Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 0, no. 0, pp. 0–0 (2012)
Advances in the Planarization Method: Effective Multiple Edge Insertions Markus Chimani 1 Carsten Gutwenger 2
1Institute of Computer Science, FSU Jena 2Department of Computer Science, TU Dortmund
Abstract The planarization method is the strongest known method to heuristi- cally find good solutions to the general crossing number problem in graphs: Starting from a planar subgraph, one iteratively inserts edges, represent- ing crossings via dummy vertices. In the recent years, several improvements both from the practical and the theoretical point of view have been made. We review these advances and conduct an extensive study of the algorithms’ practical implications. Thereby, we also present the first implementation of an approximation algorithm for the crossing number problem of general graphs. We compare the obtained results with known exact crossing number solutions and show that modern techniques allow surprisingly tight results in practice.
Submitted: Reviewed: Revised: Accepted:
Final: Published: Article type: Communicated by: Regular Paper
The research was partially supported by EUROGIGA project GraDR 10-EuroGIGA-OP-003. Markus Chimani was funded by a Carl-Zeiss-Foundation juniorprofessorship. E-mail addresses: [email protected] (Markus Chimani) [email protected] dortmund.de (Carsten Gutwenger) JGAA, 0(0) 0–0 (2012) 1
1 Introduction
Given a graph G = (V,E), the crossing number problem asks how to draw G into the plane with the fewest possible number of edge crossings. The planarization method is the probably best known and most successful heuristic to tackle the crossing number problem in practice. In its simplest form it runs in two phases: first, a (large) planar subgraph G0 = (V,E0) of G is computed. Then, the temporarily removed edges F := E \ E0 are re-inserted one after another, each time solving a single edge insertion problem. This problem can be stated as follows: Let H be a planar graph and e an edge not yet in H. We search for a smallest planar graph H+ which represents a drawing of H + e where edge crossings are replaced by dummy vertices of degree 4, and all these crossings occur on the edge e. Hence, when removing the image of e from H+, we obtain a planar embedded graph H. Using this method, each edge of F is inserted in a planar graph until we obtain a planarization G+, representing G in a planar way by using dummy vertices for crossings. In the first proposal [2] of this heuristic, the insertion problem was considered with respect to a fixed embedding (cyclic order of the edges around their incident vertices) of the planar graph H. (I.e., after obtaining the planar subgraph G0, one embedding of G0 is fixed and retained throughout the whole insertion phase.) A simple linear-time BFS-algorithm in the dual graph of H suffices to find an optimal solution. Later, and rather surprisingly, it was shown in [18] that there exists a linear-time algorithm, using the SPQR-tree data structure, which finds the optimal insertion path for e over all possible planar embeddings of H. In [17] it was shown that this approach is in practice vastly superior to the former in terms of the overall obtained number of crossings. In recent years it was furthermore shown that there exists a (rather com- plex) insertion algorithm [6] to optimally insert a vertex with all its incident edges into a planar graph H, considering all possible embeddings of H (ver- tex insertion). On the other hand, it is NP-hard to insert an arbitrary set of edges simultaneously (multiple edge insertion), even when the embedding of the planar graph into which to insert is fixed [35]. Most interestingly, the single edge insertion problem (over all possible em- beddings of H) is known to approximate the crossing number of H + e within a factor of ∆/2, where ∆ is the graph’s maximum (vertex-)degree [4, 19]. I.e., there exists a drawing of H +e in which no two edges of H cross and whose num- ber of crossings is at most ∆/2 times the crossing number of H + e. Similarly, the vertex insertion problem approximates the crossing number of the resulting graph [8]. In particular, the proof of the latter can be generalized to show that an optimal multiple edge insertion solution—with respect to an edge set F and over all possible embeddings of G0—approximates the crossing number of G0 +F within a factor only dependent on ∆ and |F | [8]. Hence, the question arose whether this multiple edge insertion problem can be efficiently approximated. After a rather complicated approach in [10], a sim- pler and at the same time approximation-wise stronger algorithm was presented only recently [7]. The algorithm reuses concepts of the SPQR-tree based single 2 M. Chimani, C. Gutwenger Advances in the Planarization Method edge insertion and seems simple enough to be implemented and used in practice. The latter paper also shows that the traditional iterative single edge insertion algorithm cannot be an approximation strategy for the crossing number of G.
Contribution. In this paper we present recent advances of the planarization approach from a practical point of view. On the one hand, we show how to improve on the traditional approach of iteratively inserting single edges, via the use of strong postprocessing routines. On the other hand, we give the first im- plementation of a simultaneous multiple edge insertion algorithm—hence, this is also the first practical study of any crossing number approximation algorithm for arbitrary graphs. By considering graph classes of known crossing numbers (either from theory or from the application of the currently strongest branch- and-cut based exact crossing minimization algorithm [9]) we can deduce a prac- tically very good performance of these heuristics: they usually find optimum, or at least very-close-to-optimum, solutions. In the next section we recapitulate the central decomposition data structures used for the insertion algorithms. We then summarize the general strategies and central steps of these algorithms chiefly in Section 3. Based thereon, Section 4 describes improvements and implementation and engineering choices, allowing the algorithms to work well in practice. Finally, Section 5 contains the full discussion of our experimental evaluation, and Section 6 concludes the paper with some interesting open problems.
2 Preliminaries
In order to present our algorithmic choices and modifications, we first have to briefly introduce two central decomposition structures, used in all algorithms dealing with the insertion problem over all possible embeddings of H. Let H be a connected graph. The BC-tree B = B(H) of H is a tree with two different node types B and C: For each cut vertex in H, B contains a unique corresponding C-node, and for each block, i.e., a maximal two-connected subgraph or a bridge, in H a unique corresponding B-node. Two nodes in B are adjacent if and only if they correspond to a block b and a cut vertex c, such that c ∈ b. It is well-known that the size of B is linear in the size of H, and that B can be constructed in linear time, by computing the biconnected components of H; see [21, 34]. Based thereon, we can further decompose each non-trivial block H0 (i.e., a block that is not a bridge) via an SPQR-tree [12, 13] into its triconnected components: While SPQR-trees are more complicated than BC-trees, they also only require linear size and can be constructed in linear time [16,20]. This data structure is particularly interesting, as it directly encodes all (exponentially many) planar embeddings of H0. We use the definition from [5, 7] which does not use Q-nodes, and therefore call the decomposition SPR-tree for conciseness. Chiefly summarizing, each tree node corresponds to a skeleton, a “sketch” of H0 where certain subgraphs are replaced by virtual edges; a skeleton’s structure is JGAA, 0(0) 0–0 (2012) 3
Figure 1: A planar graph (top left), its SPR-tree (top right), and its decomposition showing the skeletons (bottom). Virtual edges are represented by dotted lines. restricted to only three simple types. By repeatedly merging the skeletons of adjacent nodes (at their virtual edges representing each other), we can obtain the original graph. See Figure 1 for an example.
Definition 1 (SPR-tree). Let H0 be a biconnected graph with at least three vertices. The SPR-tree T = T(H0) of H0 is the (unique) smallest tree satisfying the following properties:
i. Each node ν in T corresponds to a skeleton Sν = (Vν ,Eν ) which is a “sketch” (minor, in fact) of H: Certain subgraphs are replaced by single virtual edges. The non-virtual edges are referred to as original edges. ii. The tree has three different node types with specific skeleton structure: S: The skeleton is a simple cycle; it represents a serial component. P: The skeleton consists of two vertices and multiple edges between them; it represents a parallel component. R: The skeleton is a simple triconnected graph. Note that a planar tri- connected graph has a unique embedding (up to mirroring).
iii. For the edge (ν, µ) in T we have: Sν contains a virtual edge eµ which represents the subgraph described by Sµ, and vice versa. 4 M. Chimani, C. Gutwenger Advances in the Planarization Method
iv. We can obtain the original graph H0 by iteratively merging the skeletons of adjacent tree nodes: For the edge (ν, µ) in T, let eµ (eν ) be the virtual edge in ν (µ) representing the subgraph described by Sµ (Sν , respectively). Clearly, both edges eµ and eν connect the same nodes, say u and v. We obtain a merged graph (Vν ∪ Vµ,Eν ∪ Eµ \ (eµ, eν )) by gluing the graph together at u and v and removing eµ and eν .
Observe that the merge operation guarantees that the end vertices of a virtual edge are in fact a 2-cut (a split pair), i.e., their removal splits the graph in two or more components. In fact, the skeletons are exactly the triconnected components of H0 discussed in [20]. In the algorithmic description of the multiple edge insertion approximation algorithm [7], an amalgamated version of these trees is considered:
Definition 2 (Con-tree). Let H be a connected planar graph. The con-tree C = C(H) is formed by the BC-tree B(H) which holds SPR-trees T(H0) for all non-trivial blocks H0 of H.
Clearly, the linear-sized con-tree C can be obtained from H in linear time.
3 Planarization Approach
In the above sketched planarization scheme, we can assume that the original graph G is connected, as otherwise the crossing number problem decomposes into multiple independent problems. Furthermore, the initial planar subgraph G0 can be assumed to be maximal (i.e., no edge of G can be added without losing planarity) and hence also connected. For the single edge insertion algorithms, we will usually consider any intermediate graph H; for the multiple edge insertion algorithm we set H := G0.
3.1 Single Edge Insertion We will briefly recapitulate the central ingredients of the exact linear-time al- gorithm by Gutwenger et al. [18] to solve the single edge insertion problem over all possible embeddings of H. Let v1, v2 be the vertices we want to connect in H via a new edge. First consider a fixed embedding of H and let HD be its dual. We define an insertion path to be a path in HD connecting a face incident to v1 with a face incident to v2. The length of this path is then the number of edge crossings necessary to insert the edge {v1, v2} into embedded H along this path; each dual edge in the insertion path corresponds to an edge in H that is to be crossed. We can directly compute the shortest insertion path via standard breadth-first search (BFS). Now consider H with variable embedding. Let L be the unique shortest path in B(H) from a B-node containing v1 to a B-node containing v2. The optimal JGAA, 0(0) 0–0 (2012) 5
insertion path for {v1, v2} in G can be obtained by concatenating the optimal insertion paths within the (non-trivial) blocks on this path L; we can always nest blocks at a common cut vertex into each other such that there arise no 0 H0 additional crossings. For a block H represented by a B-node on L, let vi , 0 0 i = 1, 2, denote vi if vi ∈ V (H ), or the cut vertex in H closest to vi otherwise. H0 It remains to find optimal insertion paths from any face incident to v1 to any H0 0 face incident to v2 , for each non-trivial block H . 0 Therefore, let QH0 be the unique shortest path in T(H ) from a skeleton H0 H0 containing v1 to a skeleton containing v2 . It was shown in [18] that only the embeddings of the skeletons along QH0 matter. In a nutshell, the algorithm walks along these skeletons and fixes suitable embeddings for the skeletons, one after another. Finally, an optimal embedding is found and fixed, and one can H0 H0 use the simple BFS algorithm on the dual graph to insert the edge {v1 , v2 } optimally. In the following, we can consider a con-chain Q in C(H) of the edge {v1, v2} as an extended version of L, where the “subpaths” QH0 are stored at each non-trivial block H0 along L.
3.2 Multiple Edge Insertion Let us briefly review the approximation algorithm for the multiple edge insertion problem by Chimani and Hlinˇen´y[7]. Let H := G0 be the initial planar subgraph of G into which to insert the edges in F = {ei}1≤i≤|F |. Assume we could independently insert each edge ei ∈ F into H. Using the above algorithm for single edge insertions, we would obtain a con-chain Qi for each edge ei, and therefore a so-called embedding preference for each node on Qi with respect to ei. We obtain a common embedding of H via an iterative voting scheme on the (in general conflicting) embedding preferences per con-tree node; see also Section 4.2. Among other properties, the scheme ensures that at any con-tree node at least one preference is satisfied. After realizing the so-chosen embedding, we can once again use the simple BFS algorithm in the dual graph to insert all edges of F (iteratively) into this fixed embedding. The prove-wise crucial part in the algorithm is that any two con-chains Qi,Qj are either disjoint or they intersect in one sub-chain. Hence, two con- chains (think of simple paths) “deviate” at at most two nodes in the con-tree (think of a usual tree): once when the two paths come together and once when they part. It is shown in [7] that the embedding preferences may conflict only very locally at these two “places” (called passes). Thereby, the exact definition of pass is quite involved; here, it suffices to state that such a pass might span over at most five con-tree nodes. Overall we can bound the number of nodes where some con-chains disagree on the embedding preference, as well as the additionally necessary number of crossings to route an edge through a skeleton that is differently embedded than desired. This gives an approximation factor for the optimal multiple edge insertion with respect to G0 and F , and, subsequently, for the crossing number of G. It remains to clarify what an embedding preference actually is: Observe that 6 M. Chimani, C. Gutwenger Advances in the Planarization Method
S-nodes do not allow different embeddings of their skeletons. For an R-node (a triconnected planar graph), we have only a unique planar embedding and its mirror. For a P-node, each inserted edge may want two particular edges of the skeleton to be cyclicly adjacent (in, say, clockwise direction). Finally, for a C-node each inserted edge may want a particular incident face in an adjacent block to be identified with a particular incident face in another adjacent block.
4 Engineering
4.1 Iterative Single Edge Insertion and Postprocessing In the traditional planarization heuristic, we will “simply” insert the temporarily removed edges F one after another into the planar subgraph. After each inser- tion, we replace the arising crossings by dummy vertices, and hence proceed with a planar graph. There are various ways to fine-tune the obtained result via postprocessing, as already discussed in [17]. The simplest—and in fact quite effective—variant is to start the insertion process multiple times, each time with a different, randomized order of F . Additionally, each such insertion run can be improved: After having inserted all edges, we can again remove some original edge e from the planarization (i.e., we remove all the subedges and dummy ver- tices that represent e), and re-insert it, possibly requiring fewer crossings. For this operation, we can consider either the inserted edges F (ins), all edges (all), or the x% of the edges with the most crossings (most x, for some constant x). In [17] it was shown, that these approaches lead to greatly improved results. Herein, we propose a further improvement on these methods. The incremen- tal (inc) strategy basically applies the all strategy after each single insertion step. I.e., after the insertion of an edge e ∈ F , we try to remove and reinsert ev- ery other edge already in the graph, in order to obtain a better crossing number, before proceeding with the next edge from F . We will see that this approach again dominates the previously best strategy all, though at the cost of a vastly increased running time. Note that all these strategies—when applied in a fixed embedding setting— are also applicable to the multi-edge insertion problem, after fixing an embed- ding into which all edges in F need to be inserted. Formally, the inc setting has to restrict itself to only try to reinsert the edges in F , in order to retain the approximation guarantee. Interestingly, after having obtained a postpro- cessed solution in the fixed embedding, we can run the all postprocessing where the graph’s embedding may change, i.e., using the optimal edge insertion over all possible embeddings! As the solution value never increases, the algorithm retains its approximation guarantee and improves the number of crossings in practice. From the approximation point of view, we can observe that the first part of the algorithm (fixing a suitable overall embedding) tries to minimize the number of crossings between edges in F and edges in G0, while the postprocessing rou- tines most importantly try to reduce the number of crossings between edges in JGAA, 0(0) 0–0 (2012) 7