Improved Optimal and Approximate Power Graph Compression for Clearer Visualisation of Dense Graphs

Tim Dwyer∗ Christopher Mears† Kerri Morgan‡ Todd Niven§ Kim Marriott¶ Mark Wallacek

Monash University, Australia

(a) Flat graph with 30 nodes and 65 (b) Power Graph rendering computed using the (c) Power Graph computed using Beam Search (see 5.1) finds 15 links. heuristic of Royer et al. [12]: 7 modules, 36 links. modules to reduce the link count to 25.

Figure 1: Three renderings of a network of dependencies between methods, properties and fields in a software system. In the Power Graph renderings an edge between a node and a module implies the node is connected to every member of the module. An edge between two modules implies a bipartite clique. In this way the Power Graph shows the precise connectivity of the directed graph but with much less clutter.

ABSTRACT visualise in a way that individual links can still be followed. Such Drawings of highly connected (dense) graphs can be very difficult graphs occur frequently in nature as power-law or small-world net- to read. Power Graph Analysis offers an alternate way to draw a works. In practice, very dense graphs are often visualised in a way graph in which sets of nodes with common neighbours are shown that focuses less on high-fidelity readability of edges and more on grouped into modules. An edge connected to the module then im- highlighting highly-connected nodes or clusters of nodes through plies a connection to each member of the module. Thus, the entire techniques such as force-directed layout or abstraction determined graph may be represented with much less clutter and without loss by community detection [8]. Dense edge clutter may be allevi- of detail. A recent experimental study has shown that such lossless ated to better show node labels by rendering the edges very faintly compression of dense graphs makes it easier to follow paths. How- or with aggregate techniques like bundling [6]. Though such ap- ever, computing optimal power graphs is difficult. In this paper, we proaches may give a rough indication of the general graph structure show that computing the optimal power-graph with only one mod- they make following precise edge paths difficult or impossible. ule is NP-hard and therefore likely NP-hard in the general case. We Such path following is even more difficult when graphs are di- give an ILP model for power graph computation and discuss why rected. Distinguishing direction on edges allows for up to twice as ILP and CP techniques are poorly suited to the problem. Instead, we many distinct edges in the graph. For people trying to understand are able to find optimal solutions much more quickly using a cus- the graphs to accurately follow directed paths, edge curves must be tom search method. We also show how to restrict this type of search drawn in enough isolation that any indications of direction (such as arXiv:1311.6996v1 [cs.CG] 14 Nov 2013 to allow only limited back-tracking to provide a heuristic that has tapering, arrowheads or gradients [5]) are clearly visible. better speed and better results than previously known heuristics. Recently, alternative approaches have been suggested that at- tempt to retain the fidelity of individual edge paths by introducing 1 INTRODUCTION drawing conventions that allow a large number of actual edges to be In real-world applications such as biology and software engineer- precisely implied by a small set of composite edges. In particular, ing it is common to find network structures that are too dense to so-called Power Graph Analysis constructs a hierarchy over nodes, such that nodes with similar neighbour sets are placed in the same ∗e-mail: [email protected] group or module. An edge connected to the module then implies a †e-mail:[email protected] connection to each member of the module. Power Graph Analysis ‡e-mail:[email protected] utilises lossless compression since—by contrast with bundling or §e-mail:[email protected] community-based clustering—no information is lost in the render- ¶e-mail:[email protected] ing. That is, the technique can be said to be information faithful [9] ke-mail:[email protected] such that the full graph can be reconstructed by careful inspection of the drawing. Figure 1 gives a small example of the application of Power Graph Analysis techniques to a small software dependency graph. Power Graph Analysis has been shown to have practical applica- tion to visualising biological networks [12], detecting communities in social and biological networks [14] and more recently in software dependency diagrams [3]. A recent user study has shown that—for path-finding tasks—power-graph style groupings were more read- able than flat graphs, even for people with very little training [3]. Relatively little attention has been devoted to algorithms for find- ing power graph decompositions, i.e. the best choice of modules in the power graph. Royer et al. [12] give a heuristic for finding the decomposition and Dwyer et al. [3] give a constraint programming formulation for finding the optimal decomposition with respect to various criteria, such as fewer edges or fewer edges crossing group boundaries. Unfortunately, an empirical evaluation of the two al- gorithms given in [3] reveals that the heuristic of [12] is not very effective at finding an optimal solution while the constraint pro- (a) A small graph with 23 edges. gramming approach is too slow for practical use, taking days to run for larger graphs. Thus, we need better methods to find power graph decompositions. This is the subject of this paper. In particular, we focus on finding power-graphs that are optimal or approximately optimal with respect to the number of edges in the decomposition. In Section 3 we prove that the problem of minimising the number of edges in a power graph decomposition with only one group is NP-hard. This is strong evidence that the general problem is NP- hard, though specific proofs for unrestricted modules and different goal criteria are needed. In Section 4 we look at declarative models of the Power Graph (b) A power graph decomposition of the same graph problem for input into general-purpose solvers. In particular, we can be drawn with only 7 edges. offer some refinements to the Constraint Programming model in- troduced in [3] (Section 4.1) and propose a new Integer Linear Pro- Figure 2: A small example of power-graph decomposition. gramming model (Section 4.2). In Section 5 we explore explicit search methods. In Section 5.1 we introduce a beam-search based approximate method that pro- fusion when reading the diagram, as considered by Dwyer et al. [3]. duces power graphs for a given flat graph, that are much closer to Royer et al. introduced a simple heuristic for obtaining the power optimal than previous greedy heuristics. In Section 5.2 we extend graph decomposition for a given graph in O(|V|3|E|log|E|) time. this method to a full- search strategy that is able to find Their procedure involved first computing a “candidate module hier- optimal power graphs relatively efficiently using a lower-bound cal- archy” over the nodes in the graph by repeated greedy matching of culation to cut branches of the search that are not useful. Our exper- nodes or modules with similar Jaccard index between their neigh- iments (Section 6) show that this approach is orders of magnitude bour sets. A second iterative phase involves greedily instantiating faster than solving the declarative models using generic solvers. the module that reduces the most edges. This is repeated until no more edge reduction is possible. 2 BACKGROUNDAND DEFINITIONS This algorithm is reasonably fast but—as we show in Section Royer et al. [12] coined the term Power Graph Analysis to describe 6—produces results that are far from optimal. The Beam Search a technique for reducing edge clutter by introducing a module hier- method described in Section 5.1 has faster run-time (depending on archy over an undirected graph, such that edges connected to mod- a beam size parameter) and finds solutions that are very close to ules imply a connection to every child of the module. More for- optimal. Furthermore, in Section 5.2, we show that it is easy to mally, for a directed graph G = (V,E) a power graph configuration introduce backtracking into this explicit search method, to produce is a set of modules M where each module m ∈ M is a subset of V. a search that returns an optimal solution. Such a complete search We assume that M includes the set of trivial modules {{v} | v ∈ V}. is orders of magnitude faster than the best we have been able to Our use of the term module is due to the similarity with modules achieve with generic solvers applied to declarative models, despite considered in Modular Decomposition. Like Modular Decomposi- extensive experimentation with redundant constraints to limit the tion, modules can be nested to form a hierarchy and if two mod- search space in those models, as described in Section 4. ules overlap one must be fully contained in the other. That is, for m,n ∈ M, if mTn 6= /0then n ⊂ m or m ⊂ n. As in Modular De- 3 COMPLEXITY ANALYSIS composition, a power graph decomposition is drawn with a set of In this section we show that computing the optimal power-graph representative edges R uniquely specified by E and M, such that a with a single module is NP-complete for undirected bipartite representative edge between two modules represents a complete bi- graphs. It then follows that the result holds for directed graphs partite graph in E over the children of the two modules. That is, for in general. As an optimal module for a single module case need two modules m and n, a representative edge (m,n) ∈ R represents not be a module in the optimal power-graph with multiple modules the set of edges rep(m,n) = {(u,v) | u ∈ m,v ∈ n}. In drawings of (see Figure 3), it seems likely that it is hard in the general case power graphs, the set of representative edges R must be minimal, too. Finding an optimal single module is equivalent to finding a i.e. @e1,e2 ∈ R s.t. rep(e1) ⊂ rep(e2). biclique subgraph (A,B) (here A and B denote the two independent Unlike Modular Decomposition, however, the power graph def- sets of nodes) that maximises the edge savings, that is, |A||B| − |A|, inition above allows a representative edge to span module bound- |A| ≤ |B|. The largest part, B, is taken to be the single module. aries. For example, in Figure 2(b), the edge from node 5 to module This problem is different to the Maximum Vertex Biclique Problem {0,2,4} implies that E contains edges (5,0),(5,2),(5,4) but says (MVBP) that finds an induced biclique that maximises the number nothing about any modules containing node 5. This permits greater of nodes, |A| + |B|, and to the Maximum Edge Biclique Problem compression than modular decomposition, but can also cause con- (MBP) that finds a biclique subgraph that maximises the number of Case 1 |A| > k: Suppose |A| = k + c, c ∈ (0,k]. Then  k(k − 1) (k − c)(k − c − 1)  |A||B| − |A| ≤ (k + c) + − (k + c) 2 2 k   c   = k3 − k2 − k − c2 + c + c2 + c − 2 Figure 3: The optimal solution for a single module (shaded) is not in 2 2 the optimal solution for two modules (dashed lines). < k3 − k2 − k. Case 2 |A| ≤ k: Suppose |A| = k − c, c ∈ [0,k]. Now as G has no clique of size ≥ k, the number of edges in the subgraph of 0 (k−2)(k+c)2 G induced by V − A (and hence |B |) is ≤ 2(k−1) by Turan’s´ theorem [13]. So  k(k − 1) (k − 2)(k + c)2  |A||B| − |A| ≤(k − c) + − (k − c) 2 2(k − 1) ≤k3 − k2 − k

1  2 2 2 2  Figure 4: The construction G0. − (c + 1)k − 2c k + c(c − 1)(k − 2) 2(k − 1)

(c) Beamsearch decomposition k = 1. Decomposition has 6 modules, 19 edges (d) Optimal decomposition obtained with beam search k = 10 (optimality and 17 crossings between edges and module boundaries. proven with method described in Section 5.2). Decomposition has 6 modules, 16 edges and 13 crossings between edges and module boundaries.

Figure 6: Flat and power graph renderings of a graph with 10 nodes and 51 edges. It is interesting to note that the reduction in clutter from (b) to (c) is fairly obvious, however, the qualitative improvement from (c) to (d) is less dramatic. Layouts were obtained using the yEd software (http://yfiles.com) using a combination of automatic layout and manual refinement with automatic edge routing. emszsadGed acr lseigheuristic. various Clustering with Jaccard Search Greedy Beam and for sizes times beam running of Comparison 8: Figure Run Time (seconds) power various the for size heuristics. graph decomposition versus graph cost of Comparison 7: Figure Edges in Decomposition 1000 1200 0 1 2 3 4 5 6 7 200 400 600 800 0 20 10 30 20 40 30 Graph Size (numbernodes) of 40 50 Node Count Node 50 60 60 70 70 80 80 90 90 100 100 Clustering Greedy Jaccard Beamsearch k=100 Beamsearch k=10 Beamsearch k=1 Clustering Greedy Jaccard k=10 Search Beam k=5Beam Search k=1 Search Beam ehd htcnd oehn esnbewe lnrdrawing planar a when developing reasonable on something focus do of little can computability been that the has methods There on focus graphs. confluent to graph planar tended confluent have in- Current algorithms also drawing this components. decomposition bipartite power-graph identifying Like volves [2]. drawing graph the variants. for its required and is problem graph analysis power complexity general more above mentioned As as (such problem time. sepa- polynomial the require be even of may may functions) variations proof, or proofs goal that NP-hardness rate or likely general constraints different also a slightly case) is for module it ground-work one though (the strong problem is analysis which power-graph a for proof decompositions. optimal search Jaccard to the beam comparison bad by the how was just of method discovering clustering by development motivated as example, part large such For in methods was heuristic search. proven better beam has develop instances the us small able be helping for being least in not but at invaluable in may them visualisation, optimality compute solutions graph efficiently prove optimal power to and Truly practical find for to days. necessary able to is compared ex- that an minutes provided method have search we contrast, plicit simple, By defies im- models. still declarative many problem math- efficient been this such have for techniques technologies there programming solver improve- While ematical and significant compiler model. the and to CP provements model previous a ILP to known ments first the contributed 100 the for again memory. (e.g. and results time optimal tech- search more graph for beam still node the 1078 obtain of to applicability to the compared nique shown above edges also graph have 624 the We with for JC. e.g. decomposition optimal, a to closer achieves much are that tions graph dynamic data. live-streaming of decomposition continuous scenar- like new ios enable to improvement performance enough significant es rp ih10ndsad10 de a edcmoe in decomposed be can edges with a 1500 seconds and example, 0.35 nodes For 100 (JC). with [12] graph Clustering dense Jaccard Greedy the heuristic, rtcl ntepatclsd,tebs-rtsac ehd(rbeam (or method search best-first the search the- side, and practical practical the both On results, of oretical. number a presented has paper This optimal. found be methods to three it other proved The and solution solution. optimal optimal solverthe the generation find lazy-clause to the failed with run was CPX model Gurobi commercial CP the the 20 with for run and running was cores model 8 ILP with The parallel, in hours. run was solver ILP The CPU. crossings. more boundary do is compute must variant to and tie-breaking step space search The search each at larger model. work a CP explores the it as because breaking way slower other same the the and in edges, ties minimising the only of one variants two search: implemented For optimal we [3]. model, in CP as the modules number with of the comparison number minimising and by crossings ties boundary break module to of try model CP the of variants F 8 C 7 eae ehiu o ipiyn es rpsi confluent is graphs dense simplifying for technique related A NP-hardness known first the give we side theoretical the On have we problem, decomposition graph power optimal the For decomposi- computes also it that shown have experiments Our l xasiemtoswr u na ne oei .7GHz 2.67 i7 Core Intel an on run were methods exhaustive All both while edges, minimize to tries only model ILP the that Note 5 4 http://www.opturion.com/cpx.html http://www.gurobi.com/ 5 RHRWORK URTHER h L oe n Pmdlwtotrdnatconstraints redundant without model CP and model ILP The . ONCLUSION k = or 1 BS 10 BS nsa62eg eopsto)i reasonable in decomposition) edge 612 a finds 1 BS smc atrta h rvosbs available best previous the than faster much is ) 1 oprdt .4scnsfrJ.Ti sa is This JC. for seconds 2.74 to compared 4 solver, BS 1 is not possible. We think an optimisation-based approach related to the techniques described in this paper may have some success in this regard. Finally, we need better layout methods for power-graph decom- positions and clustered graphs generally. The examples in this pa- per were initially arranged with the Y-Files organic layout method but required significant manipulation to achieve pleasing alignment and to minimise crossings between edges.

REFERENCES [1] B. Bollobas,´ C. Borgs, J. Chayes, and O. Riordan. Directed scale-free graphs. In Proceedings of the fourteenth annual ACM-SIAM sympo- sium on Discrete algorithms, SODA ’03, pages 132–139. Society for Industrial and Applied Mathematics, 2003. [2] M. Dickerson, D. Eppstein, M. T. Goodrich, and J. Y.Meng. Confluent drawings: Visualizing non-planar diagrams in a planar way. Graph Algorithms and Applications, 9(1):31–52, 2005. [3] T. Dwyer, N. Henry Riche, K. Marriott, and C. Mears. Edge compression techniques for visualization of dense directed graphs. IEEE Transactions on Visualization and Computer Graphics, To Ap- pear, 2013 http://marvl.infotech.monash.edu/˜timd/ Dwyer2013EdgeCompression.pdf. [4] M. R. Garey and D. S. Johnson. Computers and Intractability. Free- man, New York, 1979. [5] D. Holten, P. Isenberg, J. J. van Wijk, and J. Fekete. An extended eval- uation of the readability of tapered, animated, and textured directed- edge representations in node-link graphs. In Proceedings of the Pacific Visualization Symposium (PacificVis), pages 195–202. IEEE, 2011. [6] D. Holten and J. J. van Wijk. Force-directed edge bundling for graph visualization. In Proceedings of the 11th Eurographics / IEEE - VGTC conference on Visualization, EuroVis’09, pages 983–998, Aire- la-Ville, Switzerland, Switzerland, 2009. Eurographics Association. [7] D. S. Johnson. The NP-completeness column: an ongoing guide. J. Algorithms, 8:438–448, 1987. [8] M. Newman. Modularity and community structure in networks. Pro- ceedings of the National Academy of Sciences, 103(23):8577–8582, 2006. [9] Q. Nguyen, P. Eades, and S.-H. Hong. On the faithfulness of graph visualizations. In Proc. Pacific Visualization (PacificVis), pages 209– 216. IEEE, 2013. [10] P. Norvig. Paradigms of Artificial Intelligence Programming: Case Studies in Common LISP. Morgan Kaufmann, 1991. [11] R. Peeters. The maximum edge biclique problem is NP-complete. Discrete Applied Mathematics, 131:651–654, 2003. [12] L. Royer, M. Reimann, B. Andreopoulos, and M. Schroeder. Unravel- ing protein networks with power graph analysis. PLoS computational biology, 4(7):e1000108, 2008. [13] P. Turan.´ On an extremal problem in graph theory. Matematikaies´ Fizikai Lapok, 48:436–452, 1941. [14] I. Varlamis and G. Tsatsaronis. Mining potential research synergies from coauthorship graphs using power graph analysis. International Journal of Web Engineering and Technology, 7(3):250–272, 2012. [15] M. Yannakakis. Node- and edge-deletion NP-complete problems. In Proceedings of the tenth annual ACM symposium on Theory of Com- puting, STOC ’78, pages 253–264, New York, NY, USA, 1978. ACM. [16] M. Yannakakis. Node-deletion problems on bipartite graphs. SIAM Journal of Computing, 10(2):310–327, 1981. APPENDIX 10. vMod[v,m1,m2] ≥ mod[v,m1] + mod[v,m2] − 1, v ∈ V, m1 6= m ∈ M. For the published version of this paper these additional details will 2 be provided as an on-line tech report. 11. ∑ vMod[v,m1,m2] ≤ n(1 − dis[m1,m2]), m1 6= m2 ∈ M. v∈V

AILP MODEL 12. ∑ vMod[v,m1,m2] ≥ 1 − dis[m1,m2], m1 6= m2 ∈ M. v∈V The following is a complete description of the ILP model dis[m ,m ] = dis[m ,m ] m 6= m ∈ M used to minimise the number of power edges (or equivalently 13. 1 2 2 1 , 1 2 . to maximise the number of edges saved by adding modules). 14. ∑ (vMod[v,m1,m2] − mod[v,m1]) ≥ n(sub[m1,m2] − 1), Input and Parameters v∈V m1 6= m2 ∈ M. n is the number of vertices of the input graph G. V = {0,1,...,n − 1} represents the vertices of the input graph G. 15. ∑ (vMod[v,m1,m2] − mod[v,m1]) ≤ sub[m1,m2] − 1, m1 6= v∈V e(u,v) G represents the edges of the input graph as an incidence m2 ∈ M. matrix. That is, e(u,v) = 1 if (u,v) is an edge of G and e(u,v) = 0 16. ∑ vMod[v,m1,m2] ≤ ( ∑ mod[v,m2]) − sub[m1,m2], m1 6= otherwise. v∈V v∈V m is the number of modules with at least two elements (we consider m2 ∈ M. each singleton vertex to belong to its own module). M = {0,1,...,n + m − 1} represents the set of all modules. 17. dis[m1,m2] + sub[m1,m2] + sub[m2,m1] = 1, m1 6= m2 ∈ M. Integer decision variables 18. sVer[v1,v2,m1,m2] ≤ e(v1,v2), v1,v2 ∈ V, m1,m2 ∈ M. sav[m1,m2] the number of edges that may be removed from G (to 19. sVer[v1,v2,m1,m2] ≤ mod[v1,m1], v1,v2 ∈ V, m1,m2 ∈ M. then be replaced by a single edge) if the modules m and m are ! 2 20. sVer[v1,v2,m1,m2] ≤ mod[v2,m2], v1,v2 ∈ V, m1,m2 ∈ M. added. Binary decision variables 21. sVer[v1,v2,m1,m2] ≤ bic[m1,m2], v1,v2 ∈ V, m1,m2 ∈ M. mod[v,m] takes the value 1 if and only if vertex v belongs to the 22. ∑{sVer[v1,v2,m1,m2] | m1,m2 ∈ M,m1 6= m2} ≤ 1, v1,v2 ∈ module m. V. ind[v,m] takes the value 1 if and only if (v,u) is an edge, for all u in 23. sav[m1,m2] ≤ ∑{sVer[v1,v2,m1,m2] | v1,v2 ∈ V,v1 6= v2}, the module m. m1 6= m2 ∈ M. bic[m1,m2] takes the value 1 if and only if, for every vertex v ∈ m1 24. sMod[m1,m2] ≤ sav[m1,m2], m1 6= m2 ∈ M. and every vertex u ∈ m2, the pair (u,v) is an edge in G. 2 dis[m1,m2] takes the value 1 if and only if the modules m1 and m2 25. sav[m1,m2] ≤ n sMod[m1,m2], m1 6= m2 ∈ M. are disjoint sets of vertices. 26. ∑ mod[v,m1] = 1, m1 ∈ V. sub[m1,m2] takes the value 1 if and only if the module m1 is a v∈V proper subset of the module m2. 27. mod[m1,m1] = 1, m1 ∈ V. mInd[v,m1,m2] takes value 1 if and only if v ∈ m1 and ind[v,m2] = 1. Constraints 1 and 2 define the variables ind[v,m]. Constraints 3 vMod[v,m1,m2] takes value 1 if and only if v ∈ m1 and v ∈ m2. – 5 define mInd[v,m1,m2]. Constraints 6 and 7 define bic[m1,m2]. sVer[v1,v2,m1,m2] takes value 1 if and only if (v1,v2) is an edge Constraints 8 – 10 define vMod[v,m1,m2]. Constraints 11 – 13 de- with v1 ∈ m1 and v2 ∈ m2 and the edge (v1,v2) can be removed if fine dis[m1,m2]. Constraints 14 –16 define sub[m1,m2]. Constraint m1 and m2 are added. 17 says that for any two distinct modules m1 and m2, either m1 and sMod[m1,m2] takes value 1 if and only if sav[m1,m2] > 0. m2 are disjoint, or one is a proper subset of the other. Constraint 18 Objective – 21 defines the variables sVer[v1,v2,m1,m2]. Constraint 22 says Maximise that no edge can be counted twice in the saving calculation. Con- straint 23 defines the variables sav[m1,m2]. Constraints 24 and 25 ∑{sav[m1,m2] − sMod[m1,m2] | m1,m2 ∈ M, m1 6= m2}. defines sMod[m1,m2]. Constraints 26 and 27 force each vertex to be a singleton module. Constraints

1. ∑ (e(v,u)−1)mod[u,m1] ≥ n(ind[v,m]−1), v ∈ V, m1 ∈ M. u∈V

2. ∑ (e(v,u) − 1)mod[u,m1] ≤ ind[v,m1] − 1, v ∈ V, m1 ∈ M. u∈V

3. mInd[v,m1,m2] ≤ mod[v,m1], v ∈ V, m1 6= m2 ∈ M.

4. mInd[v,m1,m2] ≤ ind[v,m1], v ∈ V, m1 6= m2 ∈ M.

5. mInd[v,m1,m2] ≥ mod[v,m1] + ind[v,m1] − 1, v ∈ V, m1 6= m2 ∈ M.

6. ∑ (mInd[v,m1,m2]−mod[v,m1]) ≥ n(bic[m1,m2]−1), m1 6= v∈V m2 ∈ M.

7. ∑ (mInd[v,m1,m2] − mod[v,m1]) ≤ bic[m1,m2] − 1, m1 6= v∈V m2 ∈ M.

8. vMod[v,m1,m2] ≤ mod[v,m1], v ∈ V, m1 6= m2 ∈ M.

9. vMod[v,m1,m2] ≤ mod[v,m2], v ∈ V, m1 6= m2 ∈ M.