Graph Visualization
Yifan Hu and Martin Nollenburg¨
Synonyms Overview
• Graph Drawing Graph visualization is concerned with • Graph Layout visual representations of graph or net- • Network Visualization work data. Effective graph visualization reveals structures that may be present in the graphs, and helps the users to under- stand and analyze the underlying data. Definition A graph consists of nodes and edges. It is a mathematical structure describing Graph visualization is an area of math- relations among a set of entities, where ematics and computer science, at the a node represents an entity, and an edge intersection of geometric graph theory exists between two nodes if the two cor- and information visualization. It is responding entities are related. concerned with visual representation A graph can be described by writing of graphs that reveals structures and down the nodes and the edges. For ex- anomalies that may be present in the ample, this is a social network of people data, and helps the user to understand and how they relate to each other: and reason about the graphs. {Andre ↔ Beverly, Andre ↔ Diane, An- dre ↔ Fernando, Beverly ↔ Garth, Beverly ↔ Ed, Carol ↔ Andre, Carol ↔ Diane, Carol ↔ Fernando, Diane ↔ Beverly, Diane ↔ Garth, Diane ↔ Ed, Farid ↔ Aadil, Farid ↔ Latif, Farid ↔ Izdihar, Fernando ↔ Diane, Fernando ↔ Garth, Fernando ↔ Heather, Garth ↔ Ed,
1 2 Yifan Hu and Martin Nollenburg¨
Garth ↔ Heather, Heather ↔ Jane, Izdihar ↔ ternative types of graph representations Mawsil, Jane ↔ Farid, Jane ↔ Aadil, Latif ↔ see the survey of von Landesberger et al Aadil, Mawsil ↔ Latif}. (2011). This social network tells us that The algorithmic graph layout prob- “Farid” is a friend of “Aadil”, “Latif” is lem consists in finding node positions a friend of “Aadil”, and so on. However, in the plane (or in 3-dimensional space) this mathematical notation of the net- and edge representations as straight lines work does not convey immediately the or simple curves such that the resulting structure of the network. On the other layout faithfully depicts the graph and hand, Fig. 1 shows a visualization of this certain aesthetic quality criteria are graph. We can see at a glance that this satisfied and optimized. We list the graph has two clusters. This example most common criteria. The influence of illustrates that graph visualization can most of them on human graph reading give us an overall sense of the data. It tasks has been empirically confirmed reveals structures and anomalies, and (Purchase 1997). helps us to ask questions that can in turn be answered through interacting with • crossings: the fewer edge crossings the visualization itself, or by writing the better (a layout without crossings algorithms to mine the data for evidence exists only for planar graphs) seen in the visualization. • bends: the fewer edge bends the bet- In this chapter, a graph G = (V,E) ter; ideally edges are straight-line consists of a set of nodes (vertices) V, • edge lengths: use as uniform edge and a set of edges E, which are pairs of lengths as possible nodes. Denote by n = |V| and m = |E| • angular resolution: angles between the number of nodes and edges, respec- edges at the same node should be tively. If there is an edge from node i to large node j, we denote that as i → j. If the • crossing angles angles of pairs of graph is undirected, then we denote the crossing edges should be large edge as i ↔ j, and call i and j neighbor- • area and aspect ratio: the layout ing (or adjacent) nodes. should be as compact as possible • edge slopes: few and regularly spaced edge slopes should be used (e.g., orthogonal layouts use only Node-Link Diagrams and horizontal and vertical edges) Layout Aesthetics • neighborhood: neighbors of each node in the graph should be neigh- bors in the layout as well By far the most common type of graph layout is the so-called node-link dia- The above list is not comprehensive gram as seen in Fig. 1. Here nodes are and there may be additional application- represented by points or simple geomet- specific constraints and criteria or global ric shapes like ellipses or rectangles, aspects such as displaying symmetries. whereas edges are drawn as simple Moreover, some criteria may contradict curves linking the corresponding pair each other. Typically only a subset of cri- of nodes. In this chapter we restrict our teria is optimized by a graph layout al- attention to node-link diagrams; for al- Graph Visualization 3
����� ������� ����� ���� ����� ������� �� ����� ����� �������� ������ ����� ������� �����
Fig. 1 Graph visualization of a small social network.
gorithm and trade-offs between different two types. For convenience, we call the criteria need to be considered. first type Spring-electrical model, and the second type Spring/Stress model.
Key Research Findings Spring-electrical model
Undirected Graph Drawing This model was first introduced by Peter Eades (1984). A widely used variant, which is given below, is due to The layout algorithms and techniques Fruchterman and Reingold (1991). The presented in this section do not make model is best understood as a system of use of edge directions, but they can springs and electrical charges, therefore obviously be applied to both undirected we name this as the Spring-electrical and directed graphs. Undirected graphs model, to differentiate the spring/stress are often represented by node-link model that relies on springs only, even diagrams with straight-line edges. We though historically both are called denote by x the location of node i in i spring embedders. the layout. Here x is a point in 2- or i In this model, each edge is replaced 3-dimensional Euclidean space. by a spring with an ideal length of 0, Spring embedders (Eades 1984; which pulls nodes that share an edge Fruchterman and Reingold 1991; together. At the same time, imagine that Kamada and Kawai 1989) are the nodes have the same type of electrical most widely used layout algorithms charges (e.g., positive) that push them for undirected graphs. They attempt apart. Specifically, there is an attractive to find aesthetic node placement by spring force exerted on node i from its representing the problem as one of neighbor j, which is proportional to minimizing the energy of a physical the squared distance between these two system. The guiding principles are that nodes, nodes that are connected by an edge should be near each others, while no nodes should be too close to each other 2 xi − x j xi − x j (cf. neighborhood aesthetic). Depending Fa(i, j) = − , i ↔ j, K x − x on the exact physical model, we could i j (1) further divide the spring embedders into 4 Yifan Hu and Martin Nollenburg¨ where K is a parameter related to the original graph, each captures the essen- nominal edge length of the final layout. tial connectivity information of its par- The repulsive electrical force exerted on ent. The force-directed algorithm can be node i from any node j is inversely pro- applied to this sequence of graphs, from portional to the distance between these small to large, each time using the layout two nodes, of the smaller graph as the start layout for the larger graph. Combining the mul-
2 tilevel algorithm and the force approx- K xi − x j imation technique, algorithms based on Fr(i, j) = , i 6= j. xi − x jk xi − x j the spring-electrical model can be used (2) to layout graphs with millions of nodes The spring-electrical model can be and edges in seconds (Walshaw 2003; solved with a force-directed procedure Hu 2005). by starting from an initial (e.g., random) layout, calculating the combined attrac- tive and repulsive forces on each node, Spring/Stress Model and moving the nodes along the direc- tion of the force for a certain step length. The spring model, also know as the This process is repeated, with the step stress model, assumes that there are length decreasing every iteration, until springs connecting all pairs of nodes in the layout stabilizes. This procedure is the graph, with the ideal spring length formally stated in Algorithm 1. equal to the graph theoretical distance The spring-electrical model as de- between the nodes. The spring energy, scribed by equations 1-2 cannot be used also known as the stress, of this spring directly for large graphs. The repulsive system is force exists on all pairs of nodes, so the computational complexity is quadratic 2 in the number of nodes. Force approx- ∑ wij xi − x j − dij , (3) imation techniques based on space i6= j decomposition data structure, such as the Barnes-Hut algorithm, can be used where dij is the ideal distance between to approximate the repulsive forces nodes i and j. The layout that minimizes the above stress energy is an optimal lay- efficiently (Tunkelang 1999; Quigley 2 2001; Hachul and Junger¨ 2004). out of the graph. Typically wij = 1/dij . For large graphs, the force-directed The spring model was proposed by algorithm, which uses the steepest de- Kamada and Kawai (1989) in graph scent process and re-positions one node drawing, although it dates back to Mul- at a time to minimize the energy locally, tidimensional Scaling (MDS) (Kruskal is likely to be stuck at local minima, 1964; Kruskal and Seery 1980), and because the physical system of springs the term MDS is sometimes used to and electrical charges can have many lo- describe the embedding process based cal minimum configurations. This can be on the stress model. overcome by the multilevel technique. In There are several ways to minimize this technique, a sequence of smaller and the spring energy (3). A force-directed smaller graphs are generated from the algorithm could be applied, where the Graph Visualization 5
Algorithm 1 ForceDirectedAlgorithm(G,x,tol,K)
1 input: graph G = (V,E), initial positions x, tolerance tol, and nominal edge length K 2 set step = initial step length 3 repeat 4 x0 = x 5 for (i ∈ V) { 6 f = 0 // f is a 2/3D vector 7 for ( j ↔ i, j ∈ V) f ← f + Fa(i, j) // attractive force, see equation (1) 8 for ( j 6= i, j ∈ V) f ← f + Fr(i, j) // repulsive force, see equation (2) 9 xi ← xi + step ∗ ( f /|| f ||) // update position of node i 10 } 11 until (||x − x0|| < tol ∗ K) 12 return x