Clustering and Centrality for Graph Visualization Sergio Camelo [email protected]

Total Page:16

File Type:pdf, Size:1020Kb

Clustering and Centrality for Graph Visualization Sergio Camelo Camelo@Stanford.Edu Clustering and Centrality for Graph Visualization Sergio Camelo [email protected] Problem Clustering Centrality Conclusion Centrality and community detection improve the un- One of the most common techniques for graph We detect communities through modularity opti- We calculate centrality using the page-rank algo- derstanding of network data, especially if context is visualization is force-directed drawing. It consists mization. Communities are color-coded and they rithm. We also calculate personalized page-rank provided (otherwise, it is difficult to explain parti- in associating the graph to a mechanical system can be exploded with a ctrl-click to reveal inner centrality, which can be activated by clicking on any tions and importance). Results, however, rely a lot and then finding a low-energy equilibrium state structure. node. on using appropriate algorithms. In the case of Les of it. Miserables data, for example, few algorithms gave as good results as modularity maximization. In this work we build on top of the d3-force en- gine to emphasize centrality and commu- Color and node size show an improvement over the nity structure of the graph. To do so we raw force-directed graph. It is not clear, however, construct new forces and exploit color, size and that forces like gravity and cluster repulsion allow context. us to identify new information. Motivation Future Work Centrality measures reveal the "most important" Some fruitful direction of future research are: nodes of a graph (which could refer to the most cen- Figure 2: Visualizing Clusters, the exploded community corre- Figure 3: Personalized page-rank for the character Marius, big- • Implementing fast centrality and clustering tral, the most popular, or the most powerful nodes, sponds to the enemies of Valjean, the main character ger nodes correspond to characters that are more central to him algorithms for big graphs depending on the definition), while community de- tection attempts to find a "natural partition" of the Interface • Building a fast approximate Verlet integration graph into different communities engine for graphs on many nodes The user gets context by hovering over each node to get a description of the character. We include a clustering • Creating graphs from web data, while Both properties are very important to understand force, which repels nodes from different communities, and a gravity force, that attracts more central nodes to automatically extracting context information graph structure, but they are usually not built into the center of the graph. The user can set the strength of these forces, and the average edge length. graph visualization engines. Even if they are, they usually do not interact with the visualization as new References forces (i.e. do not interact with position), but only [1] B. Baingana, G. Giannakis. Embedding Graphs under Cen- with color and size. trality Constraints for Network Visualization. arXiv:1401.4408 [2] M. Banniester, D. Eppstein, M. Goodrich, L. Trott. Force- Directed Graph Drawing Using Social Gravity and Scaling. arXiv:1209.0748 [3] M. Bostock. D3 force-engine. https://github.com/d3/d3- force. [4] M. Bostock. Les-Miserables graph. https://bl.ocks.org/- mbostock/4062045 Figure 1: Force-directed visualization of Les-Miserables.
Recommended publications
  • Networkx Tutorial
    5.03.2020 tutorial NetworkX tutorial Source: https://github.com/networkx/notebooks (https://github.com/networkx/notebooks) Minor corrections: JS, 27.02.2019 Creating a graph Create an empty graph with no nodes and no edges. In [1]: import networkx as nx In [2]: G = nx.Graph() By definition, a Graph is a collection of nodes (vertices) along with identified pairs of nodes (called edges, links, etc). In NetworkX, nodes can be any hashable object e.g. a text string, an image, an XML object, another Graph, a customized node object, etc. (Note: Python's None object should not be used as a node as it determines whether optional function arguments have been assigned in many functions.) Nodes The graph G can be grown in several ways. NetworkX includes many graph generator functions and facilities to read and write graphs in many formats. To get started though we'll look at simple manipulations. You can add one node at a time, In [3]: G.add_node(1) add a list of nodes, In [4]: G.add_nodes_from([2, 3]) or add any nbunch of nodes. An nbunch is any iterable container of nodes that is not itself a node in the graph. (e.g. a list, set, graph, file, etc..) In [5]: H = nx.path_graph(10) file:///home/szwabin/Dropbox/Praca/Zajecia/Diffusion/Lectures/1_intro/networkx_tutorial/tutorial.html 1/18 5.03.2020 tutorial In [6]: G.add_nodes_from(H) Note that G now contains the nodes of H as nodes of G. In contrast, you could use the graph H as a node in G.
    [Show full text]
  • Networkx: Network Analysis with Python
    NetworkX: Network Analysis with Python Salvatore Scellato Full tutorial presented at the XXX SunBelt Conference “NetworkX introduction: Hacking social networks using the Python programming language” by Aric Hagberg & Drew Conway Outline 1. Introduction to NetworkX 2. Getting started with Python and NetworkX 3. Basic network analysis 4. Writing your own code 5. You are ready for your project! 1. Introduction to NetworkX. Introduction to NetworkX - network analysis Vast amounts of network data are being generated and collected • Sociology: web pages, mobile phones, social networks • Technology: Internet routers, vehicular flows, power grids How can we analyze this networks? Introduction to NetworkX - Python awesomeness Introduction to NetworkX “Python package for the creation, manipulation and study of the structure, dynamics and functions of complex networks.” • Data structures for representing many types of networks, or graphs • Nodes can be any (hashable) Python object, edges can contain arbitrary data • Flexibility ideal for representing networks found in many different fields • Easy to install on multiple platforms • Online up-to-date documentation • First public release in April 2005 Introduction to NetworkX - design requirements • Tool to study the structure and dynamics of social, biological, and infrastructure networks • Ease-of-use and rapid development in a collaborative, multidisciplinary environment • Easy to learn, easy to teach • Open-source tool base that can easily grow in a multidisciplinary environment with non-expert users
    [Show full text]
  • Graphprism: Compact Visualization of Network Structure
    GraphPrism: Compact Visualization of Network Structure Sanjay Kairam, Diana MacLean, Manolis Savva, Jeffrey Heer Stanford University Computer Science Department {skairam, malcdi, msavva, jheer}@cs.stanford.edu GraphPrism Connectivity ABSTRACT nodeRadius 7.9 strokeWidth 1.12 charge -242 Visual methods for supporting the characterization, com- gravity 0.65 linkDistance 20 parison, and classification of large networks remain an open Transitivity updateGraph Close Controls challenge. Ideally, such techniques should surface useful 0 node(s) selected. structural features { such as effective diameter, small-world properties, and structural holes { not always apparent from Density either summary statistics or typical network visualizations. In this paper, we present GraphPrism, a technique for visu- Conductance ally summarizing arbitrarily large graphs through combina- tions of `facets', each corresponding to a single node- or edge- specific metric (e.g., transitivity). We describe a generalized Jaccard approach for constructing facets by calculating distributions of graph metrics over increasingly large local neighborhoods and representing these as a stacked multi-scale histogram. MeetMin Evaluation with paper prototypes shows that, with minimal training, static GraphPrism diagrams can aid network anal- ysis experts in performing basic analysis tasks with network Created by Sanjay Kairam. Visualization using D3. data. Finally, we contribute the design of an interactive sys- Figure 1: GraphPrism and node-link diagrams for tem using linked selection between GraphPrism overviews the largest component of a co-authorship graph. and node-link detail views. Using a case study of data from a co-authorship network, we illustrate how GraphPrism fa- compactly summarizing networks of arbitrary size. Graph- cilitates interactive exploration of network data.
    [Show full text]
  • Graph Simplification and Interaction
    Graph Clarity, Simplification, & Interaction http://i.imgur.com/cW19IBR.jpg https://www.reddit.com/r/CableManagement/ Today • Today’s Reading: Lombardi Graphs – Bezier Curves • Today’s Reading: Clustering/Hierarchical edge Bundling – Definition of Betweenness Centrality • Emergency Management Graph Visualization – Sean Kim’s masters project • Reading for Tuesday & Homework 3 • Graph Interaction Brainstorming Exercise "Lombardi drawings of graphs", Duncan, Eppstein, Goodrich, Kobourov, Nollenberg, Graph Drawing 2010 • Circular arcs • Perfect angular resolution (edges for equal angles at vertices) • Arcs only intersect 2 vertices (at endpoints) • (not required to be crossing free) • Vertices may be constrained to lie on circle or concentric circles • People are more patient with aesthetically pleasing graphs (will spend longer studying to learn/draw conclusions) • What about relaxing the circular arc requirement and allowing Bezier arcs? • How does it scale to larger data? • Long curved arcs can be much harder to follow • Circular layout of nodes is often very good! • Would like more pseudocode Cubic Bézier Curve • 4 control points • Curve passes through first & last control point • Curve is tangent at P0 to (P1-P0) and at P3 to (P3-P2) http://www.e-cartouche.ch/content_reg/carto http://www.webreference.com/dla uche/graphics/en/html/Curves_learningObject b/9902/bezier.html 2.html “Force-directed Lombardi-style graph drawing”, Chernobelskiy et al., Graph Drawing 2011. • Relaxation of the Lombardi Graph requirements • “straight-line segments
    [Show full text]
  • Constraint Graph Drawing
    Constrained Graph Drawing Dissertation zur Erlangung des akademischen Grades des Doktors der Naturwissenschaften vorgelegt von Barbara Pampel, geb. Schlieper an der Universit¨at Konstanz Mathematisch-Naturwissenschaftliche Sektion Fachbereich Informatik und Informationswissenschaft Tag der mundlichen¨ Prufung:¨ 14. Juli 2011 1. Referent: Prof. Dr. Ulrik Brandes 2. Referent: Prof. Dr. Michael Kaufmann II Teile dieser Arbeit basieren auf Ver¨offentlichungen, die aus der Zusammenar- beit mit anderen Wissenschaftlerinnen und Wissenschaftlern entstanden sind. Zu allen diesen Inhalten wurden wesentliche Beitr¨age geleistet. Kapitel 3 (Bachmaier, Brandes, and Schlieper, 2005; Brandes and Schlieper, 2009) Kapitel 4 (Brandes and Pampel, 2009) Kapitel 6 (Brandes, Cornelsen, Pampel, and Sallaberry, 2010b) Zusammenfassung Netzwerke werden in den unterschiedlichsten Forschungsgebieten zur Repr¨asenta- tion relationaler Daten genutzt. Durch geeignete mathematische Methoden kann man diese Netzwerke als Graphen darstellen. Ein Graph ist ein Gebilde aus Kno- ten und Kanten, welche die Knoten verbinden. Hierbei k¨onnen sowohl die Kan- ten als auch die Knoten weitere Informationen beinhalten. Diese Informationen k¨onnen den einzelnen Elementen zugeordnet sein, sich aber auch aus Anordnung und Verteilung der Elemente ergeben. Mit Algorithmen (strukturierten Reihen von Arbeitsanweisungen) aus dem Gebiet des Graphenzeichnens kann man die unterschiedlichsten Informationen aus verschiedenen Forschungsbereichen visualisieren. Graphische Darstellungen k¨onnen das
    [Show full text]
  • Chapter 18 Spectral Graph Drawing
    Chapter 18 Spectral Graph Drawing 18.1 Graph Drawing and Energy Minimization Let G =(V,E)besomeundirectedgraph.Itisoftende- sirable to draw a graph, usually in the plane but possibly in 3D, and it turns out that the graph Laplacian can be used to design surprisingly good methods. n Say V = m.Theideaistoassignapoint⇢(vi)inR to the vertex| | v V ,foreveryv V ,andtodrawaline i 2 i 2 segment between the points ⇢(vi)and⇢(vj). Thus, a graph drawing is a function ⇢: V Rn. ! 821 822 CHAPTER 18. SPECTRAL GRAPH DRAWING We define the matrix of a graph drawing ⇢ (in Rn) as a m n matrix R whose ith row consists of the row vector ⇥ n ⇢(vi)correspondingtothepointrepresentingvi in R . Typically, we want n<m;infactn should be much smaller than m. Arepresentationisbalanced i↵the sum of the entries of every column is zero, that is, 1>R =0. If a representation is not balanced, it can be made bal- anced by a suitable translation. We may also assume that the columns of R are linearly independent, since any basis of the column space also determines the drawing. Thus, from now on, we may assume that n m. 18.1. GRAPH DRAWING AND ENERGY MINIMIZATION 823 Remark: Agraphdrawing⇢: V Rn is not required to be injective, which may result in! degenerate drawings where distinct vertices are drawn as the same point. For this reason, we prefer not to use the terminology graph embedding,whichisoftenusedintheliterature. This is because in di↵erential geometry, an embedding always refers to an injective map. The term graph immersion would be more appropriate.
    [Show full text]
  • Graph Drawing by Stochastic Gradient Descent
    1 Graph Drawing by Stochastic Gradient Descent Jonathan X. Zheng, Samraat Pawar, Dan F. M. Goodman Abstract—A popular method of force-directed graph drawing is multidimensional scaling using graph-theoretic distances as input. We present an algorithm to minimize its energy function, known as stress, by using stochastic gradient descent (SGD) to move a single pair of vertices at a time. Our results show that SGD can reach lower stress levels faster and more consistently than majorization, without needing help from a good initialization. We then show how the unique properties of SGD make it easier to produce constrained layouts than previous approaches. We also show how SGD can be directly applied within the sparse stress approximation of Ortmann et al. [1], making the algorithm scalable up to large graphs. Index Terms—Graph drawing, multidimensional scaling, constraints, relaxation, stochastic gradient descent F 1 INTRODUCTION RAPHS are a common data structure, used to describe to the shortest path distance between vertices i and j, with w = d−2 G everything from social networks to food webs, from ij ij to offset the extra weight given to longer paths metabolic pathways to internet traffic. Any set of pairwise due to squaring the difference [3]. relationships between entities can be described by a graph, This definition was popularized for graph layout by and the ever increasing amount of data being collected Kamada and Kawai [4] who minimized the function using means that visualizing graphs for exploratory analysis has a localized 2D Newton-Raphson method, while within the become an important task. MDS community Kruskal [5] originally used gradient de- Node-link diagrams are an intuitive representation of scent [6].
    [Show full text]
  • Networkx: Network Analysis with Python
    NetworkX: Network Analysis with Python Dionysis Manousakas (special thanks to Petko Georgiev, Tassos Noulas and Salvatore Scellato) Social and Technological Network Data Analytics Computer Laboratory, University of Cambridge February 2017 Outline 1. Introduction to NetworkX 2. Getting started with Python and NetworkX 3. Basic network analysis 4. Writing your own code 5. Ready for your own analysis! 2 1. Introduction to NetworkX 3 Introduction: networks are everywhere… Social Mobile phone networks Vehicular flows networks Web pages/citations Internet routing How can we analyse these networks? Python + NetworkX 4 Introduction: why Python? Python is an interpreted, general-purpose high-level programming language whose design philosophy emphasises code readability + - Clear syntax, indentation Can be slow Multiple programming paradigms Beware when you are Dynamic typing, late binding analysing very large Multiple data types and structures networks Strong on-line community Rich documentation Numerous libraries Expressive features Fast prototyping 5 Introduction: Python’s Holy Trinity Python’s primary NumPy is an extension to Matplotlib is the library for include primary plotting mathematical and multidimensional arrays library in Python. statistical computing. and matrices. Contains toolboxes Supports 2-D and 3-D for: Both SciPy and NumPy plotting. All plots are • Numeric rely on the C library highly customisable optimization LAPACK for very fast and ready for implementation. professional • Signal processing publication. • Statistics, and more…
    [Show full text]
  • Homophily at a Glance: Visual Homophily Estimation in Network Graphs Is Robust Under Time Constraints
    SN Soc Sci (2021) 1:139 https://doi.org/10.1007/s43545-021-00153-2 ORIGINAL PAPER Homophily at a glance: visual homophily estimation in network graphs is robust under time constraints Daniel Reimann1 · André Schulz2 · Robert Gaschler1 Received: 21 August 2020 / Accepted: 29 April 2021 / Published online: 24 May 2021 © The Author(s) 2021 Abstract Network graphs are used for high-stake decision making in medical and other con- texts. For instance, graph drawings conveying relatedness can be relevant in the con- text of spreading diseases. Node-link diagrams can be used to visually assess the degree of homophily in a network—a condition where a presence of the link is more likely when nodes are similar. In an online experiment (N = 531), we tested how robustly laypeople can judge homophily from node-link diagrams and how variation of time constraints and layout of the diagrams afect judgments. The results showed that participants were able to give appropriate judgments. While granting more time led to better performance, the efects were small. Rather, the frst seconds account for most of the information an individual can extract from the graphs. Furthermore, we showed a diference in performance between two types of layouts (bipartite and polarized). Results have consequences for communicating the degree of homophily in network graphs to the public. Keywords Node-link diagrams · Network graphs · Homophily · Perception Introduction Compared to numbers in tables, visualization of data allow us to harness the com- putational power of the visual system and extract relevant information at low efort in little time (e.g., Schnotz and Bannert 2003).
    [Show full text]
  • IAT 814 Knowledge Visualization Networks, Graphs and Trees
    IAT 814 Knowledge Visualization Networks, Graphs and Trees Lyn Bartram Administrivia • Assignment 2 presentations next week • 5 each day • 15 minutes each , including time for discussion • Remember your two questions !!! • Assignment 3 out tonight (you’ll like this one). Graphs and Trees | IAT814 We live in a connected world • Online Social networks: Facebook, Twitter ~ people connected online • Information networks: WWW ~ web pages connected through hyperlinks • Computer networks: The internet ~ computers and routers connected through wired/wireless connections • What is a network? “any collection of objects in which some pairs of these objects are connected by links” [Easley and Kleinberg, 2011] Graphs and Trees | IAT814 Visualizing Relations Why relations? Isn’t all data inherently relational? • Visualising data: seeing the patterns between the data values and attributes that emerge and associate or disassociate in some way • Visualising relations: when how one datum relates to another is an element in itself • We want to see the overall structure of the data set • Patterns emerge from structure as well as from values/ attributes Graphs and Trees | IAT814 Common Applications • Process Visualization (e.g., • Concept maps Visio) • Ontologies • Dependency Graphs • Simulation and Modeling • Biological Interactions • Probability maps (Genes, Proteins) • Computer Networks • Social Networks Graphs and Trees | IAT814 Process flow diagrams Graphs and Trees | IAT814 Dependency graphs Graphs and Trees | IAT814 Gene networks Graphs and Trees | IAT814 Computer networks Graphs and Trees | IAT814 Internet • What does the Internet look like? • Email paths Nature, 406, 353-354(27 July 2000) Graphs and Trees | IAT814 Social networks Graphs and Trees | IAT814 Concept maps Graphs and Trees | IAT814 Ontologies Graphs and Trees | IAT814 Simulation graphs Graphs and Trees | IAT814 Probability maps • Bayesian network Graphs and Trees | IAT814 Is this a network? Graphs and Trees | IAT814 • Several ways to represent relations • Choice depends on relation, task and scale T.
    [Show full text]
  • A Numerical Optimization Approach to General Graph Drawing
    A Numerical Optimization Approach to General Graph Drawing Daniel Tunkelang January 1999 CMU-CS-98-189 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Thesis Committee: Daniel Sleator, Chair Paul Heckbert Bruce Maggs Omar Ghattas, Civil and Environmental Engineering Mark Wegman, IBM T . J. Watson Research Center Copyright © 1999 Daniel Tunkelang This research was supported by the National Science Foundation under grant number DMS- 9509581. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of the NAF or the U.S. government. Keywords: visualization, graph drawing, numerical optimization, force-directed Abstract Graphs are ubiquitous, finding applications in domains ranging from software engineering to computational biology. While graph theory and graph algorithms are some of the oldest, most studied fields in computer science, the problem of visualizing graphs is comparatively young. This problem, known as graph drawing, is that of transforming combinatorial graphs into geometric drawings for the purpose of visualization. Most published algorithms for drawing general graphs model the drawing problem with a physical analogy, representing a graph as a system of springs and other physical elements and then simulating the relaxation of this physical system. Solving the graph drawing problem involves both choosing a physical model and then using numerical optimization to simulate the physical system. In this dissertation, we improve on existing algorithms for drawing general graphs. The improvements fall into three categories: speed, drawing quality, and flexibility.
    [Show full text]
  • 55 GRAPH DRAWING Emilio Di Giacomo, Giuseppe Liotta, Roberto Tamassia
    55 GRAPH DRAWING Emilio Di Giacomo, Giuseppe Liotta, Roberto Tamassia INTRODUCTION Graph drawing addresses the problem of constructing geometric representations of graphs, and has important applications to key computer technologies such as soft- ware engineering, database systems, visual interfaces, and computer-aided design. Research on graph drawing has been conducted within several diverse areas, includ- ing discrete mathematics (topological graph theory, geometric graph theory, order theory), algorithmics (graph algorithms, data structures, computational geometry, vlsi), and human-computer interaction (visual languages, graphical user interfaces, information visualization). This chapter overviews aspects of graph drawing that are especially relevant to computational geometry. Basic definitions on drawings and their properties are given in Section 55.1. Bounds on geometric and topological properties of drawings (e.g., area and crossings) are presented in Section 55.2. Sec- tion 55.3 deals with the time complexity of fundamental graph drawing problems. An example of a drawing algorithm is given in Section 55.4. Techniques for drawing general graphs are surveyed in Section 55.5. 55.1 DRAWINGS AND THEIR PROPERTIES TYPES OF GRAPHS First, we define some terminology on graphs pertinent to graph drawing. Through- out this chapter let n and m be the number of graph vertices and edges respectively, and d the maximum vertex degree (i.e., number of edges incident to a vertex). GLOSSARY Degree-k graph: Graph with maximum degree d k. ≤ Digraph: Directed graph, i.e., graph with directed edges. Acyclic digraph: Digraph without directed cycles. Transitive edge: Edge (u, v) of a digraph is transitive if there is a directed path from u to v not containing edge (u, v).
    [Show full text]