Watson Health Cognitive Visualization Lab

Perspective on Network Visualization

Cody Dunne November 5, 2015 [email protected] ibm.biz/cogvislab Perspectives on Network Visualization Agenda

Introduction Node-Link Composite Alternate Coordinated Discussion Cody Dunne, PhD – Cognitive Visualization RSM Web: ibm.biz/codydunne Email: [email protected]

My research focuses on information visualization, visual analytics, and cognitive computing. My goal is to design and build interactive visualization systems for users to effectively help them reach their goals. Epidemiology & network type group overlap dynamic networks overviews Specifically I want to make data discovery, cleaning, exploration, and sharing results as fluid and easy and accessible as possible. While at IBM I have been involved in many projects to this end collaborating with teams across the organization, with more coming!

I received a PhD and M.S. in Computer Science under Ben Shneiderman at the University of Maryland Human news term computer network literature exploration Computer Interaction Lab, and a B.A. in Computer Science occurrence traffic and Mathematics from Cornell College. I contribute to NodeXL and have worked at Microsoft Research and NIST. Network Visualization in IBM To name only a few

TED WDA-LS WDA WGA

Alchemy Twitter i2 Designers IOC Epidemics Network Visualization in IBM Services exposing network data (many more coming) Node-Link Network Visualizations

Introduction Node-Link Composite Alternate Coordinated Discussion Layout Algorithms Matter Immense variation in layout readability and speed

Hachul & Jünger, 2006 Readability Metrics for Layouts Algorithm Understanding Talk from the Graph Summit 2015: ibm.biz/gs_rm_slides ibm.biz/gs_rm_video

Dunne C, Ross SI, Shneiderman B and Martino M (2015), "Readability metric feedback for aiding node-link visualization designers", IBM Journal of Research and Development. Shneiderman B and Dunne C (2012), "Interactive network exploration to derive insights: Filtering, clustering, grouping, and simplification", In Graph Drawing ‘12. pp. 2-18. DOI:10.1007/978-3-642-36763-2_2 Dunne C and Shneiderman B (2009), "Improving graph drawing readability by incorporating readability metrics: A software tool for network analysts". University of Maryland. Human-Computer Interaction Lab Tech Report No. (HCIL-2009-13). Node-Link Network Visualizations: Motif Simplification Lostpedia articles Observations

1: There are repeating patterns in networks (motifs)

2: Motifs often dominate the visualization

3: Motifs members can be functionally equivalent Motif Simplification Motif Design Fan Motif 2-Connector Motif Lostpedia articles Lostpedia articles Motif Simplification Fan Glyph Design Motif Simplification Connector Glyph Design Motif Simplification Clique Glyph Design Motif Simplification Interactivity Video Motif Simplification Readable and scalable network visualizations for understanding relationships

• WDA-LS Drug Repurposing • Concept Insights Biomarker Analysis • SharpC Medical record concepts • i2 Heterogeneous Networks Motif Simplification Discussion of WDA-LS and Concept Insights

WDA-LS Drug Repurposing Concept Insights Biomarker Network • Conditions explored in client's use case (Asthma, Colitis, and Systemic • Visual encoding of network statistics and similarity measures can help Lupus Erythematosus (SLE)) seen in topology identify insights • With the motifs we were able to easily find all the diseases with any • Drastic reduction in network complexity through filtering by similarity required connectivity grouping (esp. >60%) • Easily see neighboring/related conditions • Motif membership appears meaningful – Quickly find all diseases with required connectivity grouping • Interactive tools can support mental map while filtering through – Existing conditions this drug is aimed at, such as Encephalomyelitis animations and iterative layout and Autoimmune Diseases, and positioned in the same region of • Flow of concepts between motifs at progressively higher filtering can be the network. show in Sankey diagram to show strong relationships – If Colitis proves to be a promising target, might also consider • Can speed up clique through pre-processing exploring the importance of neighbors Liver Diseases, Psoriasis, CHI Paper Hepatitis, … • Controlled experiment with 36 users showed that motif simplification • Rather than reduce size, see lossless overview then drill down in focused improves user task performance way – Reducing complexity • Optional complement gives users greater control and wider variety of – Understanding larger or hidden relationships tools to accelerate discovery • Algorithms for detecting fans, connectors, and cliques • Prototype technique can be extended and refined to address client needs • Publicly available implementation in NodeXL: nodexl.codeplex.com

Dunne C and Shneiderman B (2013), "Motif simplification: improving network visualization readability with fan, connector, and clique glyphs", In CHI '13. Shneiderman B and Dunne C (2012), "Interactive network exploration to derive insights: Filtering, clustering, grouping, and simplification", In Graph Drawing ‘12. pp. 2-18. DOI:10.1007/978-3-642-36763-2_2 Node-Link Network Visualizations: Showing Group Membership Disjoint Set Visualization Network grouping/partitions • Attributes • Topology • Combinations • Manual

Twitter ties at the 2012 Collective Intelligence Conference @ MIT Squarified Treemap (Rodrigues et al., 2011) Group-in-a-Box Meta-Layouts Variants

• Squarified Treemap (Rodrigues et al., 2011)

• Croissant-Doughnut

• Force-Directed Croissant

Doughnut Force-Directed Furnas's Generalized Fisheye Views (1986) Card, Robertson, & Mackinlay's Information Visualizer (1991) Ishii & Ullmer's Tangible Bits (1997)

Spatial Temporal Attribute Grapes-in-a-Box Group-in-a-Box Meta-Layouts See local and global connections

• Concept Insights Biomarker Analysis • SharpC Medical record concepts • Similarity for Active Learning • Innovation in Pennsylvania • U.S. Senate Voting Patterns Group-in-a-Box Meta-Layouts Discussion

• Three Group-in-a-Box layout algorithms for dissecting networks • Improved group and overview visualization • Empirical evaluation on 309 Twitter networks using readability metrics • Publicly available implementation in NodeXL: nodexl.codeplex.com

Chaturvedi S, Dunne C, Ashktorab Z, Zacharia R, and Shneiderman B (2014), “Group-in-a-Box meta-layouts for topological clusters and attribute-based groups: space efficient visualizations of network communities and their ties", CGF: Computer Graphics Forum. Shneiderman B and Dunne C (2012), "Interactive network exploration to derive insights: Filtering, clustering, grouping, and simplification", In Graph Drawing ‘12. pp. 2-18. DOI:10.1007/978-3-642-36763-2_2 Rodrigues EM, Milic-Frayling N, Smith M, Shneiderman B, and Hansen (2011), “Group-in-a-Box layout for multi-faceted analysis of communities”, In SocialCom ’11. pp. 354-361. DOI:10.1109/PASSAT/SocialCom.2011.139 Overlapping Set Visualization ABACUS API – Fortune 500 Composite Network Visualizations

Introduction Node-Link Composite Alternate Coordinated Discussion Composite Network Visualization: VoroGraph GLEAM Epidemic Model Population basins, local commuting, and global flights Voronoi Tessellation Western Europe Centroidal Voronoi Tessellation – Animated! Western Europe VoroGraph Animated Transitions VoroGraph Interface VoroGraph Contiguous Edge Coding VoroGraph Data Squares VoroGraph Non-Contiguous Edge Coding VoroGraph Non-Contiguous Edge Coding VoroGraph Force-Directed Group-in-a-Box VoroGraph Discussion

• Equal-population hexagons discretize the space for countability • Easier attribute comparison with color/size coding • Hexagons make clear it is an artificial representation • Enforces a of generalization • Contiguous relationship display

Dunne C, Muller M, Perra N, and Martino M. (2015) “VoroGraph: Visualization Tools for Epidemic Analysis”, In CHI '15 Interactivity. Composite Network Visualization: BrainVis Brain From Back: Back-view of the brain with Brain From Below: Bottom-view of the brain the network of activity and connections with the network of activity and connections overlaid onto colored brain regions overlaid onto colored brain regions Brain From Above: Top-view of the brain with Brain From Side: Side-view of the brain with color- the network of activity and connections coded brain regions and physical connections overlaid onto colored brain regions Alternate Network Visualizations

Introduction Node-Link Composite Alternate Coordinated Discussion Line Charts Tree visualization citation network aggregation

TM=Treemaps Shneiderman B, Dunne C, Sharma P and Wang P CT=Cone Trees (2012), "Innovation trajectories for information visualizations: Comparing treemaps, cone trees, HT=Hyperbolic Trees and hyperbolic trees", IVS: Information

Visualization. Vol. 11(2). pp. 87-105.

Academic Academic Papers Patents Matrix/Heatmap Visualizations Co-occurrence of organizations related to business intelligence

Business Intelligence 2000-2009: Tech1 • Google • Yahoo • Stanford • Apple Tech2 • IBM, Cognos • Microsoft • Oracle Finance • NASDAQ • NYSE • SEC • NCR • MicroStrategy Matrix/Heatmap Visualizations for Time Co-occurrence of organizations related to business intelligence

Gove R, Gramsky N, Kirby R, Sefer E, Sopan A, Dunne C, Shneiderman B and Taieb-Maimon M (2011), "NetVisia: Heat map & matrix visualization of dynamic statistics & content", In SocialCom '11: Proc. 2011 IEEE 3rd International Conference on Social Computing. pp. 19-26. Alternate Network Visualizations: GraphTrail

GraphTrail Lab study

• GraphTrail can make the same findings as other tools – And more! • New users can make findings • New users understand the exploration history – And usually motivation! GraphTrail Discussion

• A system for exploring large multivariate, heterogeneous networks using aggregation by node and edge attributes, • A method for capturing a user’s exploration history and integrating it directly into the workspace, and • A longitudinal field study and a qualitative lab study that prove the utility of these approaches.

Park DG, Dunne C, Ragan E, and Elmqvst N (2015), “StoryFacets: Generating Multiple Representations of Exploratory Data Analysis for Communication”, under submission. Dunne C, Riche NH, Lee B, Metoyer RA and Robertson GG (2012), "GraphTrail: Analyzing large multivariate, heterogeneous networks while supporting exploration history", In CHI '12. pp. 1663-1672. Riche N, Lee B and Dunne C (2011), "Interactive visualization for exploring multi-modal, multi-relational, and multivariate graph data". U.S. Patent. Alternate Network Visualizations: CACTI Network controllability Driver Nodes

How to control a directed complex network with minimum number of nodes?

What is the minimum number of

driver nodes (푁D ) of real-world networks? How to locate them efficiently? Which topological

characteristics determine 푁D ? How robust of network controllability? Network controllability Driver Nodes

Complex systems are difficult to understand and represent. An organism's genome, an ecosystem’s food web, and even a social network are just a few examples of complex systems important to diverse scientific disciplines but also to art and media. Connectedness is the essence of complex systems. A change in any one member element can propagate and affect the entire system – even if parts of it are loosely connected. How can we draw many interdependent connected components? Our technique is based on research on network controllability, and allows us to identify the “skeleton” of a variety of networks. With these technique data scientist do not need to represent every relationship to have a full sense of what the system does. Moreover, meta-information such as the overall system structure becomes comprehensible and the class of network can be perceived. Network controllability Driver Nodes

A) A directed network G(A). All nodes in the network are called state vertices.

B) The network is controlled by three input vertices (origins), which are marked in blue. The controlled network is described by a digraph G(A, B).

C) The U-rooted factorial connection of the digraph G(A,B) is composed of vertex-disjoint stems and cycles.

D) The cacti is built from the U- rooted factorial connection. The cactus in the left contains a stem and four buds.

Coordinated Network Visualizations

Introduction Node-Link Composite Alternate Coordinated Discussion NodeXL A tool for teaching and research in

Bonsignore EM, Dunne C, Rotman D, Smith M, Capone T, Hansen DL and Shneiderman B (2009), "First steps to NetViz Nirvana: Evaluating with NodeXL", In SocialCom '09. pp. 332-339.

Mohammad S, Dunne C and Dorr B (2009), "Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus", In EMNLP '09. pp. 599-608.

Smith M, Shneiderman B, Milic-Frayling N, Rodrigues EM, Barash V, Dunne C, Capone T, Perer A and Gleave E (2009), "Analyzing (social media) networks with NodeXL", In C&T '09. pp. 255-264.

http://nodexl.codeplex.com/ Coordinated Network Visualizations Watson News Explorer

http://news-explorer.mybluemix.net/ Action Science Explorer Interactive data visualization for rapid understanding of scientific literature

Amjad S, Mukhtar H and Dunne C (2014), "Automating scholarly article data collection with Action Science Explorer", In ICOSST '14: Proc. 8th IEEE 2014 International Conference on Open-Source Systems and Technologies.

Dunne C, Shneiderman B, Gove R, Klavans J and Dorr B (2012), "Rapid understanding of scientific paper collections: Integrating statistics, text analytics, and visualization", JASIST: Journal of the American Society for Information Science and Technology. Vol. 63(12). pp. 2351-2369.

Gove R, Dunne C, Shneiderman B, Klavans J and Dorr B (2011), "Evaluating visual and statistical exploration of scientific literature networks", In VL/HCC '11: Proc. 2011 IEEE Symposium on Visual Languages and Human-Centric Computing. pp. 217-224.

http://www.cs.umd.edu/hcil/ase/ NetGrok Visualizing real-time computer network traffic

Blue R, Dunne C, Fuchs A, King K and Schulman A (2008), "Visualizing real-time network resource usage", In VizSec '08: Proc. 5th international workshop on Visualization for Computer Security. pp. 119-135.

http://www.cs.umd.edu/projects/netgrok/ Discussion

Introduction Node-Link Composite Alternate Coordinated Discussion Graph Drawing 2017 Hosted in September by IBM Cambridge, MA, USA

Cody Dunne, IBM Watson [email protected] T. Alan Keahey, IBM Watson [email protected]