ABSTRACT Title of Document: INTEGRATING
Total Page:16
File Type:pdf, Size:1020Kb
ABSTRACT Title of Document: INTEGRATING STATISTICS AND VISUALIZATION TO IMPROVE EXPLORATORY SOCIAL NETWORK ANALYSIS. Adam Perer, Doctor of Philosophy, 2008 Directed By: Professor Ben Shneiderman, Department of Computer Science Social network analysis is emerging as a key technique to understanding social, cultural and economic phenomena. However, social network analysis is inherently complex since analysts must understand every individual’s attributes as well as relationships between individuals. There are many statistical algorithms which reveal nodes that occupy key social positions and form cohesive social groups. However, it is difficult to find outliers and patterns in strictly quantitative output. In these situations, information visualizations can enable users to make sense of their data, but typical network visualizations are often hard to interpret because of overlapping nodes and tangled edges. My first contribution improves the process of exploratory social network analysis. I have designed and implemented a novel social network analysis tool, SocialAction (http://www.cs.umd.edu/hcil/socialaction), that integrates both statistics and visualizations to enable users to quickly derive the benefits of both. Statistics are used to detect important individuals, relationships, and clusters. Instead of tabular display of numbers, the results are integrated with a network visualization in which users can easily and dynamically filter nodes and edges. The visualizations simplify the statistical results, facilitating sensemaking and discovery of features such as distributions, patterns, trends, gaps and outliers. The statistics simplify the comprehension of a sometimes chaotic visualization, allowing users to focus on statistically significant nodes and edges. SocialAction was also designed to help analysts explore non-social networks, such as citation, communication, financial and biological networks. My second contribution extends lessons learned from SocialAction and provides designs guidelines for interactive techniques to improve exploratory data analysis. A taxonomy of seven interactive techniques are augmented with computed attributes from statistics and data mining to improve information visualization exploration. Furthermore, systematic yet flexible design goals are provided to help guide domain experts through complex analysis over days, weeks and months. My third contribution demonstrates the effectiveness of long term case studies with domain experts to measure creative activities of information visualization users. Evaluating information visualization tools is problematic because controlled studies may not effectively represent the workflow of analysts. Discoveries occur over weeks and months, and exploratory tasks may be poorly defined. To capture authentic insights, I designed an evaluation methodology that used structured and replicated long-term case studies. The methodology was implemented on unique domain experts that demonstrated the effectiveness of integrating statistics and visualization. INTEGRATING STATISTICS AND VISUALIZATION TO IMPROVE EXPLORATORY SOCIAL NETWORK ANALYSIS. By Adam Nathaniel Perer Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2008 Advisory Committee: Professor Ben Shneiderman, Chair Professor Ben Bederson Professor Lise Getoor Professor Jen Golbeck Professor Alan Neustadtl © Copyright by Adam Nathaniel Perer 2008 Dedication Dedicated to my Mom and Dad, for always letting me push buttons. ii Acknowledgements I first thank my advisor, Ben Shneiderman. He has been a vast source of wisdom, inspiration, and motivation throughout my graduate career. It is hard to imagine a more supportive, kind, and dedicated mentor than Dr. Shneiderman. I’d also like to thank my all-star committee for shaping and refining my dissertation: Ben Bederson, Lise Getoor, Jennifer Golbeck and Alan Neustadtl. The Human-Computer Interaction Lab at the University of Maryland has been an amazing place to spend during my doctoral years. I thank the leadership of Allison Druin, the thoughtful Catherine Plaisant, the insightful François Guimbretière, the problem solver Anne Rose, and the helpful Kiki Schneider. The students of the HCIL have also played a pivotal role throughout my graduate career. In particular, Amy Karlson, Georg Apitz, Jinwook Seo, Dave Wang, Hyunyoung Song, Aleks Aris, Alex Quinn, Bill Kules, Haixia Zhao, Gene Chipman, Jerry Fails, Sherri Massey, Nick Chen, Bongshin Lee, Chang Hu and Chunyuan Liao deserve much praise for making the lab such an enlightening atmosphere. Other computer science students, Nikhil Swamy and Dave Levin, have also been critical to my success. Outside of the University of Maryland, I’ve also had several wonderful research opportunities. I thank Marc Smith for continual support and inviting me to spend two wonderful summers at Microsoft Research. I thank Eric Bier for hosting me for a summer at Xerox PARC and allowing me to innovate in the place I admired most growing up. I also thank Christine Youngblut (Institute for Defense Analyses), Takeo Kanade (Carnegie Mellon University), Michael Widom (Carnegie Mellon iii University), and Randy Beer (Case Western Reserve University), without whom I may have never learned the joy of research in computer science. My friends and family deserve much of the credit for my graduate school success. I couldn’t imagine more amazing parents than the ones I have, who have always given me the confidence and support to pursue what I love. My sister, Jennifer Slattery, is a constant source of love and pride. My significant other, Bita Azhdam, has made the last 18 months of my life among the best. The undying support of my friends can also not go unstated: Gregg Tarter, Sara Tadikamalla, Eric Dash, Ben Kelley, Michael Kalnicki, Matt Ittigson, Michael Rosenthal, Josh Baugher, Bethia Cullis, Ankita Singh, Jeremy Baugher, Andy Gordon, Atheir Abbas, Tim Fasel, Marisa Pauly, Julie Chung, Jaime Setton, Melissa Bowman, Ezi Yomtovian, Margo Serlin and Aaron Cohen. Finally, I thank Grammy, Nana, and Colin for looking out for me from above. iv Table of Contents Dedication ..................................................................................................................... ii Acknowledgements ...................................................................................................... iii Table of Contents .......................................................................................................... v Chapter 1: Introduction ................................................................................................ 1 1.1 Contribution (C1:Integration) ................................................................................. 3 1.2 Contribution (C2:Guidelines) ................................................................................. 5 1.3 Contribution (C3:Evaluation) ................................................................................. 6 Chapter 2: Related Work .............................................................................................. 8 2.1 Social Network Analysis (SNA) ............................................................................. 8 2.2 SNA Tools ............................................................................................................ 10 2.3 Network and Graph Visualization ........................................................................ 13 2.4 Layout Distortion Techniques ............................................................................... 15 2.5 Interactive Techniques .......................................................................................... 18 2.6 Guides for Discovery ............................................................................................ 26 2.7 Evaluation of Information Visualization Systems ................................................ 33 2.8 Summary ............................................................................................................... 34 Chapter 3: Integrating Statistics and Visualization to Improve Exploratory Data Analysis of Social Networks ....................................................................................... 35 3.1 Introduction ........................................................................................................... 35 3.2 Designing for Social Network Analysis: Integrating Statistics with Visualization ..................................................................................................................................... 36 3.3 Overview: Gaining Insights from the Network Structure ..................................... 40 3.3.1 Visual overview ............................................................................................. 40 3.3.2 Statistical Overviews ..................................................................................... 43 3.4 Ranking Nodes: Gaining Insights from the Network’s Individuals ...................... 43 3.4.1 Visual Coding ................................................................................................ 45 3.4.2 Filtering by Rankings ..................................................................................... 47 3.5 Ranking Edges: Gaining Insights from the Network’s Relations ......................... 49 3.6 Plotting Node Rankings: Detecting Patterns of Individuals ...............................