Visual Analytics: New Tools for Gaining Insight from Your Data
Ben Shneiderman [email protected]
Founding Director (1983-2000), Human-Computer Interaction Lab Professor, Department of Computer Science Member, Institute for Advanced Computer Studies
University of Maryland College Park, MD 20742
Visual Analytics: New Tools for Gaining Insight from Your Data
Ben Shneiderman [email protected]
Twitter: @benbendc
University of Maryland College Park, MD 20742
Interdisciplinary research community - Computer Science & Info Studies - Psych, Socio, Poli Sci & MITH
(www.cs.umd.edu/hcil)
Design Issues
• Input devices & strategies • Keyboards, pointing devices, voice • Direct manipulation • Menus, forms, commands • Output devices & formats • Screens, windows, color, sound • Text, tables, graphics • Instructions, messages, help • Collaboration & Social Media www.awl.com/DTUI • Help, tutorials, training Fifth Edition: 2010 • Search • Visualization HCI Pride: Serving 5B Users
Mobile, desktop, web, cloud
Diverse users: novice/expert, young/old, literate/illiterate, abled/disabled, cultural, ethnic & linguistic diversity, gender, personality, skills, motivation, ...
Diverse applications: E-commerce, law, health/wellness, education, creative arts, community relationships, politics, IT4ID, policy negotiation, mediation, peace studies, ...
Diverse interfaces: Ubiquitous, pervasive, embedded, tangible, invisible, multimodal, immersive/augmented/virtual, ambient, social, affective, empathic, persuasive, ...
Workshop Overview
Wordle.net
Information Visualization
• Visual bandwidth is enormous • Human perceptual skills are remarkable • Trend, cluster, gap, outlier... • Color, size, shape, proximity...
• Three challenges • Meaningful visual displays of massive data • Interaction: widgets & window coordination • Process models for discovery
Information Visualization & Visual Analytics
• Visual bands • Human percle • Trend, clus.. • Color, size,..
• Three challe • Meaningful vi • Interaction: w • Process mo
1999 Information Visualization & Visual Analytics
• Visual bandwidth is enormous • Human perceptual skills are remarkable • Trend, cluster, gap, outlier... • Color, size, shape, proximity...
• Three challenges • Meaningful visual displays of massive da • Interaction: widgets & window coordinati • Process models for discovery
1999 2004 Information Visualization & Visual Analytics
• Visual bandwidth is enormous • Human perceptual skills are remarkable • Trend, cluster, gap, outlier... • Color, size, shape, proximity...
• Three challenges • Meaningful visual displays of massive data • Interaction: widgets & window coordination • Process models for discovery
1999 2004 2010 Business takes action
• General Dynamics buys MayaViz • Agilent buys GeneSpring • Google buys Gapminder • Oracle buys Hyperion • Microsoft buys Proclarity • InfoBuilders buys Advizor Solutions • SAP buys (Business Objects buys Xcelsius & Inxight & Crystal Reports ) • IBM buys (Cognos buys Celequest) & ILOG • TIBCO buys Spotfire
Spotfire: Retinol’s role in embryos & vision Spotfire: DC natality data http://registration.spotfire.com/eval/default_edu.asp 10M - 100M pixels: Large displays 100M-pixels & more 1M-pixels & less Small mobile devices Information Visualization: Mantra
• Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand • Overview, zoom & filter, details-on-demand
Information Visualization: Data Types
. • 1-D Linear Document Lens, SeeSoft, Info Mural • 2-D Map GIS, ArcView, PageMaker, Medical imagery
• 3-D World CAD, Medical, Molecules, Architecture SciViz SciViz
• Multi-Var Spotfire, Tableau, GGobi, TableLens, ParCoords, • Temporal LifeLines, TimeSearcher, Palantir, DataMontage • Tree Cone/Cam/Hyperbolic, SpaceTree, Treemap
InfoViz InfoViz • Network Pajek, JUNG, UCINet, SocialAction, NodeXL
infosthetics.com flowingdata.com infovis.org www.infovis.net/index.php?lang=2
Anscombe’s Quartet
1 2 3 4 x y x y x y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Anscombe’s Quartet
1 2 3 4 x y x y x y x y Property Value 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 Mean of x 9.0 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 Variance of x 11.0 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 Mean of y 7.5 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 Variance of y 4.12 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 Correlation 0.816 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 Linear regression y = 3 + 0.5x 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Anscombe’s Quartet Temporal Data: TimeSearcher 1.3
• Time series • Stocks • Weather • Genes • User-specified patterns • Rapid search
Temporal Data: TimeSearcher 2.0
• Long Time series (>10,000 time points) • Multiple variables • Controlled precision in match (Linear, offset, noise, amplitude)
LifeLines: Patient Histories
www.cs.umd.edu/hcil/lifelines LifeLines2: Contrast+Creatine LifeLines2: Align-Rank-Filter & Summarize LifeFlow: Aggregation Strategy
Temporal Categorical Data (4 records)
LifeLines2 format
Tree of Event Sequences
LifeFlow Aggregation
www.cs.umd.edu/hcil/lifeflow LifeFlow: Interface with User Controls
Treemap: Gene Ontology
+ Space filling + Space limited + Color coding + Size coding - Requires learning
(Shneiderman, ACM Trans. on Graphics, 1992 & 2003) www.cs.umd.edu/hcil/treemap/ Treemap: Smartmoney MarketMap
www.smartmoney.com/marketmap Market falls steeply Feb 27, 2007, with one exception Market falls steeply Sept 22, 2011, some exceptions Market mixed, February 8, 2008 Energy & Technology up, Financial & Health Care down Market rises, September 1, 2010, Gold contrarians Market rises, March 21, 2011, Sprint declines Treemap: Newsmap (Marcos Weskamp)
newsmap.jp Treemap: WHC Emergency Room (6304 patients in Jan2006)
Group by Admissions/MF, size by service time, color by age Treemap: WHC Emergency Room (6304 patients in Jan2006) (only those service time >12 hours)
Group by Admissions/MF, size by service time, color by age Treemap: Supply Chain
www.hivegroup.com Treemap: Nutritional Analysis
www.hivegroup.com Treemap: Spotfire Bond Portfolio Analysis
www.spotfire.com Treemap: NY Times – Car&Truck Sales
www.cs.umd.edu/hcil/treemap/ Treemap (Voronoi): NY Times - Inflation
www.nytimes.com/interactive/2008/05/03/business/20080403_SPENDING_GRAPHIC.html
VisualComplexity.com : Manuel Lima Discovery Process: Systematic Yet Flexible
Preparation • Own the problem & define the schedule • Data cleaning & conditioning • Handle missing & uncertain data • Extract subsets & link to related information
SocialAction
• Integrates statistics & visualization
• 4 case studies, 4-8 weeks (journalist, bibliometrician, terrorist analyst, organizational analyst) • Identified desired features, gave strong positive feedback about benefits of integration
www.cs.umd.edu/hcil/socialaction Perer & Shneiderman, CHI2008, IEEE CG&A 2009 Network from Database Tables
www.centrifugesystems.com NodeXL: Network Overview for Discovery & Exploration in Excel
www.codeplex.com/nodexl NodeXL: Network Overview for Discovery & Exploration in Excel
www.codeplex.com/nodexl NodeXL: Import Dialogs
www.codeplex.com/nodexl Tweets at #WIN09 Conference: 2 groups ‘GOP’ tweets, clustered (red-Republicans) Twitter networks: #SOTU WWW2010 Twitter Community Twitter Network for “msrtf11 OR techfest ” Twitter Network for “msrtf11 OR techfest ” Twitter Network for “SpaceX ” Twitter Network for “TTW” Twitter Network for #CI2012 No Location Philadelphia Innovation Clusters: People, Locations, Companies
11,000 nodes 26,000 links
Pharmaceutical/Medical
Pittsburgh Metro
Westinghouse Electric No Location Philadelphia Innovation Clusters: People, Locations, Companies
Pharmaceutical/Medical
Pittsburgh Metro
Westinghouse Electric No Location Philadelphia Innovation Clusters: People, Locations, Companies
Patent Tech Navy SBIR (federal) PA DCED (state) Related patent 2: Federal agency Pharmaceutical/Medical 3: Enterprise
Pittsburgh Metro 5: Inventors 9: Universities 10: PA DCED 11/12: Phil/Pitt metro cnty
13-15: Semi-rural/rural cnty 17: Foreign countries Westinghouse Electric 19: Other states
CHI2010 Twitter Community
www.codeplex.com/nodexl/ Flickr clusters for “mouse”
Computer Mickey
Animal Flickr networks Analyzing Social Media Networks with NodeXL
I. Getting Started with Analyzing Social Media Networks 1. Introduction to Social Media and Social Networks 2. Social media: New Technologies of Collaboration 3. Social Network Analysis
II. NodeXL Tutorial: Learning by Doing 4. Layout, Visual Design & Labeling 5. Calculating & Visualizing Network Metrics 6. Preparing Data & Filtering 7. Clustering &Grouping
III Social Media Network Analysis Case Studies 8. Email 9. Threaded Networks 10. Twitter 11. Facebook 12. WWW 13. Flickr 14. YouTube 15. Wiki Networks
www.elsevier.com/wps/find/bookdescription.cws_home/723354/description Social Media Research Foundation
Researchers who want to - create open tools - generate & host open data - support open scholarship
Map, measure & understand social media
Support tool projects to collection, analyze & visualize social media data.
smrfoundation.org Sense-Making Loop
Thomas & Cook: Illuminating the Path (2004) Sense-Making Loop: Expanded
Thomas & Cook: Illuminating the Path (2004) Discovery Process: Systematic Yet Flexible
Preparation • Own the problem & define the schedule • Data cleaning & conditioning • Handle missing & uncertain data • Extract subsets & link to related information
Discovery Process: Systematic Yet Flexible
Preparation • Own the problem & define the schedule • Data cleaning & conditioning • Handle missing & uncertain data • Extract subsets & link to related information
Purposeful exploration – Hypothesis testing • Range & distribution • Relationships & correlations • Clusters & gaps • Outliers & anomalies • Aggregation & summary • Split & trellis • Temporal comparisons & multiple views • Statistics & forecasts
Discovery Process: Systematic Yet Flexible
Preparation • Own the problem & define the schedule • Data cleaning & conditioning • Handle missing & uncertain data • Extract subsets & link to related information
Purposeful exploration – Hypothesis testing • Range & distribution • Relationships & correlations • Clusters & gaps • Outliers & anomalies • Aggregation & summary • Split & trellis • Temporal comparisons & multiple views • Statistics & forecasts
Situated decision making - Social context • Annotation & marking • Collaboration & coordination • Decisions & presentations UN Millennium Development Goals
To be achieved by 2015 • Eradicate extreme poverty and hunger • Achieve universal primary education • Promote gender equality and empower women • Reduce child mortality • Improve maternal health • Combat HIV/AIDS, malaria and other diseases • Ensure environmental sustainability • Develop a global partnership for development
30th Anniversary Symposium May 22-23, 2013
www.cs.umd.edu/hcil
For More Information
• Visit the HCIL website for 400 papers & info on videos www.cs.umd.edu/hcil • Conferences & resources: www.infovis.org • See Chapter 14 on Info Visualization Shneiderman, B. and Plaisant, C., Designing the User Interface: Strategies for Effective Human-Computer Interaction: Fifth Edition (2010) www.awl.com/DTUI • Edited Collections: Card, S., Mackinlay, J., and Shneiderman, B. (1999) Readings in Information Visualization: Using Vision to Think Bederson, B. and Shneiderman, B. (2003) The Craft of Information Visualization: Readings and Reflections For More Information
• Treemaps • HiveGroup: www.hivegroup.com • Smartmoney: www.smartmoney.com/marketmap • HCIL Treemap 4.0: www.cs.umd.edu/hcil/treemap
• Spotfire: www.spotfire.com • TimeSearcher: www.cs.umd.edu/hcil/timesearcher • NodeXL: nodexl.codeplex.com • Hierarchical Clustering Explorer: www.cs.umd.edu/hcil/hce
• LifeLines2: www.cs.umd.edu/hcil/lifelines2 • Similan: www.cs.umd.edu/hcil/similan