Tamara Munzner Department of Computer Science University of British Columbia
Total Page:16
File Type:pdf, Size:1020Kb
Visualization Analysis & Design Tamara Munzner Department of Computer Science University of British Columbia D3 Unconference Keynote November 21 2015, San Francisco CA http://www.cs.ubc.ca/~tmm/talks.html#vad15d3 @tamaramunzner Defining visualization (vis) Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively. 2 Why have a human in the loop? Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively. Visualization is suitable when there is a need to augment human capabilities rather than replace people with computational decision-making methods. • don’t need vis when fully automatic solution exists and is trusted • many analysis problems ill-specified – don’t know exactly what questions to ask in advance • possibilities – long-term use for end users (e.g. exploratory analysis of scientific data) – presentation of known results – stepping stone to better understanding of requirements before developing models – help developers of automatic solution refine/debug, determine parameters – help end users of automatic solutions verify, build trust 3 Why use an external representation? Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively. • external representation: replace cognition with perception [Cerebral: Visualizing Multiple Experimental Conditions on a Graph with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEE TVCG (Proc. InfoVis) 14(6):1253-1260, 2008.] 4 Why represent all the data? Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively. • summaries lose information, details matter – confirm expected and find unexpected patterns – assess validity of statistical model Anscombe’s Quartet Identical statistics x mean 9 x variance 10 y mean 8 y variance 4 x/y correlation 1 5 Why represent all the data? Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively. • summaries lose information, details matter – confirm expected and find unexpected patterns – assess validity of statistical model Anscombe’s Quartet Identical statistics x mean 9 x variance 10 y mean 8 y variance 4 x/y correlation 1 5 Analysis framework: Four levels, three questions domain • domain situation abstraction – who are the target users? idiom • abstraction algorithm – translate from specifics of domain to vocabulary of vis [A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ] • what is shown? data abstraction • often don’t just draw what you’re given: transform to new form domain abstraction • why is the user looking at it? task abstraction • idiom idiom • how is it shown? algorithm • visual encoding idiom: how to draw • interaction idiom: how to manipulate [A Multi-Level Typology of Abstract Visualization Tasks • algorithm Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). ] – efficient computation 6 Why is validation difficult? • different ways to get it wrong at each level Domain situation You misunderstood their needs Data/task abstraction You’re showing them the wrong thing Visual encoding/interaction idiom The way you show it doesn’t work Algorithm Your code is too slow 7 Why is validation difficult? • solution: use methods from different fields at each level Domain situation anthropology/ Observe target users using existing tools ethnography Data/task abstraction Visual encoding/interaction idiom design Justify design with respect to alternatives computer Algorithm technique-driven science Measure system time/memory Analyze computational complexity work cognitive Analyze results qualitatively psychology Measure human time with lab experiment (lab study) anthropology/ Observe target users after deployment ( ) ethnography Measure adoption [A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ] 8 Why is validation difficult? • solution: use methods from different fields at each level Domain situation anthropology/ Observe target users using existing tools ethnography Data/task abstraction Visual encoding/interaction idiom design Justify design with respect to alternatives computer Algorithm technique-driven science Measure system time/memory Analyze computational complexity work cognitive Analyze results qualitatively psychology Measure human time with lab experiment (lab study) anthropology/ Observe target users after deployment ( ) ethnography Measure adoption [A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ] 8 Why is validation difficult? • solution: use methods from different fields at each level Domain situation problem-driven anthropology/ Observe target users using existing tools work ethnography Data/task abstraction Visual encoding/interaction idiom design Justify design with respect to alternatives computer Algorithm technique-driven science Measure system time/memory Analyze computational complexity work cognitive Analyze results qualitatively psychology Measure human time with lab experiment (lab study) anthropology/ Observe target users after deployment ( ) ethnography Measure adoption [A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ] 8 Why analyze? • imposes a structure on huge design space – scaffold to help you think systematically about choices – analyzing existing as stepping stone to designing new Why analyze? SpaceTree TreeJuxtaposer • imposes a structure on huge design space – scaffold to help you think systematically about choices – analyzing existing as stepping stone to designing new [SpaceTree: Supporting Exploration in Large [TreeJuxtaposer: Scalable Tree Comparison Using Focus Node Link Tree, Design Evolution and Empirical +Context With Guaranteed Visibility. ACM Trans. on Evaluation. Grosjean, Plaisant, and Bederson. Graphics (Proc. SIGGRAPH) 22:453– 462, 2003.] What? Why? How? Proc. InfoVis 2002, p 57–64.] Tree Actions SpaceTree Present Locate Identify Encode Navigate Select Filter Aggregate Targets TreeJuxtaposer Path between two nodes Encode Navigate Select Arrange 9 What? Datasets Attributes Data Types Attribute Types What? Items Attributes Links Positions Grids Categorical Data and Dataset Types Why? Tables Networks & Fields Geometry Clusters, Ordered Trees Sets, Lists Ordinal Items Items (nodes) Grids Items Items Attributes Links Positions Positions How? Attributes Attributes Quantitative Dataset Types Ordering Direction Tables Networks Fields (Continuous) Sequential Attributes (columns) Grid of positions Link Items Cell (rows) Node Diverging (item) Cell containing value Attributes (columns) Value in cell Cyclic Multidimensional Table Trees Value in cell Geometry (Spatial) Dataset Availability Static Dynamic Position 10 Types: Datasets and data Dataset Types Tables Networks NetworksSpatial Attributes (columns) Fields (Continuous) Geometry (Spatial) Items Link Grid of positions (rows) Node (item) Cell Cell containing value Position Node (item) Attributes (columns) Attribute Types Value in cell Categorical Ordered Ordinal Quantitative 11 Why? Actions Targets Analyze All Data Consume Trends Outliers Features Discover Present Enjoy Attributes Produce Annotate Record Derive One Many tag Distribution Dependency Correlation Similarity Extremes Search • {action, target} pairs Target known Target unknown Location Lookup Browse Network Data – discover distribution known Location Locate Explore Topology – compare trends unknown – locate outliers Query Paths – browse topology Identify Compare Summarize What? Spatial Data Why? Shape How? 12 Actions 1: Analyze • consume Analyze –discover vs present Consume Discover Present Enjoy • classic split • aka explore vs explain –enjoy • newcomer Produce • aka casual, social Annotate Record Derive tag • produce –annotate, record –derive • crucial design choice 13 Derive • don’t just draw what you’re given! – decide what the right thing to show is – create it with a series of transformations from the original dataset – draw that • one of the four major strategies for handling complexity exports imports trade balance trade balance = exports −imports Derived Data Original Data 14 Analysis example: Derive one attribute • Strahler number – centrality metric for trees/networks – derived quantitative attribute – draw top 5K of 500K for good skeleton [Using Strahler numbers for real time visual exploration of huge graphs. Auber. Proc. Intl. Conf. Computer Vision and Graphics, pp. 56–69, 2002.] Task 1 Task 2 .58 .74 .58 .74 .64 .64 .54 .84 .54 .84 .74 .84 .74 .84 .24 .84 .24 .84 .64 .64 .94 .94 In Out In In Out Tree Quantitative Tree + Quantitative Filtered Tree attribute on nodes attribute on nodes Removed unimportant parts What? Why? What? Why? How? In Tree Derive In Tree Summarize Reduce Out Quantitative In Quantitative attribute on nodes Topology Filter attribute on nodes Out Filtered Tree 15 Actions II: Search • what does user know? Search – target, location Target known Target unknown Location Lookup Browse known Location Locate Explore unknown 16 Actions III: Query • what does user know? Search – target, location Target known Target unknown Location Lookup Browse • how much of the data known Location matters? Locate