Why Does This Suck? Information Visualization
Total Page:16
File Type:pdf, Size:1020Kb
why does this suck? Information Visualization Jeffrey Heer UC Berkeley | PARC, Inc. CS160 – 2004.11.22 (includes numerous slides from Marti Hearst, Ed Chi, Stuart Card, and Peter Pirolli) Basic Problem We live in a new ecology. Scientific Journals Journals/personJournals/person increasesincreases 10X10X everyevery 5050 yearsyears 1000000 100000 10000 Journals 1000 Journals/People x106 100 10 1 0.1 Darwin V. Bush You 0.01 Darwin V. Bush You 1750 1800 1850 1900 1950 2000 Year Web Ecologies 10000000 1000000 100000 10000 1000 1 new server every 2 seconds Servers 7.5 new pages per second 100 10 1 Aug-92 Feb-93 Aug-93 Feb-94 Aug-94 Feb-95 Aug-95 Feb-96 Aug-96 Feb-97 Aug-97 Feb-98 Aug-98 Source: World Wide Web Consortium, Mark Gray, Netcraft Server Survey Human Capacity 1000000 100000 10000 1000 100 10 1 0.1 Darwin V. Bush You 0.01 Darwin V. Bush You 1750 1800 1850 1900 1950 2000 Attentional Processes “What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it.” ~Herb Simon as quoted by Hal Varian Scientific American September 1995 Human-Information Interaction z The real design problem is not increased access to information, but greater efficiency in finding useful information. z Increasing the rate at which people can find and use relevant information improves human intelligence. Amount of Accessible Knowledge Cost [Time] Information Visualization z Leverage highly-developed human visual system to achieve rapid understanding of abstract information. 1.2 b/s (Reading) 2.3 b/s (Pictures) Information Visualization z “Transformation of the symbolic into the geometric” (McCormick et al., 1987) z “... finding the artificial memory that best supports our natural means of perception.'‘ (Bertin, 1983) z The depiction of information using spatial or graphical representations, to facilitate comparison, pattern recognition, change detection, and other cognitive skills by making use of the visual system. (Hearst, 2003) Why Visualization? z Use the eye for pattern recognition; people good at z scanning z recognizing z remembering images z Graphical elements facilitate comparisons via z length z shape z orientation z texture z Animation shows changes across time z Color helps make distinctions z Aesthetics make the process appealing Visualization Success Stories Visualization Success Story Mystery: what is causing a cholera epidemic in London in 1854? Visualization Success Story Illustration of John Snow’s deduction that a cholera epidemic was caused by a bad water pump, circa 1854. Horizontal lines indicate location of deaths. From Visual Explanations by Edward Tufte, Graphics Press, 1997 Visualization Success Story Illustration of John Snow’s deduction that a cholera epidemic was caused by a bad water pump, circa 1854. Horizontal lines indicate location of deaths. From Visual Explanations by Edward Tufte, Graphics Press, 1997 A Visualization Expedition (a tour through past and present) Perspective Wall Starfield Displays Slide adapted from Chris North 18 Film Finder Table Lens Distortion Techniques Indented Hierarchy Layout Places all items along vertically spaced rows Uses indentation to show parent child relationships Breadth and depth end up fighting for space resources Reingold-Tilford Layout Top-down layout Uses separate dimensions for breadth and depth tidier drawing of trees - reingold, tilford TreeMaps Space-filling technique that divides space recursively Segments space according to ‘size’ of children nodes map of the market – smartmoney.com SpaceTree Cone Trees Tree layout in three dimensions Shadows provide 2D structure Can also make “Balloon Trees” – 2D version of ConeTree cone tree – robertson, mackinlay, and card Degree-of-Interest Trees Hyperbolic Trees Network visualization Often uses physics models (e.g., edges as springs) to perform layout. Can be animated and interacted with. Network Visualization Skitter, www.caida.org WebBook Web Forager Document Lens Data Mountain Supports document organization in a 2.5 dimensional environment. Designing Visualizations (some tricks of the trade) Graphical Excellence [Tufte] z the well-designed presentation of interesting data – a matter of substance, of statistics, and of design z consists of complex ideas communicated with clarity, precision and efficiency z is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space z requires telling the truth about the data. Interactive Tasks [Shneiderman] 1. Overview: Get an overview of the collection 2. Zoom: Zoom in on items of interest 3. Filter: Remove uninteresting items 4. Details on demand: Select items and get details 5. Relate: View relationships between items 6. History: Keep a history of actions for undo, replay, refinement 7. Extract: Make subcollections Proposed Data Types 1. 1D: timelines,… 2. 2D: maps,… 3. 3D: volumes,… 4. Multi-dimensional: databases,… 5. Hierarchies/Trees: directories,… 6. Networks/Graphs: web,… 7. Document collections: digital libraries,… This is useful, but what’s wrong here? Basic Types of Data z Nominal (qualitative) z (no inherent order) z city names, types of diseases, ... z Ordinal (qualitative) z (ordered, but not at measurable intervals) z first, second, third, … z cold, warm, hot z Mon, Tue, Wed, Thu … z Interval (quantitative) z integers or reals Ranking of Applicability of Properties for Different Data Types (Mackinlay 88, Not Empirically Verified) QUANT ORDINAL NOMINAL Position Position Position Length Density Color Hue Angle Color Saturation Texture Slope Color Hue Connection Area Texture Containment Volume Connection Density Density Containment Color Saturation Color Saturation Length Shape Color Hue Angle Length Visualization Design Patterns z Pre-Attentive Patterns z Leverage things that automatically “pop-out” to human attention z Stark contrast in color, shape, size, orientation z Gestalt Properties z Use psychological theories of visual grouping z proximity, similarity, continuity, connectedness, closure, symmetry, common fate, figure/ground separation z High Data Density z Maximize number of items/area of graphic z This is controversial! Whitespace may contribute to good visual design… so balance appropriately. z Small Multiples z Show varying visualizations/patterns adjacent to one another z Enable Comparisons Visualization Design Patterns z Focus+Context z Highlight regions of current interest, while de-emphasizing but keeping visible surrounding context. z Can visually distort space, or use degree-of-interest function to control what is and isn’t visualized. z Dynamic Queries z Allow rapid refinement of visualization criteria z Range sliders, Query sliders z Panning and Zooming z Navigate large spaces using a camera metaphor z Semantic Zooming z Change content presentation based on zooming level z Hide/reveal additional data in accordance with available space Software Architectures z The Information Visualization Reference Model [Chi, Card, Mackinlay, Shneiderman] Evaluating Visualizations Evaluating Visualizations z Visualizations are user interfaces, too…established methodologies can be used. z Questions to ask z What tasks do you expect people to perform with the visualization? z What interfaces currently exist for this task? z In what ways do you expect different visualizations to help or hurt aspects of these tasks? z Metrics: task time, success rate, information gained (e.g., test the user, or exploit priming effects), eye tracking. Evaluating Hyperbolic Trees z The Great CHI’97 Browse-Off: Individual browsers race against the clock to perform various retrieval and comparison tasks. z Hyperbolic Tree won against M$ File Explorer and others. z Can we conclude that it is the better browser? vs. Evaluating Hyperbolic Trees z No! z Different people operating each browser. z Tasks were not ecologically valid. z Can’t say what is better for what. z PARC researchers did extensive eye-tracking studies uncovering very nuanced visual psychology. z Found Hyperbolic Tree is better when underlying information design (e.g., tree structure and labeling) is better. z In case of CHI Browse Off, the Hyperbolic Tree had a quicker human user “behind the wheel”. z Moral: Exercise judicious study design, but also don’t feel let down if task times are not being radically improved… subtleties abound. Questions? Jeffrey Heer [email protected] prefuse http://prefuse.sourceforge.net Accuracy Ranking of Quantitative Perceptual Tasks Estimated; only pairwise comparisons have been validated (Mackinlay 88 from Cleveland & McGill) Interpretations of Visual Properties Some properties can be discriminated more accurately but don’t have intrinsic meaning z Density (Greyscale) Darker -> More z Size / Length / Area Larger -> More z Position Leftmost -> first, Topmost -> first z Hue ??? no intrinsic meaning z Slope ??? no intrinsic meaning Micro-Aspects of Visualization Design (aka fun with visual psychology) Preattentive Processing z A limited set of visual properties are processed preattentively z (without need for focusing attention). z This is important for design of visualizations z what can be perceived immediately z what properties are good discriminators z what can mislead viewers All Preattentive Processing figures from Healey 97 http://www.csc.ncsu.edu/faculty/healey/PP/PP.html Example: Color Selection Viewer can rapidly and accurately