Lecture 14: Evaluation the Perceptual Scalability of Visualization
Total Page:16
File Type:pdf, Size:1020Kb
Readings Covered Further Readings Evaluation, Carpendale Evaluating Information Visualizations. Sheelagh Carpendale. Chapter in Task-Centered User Interface Design, Clayton Lewis and John Rieman, thorough survey/discussion, won’t summarize here Information Visualization: Human-Centered Issues and Perspectives, Chapters 0-5. Springer LNCS 4950, 2008, p 19-45. The challenge of information visualization evaluation. Catherine Plaisant. Lecture 14: Evaluation The Perceptual Scalability of Visualization. Beth Yost and Chris North. Proc. Advanced Visual Interfaces (AVI) 2004 Proc. InfoVis 06, published as IEEE TVCG 12(5), Sep 2006, p 837-844. Information Visualization Effectiveness of Animation in Trend Visualization. George G. Robertson, CPSC 533C, Fall 2011 Turning Pictures into Numbers: Extracting and Generating Information Roland Fernandez, Danyel Fisher, Bongshin Lee, and John T. Stasko. from Complex Visualizations. J. Gregory Trafton, Susan S. IEEE TVCG (Proc. InfoVis 2008). 14(6): 1325-1332 (2008) Kirschenbaum, Ted L. Tsui, Robert T. Miyamoto, James A. Ballas, and Artery Visualizations for Heart Disease Diagnosis. Michelle A. Borkin, Tamara Munzner Paula D. Raymond. Intl Journ. Human Computer Studies 53(5), Krzysztof Z. Gajos, Amanda Peters, Dimitrios Mitsouras, Simone 827-850. Melchionna, Frank J. Rybicki, Charles L. Feldman, and Hanspeter Pfister. UBC Computer Science IEEE TVCG (Proc. InfoVis 2011), 17(12):2479-2488. Wed, 2 November 2011 1 / 46 2 / 46 3 / 46 4 / 46 Psychophysics Cognitive Psychology Structural Analysis Comparative User Studies method of limits repeating simple, but important tasks, and measure requirement analysis, task analysis hypothesis testing find limitations of human perceptions reaction time or error structured interviews error detection methods Miller’s 7+/- 2 short-term memory experiments can be used almost anywhere, for open-ended questions hypothesis: a precise problem statement find threshold of performance degradation Fitts’ Law (target selection) and answers ex: Participants will be faster with a coordinated Hick’s Law (decision making given n choices) staircase procedure to find threshold faster rating/Likert scales overview+detail display than with an uncoordinated display or a detail-only display with the task requires method of adjustment interference between channels commonly used to solicit subjective feedback reading details find optimal level of stimuli by letting subjects control multi-modal studies ex: NASA-TLX (Task Load Index) to assess mental MacLean 2005, Perceiving Ordinal Data Haptically workload measurement: faster the level objects of comparison: Under Workload “it is frustrating to use the interface” coordinated O+D display using haptic feedback for interruption when the Strongly Disagree Disagree Neutral Agree Strongly Agree | | | | uncoordinated O display participants were visually (and cognitively) busy uncoordinated D display condition of comparison: task requires reading details 5 / 46 6 / 46 7 / 46 8 / 46 Comparative User Studies Comparative User Studies Comparative User Studies Comparative User Studies study design: factors and levels study design: within, or between? measurements (dependent variables) result analysis within performance indicators: task completion time, error should know how to analyze the main results/hypotheses factors everybody does all the conditions rates, mouse movement BEFORE study independent variables can lead to ordering effects subjective participant feedback: satisfaction ratings, hypothesis testing analysis (using ANOVA or t-tests) ex: interface, task, participant demographics can account for individual differences and reduce noise closed-ended questions, interview tests how likely observed differences between groups are levels thus can be more powerful and require fewer participants observations: behaviors, signs of frustration due to chance alone number of variables in each factor combinatorial explosion number of participants ex: a p-value of 0.05 means there is a 5% probability the limited by length of study and number of participants severe limits on number of conditions depends on effect size and study design: power of difference occurred by chance possible workaround is multiple sessions experiment usually good enough for HCI studies between possible confounds? pilots! divide participants into groups learning effect: did everybody use interfaces in a certain should have good idea of forthcoming results of the each group does only some conditions order? study BEFORE running actual study trials if so, are people faster because they are more practiced, or because of true interface effect? 9 / 46 10 / 46 11 / 46 12 / 46 Evaluation Throughout Design Cycle Initial Assessments Iterative Design Process Benchmarking user/task centered design cycle what kind of problems are the system aiming to address? does your design address the users’ needs? how does your system compare to existing ones? initial assessments analyze a large and complex dataset can they use it? empirical, comparative studies iterative design process who are your target users? where are the usability problems? ask specific questions benchmarking data analysts compare an aspect of the system with specific tasks deployment what are the tasks? what are the goals? evaluate without users Amar/Stasko task taxonomy paper quantitative, but limited find trends and patterns in the data via exploratory cognitive walkthrough identify problems, go back to previous step The Challenge of Information Visualization Evaluation, analysis action analysis Catherine Plaisant, Proc. AVI 2004 Task-Centered User Interface Design, Clayton Lewis and John Rieman, Chapters 0-5. what are their current practices heuristics analysis statistical analysis evaluate with users why and how can visualization be useful? usability evaluations (think-aloud) visual spotting of trends and patterns bottom-line measurements talk to the users, and observe what they do task analysis 13 / 46 14 / 46 15 / 46 16 / 46 first study of macro/micro effects breaking new ground many possible followups physical navigation vs. virtual navigation The Effects of Peripheral Vision and Physical Navigation in Large Scale Visualization. GI 08 Move to Improve: Promoting Physical Navigation to Increase user Performance with Large Displays. CHI 07 Deployment Compare Systems vs. Characterize Usage Perceptual Scalability Perceptual Scalability how is the system used in the wild? user/task centered design cycle: what are perceptual/cognitive limits when screen-space design how are people using it? initial assessments constraints lifted? 2 display sizes, between-subjects iterative design process 2 vs. 32 Mpixel display (data size also increased proportionally) does the system fit into existing work flow? environment? benchmarking: head-to-head comparison macro/micro views 3 visualization designs, within deployment perceptually scalable small multiples: bars embedded graphs (identify problems, go back to previous step) no increase in task completion times when normalize to contextual studies, field studies embedded bars understanding/characterizing techniques amount of data 7 tasks, within tease apart factors 42 tasks per participant when and how is technique appropriate 3 vis x 7 tasks x 2 trials line is blurry: intent [The Perceptual Scalability of Visualization. Beth Yost and Chris North. IEEE TVCG 12(5) (Proc. InfoVis 06), Sep 2006, p 837-844.] 17 / 46 18 / 46 19 / 46 20 / 46 Embedded Visualizations Small Multiples Visualizations Results Results attribute-centric instead of space-centric 20x increase in data, but only 3x increase in absolute task significant 3-way interaction times between display, size, task [The Perceptual Scalability of Visualization. Beth Yost and Chris North. IEEE TVCG 12(5) (Proc. InfoVis 06), Sep 2006, p 837-844.] [The Perceptual Scalability of Visualization. Beth Yost and Chris North. IEEE TVCG 12(5) (Proc. InfoVis 06), Sep 2006, p 837-844.] [The Perceptual Scalability of Visualization. Beth Yost and Chris North. IEEE TVCG 12(5) (Proc. InfoVis 06), Sep 2006, p 837-844.] [The Perceptual Scalability of Visualization. Beth Yost and Chris North. IEEE TVCG 12(5) (Proc. InfoVis 06), Sep 2006, p 837-844.] 21 / 46 22 / 46 23 / 46 24 / 46 Results Critique Critique Trends: Animation, Trails, SmallMult visual encoding important on small displays first study of macro/micro effects Gapminder: animated bubble charts + human DS: mults sig slower than graphs on small breaking new ground x/y position, size, color, animation DS: mults sig slower than embedded on large is animation effective? OS: bars sig faster than graphs for small presentation vs analysis many possible followups trend vs transitions OS: no sig difference bars/graphs for large physical navigation vs. virtual navigation The Effects of Peripheral Vision and Physical Navigation spatial grouping important on large displays in Large Scale Visualization. GI 08 embedded sig faster+preferred over small mult Move to Improve: Promoting Physical Navigation to Increase user Performance with Large Displays. CHI 07 no bar/graph differences [Effectiveness of Animation in Trend Visualization. Robertson et al. IEEE TVCG (Proc. InfoVis 2008). 14(6): 1325-1332 (2008)] 25 / 46 26 / 46 27 / 46 28 / 46 Trends Small Multiples Design Results many countertrends lost in clutter individual plots get small 2 use: presentation vs. analysis (between-subjects) small multiples more accurate than animation 3 vis encodings: animation vs. traces vs. small mults animation faster for presentation, slower for analysis 2 dataset