Evaluating Multivariate Network Visualization Techniques Using a Validated Design and Crowdsourcing Approach

Evaluating Multivariate Network Visualization Techniques Using a Validated Design and Crowdsourcing Approach Carolina Nobre1, Dylan Wootton1, Lane Harrison2, Alexander Lex1 1University of Utah, 2Worcester Polytechnic Institute 1Salt Lake City, UT, USA, 2Worcester, MA, USA [email protected], [email protected], [email protected], [email protected] ABSTRACT however, are mostly based on visualization best practices and Visualizing multivariate networks is challenging because of the rely to only a small degree on empirical data, perpetuating trade-offs necessary for effectively encoding network topology the notion of visualization as an “empire built on sand” [21]. and encoding the attributes associated with nodes and edges. A How do we know whether a particular visual encoding or large number of multivariate network visualization techniques an interaction technique is better than the many alternatives exist, yet there is little empirical guidance on their respective available? The visualization literature largely relies on two strengths and weaknesses. In this paper, we describe a crowd- approaches: controlled experiments that measure correctness sourced experiment, comparing node-link diagrams with on- and time on simplified tasks for well-defined stimuli [22] and node encoding and adjacency matrices with juxtaposed tables. evaluations with experts, such as insight-based evaluation [32, We find that node-link diagrams are best suited for tasks that 28] and case studies [34]. Carpendale [6] discusses the con- require close integration between the network topology and flict between evaluating visualizations with real users and real a few attributes. Adjacency matrices perform well for tasks datasets, with high ecological validity, and controlled experi- related to clusters and when many attributes need to be consid- ments that require recruiting a large enough participant sample ered. We also reflect on our method of using validated designs to draw quantitative conclusions. Running controlled experi- for empirically evaluating complex, interactive visualizations ments requires abstraction and simplification of real-life tasks in a crowdsourced setting. We highlight the importance of and careful variation of a small number of design factors [22]. training, compensation, and provenance tracking. In this paper, we attempt to answer questions about the merits of two multivariate network visualization techniques by Author Keywords pushing the limits of controlled experiments for complex, in- Multivariate networks visualization; crowdsourced evaluation. teractive visualization techniques. We compare two MVNVs: node-link diagrams (NL) with on-node encoding and adja- CCS Concepts cency matrices (AM) with a juxtaposed tabular visualization. •Human-centered computing ! Empirical studies in visu- Testing these two conditions required the design and imple- alization; mentation of two functional prototypes. We designed and implemented these techniques based on existing guidelines, INTRODUCTION and followed a multi-stage testing and piloting process. We Multivariate networks (MVNs) are networks comprised of also elicited feedback from experts on network visualization to the network’s topology in the form of nodes and links, and validate and refine the designs. The aim of these activities was attributes about those nodes and links. Most real-life networks to ensure that the two conditions in the experiment resemble are multivariate: a social network has node attributes such as visualizations that might be used in practice. We postulate the age, name, and institutional affiliations of individuals; and eight hypotheses and evaluate them with a set of 16 tasks, edge attributes such as the types and frequencies of interac- which we derive from Nobre et al.’s task analysis [24]. We tions. In many analysis cases, it is necessary to see both the also report on insights generated in an open-ended task. topology and attributes simultaneously [27]. A recent survey by Nobre et al. [24] discusses 11 types of multivariate network Our contributions are twofold: We provide the first set of em- visualizations (MVNVs) and gives recommendations of when pirical evidence on the performance of two important MVNV to use which visualization technique. These recommendations, visualization techniques for different tasks, and we develop and reflect on an approach for controlled experiments using complex, validated visualization techniques. RELATED WORK Here we provide an analysis of prior evaluations of network vi- This is the authors’ preprint version of this paper. Please cite the following sualization techniques, followed by a discussion of the current reference: Carolina Nobre, Dylan Wootton, Lane Harrison, Alexander Lex. Evaluating landscape of crowdsourced evaluations of interactive visualiza- Multivariate Network Visualization Techniques Using a Validated Design tion techniques. For prior work regarding MVNVs, we refer and Crowdsourcing Approach. Proceedings of the SIGCHI Conference on to a recent survey [24] and to Kerren et al.’s book [19]. Human Factors in Computing Systems (CHI), to appear, 2020. Network Evaluation Studies al. [22] have categorized user studies that are aimed at eval- There is a long history of work that has evaluated the merits of uating user performance into two types: (1) understanding different network representations [38]. However, most of this the limits of visual perception and cognition for specific vi- work focuses on network topology and treats node and edge sual encodings, and (2) assessing how one visualization or attributes only cursorily, if at all. We limit our discussion to interaction technique compares to another. User studies can larger studies that evaluate NLs and/or AMs. be carried out either in a controlled lab setting or by using a crowdsourcing platform. Although lab studies afford the most Several studies compare NL and AM representations [14, 18, control over participant selection, participant attention, train- 7, 26, 30]. Ghoniem et al. [14] assessed the performance of ing, and the testing environment, they also incur a high cost both approaches using seven topology-based tasks on graphs of recruiting users, as well as an inherently limited participant of sizes between 20 and 100 nodes and densities from 20% pool [13]. Crowdsourcing platforms offer a potential solution to 60% of all possible edges. Interactivity was limited to to this problem by providing access to a much larger group selecting nodes and links. They found that NL outperformed of participants. The existing user performance work on evalu- AM in small and sparse graphs, but for larger or more dense ating interactive visualizations is characterized primarily by graphs, AM produced more accurate results. The exception validating a new approach comparing it with existing ones [4, is path-based tasks, where node-link diagrams outperformed 2], which is almost exclusively carried out in a lab setting with the adjacency matrix regardless of network size or density. In a smaller number of participants. Even though crowdsourced a follow-up study, Keller et al. [18] evaluated six tasks on a studies have been frequently used for performance evaluations domain-specific, directed network, using both NL and AM of the perceptual type (e.g., [35, 16, 15]), they are relatively representations. They confirmed the results of Ghoniem et scarce for evaluating interactive visualization techniques (e.g., al. for the investigated network types. Okoe et al. [26] also [26, 11]). This scarcity may stem from the perceived chal- reproduced the Ghoniem et al. study for larger networks (up to lenges of using a remote group of non-expert participants to 330 nodes), but for much sparser graphs with densities of 1.6% evaluate an interactive system — a topic we investigate in this to 3.2% of edges. They used a more diverse set of tasks in a work. Additionally, very little work has compared two or more large, crowdsourced study. Interactions included panning and existing complex, interactive techniques. zooming, moving nodes in the NL condition, and highlighting nodes and incident edges. Color was used to encode clusters that were detected algorithmically. Their work confirms earlier CHALLENGES findings on each technique and reveals that adjacency matrices The visualization community has embraced a wide set of perform better on cluster tasks. quantitative and qualitative evaluation methodologies, ranging from quantitative experiments conducted in a lab or on crowd- Ren et al. [30] compared NL with two sorting variants of AMs sourcing platforms, to qualitative studies, to insight based for networks of 20 and 50 nodes in a large, crowdsourced evaluation, to case studies [22]. In this paper, we conduct a study. They found that NL resulted in higher accuracy and crowdsourced study with two complex, interactive visualiza- faster response time, but that this difference diminished as tion techniques. Developing such techniques requires many participants became familiar with the visualizations. design decisions. How can we know that any effects we ob- Three studies evaluated approaches for visualizing two or more serve are not confounded by any of these design decisions? edge attributes. Alper et al. [3] found that for tasks involving Carpendale [6] describes three factors to consider: generaliz- the comparison of weighted graphs, AMs outperformed NLs. ability (can a study be applied to other people and situations), Abuthawabeh et al. [1] ran a user study comparing AMs with precision (can a

Evaluating Multivariate Network Visualization Techniques Using a Validated Design and Crowdsourcing Approach

Tuesday's Evening Fast Forward Session

Scientific Visualization

Data Visualization by Nils Gehlenborg

Pop-Out, Illusions, Interaction

Eurovis 2019 Eurographics / IEEE VGTC Conference on Visualization 2019

Particle-Based Sampling and Meshing of Surfaces in Multimaterial Volumes

New Faculty Members and Postdoctoral Fellows Spill the Beans

Miriah Meyer University of Utah Cs6964 | Jan 10 2012

Eurovis 2015 Eurographics Conference on Visualization 2015 Cagliari, Sardinia, Italy May 25 – 29, 2015

How Much Evaluation Is Enough? IEEE VIS 2013 Panel

Charles D. Hansen January 2, 2021

Reflections on How Designers Design with Data