<<

Editor: Viewpoints Theresa-Marie Rhyne

Pathways for Theoretical Advances in Visualization Min Chen Jessie Kennedy University of Oxford Edinburgh Napier University

Georges Grinstein Melanie Tory University of Massachusetts Tableau

Chris R. Johnson University of Utah

ore than a decade ago, Chris Johnson Visualization theory is commonly viewed as a re- proposed the “Theory of Visualization” search focus for only a few individual researchers. as one of the top research problems in Its outcomes, perhaps in the form of theorems or visualization.M 1 Since then, there have been several laws, are perceived as too distant from practice to be theory-focused events, including three workshops useful. Perhaps inspired by well-known theoretical and three panels at IEEE Visualization (VIS) Con- breakthroughs in the history of , visualiza- ferences. Together, these events have produced a tion researchers may unconsciously have high ex- set of convincing arguments: pectations for the originality, rigor, and significance of the theoretical advancements to be made in a ■■ As in all scientific and scholarly subjects, theo- research project or presented in a research paper. retical development in visualization is a neces- On the contrary, although textbooks tend to sary and integral part of the progression of the attribute major breakthroughs to single pioneers subject itself. at specific times and places, in most cases, such ■■ Theoretical developments in visualization can breakthroughs generally take years or even decades draw on theoretical advances in many disci- and are usually the product of numerous incremen- plines, including, for example, mathematics, tal developments, including a substantial number science, engineering science, psychol- of erroneous solutions suggested by the pioneers ogy, neuroscience, and the social . themselves as well as many lesser-known individu- ■■ Visualization holds a distinctive position, con- als. Many complex discoveries did not initially ap- necting human-centric processes (such as hu- pear to have elegant proofs, and it has taken some man perception, , interaction, and challenging and often questionable steps to obtain ) with machine-centric pro- the well-formulated solutions we find in modern cesses (such as , , and ma- textbooks. For example, most readers associate the chine learning). It therefore provides a unique theory of general relativity with Albert Einstein’s platform to conduct theoretical studies that may November 1915 discovery in Berlin. According impact on other disciplines. to Petro Ferreira,2 Einstein first speculated about ■■ Compared with many mature disciplines (such the generalization in 1907. He then published two as mathematics, physics, biology, psychology, and papers with Marcel Grossmann (Zurich) in 1913 philosophy), theoretical research activities in vi- that sketched out the theory and worked with sualization are sparse. The subject can therefore David Hilbert (Göttingen) on the problem in benefit from a significantly increased effort to June 1915. Some of the most important discover- make new theoretical advances. ies related to the theory of general relativity are

Published by the IEEE Computer Society 0272-1716/16/$33.00 © 2016 IEEE IEEE Computer and Applications 103 Visualization Viewpoints

Providing mathematical validation and numerical

Enabling measurement, abstraction, and deduction of laws

Suggesting Providing context casual relations and structure

Quantitative Taxonomies Principles Conceptual Theoretic laws and and ontologies and guidelines models frameworks theoretic systems Subset becoming a model Model becoming a system

Transforming to axioms, theorems, and conjectures

Providing de nitions, categorization, and suggested relations

Figure 1. A theoretical foundation typically evolves through iterative developments. The development of each aspect both influences and benefits from that of others. A successful transformation between different aspects indicates a theoretical enhancement of understanding. Note that the third aspect, “Conceptual Models and Theoretic Frameworks,” is represented by two boxes in the figure.

Mercury’s perihelion shift (Le Verrier, 1859), the In this article, we first review four major aspects 1919 eclipse expedition (Edington, Cottingham, of a theoretical foundation and then discuss the Crommelin, and Davidson), the evolving universe interactions and transformations between them. (Fredmann, 1922; Lemaître, 1927), the expanding Figure 1 provides an overview of the discourse in universe (Slipper, 1915; Lundmark, 1924; Hubble this article. and Humason, 1929), the big bang (Lemaître, 1931), and the black hole (Schwarzschild 1916, Taxonomies and Ontologies Chandrasekhar, 1935; Landau, 1938; Oppenheimer For millennia, humans have been classifying and his students, 1939). Some ill-fated solutions things in the world around them by describing also followed Einstein’s 1915 discovery, the most and naming concepts to facilitate communication notable of which were perhaps the static universe (see the “History of Taxonomy and Ontology” (Einstein and de Sitter) and the suspended uni- sidebar). The most significant and enduring effort verse (Eddington). is the classification of life on Earth, which com- During an Alan Turing Institute event in Lon- menced in Aristotle’s time, became mainstream don in April 2016 on the theoretical foundation of through the work of Linnaeus, and continues to , discussions on the need to build this day with new species being identified and al- such a theoretical foundation varied greatly, with ternative classifications of existing species being opinions ranging from “Visualization should not proposed.3 Alternative classifications taxonomies( ) be physics-envy” to “It is irresponsible for academ- arise over time as a result of differing opinions ics not to try.” After two days of presentations, about the importance of differentiating character- discussions, and debates, the attendees gradually istics used in creating the concepts (taxa). These converged on a common understanding that a differing opinions are usually the result of new theoretical foundation consists of several aspects becoming available, often through (see the “Major Aspects of a Theoretical Founda- technological advances, which can result in the tion” sidebar for more details) and that every visu- same organism being classified according to dif- alization researcher should be able to make direct ferent taxonomic opinions and subsequently hav- contributions to some aspect of the theoretical ing several alternative names, which may in turn foundation of visualization. lead to miscommunication. Newer classifications During the IEEE VIS Conference in Baltimore, are usually improvements on previous ones, but Maryland, in October 2016, a discussion panel took sometimes the existence of alternative classifica- this viewpoint further by outlining avenues for pur- tions reflects a disagreement as to how to interpret suing theoretical research in each aspect. This ar- the data on which the classification is based. ticle is a structured reflection by the panelists about Ontologies are representations of different rela- the discussions during that IEEE VIS 2016 panel. tionships among various concepts. Naturally, they

104 July/August 2017 Major Aspects of a Theoretical Foundation are built on the taxonomic classification of the concepts of both entities and relationships. Tax- ttendees at an Alan Turing Institute event in London in April 2016 onomies and ontologies are means of conceptu- Aon the theoretical foundation of visual analytics reached a con- alizing, understanding, organizing, and reasoning sensus that a theoretical foundation consists of the following aspects. about these entities and relationships. They are central to communicating about the world around Taxonomies and Ontologies us. They play an increasing role in understanding In scientific and scholarly disciplines, a collection of concepts are in the visualization field, allowing us to organize commonly organized into a taxonomy or ontology. In the former, and formalize our . concepts are known as taxa and are typically arranged hierarchically A brief review of the literature over the past three using a tree structure. In the latter, concepts, often in conjunction decades reveals at least 70 publications containing with their instances, attributes, and other entities, are organized some form of visualization taxonomy.3 Three ques- into a network, where edges represent various relations tions are relevant when considering visualization and rules. taxonomies: What is being classified (domain)? Why is the taxonomy being developed (purpose)? Principles and Guidelines How is the taxonomy constructed (process)? Tax- A principle is a law or rule that must be followed and is usually ex- onomies have been proposed to classify many as- pressed in a qualitative description. A guideline describes a process pects of visualization, including systems, tools, or a set of actions that may lead to a desired outcome or, alterna- techniques, interaction approaches, data types, tively, actions to be avoided to prevent an undesired outcome. The user tasks, visual encodings, input methods, and former usually implies a confidence in the high degree of generali- evaluation strategies. These aspects can be classi- ty and certainty of the causality concerned, whereas the latter sug- fied according to different criteria. For example, gests that a causal relation may be subject to specific conditions. visualization techniques can be classified by the analytical tasks they support, the visual encoding Conceptual Models and Theoretic Frameworks or used, the data type, or the domain in The terms models and frameworks have broad interpretations. which they are employed. Here we consider that a conceptual model is an abstract represen- The visualization community has found taxon- tation of a real-world phenomenon, process, or system, featuring omies useful in their research. Taxonomies offer different functional components and their interactions. A theoretic a shared vocabulary with which we can commu- framework provides a collection of measurements and basic opera- nicate effectively and reduce misunderstanding.4 tors and functions for working with these measurements. The They orientate us among the vast number of tech- former provides a tentative description of complex causal relations niques and tools that have already been developed, in the real world, and the latter provides a basis for evaluating dif- often across disparate domains. Taxonomies are ferent models quantitatively. therefore frequently adopted in literature surveys to categorize existing work. Furthermore, using Quantitative Laws and Theoretic Systems taxonomies as spaces can reveal novel re- A quantitative law describes a causal relation of concepts using a search opportunities, for example, by conducting set of measurements and a computable function confirmed under gap analysis. a theoretic framework. Under a theoretic framework, a conceptual In comparison, the term “ontology” appears model can be transformed into a theoretic system through axi- much less frequently in the visualization litera- oms (postulated quantitative principles) and theorems (confirmed ture. This is partly because some studies on on- quantitative laws). Unconfirmed guidelines are thus conjectures, tological relationships are presented as qualitative and contradictory guidelines are paradoxes. models. Because ontologies are typically described in ontology languages, such as OWL (Web On- tology Language) and RDF (Resource Description acterization and abstraction, selection of appropri- Framework), they can be used by algorithms in vi- ate visual encodings and interaction techniques, sualization systems. For example, ontologies can and formulation of data and information flows. In be used to generate annotations and filter or high- addition, taxonomies and ontologies provide the light visual objects automatically in visualization, basis for studying causal relationships, thereby fa- to enable the automated creation of visualization, cilitating the development of guidelines and quali- and to integrate keyword search and visual explo- tative models. ration in a .5 Building taxonomies and ontologies is an inves- For , taxonomies and ontologies play a tigative science because they often feature partial role in systemizing the design process and can be and evolving hypotheses. A number of consider- employed at multiple stages, such as domain char- ations therefore arise during the process, including

IEEE and Applications 105 Visualization Viewpoints

History of Taxonomy and Ontology

he term taxonomy comes from the Greek word taxis istence,” and “reality” as well as the categorization of the T(meaning “order” or “arrangement”) and the suffix concept and the relationships between different categories. -nomos (meaning “law” or “science”).1 Plato was among Taxonomy is often viewed as a subset of ontology, which the first to formulate methods for grouping objects based primarily considers the grouping relationships. Ontology on their similar properties. Aristotle wrote Categories, can be seen as a generalization of taxonomy by allowing for which provides an in-depth study of classes and objects. different types of relationships among different entities. Naming and classifying plants and animals dates back An ontology is a form of knowledge representation,3 to the origin of human languages. The development of where entities are defined with names, types, properties, modern botanical and zoological taxonomy is often at- and different relationships with other entities. Its applica- tributed to Carl Linnaeus (1707–1778), a Swedish botanist, tions in include , the who defined many of the rules that taxonomists use today. Semantic Web, biomedical informatics, science, sys- The development of taxonomy in biology facilitated the tems engineering, , and many more. paradigm shift in the 19th century when the theory of The methodology has also been used in visualization.4 evolution was proposed. The automatic construction of a hierarchical categoriza- tion scheme began in the 1960s with applications such as References decision-tree based classification, computational phyloge- 1. P.F. Stevens, The Development of Biological Systematics, netics, and topic analysis in text mining. Columbia Univ. Press, 1994. The term ontology comes from the Greek prefix onto- 2. J.F. Mora, “On the Early History of ‘Ontology,’” Philosophy and (meaning “being” or “that which is”) and suffix -logia Phenomenological Research, vol. 24, no. 1 1963, pp. 36–47. (meaning “logical discourse,” “study,” or “theory”).2 3. J.F. Sowa, Conceptual Structures: Information Processing in The term ontologia first appeared in the works by German Mind and Machine, Addison Wesley, 1984. philosophers Jacob Lorhard (1606) and Rudolf Göckel 4. O. Gilson, et al., “From Web Data to Visualization via (1613). It refers to the philosophical study of the concept of Ontology Mapping,” Computer Graphics Forum, vol. 27, no. “being” and its variants—for example, “becoming,” “ex- 3, 2008, pp. 959–966.

determining the subpopulation to study; identify- to avoid in achieving a goal. Guidelines are com- ing the characteristics used to define a class, a re- monly outlined based on accumulated lation, or the of specificity; comparing the and knowledge about some causal relations in a importance of different characteristics; differen- process. It takes courage and conviction to propose tiating among various terms used for specifying a new guideline. And it takes a lot more courage characteristics; selecting the effective visualiza- and fair-mindedness to accept critiques about the tion techniques for visualizing large taxonomies proposed guideline and then retract or refine it. and ontologies; automatically generating a taxon- Some guidelines stand the test of time and be- omy or ontology from text analysis of visualization come principles. Many others are effective in only literature; and automatically evolving a taxonomy specific circumstances. Because of the qualitative or ontology using . nature of framing guidelines and the typically Taxonomies and ontologies are fundamental tools self-directed mechanism for creating and evolving that help with understanding, communication, and guidelines, now and then some may be defined development in the visualization field. Still, a num- without rigorous care, generalized beyond their ber of challenges and open questions remain: Can intended application, become out of date, or con- we define a methodology for creating, comparing, flict with other guidelines. Many documents about and integrating taxonomies and ontologies? At guidelines often contain a disclaimer: “By defini- what levels and granularity should taxonomies or tion, following a guideline is never mandatory. ontologies be specified? How do we select one or Guidelines are not binding and are not enforced.”6 more taxonomies (or ontologies) for our work? The In many disciplines, such as biology and medi- visualization field continues to change, so taxono- cine, guidelines have played an indispensable mies and ontologies must evolve as well. We must role and are rigorously evaluated, critiqued, and continue to improve their construction and use. maintained. In other disciplines, such as physics, chemistry, and engineering, old wisdoms have Principles and Guidelines gradually been transformed into quantitative A guideline embodies a wisdom advising a sound laws and quantitative process management. In practice. This may be a course of action to take or the visualization field, guidelines have no doubt

106 July/August 2017 Examples of Visualization Guidelines played a positive role in designing and developing visualization systems as well as in education. (See n the visualization field, various books, research papers, and the “Examples of Visualization Guidelines” side- Ionline media recommend several hundred different guidelines. bar for more details.) For example, Here we offer just a subset of examples: and her colleagues considered guidelines to be an integral part of an agile process for developing ■■ Maximize the data-ink ratio: E.R. Tufte, The Visual Display of visual and visualization systems, help- Quantitative Information, Graphics Press, 1983, p. 93. ing designers make choices.7 Torre Zuk and his ■■ Overview first, zoom and filter, then details on de- colleagues argued that guidelines can be used as mand: B. Shneiderman, “The Eyes Have It: A Task by Data Type heuristics for evaluating visual designs and visu- Taxonomy for Information Visualizations,” Proc. IEEE Symp. Visual alization systems.8 Languages, 1996, pp. 336–343. These recommendations inevitably place a huge ■■ Rainbow color guidelines: B.E. Rogowitz and L.A. burden on the correctness and effectiveness of Treinish, “: The End of the Rainbow,” IEEE guidelines. If visualization guidelines are going to Spectrum, vol. 35, no. 12, 1998, pp. 52–59, and D. Borland and play a pivotal role, as these researchers suggested,7,8 R.M. Taylor II, “Rainbow Color Map (Still) Considered Harmful,” we will need to take several steps: IEEE Computer Graphics & Applications, vol. 27, no. 2, 2007, pp. 14 –17. ■■ develop mechanisms for curating, evaluating, ■■ 10 guidelines for data visualization: C. Kelleher and T. critiquing, and refining guidelines in an open Wagener, “Ten Guidelines for Effective Data Visualization in and transparent manner; Scientific Publications,” Environmental Modelling & Software, vol. ■■ establish a culture of open, democratic, evidence- 26, no. 6, 2011, pp. 822–827. based discourse on the guidelines and enable ■■ 14 guidelines for data visualization: schoolofdata. broader participation in the discourse beyond org/2013/04/26/data-visualization-guidelines-by-gregor-aisch the current scale of a few papers and blogs; and -international-journalism-festival/. ■■ inspire researchers to study guidelines, includ- ■■ Six guidelines for creative visualization: www.tut.com/ ing their evolution and applicability in differ- article/details/12-6-guidelines-for-creative-visualization/?articleId ent conditions using scientific methods, and =12. when appropriate opportunities arise, transform guidelines into quantitative laws and process management. lationships. By pursuing both positive and negative case studies and undertaking continuous com- Social scientists have established research meth- parative analysis, we facilitate the evaluation, cri- ods for collecting and analyzing qualitative data in tique, revision, and improvement of guidelines. By order to infer concrete theoretical insights, which enabling the curation of a relatively complete and include taxonomies, ontologies, guidelines, and coherent set of causal relationships functioning in conceptual models. One such method is grounded a system, we help establish a conceptual model. theory.9 It involves observing practical phenomena in the wild (to ground the theory in real-world Conceptual Models and Theoretic data), identifying categories of the instances Frameworks (events, processes, occurrences, participants, and A conceptual model can be a representation of so on), making links between categories, and estab- an , process, or system. It is typically used to lishing relationships between them. The method describe and explain the causal relationships ex- utilizes descriptive labeling (referred to as coding) hibited in phenomena in a physical, biological, to conceptualize discrete instances of phenomena economic, social, or any other type of system that systematically. It advocates continuous compara- may be intuitively observable, cannot be experi- tive analysis and negative case analysis to ensure enced directly, or may be totally hypothesized. the coding is comprehensive, meticulous, and up The descriptions of many models are accompa- to date. It encourages researchers to interact with nied by visual representations that help link concep- data by asking questions, broadening the sampling tualization with observation. The physicist Richard by exploring related phenomena, and writ- Feynman created new visual abstractions of the ing memos. physics and mathematics of quantum electrody- By enabling categorization and relationship dis- namics so that he could more easily about covery, the grounded theory method can facilitate the complex mathematics.10 Feynman famously had the development of visualization taxonomies and his van painted with his of the interac- ontologies by supporting the analysis of causal re- tions of subatomic particles (see Figure 2).

IEEE Computer Graphics and Applications 107 Visualization Viewpoints

tools can be derived. Another example is a human cognition model for visualization.12 Based on human ergonomics and cognitive psy- chology, the model defines human leverage points, where cognitive experiments can be conducted for quantitative and qualitative evaluation of visu- alizations. Similarly, sense-making models have played an important role in supporting the design of interactive analysis tools.13 Hence, building correct and effective conceptual models for visualization must be an endeavor on the part of the visualization community. Learn- ing from other disciplines, we must significantly Figure 2. Richard Feynman’s 1975 Dodge van. Feynman had the behavior increase our efforts in experimentation, theoriza- model of subatomic particles painted on the sides. (Courtesy of tion, and computational and validation. ArtCenter College of Design.) Experimentation and Qualitative Theorization In most disciplines, model development has been The visualization literature includes more than 40 a driving force for progression. It fuels and guides empirical studies for studying human perception the advancement of a subject by enabling ab- and cognition in visualization as well as more straction, proposition, prediction, and validation than 40 others for comparing different visualiza- (using experimentation, mathematics, and com- tion techniques. In addition, through numerous putation). Models are central to what researchers application case studies, visualization researchers do, both in their research and when communi- have had firsthand experience observing a variety cating their explanations. The development of the of data, visualization systems, analytical tech- standard model in particle physics was a collective niques, interaction methods, human perception effort of scientists around the world throughout and cognition, user tasks, and application con- the second half of the 20th century. In the same texts in the wild. way, the discovery of the double helix model of These empirical studies and application case DNA was an iterative research endeavor in the studies provide opportunities to formulate new early 1950s. Many intermediate steps, ranging models, perform continuous comparative analysis, from the partial model alpha helix and the in- probe negative experience, critique and improve correct triple helix model by Linus Pauling to x- existing models, broaden theoretical sampling, and ray diffraction experiments by Rosalind Franklin explore model unification and theoretical satura- and others, paved the way for James Watson and tion, all of which are advocated by the grounded Francis Crick to formulate the landmark model theory methodology we mentioned earlier. Rig- in biology. orously building and analyzing qualitative mod- In the visualization field, researchers have pro- els will inevitably motivate further theorization posed more than a dozen conceptual models for through the development of quantitative models. describing the relationships among data, visual- ization systems, analytical techniques, interac- Quantitative Theorization tion methods, human perception and cognition, In many applications, especially in the physical user tasks, and application contexts.3 The goal of sciences, models are often formulated using a par- such models is to help us describe, understand, ticular mathematical framework. For example, in reason about, and predict what people can do in physics, Newton invented calculus (also credited a visualization process and environment, which to Leibniz) to underpin classic mechanics. Ein- actions might lead to which results in given cir- stein used Riemannian to underpin his cumstances, and which workflow is more efficient general theory of relativity. Today, we commonly or effective than others. see publications entitled mathematical framework For example, a personal visualization model for X for model Y. In some situations, a model Y may fitness tracker data11 helped explain why the on- itself have evolved into an elegant mathematical calendar visualization approach was more effec- framework that can be used to underpin other tive than a traditional fitness feedback tool, and models. For example, , which more importantly, it provided a theoretical basis is underpinned by theory, has become from which general design guidelines for behavior a fundamental framework for telecommunication,

108 July/August 2017 data communication, data compression, and data to numerical certainty.18 As we discussed earlier, encryption. any visualization guideline that has stood the test Several mathematical frameworks have been of time should be regarded as a principle. Further- proposed for underpinning quantitative theoriza- more, any principle in visualization can be formu- tion in visualization, including information the- lated and proved under a theoretical framework. ory14 and algebra.15 Naturally, we hope that some (See the “An Example Theoretic System: Probabil- qualitative models in the visualization literature ity Theory” sidebar for an example.) For example, can be described using such a framework with part of ’s guideline “overview quantitative measurements, which may not be first, zoom, then details on demand” was proved quite accurate initially. Lack of accuracy does not using information theory (including an anomaly always mean wrong, however. We must remem- investigation).14 The filtering part of the guideline ber that Newton’s first law of motion could not likely requires a more complex proof because de- be fully validated until the technology for creat- fining the filtering that would result in desired de- ing the conditions for a vacuum became available. tails might require additional variables. Having errors is not always unhelpful. We must In many disciplines, some laws have parameters remember that the discrepancy between the pre- that may be constants. The discovery of such fun- diction of Newtonian gravity and the observed or- damental constants (such as the speed of light and bit of Mercury inspired the discovery of the theory of general relativity. Computational science now constitutes the Computational Simulation and Validation In most disciplines where visualization techniques “third pillar” of scientific inquiry, enabling are routinely deployed, together with experimenta- researchers to build and test models of tion and theorization, computational science now constitutes the “third pillar” of scientific inquiry, complex phenomena. enabling researchers to build and test models of complex phenomena. Advances in and connectivity make it possible to capture, analyze, absolute zero temperature) transforms postulated and develop computational models for unprec- laws into truly quantitative laws. Discovering val- edented amounts of experimental, observational, ues that would fit such parameters often requires and simulation data to address problems previously extensive experimentation. For example, in psy- deemed intractable or beyond imagination.16 chology, Fitts’s law has two parameters that vary Once we have quantitative models of visualiza- according to the choice of input device, and Ste- tion phenomena and processes, we can simulate vens’ law also has two parameters that vary ac- such models computationally, validating them cording to the choice of physical stimulus. These against experimental results and making predic- parameters suggest that a more general quantita- tions about causal relations in a visualization pro- tive law may be hidden underneath. For example, cess. For example, we can model the relationships if Newton’s second law of motion had used vol- among volume datasets, volume-rendering algo- ume instead of mass, it would have required an rithms, and resultant imagery data. The model object-dependent parameter that we now know can be used to predict the discretization errors, as density. Worse, if it were surface area instead order of accuracy, and convergence performance of mass, one would need more object-dependent as well as verify if they meet the requirements of parameters. the application concerned.17 The cognition litera- The visualization discipline provides great oppor- ture shows that human observers’ perception er- tunities for postulating parameterized laws and for rors may not linearly correlate with discretization discovering values for such parameters in different errors, so it would be exciting to extend such a scenarios. From such discoveries, we could poten- model to include more elements of human percep- tially make more fundamental leaps in our under- tion and cognition. standing as long as we continue to investigate the causes of the unattractive parameterization. Quantitative Laws and Theoretic Systems When a number of quantitative laws share a In all branches of science, many quantitative laws common measure space that includes all variables are regarded as disruptive discoveries because they to be measured and all measurement functions, represent great leaps in our understanding about they indicate the existence of a theoretical system, causal relationships from numerical uncertainty where new quantitative laws can be inferred from

IEEE Computer Graphics and Applications 109 Visualization Viewpoints

An Example Theoretic A Skeleton of a Theoretic System for System: Probability Theory Visualization

(Ω, E, P) is a measure space, where Ω is the (W, Q, X) is a measure space, where W is the sample space, Q is a sample space, E is the event space, and P(e) is state space defined by a subset of all possible alphabets in visualiza- the probability measure of an event e ∈ E. tion (such as data (D), task (T), medium (M), visual representation (V), human capability (H), and interaction (I)), and X is a subset of Axioms all possible measures in visualization (such as probability, mutual 1. The probability of an event is a nonnegative information, accuracy, time, cognitive load, error, and uncertainty). real number: P(e) ∈ R, P(e) ≥ 0, ∀e ∈ E. 2. The probability that at least one of the el- Axioms ementary events in the entire sample space 1. It may be defined based on a principle (that has stood the test will occur is 1: P(W) = 1. of time), and it cannot be deduced from other axioms. 3. Any countable sequence of mutually exclu- 2. …

sive events (e1, e2, …) satisfies the following: Pe∞ ∞ Pe . Example Law: Optimal Visual Representation ()∪ i=11i = ∑ i= ()i Let v ∈ V be a visual representation, where v is optimal under Example Law: Monotonicity a particular goodness measure M ∈ X. Let S be the state space

If EA is a subset of or equal to EB, then the prob- based on all variables Q – {V}—that is, the subset of Q without

ability of EA is less than or equal to the probabil- the visual representation V. With appropriate definitions of M and

ity of EB. That is, if EA ⊆ EB, then P(EA) ≤ P(EB). S, we have M(v, s) ≥ M(w, s), ∀w ∈ V, ∀s ∈ S.

existing ones. (See the “A Skeleton of a Theoretic while fixing other variables to a set of constants System for Visualization” sidebar for an example.) related to a scenario. We could also identify prin- In mathematics, axiomatization has been one of ciples applicable to such a scenario and use them to the driving forces in discovering rich axiomatic formulate axioms and laws. Then, we could derive systems, each of which is underpinned by a set new laws based on existing axioms and laws in the of primitive axioms. Historically, the early ef- system and test these new laws using experimenta- forts that aimed to derive a self-complete axiom- tion and simulation. Any negative testing results atic system motivated many innovations (such as will motivate further investigations into the theo- in geometry), but they often failed to achieve the retical system itself as well as the experimentation aim itself. Such failures led to Gödel’s incomplete- and simulation methods, yielding new improve- ness theorems, which confirmed that such a self- ments and advancements. New laws derived and complete axiomatic system is unattainable for any confirmed in this way can be disseminated as new slightly complex theoretical system. Nevertheless, guidelines in practice. discovering axioms in the theoretical system is a The development of small theoretical systems will noteworthy achievement in itself as long as we naturally lead to new advancements through inte- are aware of the axioms’ limitations. Such a dis- gration and unification. For example, one theoretical covery is analogous to the pursuit of curating, system may focus on cognitive load in its measure evaluating, critiquing, and revising guidelines to space, and another may focus on training costs. discover principles. Their unification would result in a more elegant and One challenge in formulating a theoretical sys- applicable theoretical system. We can expand our tem for visualization is that there appear to be horizons in the endeavor to build theoretical sys- many variables in a visualization process, such as tems for visualization, for example, addressing the the source datasets, visualization tasks, display me- relationships between visualization and emotions, dia, interaction devices, human viewers’ knowledge aesthetics, language, social objects, or ethics. and experience, interaction actions, application contexts, and so on. Some measurements are more Building a Theoretical Foundation attainable, such as data size, accuracy, and time. The visualization field has already seen more Other measurements may be problematic in terms than 100 research papers on different aspects of of their theoretical conceptualization or practical a theoretical foundation for visualization. A re- implementation, such as information, knowledge, cent keyword search using the term “visualization cognitive load, and task performance. Neverthe- theory,” for example, returned a wide variety of less, a theoretical system can be built bit by bit. topics. Intriguingly, all returned items contained We might start with a subset of these variables, the word “measure” or variants of it. All included

110 July/August 2017 some ordered or numerical measurements, such uilding a theoretical foundation for visual- as reliability, accuracy, correctness, limits, or op- Bization is the collective responsibility of the timality. Some papers discussed these measure- visualization community. In the literature, hun- ments in the context of a framework, a model, dreds of authors have already contributed to dif- or some form of a theory, and most included the ferent aspects of the foundation. The visualization word “quantify” or its variants. In addition to tra- community has demonstrated its ability in for- ditional quantities such as accuracy, precision, and mulating taxonomies, proposing guidelines, and time, the search results revealed some ambitious creating models. It possesses the unparalleled attempts to measure particular forms of human experience of working with a spectrum of visual- insight, understanding, performance, creativity, ization users and has accumulated much insight knowledge, cognitive load, learning, confidence, about the cost-benefits of many visualization and and many other attributes. A similar search of the visual analytics workflows in different applica- visualization community returned more than 200 tions. Through collaboration, the community has individual authors within the community. acquired knowledge for empirical studies, math- Building a theoretical foundation should not be ematical modeling, and computational simulation equated with creating a theory. Theoretical research is about creating new fundamental knowledge in each aspect and about making transformations, Building a theoretical as shown in Figure 1. Taxonomies are essential for identifying all concepts (variables) and their foundation for visualization is states (values) in visualization. Ontologies are the collective responsibility of essential for identifying the interactions among these concepts (functions and relational vari- the visualization community. ables). Under the contextual framework of tax- onomies and ontologies, guidelines and principles postulate causal relationships. By organizing a and is continuing to learn new skills. collection of causal relationships coherently in an The community needs to build its confidence in ontology that may also define other relationships, directing a new generation of research students we can establish a qualitative model. In return, and postdoctoral researchers to tackle fundamen- the development of a model informs us of any tal problems. Perhaps reviewers need to adjust need for a new concept in a taxonomy or a new their expectations of novelty to reflect the actual relationship in an ontology, while motivating us theoretical research activities of other scientific to discover new guidelines or study the conflicts disciplines. For example, arvix.org lists 6,202 arti- of guidelines. The grounded theory method and cles in 2016 alone in the category of “High Energy other research methods in social sciences can help Physics – Theory.” The collective effort to build us achieve such transformations methodologically a theoretical foundation in physics is enormous, and systematically. making any significant breakthroughs much less Using a quantitative theoretic framework, we romantic than portrayed by the media. can transform a qualitative model into a quan- Making significant theoretical advances will titative model, providing opportunities for model lead to significant advances in practical visualiza- validation using experiments and computational tion applications. For example, we all talk about simulation. Similarly, guidelines and principles can “design” as an action in practice. A design space be quantitatively defined, leading to a more formal is commonly defined by a taxonomy or ontol- approach to defining causal relationships in visu- ogy. Most guidelines are proposed for improving alization. When a quantitative model is structured designs. Most models suggest that designs or de- as a theoretical system, we can infer new laws and sign processes can be optimized. When we have prove or disprove a postulated law (for example, mathematically proven the correctness of a design formulated based on a guideline) using existing guideline, this implies that the guideline must be axioms and laws in the system. A quantitative obeyed in practice under the conditions defined by model, law, or theoretical system is predictive and the corresponding quantitative law. therefore falsifiable. In turn, developing theoreti- We hope every visualization researcher can find cal systems and investigating their extension and at least one pathway in this article through which unification will stimulate new taxonomies, ontol- to explore unanswered questions, known prob- ogies, guidelines, and models, thereby enriching lems, and identified deficiencies in the theoreti- visualization’s theoretical foundation. cal foundation of visualization. No doubt, there

IEEE Computer Graphics and Applications 111 Visualization Viewpoints

are other pathways featuring unasked questions, Competitiveness, President’s Information Technology unknown problems, and unidentified deficiencies. Advisory Committee (PITAC), Jun. 2005; www Like any research, building a theoretical founda- .nitrd.gov/pitac/reports/20050609_computational/ tion for visualization presents many challenges. It computational.pdf. may not be all smooth sailing. We must always 17. T. Etiene et al., “Verifying respect such challenges “in theory,” but we should Using Discretization Error Analysis,” IEEE Trans. never be afraid of them “in practice.” Visualization and Computer Graphics, vol. 20, no. 1, 2014, pp. 140–154. 18. M. Chen et al., “From and Visualization References to Causality Discovery,” Computer, vol. 44, no. 10, 1. C. Johnson, “Top Research 2011, pp. 84–87. Problems,” IEEE Computer Graphics & Applications, vol. 24, no. 4, 2004, pp. 13–17. 2. P.G. Ferreira, The Perfect Theory, Little, Brown, 2014. Min Chen is a professor at the University of Oxford. Con- 3. M. Chen et al., “References for Theoretic Researches tact him at [email protected]. in Visualization,” 2017; sites.google.com/site/ drminchen/themes/theory-refs. Georges Grinstein is a research professor at University of 4. H.-J. Schulz et al., “A Design Space of Visualization Massachusetts. Contact him at [email protected]. Tasks,” IEEE Trans. Visualization and Computer Graphics, vol. 19, no. 12, 2013, pp. 2366–2375. Chris R. Johnson is a distinguished professor of computer 5. S. Carpendale et al., “Ontologies in Biological science and directs the Scientific Computing and Data Visualization,” IEEE Computer Graphics & (SCI) Institute at the University of Utah. Contact him at Applications, vol. 34, no. 2, 2014, pp. 8–15. [email protected]. 6. US Dept. of Veterans Affairs, “TRM Glossary,” www. va.gov/trm/TRMGlossaryPage.asp. Jessie Kennedy is a professor at Edinburgh Napier Univer- 7. M. Meyer et al., “The Nested Blocks and Guidelines sity. Contact her at [email protected]. Model,” Information Visualization, vol. 14, no. 3, 2015, pp. 234–249. Melanie Tory is a senior research scientist at Tableau Soft- 8. T. Zuk et al., “Heuristics for Information Visualization ware. Contact her at [email protected]. Evaluation,” Proc. AVI Workshop on Beyond Time and Errors: Novel Evaluation Methods for Information Contact department editor Theresa-Marie Rhyne at Visualization (BELIV), 2006, pp. 1–6. [email protected]. 9. B. Glaser and A.L. Strauss, The Discovery of Grounded Theory: Strategies for Qualitative Research, Aldine Transactions, 1967. 10. D. Kaiser, Drawing Theories Apart, Univ. of Chicago Press, 2005. 11. D. Huang, M. Tory, and L. Bertram, “A Field Study of On-Calendar Visualizations,” Proc. 42nd Graphics Interface Conf. (GI), 2016, p. 3. 12. R.E. Patterson et al., “A Human Cognition Framework for Information Visualization,” Computer Read your subscriptions through & Graphics, vol. 42, 2014, pp. 42–58. the myCS publications portal at 13 P. Pirolli and S. Card, “The Sensemaking Process http://mycs.computer.org. and Leverage Points for Analyst Technology as Identified through Cognitive Task Analysis,” Proc. Int’l Conf. on Intelligence Analysis, vol. 5, 2005, pp. 2–4. 14. M. Chen et al., Information Theory Tools for Visit CG&A on Visualization, AK Peters/CRC Press, 2016. 15. G. Kindlmann and C. Scheidegger, “An Algebraic the Web at Process for Visualization Design,” IEEE Trans. www.computer.org/cga Visualization and Computer Graphics, vol. 20, no. 12, 2014, pp. 2181–2190. 16. D. Reed et al., Computational Science: Ensuring America’s

112 July/August 2017