The Pennsylvania State University

The Graduate School

Department of

EXPLICITLY REPRESENTING GEOGRAPHIC CHANGE IN ANIMATIONS

WITH BIVARIATE SYMBOLIZATION

A Thesis in

Geography

by

M. Thomas A. Auer

 2009 M. Thomas A. Auer

Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science

August 2009

ii

The thesis of M. Thomas A. Auer was reviewed and approved* by the following:

Alan M. MacEachren Professor of Geography Thesis Advisor

Cynthia A. Brewer Professor of Geography

Karl Zimmerer Professor of Geography Head of the Department of Geography

*Signatures are on file in the Graduate School

iii

ABSTRACT

Animated provide an intuitive method for representing univariate time-series data,

but often fail in presenting additional relevant information saliently, making recognition of

certain patterns difficult. Using a second in animations to represent the magnitude

of change between time states has been suggested as an effective method for enabling users to

more easily recognize patterns of change in a geographic time-series. This work seeks to answer the question: Does explicitly representing geographic change in animated maps enable users to answer questions about patterns of change easily? To address this research question, bivariate symbols (with both the value of the data and the magnitude of change between time frames represented) were created and tested. Selective attention theory (SAT) was used in selecting bivariate symbol types (separable and integral). Domain analysis with experts from the Avian

Knowledge Network (AKN) was performed to determine appropriate map reading tasks for use in task-based experiments using AKN data. Combined with existing task typologies, material from the domain analysis helped form a new task typology of movement patterns found in aggregated spatiotemporal point data. Formal task-based experiments followed, where participants were

placed into one of five experiment groups (each using a different symbol) and asked to perform

the same series of statement agreement and certainty ratings while studying map animations.

Results show that aside from questions explicitly about change, univariate non-change

symbolization may be most appropriate. Future studies should focus on testing different data

relationships (independent, interdependent, or unrelated) with symbol variations that may have

different attention behaviors as predicted by SAT. The results presented here improve the

understanding of whether explicit change symbolization helps elucidate geographic time-series

patterns or hinders the overall effectiveness of map animation.

iv

TABLE OF CONTENTS

LIST OF FIGURES ...... viii

LIST OF TABLES ...... xi

ACKNOWLEDGEMENTS ...... xii

1 Introduction ...... 1

1.1 Exploratory ...... 2 1.1.1 Map Animations ...... 3 1.2 Avian Knowledge Network Data ...... 4 1.2.2 Visualization Demand ...... 4 1.3 Representing Geographic Change ...... 5 1.3.1 Understanding Complex Time-Series Patterns ...... 5 1.3.2 Relevant Monitoring Tasks ...... 6 1.4 Questions ...... 6 1.4.1 Main Question ...... 7 1.4.2 Secondary Question ...... 7 1.4.3 Tertiary Question ...... 7 1.5 Selective Attention Theory ...... 8 1.5.1 Bivariate Symbolization ...... 8 1.6 Approach & Expectations ...... 9 1.6.1 Domain Analysis ...... 9 1.6.1.1 Expectations ...... 10 1.6.2 Task-based Experiment ...... 10 1.6.2.1 Expectations ...... 11 1.7 Implications ...... 11 1.8 Thesis Overview ...... 12

2 Literature Review...... 14

2.1 Animated Map Use ...... 14 2.1.1 Origins ...... 15 2.1.1.1 Dynamic Variables ...... 15 2.1.1.2 Static Small-Multiples ...... 16 2.1.2 Critiques of Animation Use ...... 17 2.1.3 Appropriate Reading Tasks ...... 19 2.1.4 Typologies & Change Monitoring ...... 20 2.2 Representing Change ...... 22 2.2.1 Background ...... 22 2.2.2 MapTime ...... 23 2.3 Bivariate Point Symbols ...... 24 2.4 Selective Attention Theory ...... 25 2.4.1 Nelson’s Experiments ...... 26 2.5 Experiment Design ...... 27

v

2.5.1 Domain Analysis ...... 27 2.5.1.1 Focus Groups ...... 28 2.5.1.2 Target Users ...... 29 2.5.2 Task-based Experiment ...... 31 2.5.2.1 Guiding Examples ...... 32

3 Domain Analysis & Task Typology Formation ...... 33

3.1 Domain Analysis ...... 33 3.1.1 Domain Experts ...... 34 3.1.2 Focus Group Session ...... 34 3.1.2.1 Protocol ...... 35 3.1.2.2 Animated Map Stimuli ...... 36 3.1.2.3 Outcomes ...... 38 3.2 Focus Group Session Coding & Task Typology Formation ...... 40 3.2.1 Focus Group Material ...... 40 3.2.2 Task Coding ...... 41 3.2.3 Existing Typology Application ...... 43 3.2.3.1 Wehrend ...... 44 3.2.3.2 Blok ...... 44 3.2.3.3 Andrienko et al...... 45 3.2.4 Task Categorization & Structure ...... 46 3.2.4.1 Task Levels & Components...... 48 3.2.4.2 Space, Time, and Attribute Use...... 50 3.2.4.3 Visualization Goals...... 51 3.2.4.4 Example Tasks & Example Accomplishments...... 52 3.2.4.5 Format...... 53 3.3 A Task Typology for the Movement Patterns of Aggregated Point Data ...... 54 3.3.1 Generalization ...... 54 3.3.2 Task Accomplishment ...... 55 3.3.3 Sub-tasks ...... 56 3.3.4 Outline ...... 57 3.3.5 Description of Task Types ...... 57 3.4 Summary ...... 63

4 Task-based Experiment ...... 65

4.1 Methodology ...... 65 4.1.1 Symbol Selection...... 66 4.1.2 Animated Map Design ...... 74 4.1.2.1 Data Handling ...... 74 4.1.2.2 Classification ...... 75 4.1.2.3 Temporal Legends ...... 76 4.1.2.4 Interactivity ...... 76 4.1.2.5 Tweening ...... 77 4.1.2.6 Smoothing ...... 78 4.1.2.7 Animation Frame Rate ...... 78 4.1.3 Task Selection & Response Variables ...... 79

vi

4.1.3.1 Task Selection ...... 79 4.1.3.2 Response Variables ...... 81 4.1.3.3 Experiment Structure ...... 82 4.1.4 Experiment Design ...... 86 4.1.4.1 Participants ...... 86 4.1.4.2 Format ...... 88 4.1.4.3 Experiment Data Set Generation ...... 92 4.1.5 Execution ...... 93 4.1.5.1 Pilot Experiment ...... 93 4.1.5.2 Main Experiment ...... 94 4.1.6 Analysis ...... 95 4.1.7 Predictions ...... 95 4.1.7.1 Tasks about trend in Value (6 questions) ...... 95 4.1.7.2 Tasks about trend in Geography (6 questions) ...... 96 4.1.7.3 Tasks about trend in Change (6 questions) ...... 96 4.1.7.4 Tasks about Value versus Geography Dominance (6 questions) ...... 97 4.2 Results ...... 97 4.2.1 Main Study ...... 98 4.2.1.1 Participant Culling ...... 98 4.2.2 Analysis ...... 100 4.2.2.1 True and False Statements ...... 101 4.2.3 Effects ...... 103 4.2.3.1 Symbol Group Effects ...... 104 4.2.3.2 Symbol Effects by Task ...... 107 4.2.3.3 Task Effects ...... 111 4.2.3.4 Univariate/Bivariate Effects ...... 117 4.2.3.5 Value versus Geography Variation Effect ...... 119 4.2.3.6 Increasing/Decreasing Trends vs. Remaining About the Same Effects ...... 121 4.2.4 Limitations ...... 124 4.2.5 Summary ...... 126

5 Conclusions ...... 128

5.1 Summary of Findings ...... 128 5.1.1 Typology Formation through Domain Analysis ...... 129 5.1.2 Task-based Experiment ...... 129 5.2 Impacts ...... 130 5.2.1 Typology Use ...... 130 5.2.2 Attribute Value and Geography Dimensions Focus in Animated Maps ...... 131 5.2.3 Bivariate Animated Mapping ...... 131 5.2.4 Selective Attention Theory ...... 132 5.2.5 Domain Specific Knowledge ...... 133 5.3 Future Research ...... 133 5.3.1 Tasks ...... 133 5.3.2 Data Relationships and Redundancy ...... 134 5.3.3 Animation Speeds ...... 135 5.4 Conclusion ...... 135

vii

References ...... 136

Appendix A Focus Group Session Script ...... 142

Appendix B Focus Group Session Animated Map Stimuli ...... 145

viii

LIST OF FIGURES

Figure 3-1: An Example Screenshot of the Coding Table ...... 41

Figure 3-2: Typology Formation Work Flow ...... 47

Figure 3-3: The mapped relationship between descriptive statements and tasks...... 49

Figure 3-4: A graphical depiction of the flow required for accomplishing a task...... 56

Figure 4-1: Legend for Bivariate Separable Symbol using Orientation (change) and Hue (data value)...... 68

Figure 4-2: Legend for Bivariate Integral Symbol using Saturation and Value (data value) and Hue (change) ...... 70

Figure 4-3: An example map using the Bivariate Integral symbol in the digital survey setting ...... 71

Figure 4-4: An example map using the Bivariate Separable symbol in the digital survey setting ...... 71

Figure 4-5: Example static AKN map (eBird 2009) ...... 72

Figure 4-6: Legend for Univariate Non-Change Symbol using Color Value and Saturation (data value) ...... 73

Figure 4-7: Legend for Univariate Change Symbol Using Hue ...... 73

Figure 4-8: Legend for Univariate Change Symbol Using Orientation ...... 73

Figure 4-9: Temporal Smoothing Mask given by Monmonier (1996) ...... 78

Figure 4-10: A screenshot of right side of the digital survey interface that participants used to provide rankings (Please see figure 4-14 for full interface screenshot). The middle of the screen was intentionally blank...... 82

Figure 4-11: A screenshot of the screen in the digital survey that thoroughly explains a participants’ legend ...... 88

Figure 4-12: Screenshot of the trial pre-test examples for the Univariate Non-Change Symbol ...... 90

Figure 4-13: Screenshot of the trial pre-test examples for the Bivariate Separable Symbol ... 90

Figure 4-14: A screenshot of the practice animation example where participants rank their agreement ...... 91

ix

Figure 4-15: Histogram of agreement ratings for true statements. 0 = Completely Disagree, 10 = Completely Agree...... 101

Figure 4-16: Histogram of agreement ratings for false statements. 0 = Completely Disagree, 10 = Completely Agree...... 101

Figure 4-17: Main Effect of True and False Statements on Agreement Difference showing a higher mean agreement difference for false statements...... 102

Figure 4-18: Main Effect of True and False Statements on Certainty Rating showing a lower mean certainty rating for false statements ...... 102

Figure 4-19: Main Effect of True and False Statements on Response Time showing a slower mean response time for false statements...... 102

Figure 4-20: Main Effect of Symbol on Agreement Difference ...... 105

Figure 4-21: Main Effect of Symbol on Response Time ...... 106

Figure 4-22: Main Effect of Symbol on Agreement Difference for Change Tasks ...... 109

Figure 4-23: Main Effect of Task on Agreement Difference ...... 112

Figure 4-24: Main Effect of Task on Certainty ...... 113

Figure 4-25: Main Effect of Task on Response Time ...... 113

Figure 4-26: Interaction effects of Symbol by Task for Mean Agreement Difference ...... 114

Figure 4-27: Interaction effects of Symbol by Task for Mean Certainty Rating ...... 115

Figure 4-28: Interaction effects of Symbol by Task for Mean Response Time ...... 115

Figure 4-29: Significant interaction between symbol and task for response time...... 116

Figure 4-30: Main Effect of Univariate/Bivariate Symbol on Mean Response Time ...... 118

Figure 4-31: Effect of Univariate/Bivariate Symbol across all tasks for Response Time ...... 118

Figure 4-32: Main Effect of Value vs. Geography Task Variation on Agreement Differences ...... 119

Figure 4-33: Main Effect of Value vs. Geography Task Variation on Certainty Rating ...... 120

Figure 4-34: Main Effect of Value vs. Geography Task Variation on Response Time ...... 120

Figure 4-35: Main Effect of Trend Type on Agreement Difference ...... 122

Figure 4-36: Main Effect of Trend Type on Agreement Difference ...... 122

x

Figure 4-37: Main Effect of Trend Type on Response Time ...... 123

xi

LIST OF TABLES

Table 3-1: Animated Map Materials Used As Prompts in the Focus Group Session...... 37

Table 3-2: Typology Task Coding Count...... 42

Table 3-3: Visualization goals as defined by Wehrend (1993) ...... 51

Table 3-4: An example typology task description structure ...... 53

Table 3-5: Low Level Tasks ...... 57

Table 3-6: Intermediate Level Tasks ...... 59

Table 3-7: High Level Tasks ...... 61

Table 4-1: Experiment Conditions ...... 83

Table 4-2: Experiment Factors ...... 85

Table 4-3: F-values for Interaction of Symbol by Task for the Full Response Set and Culled Response Set ...... 99

Table 4-4: Symbol Group Culling Differences ...... 99

Table 4-5: Summary of ANOVA statistics for all effects (* denotes statistical significance at a level of 0.0167) ...... 103

Table 4-6: Summary of responses for all symbols by task (* denotes statistical significance at a level of 0.0167) ...... 107

Table 4-7: Summary of ANOVA statistics for symbol effects by task (* denotes statistical significance at a level of 0.0167) ...... 108

xii

ACKNOWLEDGEMENTS

First and foremost, I would like to thank my advisor, Alan MacEachren, for his tireless guidance, generous use of his time in helping to move this thesis forward, and for offering many points of intellectual stimulation. I am very grateful to have had such a successful relationship. I would also like to thank my committee members. Cindy Brewer has been steadfast and committed to the progress of my education and this thesis. Doug Miller has given countless hours of mentorship and provided many important pieces of wisdom along the way. The Penn State

Geography community has been an excellent one for fostering critical thinking and active engagement. Craig McCabe deserves my thanks for sharing in the thesis experience and in light- hearted camaraderie that made the process that much more enjoyable. Finally, I need to thank my close friends and family for their unfailing support throughout the process.

1 Introduction

The motivation for this thesis originates from a desire to develop effective strategies for visualizing spatiotemporal patterns of both short-term bird migration movements and long-term bird distribution change. Map animations are an intuitive method for creating an active sense of movement and change that can simulate and model real-world behaviors. The use of animated maps is different from non-animated forms of space-time visualization that take advantage of interactivity or simply, static representations. Often the latter are used to track the movement of discretely identified entities (i.e., a bird, a car, a person) through filtering and querying. Here, the focus will be on studying broad-scale patterns of spatially and temporally aggregated point data, with bird sightings used as the example.

With goals of developing effective methods for visualizing bird migration and distributional change, this thesis builds on past research into methods for enhancing the ability of map animation to reveal patterns of movement and change, a challenge that has been frequently addressed in the study of exploratory geovisualization. Harrower (2002, p. 1) emphasizes this by stating that, “…the visual representation of change is a fundamental challenge for .”

The best approaches to helping users understand patterns of movement and change in visualization methods are yet to be understood. Map animation remains a frequently cited tool in attempting to address these kinds of challenges. However, its overall effectiveness in accomplishing specific map reading tasks remains uncertain. A main goal of this thesis is to better understand the utility map animation has for revealing patterns of movement and change in large space-time datasets.

To address this issue, this research first seeks to understand what types of pattern- discovery and change-monitoring tasks users would like visualizations for this type of data to

2 support. Second, it seeks to determine, through formal experimental methods, what forms of

representation are most successful in symbolizing change for those tasks. This two-phase

approach has been accomplished by completing a focus-group with domain experts (who are

knowledgeable about the dataset used) to determine appropriate tasks and then using those tasks

in an experiment with undergraduate students to test different symbolizations of the data for those

tasks. Originating in perceptual research, Selective Attention Theory is used to classify bivariate

symbols used in the experiment by their dimensional interaction behavior. This approach can be

seen as a top-down cognitive-perceptual method that first seeks to develop an understanding of visualization use and then supports implementation of symbols through a controlled experiment.

The remainder of this chapter will briefly highlight pertinent background material to

place this work amongst efforts to develop effective visualization methods for representing

change and improving map animation design. First, discussions of exploratory geovisualization,

the Avian Knowledge Network experts and data being used, and challenges associated with

representing geographic change will be presented. An overview of methods involving domain

analysis and task-based experiment will be given, with the goals of this thesis, some implications,

and a roadmap to the remainder of the thesis following.

1.1 Exploratory Geovisualization

A number of authors studying exploratory geovisualization and dynamic cartography

consider map animation to be an appropriate method for facilitating domain experts studying

spatiotemporal relationships (MacEachren et al. 1998a; Andrienko et al. 2000; Blok 2000).

However, our understanding of map animations, their abilities to facilitate data exploration, and

their most appropriate tasks, uses, and applications remains uncertain. Recently, Harrower and

Fabrikant (2008, p. 62) opined that, “to realize the full potential of animated maps…we need to

3 better understand for what kinds of representational tasks they are well suited…and how variations in the design of animated maps impact our ability to communicate and learn.”

Another important facet of the geovisualization agenda that this work addresses is the fact that the generation of large spatially and temporally referenced datasets is outpacing the development of methods to analyze, synthesize and communicate them (Chen et al. 2008). By seeking to improve our knowledge about appropriate tasks for animations and subsequent effective representations, this work will improve our ability to use map animations as a viable method for analyzing large spatiotemporally-referenced datasets.

1.1.1 Map Animations

Dorling (1992, p. 215) said, “To animate is to create the illusion of movement.”

Certainly, animation can produce a synthetic sense of motion in a way that no other map form can. I see this capability as an important method of matching the mental representations of real- world dynamic phenomenon that people may have (Griffin et al. 2006) to its appropriate depiction. Bird migration is a dynamic, seasonal phenomenon that occurs on a short time scale, composed of the movements of many individual species. The apparent motion generated by time series map animation has the potential to prompt enhanced understanding of the movements of those species. While bird distributional changes appear to “move” on the map, these

“movements” cannot be directly linked to discrete space-time entities or events (e.g., individual birds or flocks of birds); instead they are aggregate patterns of change in attribute across space.

The results of this study will speak to the ability of animated maps to facilitate pattern recognition for aggregate spatiotemporal point data.

4

1.2 Avian Knowledge Network Data

This thesis is motivated in part driven by the availability of a large, publicly-available

dataset of distributed point-based bird observations in North America, collectively known as the

Avian Knowledge Network (AKN). The dataset is suitable for use in the study of space-time

representations in that it is large, heterogeneous, publicly-available and is recent enough that the

relevant questions it might help address are far from determined, thus exploratory

geovisualization is likely to help in generating questions to pursue. Further, staff that work in

collecting, managing and distributing this dataset make apt domain experts to work with in the

first phase of this thesis focused on task analysis, because of their expertise with bird distributions

and migration required to ensure the quality of the data. As a result, sample data subsets from the

AKN will be used both for creating focus-group stimuli and for building and testing the bivariate

representation.

1.2.1 Visualization Demand

An additional benefit of working with a targeted domain group and a dataset that is fast-

growing and extremely large is that the results of such work benefit both GIScientists and domain

experts. As a project started less than 10 years ago, the AKN has now amassed over 50 million

bird observations at over 450,000 independent locations. Data-mining methods first applied to the dataset sought to employ computational multivariate analysis techniques to discover unanticipated relationships (Caruana et al. 2006). However, that work was largely not visual.

While early analysis attempts worked to account for noise in the data by using advanced statistical models, they neglected the important goal of facilitating a private, exploratory visual thinking process (DiBiase 1990). Methods that AKN staff currently use to visualize the

5 spatiotemporal dimensions of their data are rudimentary at best and would benefit from an improvement through applied dynamic cartography.

1.3 Representing Geographic Change

Previous work by Slocum et al. (2004) has suggested that univariate map animations are not necessarily adequate at assisting users in identifying patterns of change in a time-series.

However, Slocum et al. suggest that adding a second visual variable to the animation to indicate the magnitude of change between time slices is a way to reveal these patterns. Harrower (2002) notes that future work with animated maps should be directed at incorporating temporal change representation into a variety of animated maps. Within this context of past work, this thesis focuses on testing an explicit representation of temporal change within a non-interactive map animation setting that uses bivariate symbolization of data and change.

1.3.1 Understanding Complex Time-Series Patterns

In the nascent days of geovisualization research, MacEachren and Ganter (1990) emphasized the potential of maps to, “stimulate scientific insight by facilitating the discovery of patterns and relationships in spatial data.” While development of the field has directed some attention at pulling apart individual pieces of pattern recognition, there has been little development in understanding of the ways in which more complex and synoptic space-time patterns are comprehended by map-readers. How are complex patterns of broad spatial and temporal scales processed and understood by map readers? How do we as GIScientists begin to understand the perceptual behavior, cognitive processes, and elemental tasks behind such pattern recognition? And how can we develop tools that support these sorts of pattern recognition? These

6 questions are just a few that indicate the importance of developing ways to understand how map- readers seeking to understand complex processes can use visual models of those processes to better understand them.

1.3.2 Relevant Monitoring Tasks

One way we can begun to understand the ways complex processes are understood is by studying the map reading tasks completed when animated maps are being used to study patterns and change. Considering the map reading goals and tasks that can be met by any animated map is important in understanding use and ensuring effective design. There are many potential tasks and often the misidentification (or lack of identification) of the map reading goals of an experiment has resulted in the failure to confirm the hypotheses of previous research. Derived from study of both the relevant literature, which provides guidance for appropriate animated map reading tasks, and the results of the domain analysis, the animations and map reading tasks used here are intended for the recognition of broader spatiotemporal patterns, such as trends over time. The typology derived from domain analysis provides a more specific overview of relevant tasks that fit within this framework.

1.4 Questions

This section presents the central questions this thesis seeks to answer, a main question and two complementary other questions that are necessary in addressing the main question.

7

1.4.1.1 Main Question

The first question of this research is: Does explicitly representing both the magnitude of temporal point data and the magnitude of change in data values between frames of an animated geographic time-series enable users to answer questions about patterns and rates of change quickly, easily and accurately? Will explicitly representing geographic change help elucidate patterns, hinder their recognition in a time-series, or have little to no effect?

1.4.1.2 Secondary Question

A second question focuses on the choice of bivariate symbolization as informed by selective attention theory: Does signifying the two components in a bivariate symbol pair using visual variables that are expected (by selective attention theory and prior research by others) to be separable versus integral change the answer to the question above and, if so, how? Depending on the map-reading task at hand, these alternative forms of representation may yield different responses, supporting the use of one or the other in animated bivariate maps generally or supporting different symbol choices for different tasks that each enables.

1.4.1.3 Tertiary Question

The third question of this research asks, simply: what map-reading tasks do experts in the data domain perform while studying map animations of their data? There is little understanding of generalized domain-related tasks as they pertain specifically to the use of map animations and space-time datasets. A new task typology, derived from domain expert focus group material, will be formed in attempting to answer this question.

8

1.5 Selective Attention Theory

Selective Attention Theory (SAT) can be defined as, “a way of measuring the perceptual grouping of features in a visual image” (Nelson 2000a, p.262). It is derived from psycho-physical studies seeking to understand the nature of vision (Duncan 1984). SAT was also used to understand how symbols were grouped based on their graphical dimensions. Speeded classification tests performed on groups of symbols derived a spectrum of categories defined by the dimensional interaction of the symbols. A visually separable bivariate symbol allows users to attend to each variable dimension separately, each being independent (MacEachren 1995).

Alternatively, the dimensions of a visually integral bivariate symbol cannot be visually separated and are interdependent (Nelson 2000a). In the middle, a visually configural symbol can display behaviors of both separable and integral symbols, with potential emergent properties.

In this research, SAT is used both to help classify and select symbols for testing and to develop predictions about how the symbols are used to recognize patterns and trends of change.

Testing bivariate map animations based on SAT will contribute to our understanding of pre- cognitive constraints in relation to acquiring information from animated maps. To my knowledge, this is the first application of selective attention theory in investigating multidimensional symbolization choices for animated maps.

1.5.1 Bivariate Symbolization

Bivariate symbol design, largely driven in this work by selective attention theory, is also constrained by the challenge of appropriately symbolizing change and by the structure of the data itself. Symbolizing change as a divergent (negative, none, positive), ordinal scheme limits the number of possible bivariate symbols for use in each of the two SAT categories use in this

9 experiment (integral and separable – configural was not used as its behaviors are not as well understood). A process of practical design and logical elimination yielded one bivariate pair in each category for final experiment use. The impact of the data characteristics and representing change on symbol design will be thoroughly discussed in the methodology section.

1.6 Approach & Expectations

Following is a discussion of the approach to answering the above questions and some of the expectations developed in planning to address them.

1.6.1 Domain Analysis

As part of the User-centered Design (UCD) process for software development, Robinson et al. (2005) identify Work Domain Analysis as the first step. Not seeking to complete an iterative, refined final product, the domain analysis phase in this research will be the only segment of the

UCD process implemented, as it is the crucial one for understanding the relevant tasks of domain experts to be used with map animations and change representations.

Visualization development projects working with domain experts have used a number of methods for collecting information in the process, including: interviews (Brewer 2005;

Bhowmick et al. 2008), literature analysis (Blok 2000; Bhowmick et al. 2008), focus groups

(Harrower 2000), and multifaceted procedures (Brewer 2005; Robinson et al. 2005). While

Brewer (2005) notes that, “there is no single, best method for conducting studies of work and expertise,” certain methods are more appropriate considering the goal of eliciting knowledge from experts.

10 To understand relevant tasks, a focus-group where domain experts respond to prompts while studying map animations of bird migration and distributional changes was performed. A focus group was selected as an efficient way to both develop an understanding of tasks that experts use animated maps to address and to obtain input from the experts on map animation design. The results of the focus group session were coded and formed into a typology to build a lexicon of relevant tasks related to the data set. Subsets of these tasks were employed within the task-based experiment as a basis for testing change representation symbology with those tasks.

1.6.1.1 Expectations

Domain experts are expected to have a variety of methods for reading animated maps, characterizing migration and distribution, and communicating their understanding of these processes to others in viewing map animations. No single type or method of explaining their knowledge-formation processes is expected. Instead, their explanations of the complex, spatially and temporally broad phenomenon will more likely be constructed of smaller, piece-wise map- reading units. Synthesizing and cogently generalizing the domain experts’ explanations will be critical in composing a general understanding of how these experts read animated maps.

1.6.2 Task-based Experiment

Task-based experiments have been used frequently in cartographic research to address important questions, especially those directed at testing the effectiveness of symbols and map forms. The design of such experiments requires practical consideration in designing experiment elements such as stimuli, response data collection methods, participant selection, and analysis

11 methods. Decisions related to these aspects of experiment design will be discussed in Chapter

Two and Chapter Four.

1.6.2.1 Expectations

In addressing the first research question, this study contributes to knowledge about our ability to recognize patterns of change when change is directly represented on a map animation, as opposed to when it is something that we have to derive from the animation ourselves. It is unclear whether adding a second variable dimension, via bivariate symbolization, to a map animation in this manner will unburden and offload perceptual processes related to recognizing change or whether it will further complicate the map animations, making their use more difficult.

Addressing this secondary question delves further into the explicit representation itself, considering the use of each type of bivariate pair, as derived from selective attention theory.

Separability and integrality represent two extremes of a spectrum of variable dimension interaction. It is likely that these two different dimensional interaction behaviors for variable pairs will serve different map reading functions to accomplish different goals of deriving information from a map animation. By testing both types against different map reading tasks, this work is designed to show which map reading tasks are best addressed by each type.

1.7 Implications

The work presented in this thesis has implications for multiple aspects of geovisualization research, including task typology development, animated mapping, representing change, and the application of selective attention theory. By working with domain experts to establish a set of known tasks for use with animated maps and categorizing those tasks into a cogent typology, this

12 work helps to further our understanding of relevant tasks for visualization development and

methods for developing task typologies such as the one presented here. This work indirectly

reveals perceptual and cognitive overload limits for animated maps, adding to our understanding

of the ability of visual system processing to comprehend animated time-series maps. Proven to be

successful in some situations, bivariate change representations can be seen as a way of making

thematically relevant information in a map animation more perceptually salient and more readily

available for cognitive processing.

Results from this study contribute to our understanding of visual attention and its applied

use in cartographic experiments by testing the effectiveness of separable and integral variable pairs to facilitate users answering questions about patterns and trends of change. Using previously classified bivariate symbols, the results will link visual attention research with map animations by showing what map reading tasks are facilitated by users attending to different, separable variable dimensions or holistic, integral variable dimensions in an animated map. This relates visual attention to bivariate symbols, animation, and change in completing domain-relevant map reading tasks.

1.8 Thesis Overview

The next chapter will provide a thorough review of relevant literature as it pertains to change representation, bivariate mapping, selective attention theory, animated mapping, and experiment design. The third chapter of this thesis will describe the two phases of the methodology, the domain analysis and the task-based experiment, separately, with links between the two offered as necessary. Chapter Four will continue in bifurcated fashion, presenting the results of the domain analysis as a typology and the results of the task-based experiment as the analysis of digital survey response data. Likewise, Chapter Five will separately interpret and

13 analyze the results of both sections, providing links between the pieces as relevant. Finally, the sixth chapter will summarize with conclusions, provide an assessment of this thesis, and offer suggestions for future work.

2 Literature Review

This chapter summarizes relevant literature pertaining to the two major goals of this thesis. The first is to develop a representative set of expert-informed, animated map reading tasks relevant to understanding space-time phenomena and to form a new task typology based on those tasks. The second is to test bivariate change representations in an animated map setting for their effectiveness in recognizing patterns and trends. To contextualize these goals requires discussing research directed to animated maps, change representation, bivariate symbolization, selective attention theory, and experiment design. Each of these topics is reviewed below.

2.1 Animated Map Use

The historical context of animated maps and their use has been previously summarized by a number of authors (Campbell and Egbert 1990; Harrower 2004; Harrower and Fabrikant

2008). Campbell and Egbert (1990) discuss thirty years of limited progress, reviewing nascent efforts, exceptional examples, research prospects, and technical dimensions, identifying the need for development beyond mere technological improvements. In an updated review of animated mapping, Harrower and Fabrikant (2008) focus on animated map characteristics, design, and potential troubles. With the “very real risk that mapping technology is outpacing cartographic theory,” Harrower and Fabrikant shift the focus toward understanding how animations work, identifying their successes, and the reasons why they are successful in a move to ensure that animated maps are used appropriately and more effectively. The authors state the importance of understanding the types of representational tasks for which animated maps are best suited, so that animated maps may become a more powerful tool in the agenda of geovisualization research.

15

2.1.1 Origins

Early time-series map typologies provide a background for understanding the development of animations as they are used to represent both time-series and change. The first typology, proposed by Monmonier (1990) is composed of three static forms: Dance, detailing the movement through space and time; Chess, representing individual time slices; and Change, directly showing the difference between two time periods. DiBiase et al. (1992) extended the latter with three forms to visualize change: fly-bys, re-expression, and time series. Fly-bys, moving through space, and re-expression, attribute change (across a data range, thus not over time), are not particularly relevant to this study. However, time series, or chronological change, provides a theoretical basis for developing animations that visualize change in an ordered, chronological time-series, as a way to depict the movement of an entity through time and across space. Concurrently, Dorling (1992) provides a similar typology, experimenting with examples of animating time (for geographic objects), praising animated time-series as being most successful at representing continuous change, a movement phenomenon he claims our eye-brain system is well-adapted to perceive.

2.1.1.1 Dynamic variables

Putting time on a map adds another dimension of variables, termed the “dynamic variables.” The original set of three (duration, rate of change, and order) were first introduced by

DiBiase et al. (1992). MacEachren (1995) extended these, adding another three (display date, frequency, and synchronization). The variables were described to detailed how they could be manipulated for animation design, graphic authoring and “narrative” construction. Under the circumstances of automated data-driven animated map generation, they require less manipulation.

16 The ordered, chronological nature of time-series data provides inherent selections for display date and order and the values in a data-driven animation automatically control the rate of change

(often variable) for a sequence of frames. Animated maps using a single dataset do not require synchronization. Frequency is determined by the temporal resolution available in the data and can possibly be adjusted for by aggregating to coarser temporal units. The duration of each frame represents real world duration of each time slice (one frame for each week). As all time slices are the same length, the duration of each frame is identical.The process of selecting the dynamic variables in the experiment phase of this thesis will be discussed in the methodology.

2.1.1.2 Static Small-Multiples

An important alternative to map animation that has also often been used to display time- series data is static small-multiples (SSMs). Bertin (1967/1983) introduces the notion of a homogeneous graphical series that portrays a collection of small maps or diagrams with unique invariants, including, amongst other things, time-series. However, Tufte (1983) was the first to term this graphical design technique a “small-multiple”. He acknowledges them as resembling frames of a movie (thus relating them to animation) and praises them for a number of reasons, including the fact that they are “efficient in interpretation.” However, Monmonier (1990) was the first to suggest SSMs as a potent strategy for geographic time-series visualization of maps, especially when used for juxtaposition. While SSMs are cited as a reliable method for allowing users to patiently and freely study and compare individual time slices (Dorling 1992), they become unwieldy when the number of multiples begins to overwhelm the ability to practically view them as a group (Martis 1989). Datasets with enormous temporal resolution and extent are becoming frequently more available and demand analysis (Chen et al. 2008), such that in many uses (e.g., when the goal of analysis is hourly, daily, or weekly counts for a full time series)

17 SSMs are no longer a practical option. SSMs may be useful if the data were aggregated substantially into a manageable number of time slices (e.g., to yearly totals) or subsets of time were considered for one time (e.g., aggregating to the months of one year).

There is substantial support for SSMs and map animations facilitating different types of tasks. Eye-tracking studies (Fabrikant et al. 2008; Fabrikant and Garlandini 2009) using data of dynamic phenomenon have begun to delimit appropriate tasks for both SSMs and animated maps, finding that SSMs offer the most affordances for tasks that involve comparison and non- sequential study. Griffin et al. (2006) found that users could more quickly and easily identify subtle space-time clusters with animated maps, than with static small-multiples. Additionally,

Johnson and Nelson (1998) found that participants were more successful at identifying trends and patterns of change with animations than they were with static paper or computer map series.

While SSMs function largely for comparison, animations function best with trends and overall patterns.

2.1.2 Critiques of Animation Use

A number of authors (Cutler 1998; Morrison 2000; Tversky 2002; Hegarty 2003) offer skepticism about the use of animation, offering evidence that for the reading tasks designated in each respective study, static representations were more successful than animations. These authors claim animation is not beneficial in facilitating learning and that it is not cognitively or perceptually plausible in accomplishing certain reading tasks. Of these studies, only Cutler used geographically-based maps.

Studies by Lowe (1999; 2003; 2004) focused on complex animations and their ability to facilitate construction of quality mental models, a higher level, complex task based on cognitive processes that involve using domain knowledge, developing causal relationships, and

18 interrogating animations in extracting information. In those studies, he found that perceptually

salient information in map animations was not the most thematically relevant, inhibiting task

accomplishment. Independently, Harrower and Fabrikant (2008) contend that the improvement of

graphic design principles for map animations will make thematically relevant information more

perceptually salient, improving the success of map animations.

Despite evidence from non-cartographic research that indicates that map animations may not be particularly successful, there is ample support, including previously mentioned literature, for animation use from cartographic research. Griffin et al. (2006) show that pattern presence detection is better enabled by animation than by SSMs using moving cluster stimuli in maps.

MacEachren et al. (1998a) found that experts using animation discovered patterns more readily than experts using static time-stepping. Finally, research by Ogao and Kraak (2002) indicate that animations can indeed facilitate geospatial data visualization processes and should be used for such.

In addition, my biggest critique of animation skeptics, from both cognitive science and cartography alike, is that the reading tasks of many experiments have not been congruent with what animation is best at facilitating. Map animations are certainly not suited to some reading tasks, such as time-slice-specific quantity read-out, legend evaluation, or comparison of two non- adjacent time slices. I believe that map animation, in contrast, is well suited to tasks related to recognizing trends, cycles, and overall patterns of change and movement. Animation has a role to play in geovisualization, but we need to better understand how it can do so most appropriately effectively. The following section reviews research relevant to selection of appropriate reading tasks for map animations.

19

2.1.3 Appropriate Reading Tasks

Koussoulo and Kraak (1992) were the first to develop a framework of understanding map reading tasks for map animations, drawing on Bertin’s (1967/1983) principles of reading levels to present a framework of potential questions directed at spatiotemporal maps. Johnson and Nelson

(1998) tested users on quantity evaluation and trend pattern recognition tasks, finding that animations were suitable for the latter but not the former. This trend pattern recognition task was relatively simple (the goal being to merely identify overall decrease, increase or lack of change in a stream level simulation) and does not speak to the potential of animations to aid in recognition of more complex space-time-attribute patterns of change. Additionally, Harrower (2007) found that participants discovered broad-scale patterns of seasonal employment waves in a study pitting classed, animated choropleth maps against unclassed, animated choropleth maps, despite the fact that the intended map reading task did not require the user to search for these kinds of patterns.

In that same study, Harrower (2007) began with a goal of examining, “how map readers perceive geographic patterns of change on a choropleth map and their understanding of the relative stability of those data values.” I believe that these may be two very different and disparate tasks and that this is evident in the final map reading task selection for Harrower’s experiment. Harrower asks participants to watch two pairs of unclassed and classed animations; one pair with a subset of metropolitan regions highlighted and one pair with a subset of states highlighted. He then asks users to indicate which regions changed the most or least. This is then a quantity evaluation map reading task for those selected regions, not a task that asks the user to perceive and explain patterns of change as a whole. Research previously mentioned suggests that map animations do not function well as a quantity evaluation tool, so it appears that these are not entirely appropriate tasks to direct at map animations.

20 While our understanding of appropriate tasks for map animations is still incomplete, one important role of animation is that it can be seen as a way of assisting the user in recognizing broader patterns of trend and change. Tasks congruent with this role have been described in typologies seeking to categorize the types of analysis performed with dynamic spatiotemporal data. A review of these typologies follows.

2.1.4 Typologies & Change Monitoring

The task typologies discussed in this section were used in the first phase of research to help categorize the reading tasks of domain experts using animated maps. Additionally, these typologies provide a basis for discussing the ways of typifying these tasks and linking them to map animation uses. These reading tasks were developed by working with material generated from a focus group session with domain experts. To form a typology of these tasks required coding and classification. While a more thorough discussion of this typology formation process will be presented in the methodology section, a discussion of the following three existing typologies, which provided guidance in that process, is provided here.

One of the earliest, simple typologies was defined by Koussoulakou and Kraak (1992), who identified reading levels for both space and time. Overall levels (as compared to intermediate or elementary levels) for both space and time yielded an example question that involved the trend of population density across an entire period of time. The presence of trend in the overall category directs my attention towards overall tasks as being appropriate animated map reading tasks.

Another early typology with a more general audience was developed by Wehrend (1993) who formulated a method to identify visualization goals, subsequently matching them to examples of appropriate techniques. The method involved mapping one of nine actions to one of

21 seven types of data. While his typology covered many non-spatial data forms, the types of

direction, shape, position, and spatially extended region or object (SERO) have relevance to

geographically oriented tasks. The types of actions outlined by Wehrend ranged from simple

tasks, such as identify, locate, or distinguish, to more complex and complicated tasks, such as

compare, associate, and correlate. While this method is potentially useful, only examples of

visualization techniques described and detailed in that book up to that point in time are linked to

the tasks, and no generalizations of techniques for each task are given.

Blok (2000) offers a typology for characterizing the types of change monitoring tasks that

can be performed with time series geospatial data. Blok breaks down potential tasks into those

that acknowledge (1) change in the spatial domain, (2) change in the temporal domain, and (3)

comparisons. Changes in the temporal domain include two scales: short and long series. The long

series includes cycle and trend, the two most relevant tasks as they relate to map animations.

While Blok (2005) later links dynamic visual variables to methods of interactivity used with

animation, at a broader level most change monitoring tasks are not linked to appropriate

representation forms and readers are left to make their own matches.

The most widely used typology for spatiotemporal geovisualization is that of Andrienko

et al. (2003) who draw on Bertin (1967/1983) and Peuquet (2002), separating task types into three dimensions (search level, search target, and cognitive operation). The cognitive operation divides tasks into those that identify and those that compare. The search target has a divide between those tasks that define a “when” and seek to find the “what” and “where” and those tasks that define the

“what” and the “where” and seek to find the “when.” Finally, the search level decomposes into combinations of general (overall) and elementary (individual) levels within the “when” and the

“what” and “where.” Knowing that compare tasks are not appropriate for map animations and that general “when” and general “what” and “where” levels are most likely to be successful leaves only a small piece of this typology relevant to map animations. Nonetheless, the

22 Andrienko et al. typology is important to use in light of the fact that such a generalized typology does not exist otherwise for spatiotemporal data.

2.2 Representing Change

Methods for cartographically representing change have been underdeveloped in the literature. Harrower (2002) was amongst the first to summarize and synthesize attempts to conceptualize change, providing both a categorization of types of change possible in time geography and constructing a “Change Task Cube” to evaluate the difficulty of change tasks that a user may perform while studying a map animation. While this thesis does not attempt to advance the conceptualization of change as it applies to time geography, exploring previous attempts at developing cartographic methods for representing change is important, as novel methods are developed here.

2.2.1 Background

Harrower (2002) built on existing typologies of space, time, and space-time to develop a conceptual framework for representing change within map animation. He categorizes types of change as the following: (1) location, (2) geometry – size/shape, (3) attribute, and (4) state/existence. These categories provide a good foundation for understanding the types of change possibly occurring in a given dataset and matching ways of designing map animations towards tasks directed at those types of change. Noting that Blok (1999) suggests animated maps as allowing for both identification and comparison of changes, he posits the question of whether animated maps are suitable for both, and how those can be best represented.

23 His solution was a novel one, termed “visual benchmarks,” which were intended to help answer the question: “How does the current state of the phenomenon compare to what is about to happen or what has just happened?” A potential answer to this question was addressed by adding a second map symbol to indicate any number of important values in the data set, including extremes, starting or ending points, critical values, or moments just before or after the current one. Harrower tested a number of variations of benchmarks in an interactive animation setting, finding that when used with proportional symbol maps benchmarks increased accuracy.

2.2.2 MapTime

Further support for adding additional symbolization to time series animated maps to improve use is provided by Slocum et al. (2000, 2004) in their user-based evaluation of

MapTime, a software system for exploring spatiotemporal point data with different symbolizations. An important finding of that study was that overall spatiotemporal patterns of population change shifts in the United States missed in a univariate animation, using graduated circles to represent the population sizes of cities. However, those overall patterns were immediately apparent to researchers and subjects when presented in a single, static change map.

However, the static change map could not give user a realistic sense of movement, change, or rate of change that an animated map did. Offering a possible method for combining static change maps and for map animations, Slocum et al. (2000, p. 26) suggest that, “shading the interior of the circle to represent the degree of change,” could be successful in helping users recognize such patterns. Slocum et al. qualitatively evaluated this possible solution, finding some success.

However, they expected that a map animation with the degree of change represented by shading the circle interior would work best as a supplement to a univariate animation or static small

24 multiple series, not as a substitute, because of complexity resulting from simultaneous changes in

both circle color and size.

Considering the successes of both Harrower (2002) and Slocum et al. (2000, 2004), I believe that a bivariate symbol showing both the magnitude of point data values and the magnitude of change (between frames) in a map animation will be more effective in revealing spatiotemporal trends and patterns of change than using univariate symbols in a map animation.

To do this requires exploring methods for bivariate mapping.

2.3 Bivariate Point Symbols

Literature directed at static bivariate mapping has been summarized by a number of

authors (MacEachren 1995; Brewer and Campbell 1998; Nelson 2000). Nelson’s (2000a) work builds on prior work and is particularly relevant here due to her focus on empirical experiments investigating bivariate symbol properties. Nelson breaks previous research on bivariate mapping into three categories: symbol design, redundant coding studies, and experimental studies seeking to test effectiveness. While the impacts of perception on bivariate symbols will be discussed in detail below, studies involving bivariate symbols and animated maps are practically non-existent, beyond the two (Harrower 2000; Slocum et al. 2000, 2004) mentioned in the previous section.

Cartographic investigation into bivariate symbols at this point has been limited largely to static maps.

The core focus of this thesis is on bivariate symbolizations for representing both data values and change in map animations. To address this focus systematically, an approach to categorizing, selecting, and testing symbol types is needed. This is accomplished by studying the perceptual nature of bivariate symbols as done by previous authors. MacEachren (1995) draws on

Bertin’s (1967/1983) “associativity,” the ability of a map reader to ignore individual dimensions

25 of a variable pair, to identify selective attention theory as a possible route for studying bivariate symbol attention behavior in maps. Nelson (2000a, 2000b) provides the main foundation of research and testing of selective attention theory with bivariate symbols for this thesis.

2.4 Selective Attention Theory

The application of selective attention theory (SAT) to bivariate symbols and their use in a map setting has been discussed by MacEachren (1995) and MacEachren et al. (1998b) and has been thoroughly investigated for point symbols that depict quantitative data by Nelson (1999,

2000a, 2000b). Nelson (2000a, p. 262) defines selective attention theory as, “a way of measuring the perceptual grouping of features in a visual image,” and that SAT “contends that our ability to analyze a symbol’s graphic variables (i.e., color, size) is affected by other graphic variables in the same symbol.” This latter contention has obvious ramifications for bivariate symbols, which inherently have two graphic variable dimensions, such that the application of SAT to bivariate symbol design is an intuitive method for guiding design and studying their effectiveness. Indeed,

Nelson completes a series of studies applying SAT to bivariate symbol design (1999, 2000a,

2000b).

SAT has its origins in psychological research in the 1950s and 1960s (Nelson 1999). An experimental task known as the speeded-classification test allowed researchers to evaluate interactions between dimensions in a given stimulus. What results is a classification along a spectrum of dimensional interaction. On one end is separable, where the two dimensions of the symbol can be “pulled apart” and studied separately. Separable symbols can also be thought of as those where the two dimensions can be attended to separately from all other dimensions or that the dimensions behave independently. On the other end of the spectrum of dimensional interaction is integral. The two dimensions of integral symbols cannot be attended to separately

26 from each other and the viewer is forced to study them as an interdependent pair. In the middle

lies configural which may be attended to individually or can be combined to create emergent

properties. However, these emergent properties are not well understood and the consequences of

having a configural symbol are subsequently not well understood either.

As a result of having interdependent or independent dimensions, the nature and

relationship of the two datasets to be mapped (whether they are correlated, interrelated, or not)

requires that appropriate variable pairs be selected for the particular data set to be used. An

interdependent variable pair may be most successful with data that is correlated, while an

independent variable pair may achieve the most success with data that is not inherently related.

This has not been tested.

2.4.1 Nelson’s Experiments

Nelson’s (2000a) first experiments used the speeded-classification test to categorize

commonly used cartographic symbols by their dimensional interactions, in an abstract non-map setting. Nelson’s (2000b) second experiment was in a map setting, seeking to understand whether symbol classifications made in an abstract setting held once they were done in a map setting. The results not only provided confirmation of dimensional interaction for some bivariate symbol pairs in a map setting, but also identified processing difficulty (based on reaction times) for some of the pairs. A list of experimentally-identified separable pairs included: graduated symbol hue-size, point symbol hue-shape, graduated symbol size-value, point symbol shape-size, size- value, dot numerousness-hue. Rectangle height-width and choropleth value-saturation were named as integral pairs as a result of her experiments. Nelson also ranked these on processing difficulty (based on reaction times), with rectangle height-width ranking as least difficult amongst the integral variable pairs. Graduated symbol hue-size and point symbol hue-shape were fastest

27 for separable variable pairs. Considering the added processing stress associated with animation,

choosing variable pairs that can be processed quickly is important.

Nelson (2000a, 2000b) tested only the combinations that were used most frequently in the cartographic literature. So other possible variable pairs may exist, including those that are easier to process and that may function better in animation. However, it will not be feasible in this thesis to develop and test other possibilities. Regardless, Nelson’s research is critical in guiding symbol design choices for both experiment and map use alike, as it identifies known attention behaviors and difficulty for a select set of symbols.

2.5 Experiment Design

Two methodological phases of this thesis were completed: a domain analysis and a formal task-based experiment. The first phase, domain analysis with experts, identifies map reading tasks for animations with relevant data. The second phase, task-based experiment, seeks to evaluate the effectiveness of explicitly representing change with bivariate symbols in an animated map. Guidance for research design can be found in the literature on task analysis with domain experts and in the literature on cartographic experiments, respectively. This section

provides an overview of domain analysis, analysis methods, and target users, discussing specific examples with the greatest influence on methods used in this thesis.

2.5.1 Domain Analysis

Numerous studies in geovisualization have used domain expert knowledge in the process of designing tools to facilitate exploration and discovery with geographic data (Buttenfield 1999;

Harrower et al. 2000; Edsall 2003; Acevedo et al. 2008). Responding to a call from MacEachren

28 and Kraak (2001) to, “…develop a comprehensive user-centered design approach to

geovisualization usability,” Robinson et al. (2005) combined a range of usability techniques to

design geovisualization tools, developing a process that places Work Domain Analysis at the

forefront of the process, but also placing user participation at every stage. Further, previous

studies (Harrower et al. 2000; Harrower 2002) have used target users in the first-stage of experimental design, ensuring that stimulus materials are functional for target audiences, such that they have ecological validity. Accordingly, domain analysis will be used to address the central objective of developing a new task typology, used to identify specific experimental tasks, and to obtain input about experiment materials.

While domain expert knowledge plays a strong role in developing an understanding of reading tasks directed at geospatial information, methods for accessing domain expert knowledge vary. Blok’s (2000) typology of change monitoring tasks was based on the questions of domain specialists monitoring geographic phenomenon in space and time, extracted from domain literature. However, beyond simple observation of the domain expert exploration process, active involvement and interaction with experts is often helpful in improving the usability of representations and interactive tools (Bhowmick et al. 2008) and understanding the process of inference, hypothesis formation and knowledge synthesis engaged in by users (Robinson 2008).

2.5.1.1 Focus Groups

Monmonier and Johnson (1991) note that despite frequent use in advertising and marketing research at that time, the cartographic literature had ignored the use of focus groups for evaluating map design. Monmonier and Gluck (1994) became the first to pioneer the application of focus groups to the process of developing dynamic cartographic products, performing two focus groups with 26 information, cartographic, and computer specialists in an effort to improve

29 interface effectiveness. Contending that focus groups are a “low-cost, efficient qualitative method for investigation and design improvement,” Monmonier and Gluck (1994, p. 37) had participants view and discuss dynamic map stimuli, including graphic scripts for exploring spatial distribution correlation, time-series, and user-control enhancements. They found that participants were able to successfully identify true patterns in the data and give critiques that were useful in the refinement of the visualization product.

Focus groups are particularly flexible methods for engaging experts, being used both to contribute to usability studies targeted at product improvement and to provide an initial framework for experiments. Not only can experts reveal valuable information on their goals with cartographic stimuli (as would be used in an experiment), but they can offer free-form criticism and feedback that can be used to help improve the stimuli (as is common with the usability process). Focus group sessions can be implemented both to determine reading tasks, by having experts study stimuli and describe their map reading process, and to garner critiques of stimuli for improvement in a formal testing situation. The entirety of this process will be thoroughly described in the methodology section.

2.5.1.2 Target Users

In seeking to determine appropriate tasks for the use of any visualization method, the expertise of a target user group is an important consideration. This is especially true for change representations that may be used by both experts, seeking to explore patterns in their data, and novices, who may encounter the representation in learning situations focused on patterns of change. The use of either group during domain analysis or during task-based experiment requires considering the match between expertise and both task determination and experiment performance.

30 As discussed in section 2.5.1, domain expertise plays an important role in understanding

domain-relevant map reading tasks. Novices are not suitable sources for determining tasks. Lowe

(1999) remarks that novices lacking domain knowledge associated with the content of a map animation may not be able to form appropriate mental representations and may attend to perceptually compelling aspects, as opposed to those that are thematically important. Cutler

(1998) supports this in finding that, in an experiment plotting animation against static paper maps, test subjects’ reading levels and prior knowledge were better indicators of map content comprehension than was the type of map that they were given in the experiment. With this in mind, domain experts are enlisted in this research both to help elucidate appropriate map reading tasks for animating maps with domain-relevant data and to help improve the quality of those maps for use in the experiment.

Experts are an obvious choice for task determination. However, because SAT focuses on low-level perceptual processing, domain expertise is not relevant at the experimental level and novices can act as participants. They can be used to evaluate the success of change symbolizations for reading tasks identified with experts, as they do not suffer from the bias inherent with domain experts who helped to generate the material for reading task determination.

Using tasks derived from the focus group session with experts in the subsequent task-based experiment gives this study more ecological validity, because change symbolizations tested will be better understood and more useful for those tasks. As a result, experts benefit by gaining animated mapping techniques that are tested based on their tasks, potentially making them more useful for the domain knowledge formation process and for use by more novice users of their data.

31

2.5.2 Task-based Experiment

Empirical, controlled experiments focused on cartographic stimuli, following methods borrowed from psychology, have a long history, dating back to pioneering research such as

Arthur Robinson’s 1952 The Look of Maps (Montello 2002). The advent of computers brought a

new age of experiments that involves digital stimuli and dynamic maps, including digital or

online surveying methods to display stimuli and collect response data (e.g. Harrower 2002;

Griffin et al. 2006; Midtbø et al. 2007; Midtbø and Nordvik 2007). Harrower (2002) broadly

groups previous task-based experiments on aspects of map reading, including: memory recall,

rate estimation, rate misinterpretation, speed performance, recognition of general patterns, and

tracking eye movement. And while a number of specific experiment examples provide ample

guidance in designing the formal task-based experiment used in the thesis, the literature is

particularly lacking in summarizations of methodological techniques for successful experimental

design procedures targeted at cartographic and geovisualization practices.

While the methodological foundations for this thesis are placed within the last 30 years of

cartographic experiments, new trends in experimental methodology are underfoot. The most

notable of these is the re-birth of eye-tracking studies (Fabrikant et al. 2008) and the application of computationally-based saliency models of visual attention to cartography (Fabrikant and

Goldsberry 2005). Fabrikant et al. contend that comparative studies seeking to prove that one representation form is better than another are ineffective and that as a research community we

“should instead be interested in finding out how highly interactive visual analytic displays work, identifying when they are successful and why” (Fabrikant et al. 2008, p. 202).

32

2.5.2.1 Guiding Examples

Past research by Harrower (2002) provided a key basis for design of the main experiment in this thesis. His work was an important model because it was an example of a successful attempt to build and design from scratch an entirely digital survey, one that guided the user through the survey process and automatically generated maps as stimuli. However, a number of other previous studies provided guidance on small aspects of experimental design decisions, including format, stimuli design, question and task structure, statistical methods, order effects and randomization, pilot study testing, interface design, and proctoring protocol. Beyond literature already cited in this chapter, a large number of other research methodologies were drawn upon

(Slocum 1983; Brewer et. al. 1997; Edsall et. al. 1997; Brewer and Pickle 2002; Andrienko et al.

2002; Blok 2005). The specific contributions of each will be detailed in the methodology section, which follows.

3 Domain Analysis & Task Typology Formation

This chapter presents the methods used to perform domain analysis, the process of forming a new typology of tasks derived from domain analysis, and the task typology itself. First, a focus group session was designed and implemented with domain experts to learn about how these domain experts perform analytical tasks with spatiotemporal and change-related data, as well as how they use animated maps to accomplish those tasks. Second, the focus group session material was used in conjunction with existing task typologies to identify tasks, apply existing

typologies to those identified tasks, and categorize tasks through a coding process that yielded a

new task typology for the movement patterns found in aggregated spatiotemporal point data.

Finally, the task typology itself, along with a discussion of its use, will follow. This material will

link the outcomes of the domain analysis with the task typology by describing the typology

formation process, as guided by existing task typologies directed at spatiotemporal patterns and

the focus group session material.

3.1 Domain Analysis

The goal of the domain analysis was to identify and extract typical map reading tasks of

domain experts studying Avian Knowledge Network (AKN) data, so that those tasks could be

used in a task-based experiment with the AKN data. The task-based experiment sought to test the effectiveness of different change symbolizations in animated maps. The Avian Knowledge

Network is housed at the Cornell Lab of Ornithology (CLO), with a staff whose purpose is to help collect, manage, study, and serve the data. To elicit map reading tasks from the staff, a single focus group session with prompts and animated map stimuli was developed and moderated. The

34 results of that session generated a body of material that was transcribed, coded, and categorized to identify tasks to be used as input toward constructing a typology of movement patterns typical of aggregated spatiotemporal point data. This typology building process was aided by applying existing typologies to help form a new typology from the extracted domain-specific tasks. A description of this process follows.

3.1.1 Domain Experts

A particular interest in domain experts that work on bird migration and bird distribution is motivated by my own personal interest in the topic. Prior to transitioning into geographic research, my interests were based heavily in ornithology and birding, having focused an undergraduate thesis on bird migration and working as an avian field technician after my undergraduate education. As a result, I have a good knowledge of the domain topic and have had previous experience with experts studying bird distribution and migration. This background led me to approach members of the CLO managing the AKN to recruit experts from within their group for the work. The CLO is renowned for its expertise in all fields of avian research and staff members managing the AKN are considered experts in their field in terms of bird species distribution, migration, and occurrence. As a result, these staff members were ideal choices. A request for collaboration was sent to the director of the program and six of the AKN staff agreed to hold a single two-hour focus group session.

3.1.2 Focus Group Session

A focus group session was the best means of domain analysis for a number of reasons.

First, focus group participants were afforded an opportunity to build consensus on analysis goals

35 for their data. While individual and independent participation may have produced a more diverse set of map reading tasks, group response revealed a cohesive, uniform set of tasks. Domain experts frequently perform a range of relevant map reading tasks in communicating bird migrations and distributions, fostering community amongst the users of their data, and producing materials for those users. These skills are key aspects to their recruitment of novice citizen- science users who will further contribute data. Thus, consensus built amongst multiple experts who operate in this way means that tasks derived from the consensus are well suited to both the domain experts and novice-level users of the data who are recruited by the domain experts and, in turn, learn from materials produced by these experts. Second, from a practical perspective, AKN staff members are busy with extensive travel schedules, such that building consensus asynchronously would have been nearly impossible. As a result, a single, thorough session completed with multiple members at once was the most practical option.

3.1.2.1 Protocol

A single two-hour focus group session was planned for and carried out in early

September 2008. I acted as the moderator. The session was held in a conference room at the

Cornell Lab of Ornithology offices in Ithaca, New York. Five members attended. A script had been developed that involved an introduction and four sections. Two modes of operation were written into the script. The first accounted for a full two hour session as planned. The second accounted for the possibility of an unexpectedly shorter session (~1 hour), however, this was not necessary as the session ran the full two hours. A full copy of the script has been appended as

Appendix A. The focus group session was audio recorded, with no names linked to participants, and was transcribed for later use in typology formation.

36 The first section of the session was intended to identify visualization goals for AKN data

and to develop an understanding of current methods and plans for visualizing AKN data. No

stimuli were presented in the first section. The second section was intended to prompt discussion

on data formatting issues, including statistical standardization and aggregation methods, as a way

of informing proper of animated maps for the task-based experiments. During

this session both static and animated examples, previously published by AKN staff members were

presented to stimulate discussion. The third section offered prompts to encourage discussion

about monitoring change and about methods for temporally smoothing the data, in an effort to

again inform cartographic design, but also to get participants thinking about reading tasks for the

data. No stimuli were presented in the third section. The fourth and final section presented map

animations that I designed; participants were asked to study these and describe their reading

process out-loud, discussing it with other participants. This section was intended specifically to

elicit reading tasks directed at different animated examples (detailed below) of their data.

Participants were instructed not to discuss interactivity or general design, but instead to focus on

patterns in the data, change, and overall impressions about the phenomenon. Prompts were given

that focused on aspects of the data, including patterns in density, change, spatial extent,

clustering, paths, and overall pattern. The focus group session was ended by addressing any

unanswered questions from the participants.

3.1.2.2 Animated Map Stimuli

The animated maps used in the focus group session included ones produced by the AKN

staff and ones I produced. Animations produced specifically for the focus group were varied to

include examples of the following: short-term (months within a year) and long-term (years within a decade) temporal extents, temporal smoothing methods using moving windows, symbos the

37 staff had been using, and new symbols that included proportional symbols and bivariate representations of change. Animated map materials used in domain analysis can be found in

Appendix B (on the CD or as part of the eTD). Table 3-1 outlines the order of the materials used.

Table 3-1: Animated Map Materials Used As Prompts in the Focus Group Session

Order & Temporal Temporal Species Author Symbolization Extent Resolution Smoothing Point Symbol Squares 1. Purple colored from red to white 1 year AKN 1day None Martin (white represents a high number of sightings) 100km Grid Cells 3 month 2. Ross’s (increasing value in green Auer 1 decade 1 year moving Goose for increasing number of window sightings) 100km Grid Cells 3. Western (increasing value in green Auer 1 year 1 month None Grebe for increasing number of sightings) 100km Grid Cells 3 month 4. Western (increasing value in green Auer 1 year 1 month moving Grebe for increasing number of window sightings) Unclassed Gridded 5. Dickcissel Auer 1 decade 1 year None Proportional Circles 6. Arctic Unclassed Gridded Auer 1 year 1 month None Tern Proportional Circles Unclassed Gridded Proportional Circles – bivariate change 7. Dickcissel Auer 1 decade 1 year None symbolization (Red – Negative, Blue – Positive, Gray – None) Unclassed Gridded Proportional Circles – 8. Ross’s bivariate change Auer 1year 1 month None Goose symbolization (Red – Negative, Blue – Positive, Gray – None)

38

3.1.2.3 Outcomes

The focus group session ran for two hours, as planned, and was successful in that

discussion generated a large body of material from which to extract specific examples of map

reading tasks directed at the animated map stimuli. Overall, discussion was productive. One late-

arriving participant and a few side-tracked discussions produced redundant or irrelevant material

(the latter occurred despite my best efforts to steer the discussion). Discussion on map reading

tasks during the fourth section was successful in that participants freely offered their reading and

thought processes, describing the patterns in the data. Some participants were resistant to new

forms of symbolization and had difficulty describing patterns using those forms, so it was helpful

to have also shown animated maps that they had created, as these stimulated quite a bit of

discussion about patterns in the data. Nonetheless, critiques of new symbolizations and animation

methods helped to guide future animation design. These will be discussed below.

Participants commented on the following aspects of animation design during the focus group session: data units, temporal resolution, moving temporal windows, symbolizations, and outlier appearance. In the animations created for the focus group session, the data units used were the number of sightings of a species per unit time for a given grid cell. Discussion amongst participants strongly suggested that they preferred some form of standardization, such as the number of sightings of a species, adjusted by the number of times a species was not seen for a given grid cell at a particular time. For the task-based experiments, frequency (number of times the given species was sighted out of the number of times sightings of any kind were recorded for a given location) was used, as the AKN data allowed for easy calculation of that unit and it was

the unit that AKN staff had been accustomed to in their own visualizations.

Focus group animations that I created used either months or years as the temporal

resolution. However, participants agreed that a resolution of days or weeks was preferable, as the

39 patterns in the data occurred at a temporal scale much closer to days. For the task-based experiments, a temporal resolution of one week was used, as it was the most feasible considering the data infrastructure limitations I was working with.

Participants disliked animations that used moving temporal windows as a method of temporal smoothing. The experts considered those animations as distorting the reality of occurrence and timing, by displaying sightings at times that they did not in fact occur (in the case of moving windows, two to three weeks before or after the actual occurrence). As discussed in

Chapter Four, alternative temporal smoothing methods were applied that did not distort the reality of occurrence and timing.

New symbolizations presented to the focus group participants that used graduated circles to represent the number of sightings was met with general confusion and disapproval. One participant misinterpreted large graduated circles (representing a large number of sightings) as actually representing larger regions. Another participant felt that graduated circles gave too much weight to outliers that were oversampled. In this case, he was referring to vagrant birds that birders chased and over-reported in relation to their true rate of occurrence. Participants agreed that they would prefer symbolizations that reduced the visual impact of such outliers, as that would allow a more faithful representation of a species’ overall pattern. Finally, one participant felt that the size of the graduated symbol overwhelmed the color used to represented change in a bivariate pair. While the symbol selection process discussed in Chapter Four eliminated size as a visual variable for use in the experiment, the participants’ comments helped to first identify potential problems with it.

The remaining discussion was quite substantial and provided a large body of rich material for task extraction, and subsequently, task typology formation.

40

3.2 Focus Group Session Coding & Task Typology Formation

As a result of successfully gathering a large amount of focus group session material to work with, the next step was to form a task typology from that material. This was done by first identifying and extracting tasks from the material generated by the focus group session. By merging the tasks from existing typologies with the tasks identified in the focus group session, it was possible to create a new typology that uniquely focuses on the movement patterns found in aggregated spatiotemporal point data. This new typology is intended to support the second phase of research (task-based experiment), and potentially to support geovisualization tool design by providing input to choices on methods and functionality to implement. The steps required to complete this process are described in this section, starting with the coding of the focus group session material, continuing with the application of existing typologies, and finishing with the grouping and categorization of the tasks.

3.2.1 Focus Group Material

First, the audio recordings of the focus group session were transcribed to text and the text was annotated with key events. Annotation involved marking the points at which different stimuli were displayed (these were announced in the audio recording by the moderator). Relevant responses to prompts and stimuli were extracted, grouped, and labeled by order, topic and participant. Large, complex responses were decomposed into individual statements that formed cogent descriptive statements of simpler, independent patterns. Within these statements, the verbs and clauses were identified and marked to code the language being used by participants in describing their reading of the animated maps. In this format, tasks were more easily isolated for matching with applied existing typologies and for subsequent classification.

41

3.2.2 Task Coding

With the tasks from the focus group session material isolated, it was possible to code each one with multiple pieces of information, which are described in this section. This included applying tasks from existing typologies, which will be detailed in the next section.

Systematically, each isolated focus group session task was coded with the existing typology tasks and with its use of space, time, and attribute. Coding tasks by their use of space, time, and attribute was done for two major reasons. First, it was done to develop an understanding of the complexity of the task. Second, it was done to help relate the completion of the task to what dimensions of space, time, and attribute would be required in any visualization or representation method seeking to facilitate successful completion of that task. Focus group session tasks could, and often did, receive multiple codes and belonged to multiple task bins. Finally, each focus group task was coded with a working name. As tasks were coded, groups emerged in the process, and comments were written for each task, to help guide the final formation process. Figure 3-1 shows an example screenshot of the coding table used.

Figure 3-1: An example screenshot of the coding table

Following, Table 3-2 provides summary statistics on the number of times the task types from the three existing task typologies were used. These summary statistics help in describing the distribution of tasks coded to statements, in understanding the overall relationship of the existing

42 typology tasks to the focus group session material, and in linking the coding process with task categorization. Additionally, while the general task names and final task types that statements were given in the intermediate process of coding were not the ones ultimately applied in the resulting new typology, they did help to conceptually group tasks and were the pieces of the coding process that helped categorize the final set of tasks.

Table 3-2: Typology Task Coding Count Number of Times Existing Typology Task Coded Wehrend (1993) Identify 41 Locate 29 Distinguish 29 Categorize 10 Cluster 30 Rank 43 Compare 15 Associate 6 Correlate 1 Blok (2000) Appearance/Disappearance 32 Mutation (Nominal) 6 Mutation (Increase/Decrease) 20 Movement Along a Trajectory 23 Boundary Shift 14 Moment in Time 14 Pace 3 Duration 5 Sequence 10 Frequency 2 Cycle 4 Trend 4 Short Series Same/Different 7 Long Series Same/Opposite/Different 5 Long Series In Phase/Phase Difference 3 Andrienko et al. (2003) Identify / when>what + where / general when and general what + where 5

43

Identify / when>what + where / general when, elementary what + where 5 Identify / when>what + where / elementary when, general what + where 5 Identify / when>what + where / elementary when and elementary what + where 7 Identify / what + where>when / general when and general what + where 18 Identify / what + where>when / general when, elementary what + where 17 Identify / what + where>when / elementary when, general what + where 0 Identify / what + where>when / elementary when and elementary what + where 14 Compare / when>what + where / general when and general what + where 1 Compare / when>what + where / general when, elementary what + where 0 Compare / when>what + where / elementary when, general what + where 1 Compare / when>what + where / elementary when and elementary what + where 3 Compare / what + where>when / general when and general what + where 0 Compare / what + where>when / general when, elementary what + where 11 Compare / what + where>when / elementary when, general what + where 0 Compare / what + where>when / elementary when and elementary what + where 2

3.2.3 Existing Typology Application

As discussed in Chapter Two, tasks from existing typologies were applied to aid in grouping and categorizing the extracted and isolated focus group session tasks. The following task typologies were applied to the extracted focus group session tasks: Wehrend (1993), Blok

(2000), and Andreinko et al. (2003). The tasks from these three typologies were used in the coding process, applying these task types to those identified in the focus group session. Ideas from Bertin (1967/1983) and Yattaw (1999) provided additional guidance, however their classification of tasks was not specific enough for use in coding. Each task offered a different perspective in forming the new typology. Wehrend’s typology was largely used to understand and derive the active goals of visualization for each task, while Blok’s typology lent naming conventions and provided a foundation for similar movement patterns. The Andrienko et al.

(2003) typology was useful in forming task levels and to define search targets by applying it within each task. Before the existing task typologies were applied, the definitions of their tasks

44 were extracted and numerically ordered for ease of coding. This section will describe the application and contributions of each typology.

3.2.3.1 Wehrend

Applying Wehrend’s (1993) list of visualization goals made it easy to identify the verb/clause structure present in a descriptive statement of the extracted map reading tasks from the focus group session material. The reading task verb actions in the focus group session tasks, as identified by the verb/clause structure, could then be matched to Wehrend’s “types of actions.”

These “types of actions” were so useful that, as more thoroughly described later, they were integrated with the typology itself. However, Wehrend’s “types of data” were deemed less useful in forming this new typology, as the type of data (aggregated spatiotemporal point) used here is covered by a only a few (direction, position, and spatially extended region or object) of

Wehrend’s “types of data,” as his taxonomy was intended for visualization techniques generally, not just those that are geographic.

3.2.3.2 Blok

Blok’s (2000) typology was the most useful in grouping and naming the focus group session tasks, as her domain source (environmental monitoring) and visualization goals (change monitoring) were the most similar to this work. As a result, a number of task names, goals, and descriptions were drawn directly from Blok’s typology, including (with examples from the new typology developed here in parentheses): appearance/disappearance (presence/absence and arrival/departure), movement along a trajectory (path description), boundary shift (regional shift), cycle (same), and trend (same). Blok also divided her tasks by monitoring questions, with those

45 that use identification and comparison across short and long time series. However, I believe that these types of question uses are better defined by the more recent Andrienko et al. (2003) typology, and subsequently, theirs was the one used in defining question approaches for the new typology. Blok also categorizes her tasks by spatial, temporal, and spatiotemporal dimensions based on methods of reasoning, to help organize the types of tasks. The new typology acknowledges these dimensions, but uses the dimensions (also including attribute) as a way of helping to structure the tasks based on complexity and as a way of helping to link tasks with the dimensions of space-time-attribute needed in a visualization method to facilitate task completion.

As a result, Blok’s temporal reasoning tasks are not carried over directly.

3.2.3.3 Andrienko et al.

Three important features of the typology presented here were derived primarily from the

Andrienko et al. (2003) typology. First, their typology introduced the notion of levels, based on reading levels defined originally by Bertin (1967/1983) to categorize the object-oriented perspective of an exploratory task. While not directly adapted nor completely analogous, the new typology uses complexity levels based around the amount of effort required to complete a task.

These complexity levels are based on how many descriptive statements are required to complete the task and what use of space, time and attribute is required by a given visualization method designed to facilitate completion of the task.

Second, and most important, the Andreinko et al. typology can be fitted within the entire new typology to allow operational search target identification. Their typology separates task types into three dimensions: search level, search target, and cognitive operation. The cognitive operation divides tasks into those that identify and those that compare. Focusing on the behaviors of individual objects or groups of objects as the “what”, the search target has a divide between

46 those tasks that define a “when” and seek to find the “what” and “where” and those tasks that

define the “what” and the “where” and seek to find the “when.” Finally, the search level

decomposes into combinations of general level (overall) and elementary level (individual) with the “when” and the “what” and “where.”

This multi-dimensional approach of identifying an exploratory task can be applied to the tasks defined in this new typology by replacing the object or group of objects (representing the

“what” in the Andrienko et al. typology) with the pattern to be described in the completion of a task. In this scenario, a task that starts with an elementary “when” and seeks the elementary “what

+ where,” might instead be “Describe the [TASK X] pattern at the given time moment” as opposed to “Describe characteristics of this object (location) at the given time moment,” as defined by Andrienko et al. (p. 510). Operating in this way, it is possible to replace the “what + where” in the Andrienko et al. typology with the task given in the new typology to define specific examples for each. In the presentation of the new typology a few examples will be given (e.g.,

Describe the pattern of timing for Yellow Warbler during spring migration in North America.).

However, in using the new typology, the complete application of the Andrienko et al. typology to derive specific exploratory tasks will be left up to the reader.

3.2.4 Task Categorization & Structure

Once each focus group session task was coded, it was possible to group them into similar tasks, order the tasks, and form a typology based on their use of space, time, attribute, tasks applied from the existing typologies, and their individual roles in the domain experts’ process of characterizing patterns in the data. While aspects of typology formation were somewhat subjective on my behalf as a result of my experience with the domain, domain experts, and their

47 tasks, the applied existing typologies helped to keep the formation process more uniform and objective. Figure 3-2 provides a graphical depiction of the overall typology formation process.

Figure 3-2: Typology Formation Work Flow

This section describes how the typology was structured and some of the logic behind its formation. This includes a description of task levels, space-time-attribute use, visualization goals, task components, example tasks as driven by the application of the Andrienko et al (2003) typology, example descriptive statements, and, finally, the format of the typology.

48

3.2.4.1 Task Levels & Components

The typology was formed hierarchically, grouping tasks into three levels of task complexity (low, intermediate, and high) based on the following: 1) how much a given task contributed to the completion of other tasks (low contributes to intermediate and high), 2) the length, detail, and complexity of descriptive statement required to describe the pattern (high level tasks required longer statements) and, 3) how fully a visualization method needs to utilize space- time-attribute to facilitate completion of the task (low level tasks may require only one dimension, such as space; high level may require all three).

Of these three drivers of typology structure, the biggest driver was a task’s contribution to the completion of other tasks, as derived empirically from the fact that, the low-level, simpler tasks are components of the higher level tasks. That the descriptive statements in the domain analysis focus group session material for complex patterns were almost always composed of simple, fractional statements that could be considered individual tasks in their own right is important in understanding the relationship between task levels. In the process of extracting the tasks, applying the existing typologies, and categorizing and forming the typology, these simple statements were used to define the low-level simple tasks, that when combined, formed intermediate and high level tasks, whose resulting descriptive statements were the original complex pattern descriptions provided in the domain analysis process. Figure 3-3 provides a graphical representation of this relationship.

49

Figure 3-3: The mapped relationship between descriptive statements and tasks

Low level tasks are frequently completed by characterizing the pattern using simple statements, such as “western distribution.” Intermediate level tasks are often accomplished by combining a small number of low level statements to make comparisons (either intrinsic or extrinsic) amongst aspects of the pattern to characterize it. High level tasks require multiple statements from low and intermediate level tasks to form a construction of the overall pattern, yielding the lengthiest, most detailed, and most complex descriptive statements. As task complexity levels increase, successively more effort is required to complete the task, with longer and more thorough descriptive statements.

An important point is that tasks categorized in this new typology can overlap both in definition and completion. However, defining a hierarchy of order was important, as there was an apparent order based on the fact that a number of simpler tasks (those ordered as low level) were

50 necessary to accomplish in completing more complex tasks (those ordered as intermediate or high

level). Tasks can overlap in their goals, as some fuzziness between task categories is inherent and

the boundaries are not crisp. The details behind typology structuring will follow in the next

section.

3.2.4.2 Space, Time, and Attribute Use

Each task in the typology presented below is specified with the minimum dimensions of

space, time, and attribute use, that a visualization or representation method must take advantage

of in facilitating the successful completion of a task. The categorization process revealed that

tasks could be grouped based on the complexity of their space, time, and attribute use. This minimum dimensional use can help to guide the choice of an appropriate representational method to facilitate accomplishing the goal of that task. For example, tasks that only require space or time might be best facilitated with static representations, such as a single map, or a timeline, respectively. On the other end of the spectrum, tasks that require the combined use of all dimensions (space, time, and attribute) will demand a richer, more complex visualization method that combines multiple techniques such as advanced, multi-dimensional interfaces that allow filtering, querying, interaction, and scaling.

Thus, each task is assigned a minimal level of STA use that a visualization method must capitalize on to make the task accomplishable and the goal achievable. However, exceptions for each task may exist. Some tasks that I have listed as requiring at least space and time, may

actually function with only one or the other. In these cases, the typology user must decide if the

dimensions of space, time, and attribute provided by a visualization method are adequate to

facilitate completion of the task.

51

3.2.4.3 Visualization Goals

The visualization goals described by Wehrend (1993) were used both to guide the formation of tasks identified in the focus group session into a new typology and also to provide his visualization goals in the typology as example actions that can be used to make established links to visualization methods for accomplishing those goals with a given task. Wehrend’s method includes pairing a “type of action” with a “type of data,” to identify visualization methods for completing a goal defined by the pairing. Table 3-3 provides Wehrend’s definitions (sans the graphical examples he references) for those “types of action.”

The visualization goals included for each task are a representation of what I believe to be the most commonly sought goals for that particular task. However, others could likely be used for a given task. The matching of visualization goals to tasks is meant to represent commonly expected cases and is not meant to provide a complete list of all possible matches. Additionally, as these Wehrend’s visualization goals were intended for a breadth of generic visualization methods (those not specific to geography) they are not a perfect fit, but instead a place to start in identifying appropriate methods for visualizing the data to accomplish the task presented in this typology.

Table 3-3: Visualization goals as defined by Wehrend (1993) Visualization Goal Definition Identify “To identify is to establish the collective characteristics by which an object is distinctly recognizable. Identification is the finest level of detail when an individual objects I being considered.” (p. 188) Locate “To locate is to determine specific position. This position may be either absolute or relative. Locate is frequently associated with a spatially extended region or object. In these cases, there are two properties to consider: position and boundaries with respect to the spatially extended region or object. Boundaries include the magnitude, range, or distance over which it extends.” (p. 188)

52 Distinguish “To distinguish is to recognize as different or distinct. This depth of detail necessitates only that objects be recognized as different; no identification is necessary.” (p. 188) Categorize “To categorize is to place in specifically defined divisions in a classification. Categorization is often considered when multiple objects are to be studied and some organization is desired.” (p. 188) Cluster “To cluster is to join into groups of the same, similar, or related type. Clustering is intended to be used for putting together things that need to be either conceptually or physically grouped. Whereas categorizing places objects into preexisting categories or groups, clustering creates the groups as the objects are placed in them.” (p. 188) Rank “To rank a data type is to give it an order or position with respect to other objects of like type. Ranking is intended to be used when an absolute or a relative ranking is to be given to some series of objects. For instance, it may be important to know that the maximum and minimum elements in a group are based on some type of ordering system. Ranking also implies some type of comparison, at least in a relative sense. Stating that one object is greater or smaller than another requires comparison. Objects can be ranked according to many criteria: alphabetical, numerical, or chronological.” (p. 188) Compare “To compare is to examine so as to notice similarities and differences. Comparing is important when two or more objects need to be looked at and no rank is implied for them. As we stated in discussing rank, comparison is similar to ranking, except that you may want to compare without explicitly ordering: you may want to compare two faces in order to see how they differ.” (p. 189) Associate “To associate is to link or join in a relationship. Association is used when a relationship is drawn between two or more objects that may be otherwise unrelated. These objects or attributes need not be of the same type and often are not.” (p. 189) Correlate “To correlate is to establish a direct connection. The connection may be causal, complementary, parallel, or reciprocal. Correlation is used when the relationship between two or more objects is connected in a manner that is important, if not obvious.” (p. 189)

3.2.4.4 Example Tasks & Example Accomplishments

For each task entry in this new typology, two types of examples will be given. The first will be an “Example Task,” as defined by the application of the Andrienko et al. (2003) typology, as previously discussed. These examples represent a typical case from the possible combinations that would arise from applying the Andrienko et al. typology to this new typology; they give some guidance on what a common task would look like in that scenario. The second will be an

53 example descriptive statement that would be derived by accomplishing that task. This will be called an “Example Accomplishment.” For example, if the Example Task was “Describe the

Absence/Presence pattern of Arctic Tern on St. Paul Island in October,” the Example

Accomplishment would be “Arctic Tern is absent from St. Paul Island during October.” Example accomplishments are adapted from the responses provided in the focus group session.

3.2.4.5 Format

Each task has the following: the minimum dimensions of space-time-attribute a visualization must use to facilitate completion of the task, a full written description, a set of visualization goals derived from Wehrend’s set (Wehrend 1993) for possible use in linking the tasks to appropriate visualization methods, an example task as derived from the application of the

Andrienko et al. (2003) typology, and an example accomplishment from the focus group session material. An example task description structure is presented in Table 3-4.

Table 3-4: An example typology task description structure TASK Any combination Examples from Wehrend (1993), Minimum STA Use of Space-Time- Viz Goals such as: Identify, Locate, Attribute Use Distinguish Includes a written explanation of what the task requires to complete Description: it and how it is most often used. Any qualifying information is provided. Components: Any lower level tasks that contribute to the given task. A task as defined by the application of the Andrienko et al. (2003) Example Task: typology to the given task, such as, “Describe the Absence/Presence of Arctic Tern on St. Paul Island in October.” Example An adapted statement from the focus group session, such as, Accomplishment: “Arctic Tern is absent from St. Paul Island during October.”

54

3.3 A Task Typology for the Movement Patterns of Aggregated Point Data

This typology is created with an assumption, based on past research, that different representations will facilitate different levels of inference affordance, and thus different tasks dependent on goal. Making links between tasks in this typology and appropriate forms of visualization remains a topic of future work, which will be discussed in Chapter Five. This section first presents information on the typology’s use, its potential generalization, how tasks are accomplished, and some of the sub-tasks present in the typology. Then, an outline of the typology will be followed by the typology itself, before a summary of this chapter is presented.

3.3.1 Generalization

While this typology is closely linked to tasks derived from a domain-specific situation, having applied more general existing task typologies means that there is potential for this typology to be generalized to other situations involving similar data. An obvious example is data about bird flu, or other avian-borne diseases, which will also be based on aggregate point

(incident) data and will obviously exhibit similar movement patterns. The typology is expected to apply to any data that has movement patterns similar to those of aggregated bird populations would work under this typology. This may include other animal species, cell phone data, mobile device tracking, or any other number of data sets that aggregate inconsistently sampled point data to study collective movement patterns.

55

3.3.2 Task Accomplishment

This section details the process involved in accomplishing a task. Some of this material has been introduced in other ways within this chapter. This section differs in stringing the process together from a perspective of using the task typology.

Most simply, a task as conceptualized here is accomplished by forming a descriptive statement of the pattern sought by the task. However, the complete trajectory of task accomplishment is more nuanced. The flow associated with accomplishing a task can be found in

Figure 3-4. First, a user selects an analysis or exploratory goal, directed at the data. Then, with the goal in mind, a user identifies a task that helps them reach that goal. With a task identified, the user would then apply the Andrienko et al. (2003) typology to form a specific example for accomplishment. To accomplish the task, the user visualizes the data. Here, the visualization goals (Wehrend 1993) and the space, time, and attribute use would guide the user to select an appropriate method of visualization. Then, based upon the visualization, the user can accomplish the task and the goal, as reflected in the form of a descriptive statement that answers the question posed by the task. If a task is of a high or intermediate level, the user will first have to accomplish the low level tasks that compose that given high or intermediate level task. Following this flow from selecting a goal to describing a pattern, the exploratory process can be best facilitated.

56

Figure 3-4: A graphical depiction of the flow required for accomplishing a task.

3.3.3 Sub-tasks

A few tasks identified through domain analysis were not substantially unique enough to

warrant their own specific type. As a result, these have been categorized as sub-tasks under the

main tasks they are based upon. Both of these (Arrival/Departure and Density Characterization)

are specific examples of their parents, but differ from their parents in the methods required to

accomplish them. These sub-tasks can be considered as examples of where the typology categorization is a bit fuzzy. Nonetheless, the use of these sub-tasks is valuable individually, and for intermediate and higher level task accomplishment.

57

3.3.4 Outline

Tasks are ordered hierarchically, as described, and include, in parentheses, the necessary

minimum dimensions of space, time, and attribute use necessary for any visualization method to

facilitate completion of that task.

1. Low Level Tasks A. Presence/Absence (S, T, ST) B. Extent Characterization (S, T, ST) C. Outlier Identification (S, T, ST) D. Directional Description (ST) 2. Intermediate Level Tasks A. Timing (T, ST) a. Arrival/Departure (T, ST) B. Boundary Shift (ST) C. Clustering (ST) a. Density Characterization (SA, TA, STA) D. Path Description (ST) 3. High Level Tasks A. Rate (STA) B. Trend (STA) C. Cycle (STA) D. Cluster Paths (Multi-object Tracking) (STA) E. External Information Association (STA)

3.3.5 Descriptions of Task Types

Table 3-5 presents the low level tasks.

Table 3-5: Low Level Tasks A. Min. STA S, T, ST Viz Goals Identify, Locate Presence / Use Absence Description This type focuses on identifying in space, time, or space-time the presence or absence of the target phenomenon. Spatial and temporal scale can range from the finest resolution to the broadest. As a low level type, there are no sub-components, but it often acts as a necessary component of higher level tasks. Components None.

58

Example Describe the Presence/Absence pattern of Arctic Tern on St. Paul Task Island in October. Example “Arctic Tern is present on St. Paul Island in October.” Accomp.

B. Min. STA S, T, ST Viz Goals Identify, Locate, Distinguish Extent Use Characterization Description The description of the spatial, temporal, and spatiotemporal extent of a phenomenon relies on both defining boundaries and describing frequencies. The detail of the boundary provided can range from rather simple, such as a term like “Western” or “October,” to a more nuanced demarcation of range at a finer scale. Extent Description departs from Presence/Absence when the identification of extent or range shifts from locating existence at single locations given a time to describing the entire extent of existence given any range of times. Of low level tasks, extent description may often be the lengthiest to carry out, due to the amount of information needed to do the task. Components None. Example Describe the pattern of Extent for Boreal Chickadee. Task Example “Boreal Chickadee occurs throughout Alaska and Canada south to Accomp. southern Canada and some of the northernmost United States.”

C. Min. STA S, T, ST Vis Goals Identify, Locate, Distinguish, Cluster, Outlier Use Categorize, Compare Identification Description This type involves separating and identifying outliers or anomalies from the remainder of the pattern. Spatial and temporal scale can range from the finest resolution to the broadest. However, this task is often performed at finer-scale discrete instances. In the domain- specific example given here, this task is imbued with some level of subjectivity as to how anomalous a point really is from the overall pattern. In many other applications, this task will have a more precise statistical interpretation for identifying outliers. Components None. Example Describe the pattern of outliers for Western Grebe. Task Example “Western Grebe is a vagrant to the east in the winter only.” Accomp.

D. Min. STA S, T, ST Viz Goals Identify, Distinguish, Categorize, Compare Directional Use Description Description This type combines space and time to derive directional movement of a phenomenon. Directional fineness can be as broad as a primary cardinal direction or as detailed as describing movement toward a specific physiographic feature. Descriptors often include statements such as “from,” “to,” “towards,” “along,” and “away.” Components This type requires comparing previous and current states of Extent Characterization or Presence/Absence to derive directional movement. It is categorized as a low level task as it contributes to many medium and high level tasks. Example Describe the pattern of Directional movement for Western Grebe Task during the fall and winter. Example “Western Grebe moves south in fall and winter.” Accomp.

59

Table 3-6 presents the intermediate level tasks.

Table 3-6: Intermediate Level Tasks A. Timing Min STA T, ST Viz Goals Distinguish, Categorize, Compare Use Description This type focuses on time in relating the temporal behavior through space, with descriptors of the pattern often given as specific moments in time. Largely, this refers to the temporal description of movement, starting with a group, or entity, and describing when or where it is or is not at a given location. Expectations of occurrence through time are often included. Components This type often requires using and combining both spatial and temporal dimensions Presence/Absence and Extent Characterization, but can also include Directional Description. Example Describe the pattern of timing for Yellow Warbler during spring Task migration in North America. Example “Yellow Warbler appears everywhere within a three day window. Accomp. The event is that fast.”

A.a. Min. STA T, ST Viz Goals Distinguish, Categorize, Compare, Rank Arrival / Use Departure Description The most common sub-type of timing is the acknowledgement of arrival or departure to or from a location at a given time. The most desired temporal resolution for this sub-task is single-day, but often weeks and even months can be used. Optionally, this sub-task often experiences ranking, categorizing and comparing, such as “First,” “Last,” “Early,” “Late.” Components This type often requires using and combining both spatial and temporal dimensions Presence/Absence and Extent Characterization, but can also include Directional Description. Example Describe the pattern of Arrival/Departure for Purple Martins during Task spring migration in North America. Example “East of the Mississippi, Purple Martins arrive first, in late March, Accomp compared to birds in the west that arrive in mid-April.”

B. Min. STA ST Viz Goals Locate, Distinguish, Compare Regional Use Shift Description This type characterizes changes in existing extents over time. Often these are done with longer-windows of time, but can occur over shorter periods. First identifying and describing the extent of the phenomenon and then noting, over time, changes in the extent, usually as expansions or contractions described with direction and often magnitude of the shift. Components Completing Extent Characterization and Directional Description is necessary. Example Describe the pattern of Regional Shift for Dickcissel over the period Task of 2003 to 2006 for the central great plains. Example “Dickcissel experiences a huge pulse in 2005 along the Colorado-

60

Accomp. Kansas border, shifting east.”

C. Min. STA ST Viz Goals Cluster, Categorize, Compare Cluster Use Formation Description This type of clustering involves identifying sub-groups of the whole pattern and treating them as independent objects. The most common form of this type is the study of sub-populations of a whole, those that have distinct geographical boundaries. This includes space-time clustering, such as abundance peaks in concentration for periods of movement, and is required to describe the distribution and movements of sub-populations. Components Completing Extent Characterization, Presence/Absence and Outlier Identification are essential to completing this task. Example Describe the pattern of Clusters for Purple Martin in North America. Task Example “Purple Martin has sub-populations in Arizona and the Pacific Accomp. Northwest, when compared to the main population east of the Mississippi.”

C.a. Min. STA SA, TA, Viz Goals Categorize, Cluster, Rank, Compare Density Use STA Characterization Description This type involves characterizing the distribution within known extents in terms of density, as opposed to simple cluster outlining. Regions of high or low density are often discerned from the pattern as a whole. This type can be used to either describe spatially variable densities of occurrence or to identify locations in space-time that have particularly unique patterns of density. Components Completing Extent Characterization, Presence/Absence and Outlier Identification are essential to completing this task. The use of attribute with these components sets this type apart from Cluster Formation Example Describe the pattern of Density for Snow Goose during wintering Task months in North America. Example “Naturally occurring populations of Snow Goose in California, Accomp. Texas, and New Mexico represent thousands of birds, whereas everything east of the Mississippi represents individual birds.”

D. Path Min. STA ST Viz Goals Cluster, Compare, Categorize, Associate Description Use Description This type involves the description of a path or route of movement. In the domain specific examples given here this was seen as a way of characterizing the migratory routes or dispersal routes of bird species. Often, this is done by describing regions of seasonal movement between areas where the phenomenon was sedentary, but can take on descriptions more complex than those that describe a simple one-directional path between two regions. Often the paths are not spatially linear or direct. Additionally, the paths can be considered to have area and describing of the path area is a good method for characterizing the extent of migration in this instance. Components Completing Extent Characterization and Directional Description are required to define both the path and the regions that the path connects. As the path is often not a consistently occupied region in space or time, Timing is often used to describe the temporal origins and ends of the path use.

61

Example Describe the pattern of Paths for Purple Martin during spring Task migration in North America. Example “In the west, you see Purple Martins migrate along the California Accomp. coast towards the Pacific Northwest.”

Table 3-7 presents the high level tasks.

Table 3-7: High Level Tasks A. Min. STA STA Viz Goals Distinguish, Categorize, Rank, Compare Rate Use

Description Completing the Rate type demands describing the density of movement over a given period of time. Rate can be applied to arrival and departure of migration, or to the dispersal or contraction of a distributional change. Commonly used descriptors include: “fast,” “slow,” “quickly,” and “gradual.” Most often rate requires reading of attribute space to judge the flow or wave properties of a movement. Rate goes beyond timing by describing the magnitude of movement adjusted by unit time. Components This type requires the completion of Extent Characterization, Path Description and/or Regional Shift, to define the spatial dimensions. Most important is the completion of Timing, combined with Density Characterization (the use of attribute in this case) to form a more complex spatiotemporal description. Example Describe the rate of Purple Martin migration patterns in North Task. America. Example “Purple Migration experiences a slower progression north in the Accomp. Spring, but in the fall they virtually vanish off of the map in a very short period.”

B. Min. STA STA Viz Goals Categorize, Rank, Compare, Associate, Trend Use Correlate Description This type of task involves characterizing the trend in occurrence through time of any length and over space of any extent. In the domain-specific example given here, this task is most often completed at a period of one year to describe migrations and longer periods (multi-year) to describe distributional changes. This can be done through descriptors such as, “increasing,” “decreasing,” “not changing,” or any other number of terms often used to describe time series. The use of the attribute dimension is necessary to characterize properties of volume and density. Components The completion of this type requires using Extent Characterization, Presence/Absence, Directional Description and Timing as building blocks for informing the description of Regional Shift and Density Characterization. Patterns detected in Cluster Formation and Outlier Identification are also often used in the completion of this type. Example Describe the pattern of Trend for Dickcissel over a ten-year period in Task North America. Example In studying the movement of Dickcissel, a completion of this task

62

Accomp. type would involve describing year-to-year distributional shifts (for example, from west-central Great Plains to the Midwest) and noting expanse and density increase in eastern states.

C. Min. STA STA Viz Goals Categorize, Compare, Associate, Correlate Cycle Use Description This complex type involves describing the cyclical nature of movement through time and over space. Building on both forms of lower tasks, the description of cycle demands the definition of starting and end points, paths, direction, and timing. Often a cycle will have both stationary and mobile periods. Important to these are descriptions of frequency (how often a pattern is repeated) and persistence (how long a pattern maintains). Additionally, it is possible to describe long-term changes in short-term cycles, such that this task can be temporally multi-scale. Components It is necessary to complete the following, at least, when completing a Cycle description: Extent Characterization, Presence/Absence, Directional Description and Timing as informing Path Description and often Density Characterization. In addition, Cluster Formation and Outlier Identification help to fully describe a given cycle. Example Describe the pattern of Cycle for a year of Cerulean Warbler Task movement. Example Characterizations of a cycle often require describing parts in seeking Accomp. to construct a scenario that best relates the overall pattern. Starting with a description of the extent and density of wintering grounds, one would proceed to discuss the timing and paths of migration from wintering grounds to breeding territory, noting the clustering and dense localization associated with migration stopovers. Following, a description of the summer extent and a detailing of the fall migration back to the wintering grounds would certainly include a notation of the pattern of vagrant outliers to the east.

D. Min. STA STA Viz Goals Categorize, Compare, Associate, Correlate Cluster Use Dynamics Description This type draws on Cluster Formation and Path Description, however its use goes beyond simple paths. Multi-objects tracking is often completed in this task, and can include any number of ways of describing the behavior of one or more cluster groups through space and time, including path description, but also potentially describing intrinsic changes in a cluster, cluster movement timing, and behavior as relative to other clusters. Monitoring multiple clusters requires making spatiotemporal comparisons between clusters and characterizing disparate behaviors as a way of understanding overall, holistic pattern. Components Precursors of both Cluster Formation and Path Description are necessary. Directional Description, Density Characterization, Regional Shift, and Rate are needed to provide a thorough understanding of multi-object space-time patterns. Example Describe the pattern of Cluster Dynamics for Sandhill Crane sub- Task populations during fall migration in North America. Example “With Sandhill Cranes, one can see all the subpopulations: the Accomp. Western population breeding in the west, the birds near the Great Lakes, and then the sub-group in Florida. In September, the Western birds begin to migrate, but the eastern birds stay put. Then, the Great

63

Lakes birds move and the Florida population stays.”

E. Min. STA STA Viz Goals Associate, Correlate External Use Information Description The most complex task, Association requires relating prior Association knowledge or other information sources with the target pattern. Examples would include relating times of breeding, wintering and migration with spatiotemporal behavior. This could also include molt information, species specific biology, or other abiotic or biotic relating factors. Components Any previous types can be used as well as those not present in this typology.

Example Describe the pattern of nest box use for Purple Martin detectability Task during spring migration in North America. Example “The eastern population of Purple Martin uses nest boxes, so they’re Accomp. more detectable when they arrive.”

3.4 Summary

This new task typology for the movement patterns found in aggregated spatiotemporal point data has been formed through domain analysis used to identify the map reading tasks of domain experts and applying existing task typologies to aid in categorization. While the focus group material used to generate this new typology and examples given in this typology are domain-specific, applying existing typologies in the formation process means that the use of this typology should be applicable to any situation involving movement patterns found in aggregated spatiotemporal point data. This typology differs from existing typologies, in that it focuses on a more specific type of data (aggregated spatiotemporal point) and a broader set of task goals

(movement patterns). This typology will be especially useful in guiding the design of geovisualization tools to aid the development of geovisualization tools that require temporally dynamic forms of representation and those that capitalize on heterogeneous spatiotemporal point data to support the analysis of movement patterns present in those data.

64 This chapter presented the methods used to perform domain analysis, the process of forming a task typology of tasks derived from domain analysis, and the task typology itself. The process of determining the animated map reading tasks of domain experts through a focus group session generated ample material for use in forming a new task typology. By coding the focus group session material to identify tasks, applying existing typologies to those identified tasks, and categorizing tasks new task typology was formed. Finally, the task typology itself, along with a discussion of its use, was presented. This description of this process helped to link the outcomes of the domain analysis with the task typology by describing the typology formation process, as guided by existing task typologies directed at spatiotemporal patterns.

4 Task-based Experiment

The task-based experiment had one primary goal, to answer the question: does explicitly

representing geographic time-series change with a bivariate map symbol help or hurt the

recognition of patterns of change in an animated time-series map? This chapter focuses on methods and results of the primary experiment. First, the decisions associated with selecting bivariate symbols, designing animated maps, and choosing experimental design variables will be presented. Second, a discussion of experiment execution and predictions for experiment results are given. Finally, the results of the experiment and discussion of those results will be presented, along with a discussion of some limitations of the experiment.

4.1 Methodology

Once tasks were extracted from the focus group session material and the new typology was formed, the process of designing a task-based experiment to answer the main question of this thesis could proceed. This required a series of decisions. First, bivariate map symbols had to be selected to represent both the magnitude of point data values and the magnitude of change between frames. This selection process was guided by selective attention theory. Elements of animated map design for use in the experiment had to be decided upon, with many previous studies informing those decisions. A selection of tasks was made (guided by the task typology), the experiment question and response structure was built, and statistical analysis methods were identified. After a digital survey was coded and constructed to collect experiment response data, a pilot study was completed with 20 Penn State graduate students to refine the layout and format of the digital survey, check for errors, test the statistical analysis methods, and decide upon final task

66 selections. After making changes identified in the pilot study, the main experiment was

performed with 63 Penn State undergraduate students. This section reports on the details of this

process.

4.1.1 Symbol Selection

A key goal this research addresses is to investigate the effectiveness of symbolization

choices for showing change in time series data in animated maps. Thus, a key decision for the

experiment was which symbols were to be compared. The independent variable in the

experiment, then, was the variation in symbols.

As noted above, selective attention theory guided symbol choice. Selective attention theory identifies three categories of dimensional interaction for bivariate symbols: integral, separable, and configural. As previously mentioned, configural variables (which represent a middle ground between integral and separable symbols) are not studied in this thesis. As a result, one integral and one separable variable pair were to be selected. In addition, to ground the interpretation of results for the two options, a univariate symbol that did not represent change was added as a control. The starting point for selecting the two bivariate symbols (one integral and one separable pair) was Nelson’s (2000b) published list of commonly used bivariate symbols, categorized by their dimensional interactions from selective attention theory. However, to fully explain the choices made, it is necessary to discuss aspects of visual variable syntactics and strategies for representing change that had an impact of symbol selection.

Visual variable syntactics as described by MacEachren (1995) were used to match data measurement levels to the appropriate symbol use. Bivariate symbols in this setting represented

AKN point data values and the change between frames of those point values. The AKN point data values are numerical data and best represented with visual variables that are effective for

67 symbolizing numerical data. MacEachren lists location and size as good for numerical data and color value, color saturation, color hue, texture, and orientation as marginally effective. Second, change could be represented as nominal (change – yes/no), ordinal (positive, no, negative), interval (standard deviations of change), or numerical (magnitude of change). However, merely showing the presence or absence of change (nominal) was not seen as particularly valuable and methods for classifying and calculating meaningful quantitative levels of change (interval and numerical) proved difficult with little guidance on doing so, especially for an animated map. In the end, change was represented ordinally.

Representing positive, no, and negative change means that symbols should show a divergent scheme, for increasing, decreasing, and no change. Symbolizing a negative value of change posed an interesting problem for symbol design. An additional problem was identified when experimenting with a symbol pair that used size as the variable for the point value data.

Locations on the map changing from a value greater than zero to a value of zero (disappearance) were found to experience significant negative change and needed to be symbolized as such, despite showing no value and having no size as a symbol. This ruled out size as a practical variable for representing the data magnitude; thus some alternative to size was needed for representing the point data values. These two problems will be discussed on a case-by-case basis below.

Having eliminated size as a potential visual variable, the remaining separable pairs identified by Nelson (2000b) include: dot numerousness – hue, point symbol shape – hue, point symbol value – shape, and orientation - hue. After testing these in an animated setting, dot numerousness-hue was eliminated as it was impossible to represent locations experiencing disappearance as the point data value was represented by the number of dots (zero value meant zero dots with no way to congruently represent the negative change). Point symbol forms were given up or down arrows indicating positive or negative change, but this created significant

68 interaction problems with orientation that made discerning the directions of the arrows in a map

impossible. The only remaining choice for a separable pair was orientation (change) – hue (data

value) (Figure 4-1).

Figure 4-1: Legend for Bivariate Separable Symbol using Orientation (for change) and Spectral Scheme (for data value)

A thick vertical line represented increase, a small square in the middle represented no change, and a thick horizontal line represented decrease (as a minus sign). Color hue, though, is only a marginally effective visual variable for numerical data (MacEachren 1995). A solution was found in applying a selection of a carefully designed spectral scheme (Brewer 1997). However, the spectral scheme is not composed of hue alone, but instead includes hue, value, and saturation as visual variables, providing more contrast between steps. While using this part-spectral scheme meant that numerical data could be represented, in part, with hue, this final variable pairing was not completely faithful to the original identified by Nelson (2000b). Additionally, using the

69 spectral scheme meant that zero values could be included as part of the scheme. This variable pair

is termed “Bivariate Separable” in subsequent discussion.

Nelson (2000b) categorized only two examples of integral variable pairs: saturation-value

and height-width. Both of these presented problems, as the use of size in height-width made it impossible to represent negative change. Likewise, it was impossible to represent a divergent scheme with only color saturation or value. Despite not testing it in a map setting, Nelson (2000a) identifies hue-saturation as being integral based on psychological experiments reported in the literature. However, MacEachren et al. (1998b) found that it performed poorly compared to a separable texture overlay and an adjacent display of secondary information, suggesting that it is not the most optimal choice of integral variable pairs. Regardless, it was the only one available and the types of change (negative, no, positive) could be represented by different hues and the point data value could be represented primarily by saturation of a given hue for a particular type of change (Figure 4-2).

When creating symbol schemes that use saturation with graphic design software such as

Adobe Flash (with ActionScript 3.0), adjusting saturation has an impact on both saturation and value, such that the final symbolization is actually a combination of both. While the saturation can be controlled independently in Flash, the result of changing the saturation of a color also changes the value as well.

70

Figure 4-2: Legend for Bivariate Integral Symbol using Value/Saturation (for data value) and Hue (for change)

This variable pair is termed “Bivariate Integral” in subsequent discussion. On important distinction to make between the Bivariate Separable and Bivariate Integral legends is that while the Bivariate Separable symbols are shown as discrete points in the legend, the Bivariate Integral symbols are shown as a continuous shading scheme. However, the discrete symbols (grid cells or shapes) for both Bivariate Separable and Bivariate Integral were represented on the map as unclassed and along a continuum, not as three discrete categories as drawn in the Bivariate

Separable legend, which was difficult to draw as a continuum (and thus more closely representing the symbols drawn on the map). Figure 4-3 and Figure 4-4 provide examples of both bivariate symbols in the digital survey setting. A discussion of classification will be presented in the next section.

71

Figure 4-3: An example map using the Bivariate Integral symbol in the digital survey setting

Figure 4-4: An example map using the Bivariate Separable symbol in the digital survey setting

72 The non-change control group symbol was based on the existing symbolization of maps created by the AKN staff. This symbolization used increasing value and decreasing saturation in green to represent an increasing number of bird sightings. Please see Figure 4-5 for an example of their maps and Figure 4-6 for the legend of the non-change symbol, termed “Univariate Non-

Change,” used in the experiments. The same caveat about graphic design manipulation of saturation and value discussed above for the Bivariate Integral symbol apply to this symbolization as well, such that it is actually a scheme of saturation and value, not just value.

Figure 4-5: Example static AKN map (eBird 2009)

73

Figure 4-6: Legend for Univariate Non-Change Symbol using Color Value/Saturation (data value)

To more completely understand the effectiveness of the two bivariate change symbolizations, univariate change symbols showing only the ordinal level of change (and no point data value) occurring at a given location based on the two integral and separable symbols were included in the experiment. These were abstracted from the bivariate examples and can be seen in Figure 4-7 and Figure 4-8. Adding these resulted in a total of five variable pairs (or five testing groups) for the experiment.

Figure 4-7: Legend for Univariate Change Figure 4-8: Legend for Univariate Change Symbol Using Hue Symbol Using Orientation

74

4.1.2 Animated Map Design

Harrower (2003, p. 1) cautions map-makers faced with creating animated maps, saying that “Cartographers who want to use animation to make a better map must know the strengths and limitations of animation as a tool, and how map-readers are likely to be impacted by animation.”

This is especially true when animated maps are to be designed for use in experiments. Animated map use in this situation requires careful consideration and control of many design elements, so that unaccounted variability does not impact the response data and results can speak to the intended variation as faithfully as possible. Relevant decisions as they apply to data format, classification, interactivity, temporal legends, tweening, smoothing, and frame rate will be described in this section.

4.1.2.1 Data Handling

Map animations of choropleth maps are not uncommon. Calculating the differences between frames is easily done with animated choropleth maps when the collection units are the same. However, as the AKN data is collected and stored as discrete, rarely overlapping, points in space and time, it is necessary to aggregate the data both spatially and temporally. The spatial aggregation of raw data into 100 km grid cells allows direct calculation between time frames, thus it is possible to calculate change. Spatial aggregation also allows a variety of symbol types, as previously discussed, to be used in gridded format. Data consistency can be highly variable at the finest temporal resolutions of a single day or even a single hour. Temporally aggregating the data to weeks yields a more consistent pattern, one without gaps in sampling.

75

4.1.2.2 Classification

Classification is not a straightforward task when creating a series of animated maps using

multiple examples of species with different ranges and distributions of data. One possible solution

is to leave the data unclassified. Slocum et al. (2008) reviewed the results of experimental studies

on both classed and unclassed maps, but did not conclusively advocate either for any particular

situation, noting that no studies made direct comparison between the classed and unclassed maps.

Harrower (2007) pits a classified map animation against an unclassified map animation in a test

of quantity estimation. While his experiment indicated that unclassed maps performed no better

for that task, qualitative suggestions by participants indicated that the unclassed maps may have

in fact been better for revealing subtle geographic shifts in regional patterns.

While translating knowledge derived from studies done on one map form to a different

map form is not well understood, unclassified data and legends may be reasonably applied in this instance. This is due to the fact that map reading tasks for use in this experiment are similar to what unclassed legends were successful for in Harrower’s (2007) experiment. Harrower offers further support for unclassed animation by arguing that animated classed maps create artifacts of classification in that some locations frequently change classes when located near a break while others can experience significant change in value while staying entirely within a class.

Unclassed maps, however, consistently utilize the full range of a color scheme and will

disguise any differences in overall magnitude from one data set to another. This is not an

important issue in this case, though, as none of the tasks ask participants to make comparisons

between data sets.

76

4.1.2.3 Temporal Legends

Temporal legend use in any animated map setting requires careful consideration. When evaluating trends or patterns of change in spatiotemporal data, users must have a time referent and awareness of the passage of time in an animation. Edsall et al. (1997) tested three forms of temporal legend (linear, cyclical, and text-based), failing to find that any were significantly more effective than any other; however, the experiment had flaws and used a small sample of participants. Experimentation by Midtbø et al. (2007) was similarly inconclusive in testing the effectiveness of many different temporal legend forms, but did yield user preferences towards temporal legend form, in that circular legends were preferred for cyclical data (24 hour periods), while linear legends were preferred for non-cyclical data (years, decades). Slocum et al. (2004) found that participants naturally thought of clocks and calendars in association with time, but implemented a linear time line, suggesting that such time lines should be detailed and clearly integrated with the map. Considering that the temporal extent of many animation examples used in the experiment is only five to fifteen frames long, representing a linear progression for a sub- seasonal period, a linear legend is the most logical choice.

4.1.2.4 Interactivity

Allowing a user to interact with an animated map in an experiment setting, where they could adjust the dynamic variables, has the potential to introduce unaccounted levels of variability to experiments. Since interaction with animated maps is not a focus of this research, it was not used in the experiment. Experimentally, eliminating user interaction with maps in order to focus on the key independent variable of symbol type is important. However, the use of non- interactive maps can be frustrating for users due to disappearance (features are only visible for a

77 short time) and personal preferences for animation frame rates (Harrower 2007). In animated map experiments, Harrower addressed this problem by substituting a looping time series animation for the individual animation rate adjustment often provided in interactive animated maps. In his experiment, participants saw the same animation, at the same rate, without the ability to adjust rate and introduce variability but the animation looped repeatedly. Looping alleviates potential varying preferences for animation speed amongst users, as they have an opportunity to adjust to the given rate and answer questions after becoming accustomed to the rate. Additionally, the impact of disappearance is lessened as users have an opportunity to view the map repeatedly; potentially studying features more thoroughly than if they had only seen the animation once. In the experiments detailed here, the animations are looped indefinitely, with a 3 second

“rewinding” screen to indicate that the animation is starting over.

4.1.2.5 Tweening

Acevedo and Masuoka (1997) contend that an animation with more frames offers better perception of change and movement. The authors suggest temporal interpolation as a way to sustain display rate and create smoother transitions for animations that lack an adequate number of frames. Similarly, smoothing transitions via tweening (or temporal interpolation) is suggested by Fabrikant and Goldsberry (2005) as a way to make important features in a map animation more salient. However, an overall smoothing effect on an animation as a result of tweening could potentially exacerbate problems related to change blindness, because the change transitions will happen more gradually, thus becoming harder to detect. Considering the temporal resolution available in the data (single days), tweening becomes an unnecessary step in improving animations in this situation.

78

4.1.2.6 Smoothing

Animations created from data with a temporal resolution of one week, such as the data used here, can suffer from jumpiness, variability, or spikiness that can make viewing the animation difficult in raw form. An alternative to tweening that can solve this problem is smoothing, or temporal generalization (Monmonier 1996). As opposed to tweening, which interpolates and adds values between time slices to create more gradual transitions, smoothing applies a filter that dampens peaks and valleys to make the time-series less noisy. However, smoothing in this way does not add temporal resolution that did not exist.

Figure 4-9: Temporal Smoothing Mask given by Monmonier (1996)

Monmonier described a 5-frame moving filter that replaced each value with the mean of the value of a current frame (Figure 4-9) and two values before and two values after the current frame to smooth the series. A number of possible weights were tested with the data used for the experiment and a three-frame filter with weights of 0.2, 0.6, and 0.2 was applied. This resulted in a less noisy animation that preserved the nature of the data.

4.1.2.7 Animation Frame Rate

Few cartographic experiments have sought to determine an effective range of map animation pace for selected tasks. Griffin et al. (2006) used four paces in their experiment,

79 finding that participants were most successful with a pace of 9 seconds (9000 milliseconds) per frame. Psychological experiments (Shapiro 1994 via Wolfe et al. 2006) have shown that the probability of correctly identifying a second stimuli following the successful identification of an initial stimuli, amongst a stream of distracting stimuli, is greatest at animations speeds lower than

150 milliseconds per frame and at speeds greater than 500 milliseconds per frame. Considering that pilot study participants reported difficulties with a speed of 75 milliseconds per frame, the main study experiment shifted to the other side of the curve provided by Shapiro and presented stimuli at a frame rate of 500 milliseconds per frame. While this is quite a bit faster than the most successful frame rate found by Griffin et al., Acevedo and Masuoka (1997) contend that if an animation is shown too slowly, change or trend in an animation is not communicated.

4.1.3 Task Selection & Response Variables

4.1.3.1 Task Selection

After making symbol selection and animated map design decisions, the next step in creating an experiment design was to select tasks. Based on the literature reviewed, only a small subset of the tasks detailed in the typology are ones that map animation can be expected to support independently from interaction with the map interface. A number of the tasks were low- level, simple tasks focused on space, time, and attribute independently, thus tasks that could be best addressed with a static map or graph. Map animations have been shown to be particularly useful for tasks directed at spatiotemporal trends in the data, thus to tasks that address space, time, and attribute components of data together. As a result, tasks directed at characterizing spatiotemporal trends in the data were chosen as the major task category for the experiment.

80 In addition, the pilot study included a section of tasks from the typology that were

directed at qualitatively-described, language-specific examples of geographic pattern, similar to

those described in the focus group sessions. These were related to specific geographic movement

behaviors. An example used was, "Across the time period, as a whole, the extent of this species

shrinks and contracts southward to locations in California, along the Mexican border and along

the Gulf Coast." Similarly described statements from the following new typology tasks were

used: Presence/Absence, Directional Description, Path Description, Regional Shift (including

contraction and expansion), and Cluster Formation (including grouping and scattering).

These were removed from the main experiment, as results from the pilot study indicated

that there was little to no variation in participant response for those tasks, with most participants

completely agreeing with those statements. This indicated that tasks about trends in geography

may be considerably easier than other tasks. As a result, tasks about geography in the main

experiment study were quantitatively similar to the other tasks used, which will be described

below. Further, while graduate students were likely to know enough geography to locate those

types of descriptions for those tasks, undergraduate students may not have and may have failed in

evaluating symbol success on those tasks.

Having participants focus on the different dimensions of the variables being used was desirable. This way, predictions from selective attention theory could inform the interpretation of the results appropriately. This was accomplished by breaking trend tasks into four categories of trends over time: tasks about value, tasks about geography, tasks explicitly about change, and tasks that asked the participants to compare trends in value versus trends in geography.

Specifically, tasks about value referred to the overall trend in the number of bird sightings per grid cell across the entire map, and thus about trends in density. Tasks about geography referred to the overall trend in the number of cells with sightings, thus about trends in geographic extent.

Tasks about change referred to the overall trend in the number of cells showing positive change

81 (those with increasing frequency counts), representing change in geography. Tasks directed at value versus geography referred to the dominance of one trend over the other (e.g., the increase in value was greater than, less than, or equal to the increase in geography). Each set of tasks included three variations in trend, which will be described below in the Experiment Structure section. Please see Table 4-1, below, for a full delineation of task conditions.

4.1.3.2 Response Variables

Three response variables were used to measure participants’ success with the map animations. The first two metrics, agreement with trend descriptions and certainty in that agreement, were rated by the user on a scale from zero to ten, with no midpoint. Midpoints have been shown to result in the participant giving non-answer, no-opinion responses in a subconscious effort to appease the survey giver (Raaijmakers 2000). Thus, mid-points were removed and participants had the option of giving a response at 0, 2, 4, 6, 8, or 10. For agreement, these corresponded to “Very Strongly Disagree,” “Strongly Disagree,” “Disagree,” “Agree,”

“Strongly Agree,” and “Very Strongly Agree,” respectively. For certainty, these corresponded to

“Definitely Uncertain,” “Fairly Uncertain,” “Somewhat Uncertain,” “Somewhat Certain,” “Fairly

Certain,” and “Definitely Certain,” respectively. These were labeled on the response scale so that participants saw them and could use the verbal category description to decide upon their rating.

The final response variable was response time, which was recorded from the moment participants clicked “Start Animation” until they clicked “Submit” to record their answer. This was measured in seconds. Figure 4-10 provides a screenshot of the digital survey interface that participants used

to provide their rating.

82

Figure 4-10: A screenshot of right side of the digital survey interface that participants used to provide rankings (Please see figure 4-14 for full interface screenshot). The middle of the screen was intentionally blank.

4.1.3.3 Experiment Structure

Three of the four task types (value, geography, change) included three variations in trend:

an increasing trend, a decreasing trend, and a trend that remained about the same. Value vs.

Geography tasks had three different variations, all using increasing trends. The first stated that the

animation was showing more increase in value than in geography, the second that the animation was showing more increase in geography that in value, and the third that the animation was showing an equal increase in both value and geography.

83 Each task trend variation (and value vs. geography variation) included both a true statement and a false statement, where the participant was expected to either agree or disagree, respectively. This ensured that participants more thoroughly studied each animation instead of consistently agreeing to each example without scrutinizing them. As a result, the final method for calculating and analyzing the agreement was by measuring the difference of the participant’s response from the intended “correct” answer. Specifically, for a true statement, a rating of 10

(Strongly Agree) was scored as a 0 difference from the intended answer while a rating of 4

(Slightly Disagree) was scored as a 6 (different than intended). For false statements, a rating of 0 was scored as a 0 difference from intended while a 10 (Strongly Agree) would be scored as a 10.

Finally, the data shown in the animation for a true statement was faithful to the intention for that condition. However, the data shown in the animation for a false statement had a trend opposite to the intention for that condition. For example, if the task was value and the task variation was an increasing trend, but the statement intention was false, the animation showed data with a decreasing trend.

With four task types and five symbol groups, there were a total of 20 conditions, with six variants within each (three task variations, each having a true statement and a false statement).

Table 4-1 displays all conditions.

Table 4-1: Experiment Conditions Statement Data Task Variation Intention Statement Showed Over time, the trend in the number of bird Increase True sightings at initial locations is INCREASING. Increasing Over time, the trend in the number of bird Decrease False sightings at initial locations is Value INCREASING. Over time, the trend in the number of bird Decrease True sightings at initial locations is Decreasing DECREASING. Over time, the trend in the number of bird Increase False sightings at initial locations is

84

DECREASING. Over time, the trend in the number of bird No Change True sightings at initial locations is Remaining REMAINING ABOUT THE SAME. the Same Over time, the trend in the number of bird Increase False sightings at initial locations is REMAINING ABOUT THE SAME. Over time, the trend in the number of Increase True locations with bird sightings is INCREASING. Increasing Over time, the trend in the number of Decrease False locations with bird sightings is INCREASING. Over time, the trend in the number of Decrease True locations with bird sightings is DECREASING. Geography Decreasing Over time, the trend in the number of Increase False locations with bird sightings is DECREASING. Over time, the trend in the number of No Change True locations with bird sightings is Remaining REMAINING ABOUT THE SAME. the Same Over time, the trend in the number of Increase False locations with bird sightings is REMAINING ABOUT THE SAME. Over time, the trend in the number of Increase True locations showing POSITIVE CHANGE is INCREASING. Increasing Over time, the trend in the number of No Change False locations showing POSITIVE CHANGE is INCREASING. Over time, the trend in the number of Decrease True locations showing POSITIVE CHANGE is DECREASING. Change Decreasing Over time, the trend in the number of Increase False locations showing POSITIVE CHANGE is DECREASING. Over time, the trend in the number of No Change True locations showing POSITIVE CHANGE is Remaining REMAINING ABOUT THE SAME. the Same Over time, the trend in the number of Increase False locations showing POSITIVE CHANGE is REMAINING ABOUT THE SAME. Over time, there is more increase in Value Value True NUMBER OF SIGHTINGS PER Greater greater Over time, there is more increase in Geography than NUMBER OF SIGHTINGS PER Greater False Value vs. Geography LOCATION than in NUMBER OF Geography LOCATIONS. Geography Over time, there is more increase in Geography NUMBER OF LOCATIONS than in Greater greater True NUMBER OF SIGHTINGS PER than Value LOCATION.

85

Over time, there is more increase in Value NUMBER OF LOCATIONS than in Greater False NUMBER OF SIGHTINGS PER LOCATION. Over time, there is an equal increase in Both Equal NUMBER OF SIGHTINGS PER True Geography LOCATION as there is in NUMBER OF LOCATIONS. equal to Over time, there is an equal increase in Geography Value NUMBER OF SIGHTINGS PER Greater False LOCATION as there is in NUMBER OF LOCATIONS.

To facilitate a better understanding of the results presented later in this chapter, Table 4-2 provides a list of the within-participant factors and between-participants factors. This will help in understanding the application of repeated measures ANOVA with a between-participants factor in analyzing the response data collected. This will be done both to answer the main questions of this thesis identified in the first chapter and also to identify noticeable factor effects on these kinds of tasks and bivariate symbols in animated maps.

Table 4-2: Experiment Factors

Within-participant Factors

Factor Trials

TREND (1) Increasing/Decreasing (2) Remaining About the Same

STATEMENT (1) True (2) False

(1) Value TASK (2) Geography (3) Change (4) Value versus Geography

(1) Value greater than Geography VERSUS (2) Geography greater than Value (3) Value equal to Geography

86

Between-Participants Factors

Group Condition

(1) Univariate Non-Change (2) Univariate Change Hue SYMBOL (3) Univariate Change Orientation (4) Bivariate Integral (5) Bivariate Separable

4.1.4 Experiment Design

The process of constructing and designing the experiment drew upon many resources.

The largest contribution of all was Harrower (2002), who implemented one of the first digital,

Adobe Flash-based surveys to collect response data on animated map stimuli. A similar approach

was used here. This section outlines the details of participant selection, digital survey format,

question structure and ordering, and experiment data set generation.

4.1.4.1 Participants

Participants for the pilot study were Penn State Geography graduate students. Ten had

experience with GIScience research and ten did not. This diversity in experience provided a range

of feedback, from both those who had experience with designing GIS experiments, maps, and

interfaces and those who did not, but had experience studying geographic phenomenon and could

be considered to be similar to target users. These participants were not compensated.

Participants for the main study were Penn State undergraduates, recruited from

introductory Geography lectures. Novice undergraduate students were considered reasonable participants in this study for multiple reasons. First, the perceptual nature of the experiment meant

87 that domain experience was not expected to influence performance on the tasks. No expert knowledge is required in identifying trends in the data, as the names of species in the examples were removed, so that examples were as generic as possible. Second, map animations of this kind are often designed for use by individuals learning about or studying distributions and change in distributions of any phenomena for which data collected at points are aggregated to areas for display and analysis. Thus, the university student population provided a reasonable (and practical) sample of potential users. Subsequently, results from the experiment can be applied to any situation where undergraduate learning through animated maps is a desirable form of communication.

It should be noted that the limited age range of the student participants does not adequately represent the full age range of expected users, which likely includes older users of

AKN data. As a result, there is a potential that the impacts of aging on eye sight or cognitive spatial abilities may result in a somewhat different performance by a group that includes a broader range of participant ages.

Participants in the main study were compensated either $5 or extra credit for their time.

As expected, the main effect of compensation method on the three response variables was found to not be statistically significant.

Participants were randomly assigned to one of five groups, one for each of the symbolization types discussed above. Assigning participants to groups based on the different symbols was done to study the success of participants using a single symbol across all of the tasks. Each participant saw only maps with only one symbol type and each participant completed the same tasks with a different symbol. To deal with order effects, questions were ordered based on a Balanced Latin Square design that ensured for the whole population each question occurred as many times both before and after every other question. Animations and statements were

88 presented together and each participant saw them in different order, using a different symbolization, dependent on which symbol group they were in.

4.1.4.2 Format

The digital survey was written in ActionScript 3.0 for Adobe Flash, intended to be run in any internet browser. All symbol variations of the digital survey for both the pilot study and the main study experiments can be found in the Penn State GeoVISTA Center’s Resources Digital

Library (http://www.geovista.psu.edu). A screen size of 1680 x 1040 pixels was selected to match the monitor size in one of the Penn State Geography computer labs where the experiment would be proctored.

Figure 4-11: A screenshot of the screen in the digital survey that thoroughly explains a participants’ legend

89 The process of completing the digital survey used in the experiment is as follows. After

agreeing to the terms of implied consent, participants saw an introduction screen that outlined the

details of the experiment and gave them an outline to participation. The next screen (Figure 4-11)

thoroughly explained both the symbol legend for their group and the temporal legend. The

following screen showed them an example simplified map animation with four regions depicted

in a grid. This screen indicated the location of the information necessary to start the animation,

illustrated how the statement would appear, and presented the interface with which to provide

their rating of the statement. Before beginning the experiment, participants performed three trial

tasks (screenshots – Figure 4-12 and Figure 4-13) with simplified maps, where they were asked to indicate the type of trend (increasing, decreasing, or remaining about the same) that was occurring. This was intended to ensure that they understood the symbolization method used for the maps to be presented in the main session. The results from this section were used to remove participants who got two or more trial tasks incorrect, since this indicated that they either misunderstood the task or were not attempting to complete it accurately. After completing the three trial tasks, the participants adapted to the format of the questions by going through four practice example animations containing full maps before starting the main set.

90

Figure 4-12: Screenshot of the trial pre-test examples for the Univariate Non-Change Symbol

Figure 4-13: Screenshot of the trial pre-test examples for the Bivariate Separable Symbol

Each animation screen first showed the statement for which the participant was to rank their agreement. After the data finished loading, a three second delay gave the participant time to

91 read the statement for the given animation. A button appeared that allowed participants to start the animation. After the animation started, it looped continuously, with a “Rewinding” screen at the end of each loop. Participants were able to freely study the animation and give their answer at any time by clicking a number on the rating scale and clicking “submit” (screenshot – Figure 4-

14). As noted above, response time was measured as the time between when they clicked to start the animation and when they clicked to submit their agreement rating.

Figure 4-14: A screenshot of the practice animation example where participants rank their agreement

After rating their agreement, participants rated their certainty in that response without seeing the animation again, as it disappeared from the screen. Responses proceeded in this manner until participants had completed all 24 tasks. At the end, a screen asked them to complete anonymous questions about themselves, including: sex, age, whether or not they had taken a class in cartography, and whether or not they considered themselves a bird watcher.

92

4.1.4.3 Experiment Data Set Generation

This experimental method required a set of 24 animated map time-series that matched the categories of task delineated above. As is typical for such experiments there are two options. The first is to select representative real data and the second is to generate data to meet specific criteria.

I opted for the first option, as there was a large body of data to select from and using real data makes the results of the experiment more applicable to a real-world scenario. Selecting representative data sets required a method of quantifying the type (value, geography, change, value vs. geography), variation (increasing, decreasing, or remaining the same) and slope of a trend. For example, a task about trend in value, where a participant was asked to agree or disagree with a statement that indicated that the number of bird sightings per cell was increasing over time, required an animation known to have that trend. Tasks that asked the participant to distinguish between trends in value and trends in geography (Value vs. Geography tasks) were those where an increasing trend was either the result of more sightings per cell, the result of more cells with sightings, or an equal increase in both.

The method for quantifying trends involved extracting week-by-week information from the animations and studying the time-series patterns. Code in ActionScript 3 tallied the values for sightings per cell, number of cells, and number of cells showing each type of change over time.

These time-series values were then brought into Microsoft Excel where they could be visually examined. The previously selected frame rate largely determined that the minimum number of frames necessary to show a series that could be studied in a reasonable amount of time was 5 frames, such that any animations used in the experiment had to be at least that long. Periods showing no change were evaluated by measuring the variance across the selected time series, ensuring that the “no change” trend cases had a statistical variance less than or equal to 0.0001.

Periods showing increases or decreases were evaluated with linear regression, ensuring that any

93 trends had a slope of at least 0.1 frequency units (number of times the given species was sighted

out of the number of times sightings of any kind were recorded for a given location) per frame

and an R-squared value of 0.90 or better. Systematically, I worked through a list of North

American bird species, studying the time series of each and identifying statistically significant trends that could possibly be used for example animations. These trends were then visually inspected in animation form to ensure that patterns were not overly difficult to recognize, as at some of the lower slopes (less than 0.4) some patterns were too visually noisy, scattered, or brief.

They were also inspected to ensure that there were at least a dozen locations with sightings on each map at any frame in the animation. Once a suitable species example was found for each task condition and the animation was deemed appropriate and adequate, it was then placed into the experiment.

4.1.5 Execution

The pilot and main experiments were proctored differently, as the goals for each were different. First, the goal of the pilot study was to inform the design of the survey and tasks. A group of graduate students took the survey and provided free-form feedback to critique the format and look for errors with the survey. Second, the main study sought to collect the main body of data for analysis. The process for executing and proctoring each component of the study is described below.

4.1.5.1 Pilot Experiment

For the pilot experiment, participants were asked to take the survey and provide free-form written feedback at any point during their experience. This was done in a Penn State Geography

94 Department computer lab. Participants were instructed to focus on the overall format of the

experiment. Specifically, they were asked to provide feedback or critique on the interface

usability, the language used, the animation frame rates, and the difficulty of the tasks. If anything

confused them or if they did not understand something, they were asked to record that. As

previously mentioned, the pilot study helped guide the final selection of tasks and task

arrangement for the main study experiment, as well as a final selection of the animation frame

rate. Additionally, a number of participants identified coding errors in the animations, language

that was unclear, or interface problems. These were fixed for the main experiment.

4.1.5.2 Main Experiment

For the main experiment, participants came to a Penn State Geography computer lab.

Each participant was screened for three criteria before they were assigned to a symbol group.

First, an Ishihara color-vision test was administered, as one of the symbol groups was not

readable by those with color-impaired vision. No participants registered color-impaired vision.

Second, they were asked if they had taken a cartography class or whether they considered themselves bird watchers, as screening questions. Participants with experience in these categories were considered the largest potential sources for bias within the target participant pool. A small number of participants indicated that they had indeed taken a cartography class or that they were bird watchers. These participants were distributed amongst the five groups, such that no single group had an unequal population of participants with these two traits. After participants were screened, they were placed into a symbol group based on the next available computing station, such that the order in which they arrived at the testing facility determined which group they were placed into, dependent on their answers to the two screening questions. Participation at that point was guided by the digital survey, which had pointers and interface cue for completing it. Thus, all

95 participants received the same digital instructions using examples appropriate to the group they

were in. Responses from the survey were sent automatically via a PHP script on the Penn State

server to my email address. These responses were then copied to an Excel table for final coding

before being used in analysis.

4.1.6 Analysis

Two consultations with the Statistical Consulting Center (SCC) at the Penn State

Department of Statistics recommended that ANOVA with repeated measures be used to analyze

results; a key reason is that it allowed calculation of between- and within-group interactions of

the results. For each main effect or interaction effect given in the results, the degrees of freedom,

F-value, and P-value for each are reported. Tables of results were complied in Microsoft Excel and the ANOVA statistics were generated using SAS 9.1.

4.1.7 Predictions

A set of predictions derived from selective attention theory guided analysis and interpretation of results. The predictions relate to each of the three response variables. The following predictions are grouped by the task types in the experiment, with a listing of the goals for each task type and what questions the answers from those tasks will allow me to answer.

4.1.7.1 Tasks about trend in Value (6 questions)

The goal of these tasks was to identify the abilities users have to monitor trends in value

(number of sightings per location) with different symbols. Univariate non-change symbols are

96 expected to result in the best performance for value tasks. Bivariate forms are expected to

perform less well and/or to experience a significant increase in reaction time, as the additional

symbolization of change may make it more difficult to study just trends in value. However, the possibility exists that the bivariate separable symbol may equal the univariate non-change symbol as users may be able to separate and attend to only the value dimension independently without being forced to attend to both value and change dimensions as they would with the bivariate integral symbol. SAT would predict that the integral symbol would perform slower and less well for tasks that require attending to just one part of the symbol. Univariate change maps should perform the least well for value tasks, as they have zero information about value coded within them.

4.1.7.2 Tasks about trend in Geography (six questions)

The questions in this section are intended to study what impacts the different forms have on studying trends in geography, specifically the number of locations with sightings. Thus these questions focus on the trends in the geographic extent of a given distribution. Change symbolizations are expected to succeed for these tasks, because the change encoding makes identifying locations (and thus the number of locations) experiencing change easier. Univariate change forms may be most successful here, as they are less complex than the bivariate forms and may have faster response times.

4.1.7.3 Tasks about trend in Change (six questions)

Tasks in this section are focused on revealing the difference between the two bivariate forms and how participants are able to use them to answer specific questions about change.

97 Logically, the univariate non-change symbol will be predicted to perform poorly, with no change encoding. SAT would predict that in the case of studying change, a single variable dimension that does not require studying the other variable dimension, the separable symbol should be most successful, as users can attend to just the change dimension. Accordingly, under this prediction, the Bivariate Integral symbol would perform poorly, as being forced to attend to both dimensions would make studying just the change dimension more difficult.

4.1.7.4 Tasks about Value versus Geography Dominance (six questions)

Results from these tasks will help determine whether participants can recognize that one of the trends (increase in value or geography, or equal increase in both) was dominant or not.

Change forms are expected to show a tendency for indicating that trends were a result of an increase in geography, as opposed to an increase in value, based on previous predictions that change encoding facilitates recognition of locations that are changing. This may be especially true for the univariate change forms, which have only a very abstracted representation of value.

4.2 Results

This section presents the results of the task-based experiment phase of research. The section begins with an overview of the data collection and cleaning process. ANOVA with repeated measures is used to examine main and interaction effects for both within-participant and between-participant factors. Specifically, each response variable is analyzed individually. Results of the symbol effect, the symbol effect by each task individually, and the task effect are presented first to address the primary and secondary questions of this thesis. Other interesting results

98 follow. Discussion is provided throughout. Finally, this section closes with discussion of

limitations and a summary of the chapter.

4.2.1 Main Study

Two phases of data collection with participants resulted in a total of 63 full sets of

responses. An additional five sets were lost as a result of participant actions (clicking “back” in

the browser window) or server failure (browser crashed or server did not deliver responses). The

full body of 63 response sets was analyzed using descriptive statistics of the response variable

means for both within-group and between-group effects. This was done to explore the data for

patterns, to check for errors, and to help guide the process of culling participants.

4.2.1.1 Participant Culling

Exploratory analysis revealed that the response data may have suffered from substantial

noise. As a result, the results of the participant’s responses in the pre-test section of the survey

were reviewed. Nine participants got two out of the three pre-test questions wrong and were

excluded from analysis. Repeated measures ANOVA statistics for between-participant by within-

participant interaction of symbol by task type for each of the response variables indicated that

removing those nine participants decreased the noise in the data, as the F-value increased for each case. The F-value is a measurement of the distance between individual distributions and as it goes up, the p-value goes down. Table 4-3 provides the F-values for the full set 63 and that set after removing those nine participants.

99 Table 4-3: F-values for Interaction of Symbol by Task for the Full Response Set and Culled Response Set

Full Response Set (63) Culled Response Set (54) Agreement Difference 23.05 23.29 Certainty 15.18 15.20 Response Time 6.58 6.88

Removing those nine participants also revealed that some symbol groups were more

likely to get two trial questions wrong. Table 4-4 shows the difference in distribution prior to culling participants who got two pre-test questions wrong and the distribution after culling.

Interestingly, the two univariate change representations had the most wrong answers for the pre- testing questions. Observation of the test participants leads me to suggest that since no value component in these symbols changed on a frame-by-frame basis in the simplified pre-test examples, participants had a hard time indentifying that change was occurring, despite the fact that change was plainly encoded (statically) in the symbol. An over-simplification of the trial test animations (four very simplified grid cells) as compared to the animations used in the main body of the experiment (a continental map with hundreds of symbols) may explain this behavior.

Table 4-4: Symbol Group Culling Differences

Full Culled Response Set Response Set Final Response Symbol Group (63) (54) Difference Set (53) Univariate Non-Change 12 11 1 10 Univariate Change Hue 13 9 4 9 Univariate Change 11 13 11 2 Orientation Bivariate Integral 12 11 1 11 Bivariate Separable 13 12 1 12

Analysis of symbol group by task type interaction effects for response time indicated that effects for that particular response were less pronounced than for the other two response variables

100 (agreement difference and certainty). An investigation into the distribution of response times showed that one participant generated the highest value (119 seconds – next highest was 78 seconds) and at least one other extreme response time (64 seconds – one of only three responses greater than 60 seconds). The reason for this is uncertain, but it is possible that poorly monitored participants could be browsing in other windows or using cellular phones during the experiment, despite being asked not to do so. This participant was removed, again decreasing noise in the response data for response time.

4.2.2 Analysis

After completing exploratory analysis and selecting a final group of participant responses for detailed analysis, each between-participant factor and within-participant factor was analyzed to study both the main effects and interaction effects on response values. ANOVA statistics were calculated to determine statistical significance for effects. These are used to help answer the primary and secondary questions of this thesis as well as to identify other interesting results that have an impact on animated maps, bivariate symbology, change representation, and selective attention theory. This section systematically breaks down and presents the results of the main study, organized by factors. Plots for notable and statistically significant main effects and interaction effects will be provided, along with ANOVA statistics.

In establishing a level of significance to evaluate ANOVA statistics and because there were three response variables, it was necessary to check for correlation between the three response variables. All three response variables were found to be significantly correlated

(p=<0.0001) and a Bonferroni Adjustment was made, setting the level of significance for evaluation to 0.0167. This was done to control the overall Type I error in analyzing the response data.

101 In addition to ANOVA statistics, the means of the response variables were analyzed

using the Tukey-Kramer multiple pairwise comparison procedure to determine if any means were

significantly different from one another. However, analysis revealed that very few pairs of means

were significantly different from each other and sufficiently different to yield meaningful results

that deserve discussion.

4.2.2.1 True and False Statements

First, response quality was checked with the effect of true and false statements on response values.

Figure 4-15: Histogram of agreement ratings Figure 4-16: Histogram of agreement ratings for true statements. 0 = Completely Disagree, for false statements. 0 = Completely Disagree, 10 = Completely Agree. 10 = Completely Agree.

Histograms of agreement values for the two types of questions (Figure 4-15 and Figure 4-

16) show that while the distribution for true statements was as expected (a “distance decay” function from high agreement to low agreement), the distribution for false statements was skewed towards the center of the response spectrum, suggesting that participants were less likely to completely disagree with a false statement than they were to completely agree with a true

102 statement. The within-subjects main effect of true and false statement types for all three response variables supports that notion. For false statements, the mean agreement difference was higher

(Figure 4-17), the mean certainty rating was lower (Figure 4-18), and the mean response time was slower (Figure 4-19), indicating poorer overall performance for false statements.

F = 27.94, df = 1, p = <0.0001 F = 24.28, df = 1, p = <0.0001

Figure 4-17: Main Effect of True and False Figure 4-18: Main Effect of True and False Statements on Agreement Difference Statements on Certainty Rating showing a showing a higher mean agreement difference lower mean certainty rating for false for false statements. statements.

F = 7.98, df = 1, p = 0.0069 Figure 4-19: Main Effect of True and False Statements on Response Time showing a slower mean response time for false statements.

103 The mismatch between the two distributions of agreement values indicates that participants were indeed studying the animations enough to tell that statements could not be responded to by consistently agreeing with all of them. However, the fact that false statements had a markedly different distribution, a higher mean agreement difference, a lower mean certainty, and a slower mean response time suggests that discerning if a targeted phenomenon is not occurring, or is behaving differently than expected, is harder than discerning if is in fact occurring. This effect can also be explained by acquiescence on behalf of the participants. With agree/disagree type questions, participants are more likely to agree, especially when they do not know the correct response (Krosnick 1999) or the statements are positively worded (Friborg et al.

2006).

4.2.3 Effects

Table 4-5 provides a summary of the ANOVA statistics for each effect for all three response variables collected in the experiment. Following, each effect will be treated and analyzed individually.

Table 4-5: Summary of ANOVA statistics for all effects (* denotes statistical significance at a level of 0.0167) Agreement Certainty Response Time Difference Effects df F p df F p df F p Symbol Group 4 1.69 0.168 4 0.4 0.8 4 2.38 0.07 Task Type <0.0001 <0.0001 3 136 3 88.9 <0.0001* 3 26.3 * * Trend <0.0001 <0.0001 1 34.55 1 13.16 0.0007* 1 31.68 * * True/False <0.0001 1 27.94 1 24.28 <0.0001* 1 7.98 0.0069* * Value versus 2 6.56 0.0021* 2 6.42 0.0024* 2 11.5 0.3198 Geography

104

Univariate/Bivariate 1 0.05 0.8257 1 0.08 0.7773 1 5.7 0.0217 Integral/Separable 1 0.84 0.3698 1 1.61 0.2182 1 2.4 0.1366 Symbol Group*Task 12 1.94 0.0345 12 1.57 0.1058 12 3.38 0.0002* Univariate/Bivariate* 3 0.1 0.9584 3 0.29 0.8351 3 4.47 0.0051* Task Integral/Separable* 3 0.49 0.6934 3 0.07 0.9779 3 2.4 0.1366 Task Symbol Group* Value versus 2 0.59 0.782 2 0.8 0.6044 2 0.35 0.9445 Geography

4.2.3.1 Symbol Group Effects

The results presented in this section can be used to address the primary question of this

thesis, which is: Does explicitly representing both the magnitude of temporal point data and the

magnitude of change in data values between frames of an animated geographic time-series enable

users to answer questions about patterns and rates of change quickly, easily and accurately?

The main effect of symbol group on the three response variables was not significant. This suggests that no single symbol had a clear advantage over the others for the tasks used in this experiment. If this result was to hold for an experiment with more participants and/or more tasks, it might suggest that SAT predictions based on tests with static map symbols may not transfer to a setting involving time-series map animation. However, as discussed in section 4.1.1, constraints imposed in representing change bivariately meant that less-than-optimal variable pairs

(orientation and spectral scheme for Bivariate Separable and saturation + value and hue for

Bivariate Integral) were used in the experiment. This likely had an impact on the effectiveness of the two bivariate symbols.

While statistically significant differences were not found, examining the effect of symbol on performance is worthwhile to discus, as the effects for two of the response variables

(agreement difference and response time) were close to being significant, thus would be worth

105 further investigation with a larger group of participants. Figure 4-20 shows little variation in agreement difference between symbols, but participants using the Bivariate Integral symbol did have the lowest agreement difference across the full set of tasks. Figure 4-21 shows that response times were lowest for the two Univariate Change symbols (with hue having the fastest response time) and highest for the Bivariate Separable symbol.

F = 1.69, df =4, p = 0.168 Figure 4-20: Main Effect of Symbol on Agreement Difference

106

F = 2.38, df =4, p = 0.07 Figure 4-21: Main Effect of Symbol on Response Time

The impact of the main effect across all tasks inconclusively addresses the main question

of this thesis, potentially showing that participants using the Bivariate Integral symbol had the

lowest agreement difference and the lower time of the two bivariate symbols. While participants

using the Univariate Change symbols were faster, they also produced much higher agreement

differences, indicating that overall performance may have been poor. Again, as these effects were

not significant, symbol possibly does not have an effect on performance. On the other hand, this lack of significance may be due to the unbalanced distribution of participants amongst symbol groups combined with small sample size. Additional testing, with a larger balanced distribution, is needed before results can be considered robust enough to guide map animation design. Further, as previously discussed, the use of poorly rated bivariate symbol pairs likely had an impact on the performance of those symbol types in the experiment.

107

4.2.3.2 Symbol Effects by Task

Following analysis of the symbol effect for all tasks, inspecting the effect of symbol group on the response variables for each task individually is important. Table 4-6 presents the response variable means for all five symbols and all four tasks. Table 4-7 presents the ANOVA statistics for those response variable means. This set of ANOVA results indicates that the effect of symbol generates a significant difference for the agreement difference measure on the change tasks (as expected), but not for other tasks. There were no significant differences in certainty judgments and only one for response time, for symbol value versus geography.

Table 4-6: Summary of responses for all symbols by task (* denotes statistical significance at a level of 0.0167) Univariate Univariate Univariate Bivariate Bivariate Non- Change Change Response Integral Separable Task Value Change Hue Orientation Agreement 4 4.7 4.4 4.1 4.1 Difference Value Certainty 7.6 6.7 6.8 7 6.5 Tasks Rating Response 13.9 12.5 11.8 14.5 19.8 Time Agreement 1.2 1.5 1.5 1.1 1.4 Difference Geography Certainty 8.8 8.7 8.4 9 8.5 Tasks Rating Response 9.2 8.7 9.6 10 11.5 Time Agreement 4* 2.6* 3* 2.3* 2.8* Difference Change Certainty 7.3 7.8 7.8 8.1 7.8 Tasks Rating Response 12.2 7.5 10.7 9.5 13.1 Time Agreement 4.3 4.4 5 4.3 4.3 Difference Value vs. Certainty Geography 7 6.7 6.3 6.8 6.3 Rating Tasks Response 10.2* 7.9* 12.8* 14.2* 17.7* Time

108

Table 4-7: Summary of ANOVA statistics for symbol effects by task (* denotes statistical significance at a level of 0.0167) Agreement Difference Certainty Response Time Effects df F P df F p df F p Symbol, Value Tasks 4 0.77 0.5497 4 1.5 0.2168 4 1.79 0.1455 Symbol, Geography Tasks 4 0.58 0.678 4 0.49 0.746 4 1.21 0.3173 Symbol, Change Tasks 4 3.76 0.0097* 4 0.56 0.693 4 1.79 0.1462 Symbol, Value vs. 4 1.9 0.1252 4 0.46 0.7632 4 3.42 0.0154* Geography Tasks

The fact that for most response variables and most tasks the main effect of symbol was

not significant suggests that symbol does not affect performance and that no particular symbol

can be suggested as better than any other based on these results. However, the significant effect of

symbol on Change tasks for agreement difference is a potentially important effect since a central

focus of this research is about how to represent change along with data in map animations. Figure

4-22 indicates that the Univariate Non-Change symbol performs much poorer, with a significantly higher agreement difference as compared to the other symbols. This does offer support for addressing the central question of this thesis, by saying that bivariate symbols representing change were more successful for explicit change tasks than a Univariate Non-

Change symbol.

109

F = 3.76, df =3.76, p = 0.0097 Figure 4-22: Main Effect of Symbol on Agreement Difference for Change Tasks

The results in this section can be used to address the secondary question of this thesis, which is: Does signifying the two components in a bivariate symbol pair using visual variables that are expected (by selective attention theory and prior research by others) to be separable versus integral change the answer to the primary question and, if so, how? However, it is difficult to conclusively answer this question, as none of these effects of symbol on the three response variables for the four tasks proved to be statistically significant. This does suggest that differences in symbol have little impact on performance for each task. Regardless, there is value in discussing the potential impact of this main effect as it helps address predictions from SAT on how symbols should have performed.

SAT would predict that the bivariate separable symbol would perform better for those tasks where only one dimension of the variable needs to be attended to separately. In this case, that is the value tasks and the change tasks. The geography tasks do not necessarily require

110 attention to either dimension, just the presence or absence of a symbol of any kind. However,

encoding of change may make detecting trends in geography easier as the locations experiencing

change will be encoded in a way that may make them easier to detect.

Change tasks require attending only to the dimension encoded with change, but may

benefit from having change integrally bound to value as in the integral symbol, due to a

combination of symbol redundancy and interrelated data. Dobson (1983) found that for three different reading tasks with static maps, graphic symbol redundancy improved both response times and accuracy. Thus, with interrelated data and symbol redundancy (change being derived from data value), the Bivariate Integral symbol would be predicted have better performance for participants using that symbol. Thus, the integrally forced attention of the interrelated and redundant dimensions present in both bivariate symbols can explain success for the Bivariate

Integral, but not the Bivariate Separable symbol, where the dimensions are still redundant, but not integrally bound.

The results presented in Table 4-7 suggest that, except for agreement difference in change tasks and response time for value versus geography tasks, no symbol performed significantly better than any other. This is despite the prediction from SAT that, as discussed above and in the predictions in the first half of this chapter, particular symbols should outperform others for different tasks. The fact that they do not can possibly be explained by the relationship between the two datasets used in this experiment and how integral and separable variable pairs behave with different dataset relationships.

In this experiment, one variable dimension represented the magnitude of value of the point data, while the other dimension represented the change between frames of that data, such that the second dimension is derived from the first. As a result, the two sets of data are inherently interrelated. This is important, because the forced integration experienced with integral variable pairs is said to work best for interrelated (or correlated) data sets. Separable variable pairs,

111 however, are said to work best for independent data sets. Predictions based on SAT in this case would have suggested that Bivariate Separable would have performed better, as users could attend to one dimension at a time, and in this case, the important one, the change dimension.

However, this was not the case due to the benefit of symbol redundancy and interrelated data sets that made Bivariate Integral more effective. This is especially true in a situation such as this, where data was interrelated and in particular where there was a strong temporal autocorrelation, as is often found in time-series map animations.

An oversight of this study was that data relationships other than ones that are temporally interrelated were not tested within the experiment, so it is not possible to be entirely confident that this behavior for integral and separable variables pairs would hold with different data relationships (such as independently correlated, where one dataset is not derived from the other as in this case, or unrelated), as they are untested in this case. Testing different data relationships in this framework should be a key aspect of future research. For now, results from this experiment can inform bivariate animated map use by suggesting that map designers link the relationships of the data with dimensional interaction behaviors of the variable pairs.

4.2.3.3 Task Effects

The within-participant task effect on the three response variables can be analyzed in two ways. First, the main effect can be used to understand the impacts that task type has on performance for all symbols and for map animation use in general. Second, symbol by task interaction effects can be used to identify symbols that behaved differently across the tasks for any of the response variables to further help address the main question of this thesis. This section provides both analyses.

112 Figure 4-23 shows that the within-participant main effect of task on mean agreement

difference results in a markedly better performance (a lower mean agreement difference) for

geography tasks than for both value tasks and tasks asking the participant to discern between

value and geography. This suggests that map animations such as the ones used in this experiment

may be more suitable for facilitating the study of trends across a number of locations, than in the

study of trends of value at given locations. Recognizing attribute-driven trends in value as a whole is apparently more difficult than recognizing trends in geography as a whole. This behavior is further supported by the same main effect for certainty (Figure 4-24) and somewhat by response time (Figure 4-25), as certainties were significantly higher for geography tasks than they were for all other tasks. Times were somewhat lower, and this was also significant.

F = 136, df =3, p = <0.0001 Figure 4-23: Main Effect of Task on Agreement Difference

113

F = 88.9, df =3, p = <0.0001 Figure 4-24: Main Effect of Task on Certainty

F = 26.3, df =3, p = <0.0001 Figure 4-25: Main Effect of Task on Response Time

In studying the interaction effects of symbol by task for the three response variables, the only significant interaction is that for response time (Figure 4-28), which shows a strong

114 interaction for the Univariate Non-Change symbol performing much more poorly compared to other symbols for change tasks than for other tasks. Participants using this symbol had much slower response times when compared to the other symbols and tasks. While not significant (but close), similar interactions for agreement difference (Figure 4-26) and certainty (Figure 4-27) support poor performance for the Univariate Non-Change symbol for Change tasks as compared to other tasks. The Univariate Non-Change symbol had a higher mean agreement difference and a lower certainty. These results help answer the main question of this thesis, by suggesting that bivariate symbols representing change are more successful for explicit change tasks than is a

Univariate Non-Change symbol. It would be difficult to understand change from typical, univariate animated maps that represent only the data and expect map readers to interpret change from the change inherent in the animation.

F = 1.94, df =12, p = 0.0345 Figure 4-26: Interaction effects of Symbol by Task for Mean Agreement Difference

115

F = 1.57, df =12, p = 0.1058 Figure 4-27: Interaction effects of Symbol by Task for Mean Certainty Rating

F = 3.38, df =19, p = 0.0002 Figure 4-28: Interaction effects of Symbol by Task for Mean Response Time

116 Another noticeable, but less substantial, interaction effect between symbol and task is that which is expressed through the Univariate Change Orientation symbol for response time. Figure

4-29 shows that while other symbols experience a decrease in response time for change tasks as compared to other tasks, both Univariate Non-Change (expected and observed, as previously discussed) and Univariate Change Orientation experience little decrease, suggesting that these forms are slower at change recognition. A similar interaction was not present for the agreement difference and certainty response variables. Both the Univariate Change Orientation symbol and the Bivariate Separable symbol (the bivariate version of Univariate Change Orientation) have slower response times when compared to the Univariate Change Hue and Bivariate Integral symbols, respectively. This behavior suggests that orientation may be a poor visual variable for detecting trends in change in an animated map setting. This finding is supported by recent research (Fabrikant et al. 2009) showing that symbols using orientation as a visual variable perform poorly for detecting change in a flicker paradigm.

F = 3.00, df = 4, p = 0.0273 Figure 4-29: Significant interaction between symbol and task for response time.

117

4.2.3.4 Univariate/Bivariate Effects

Aggregating the symbols encoding change to their respective univariate and bivariate

categories provides another look at behavior across the four tasks. Of the between-participants main effect of univariate or bivariate symbol on the three response variables, only that of response time is statistically significant. That effect shows (Figure 4-30) that participants using bivariate symbols responded more slowly than those using univariate symbols. The effect (Figure

4-31) of symbol for response time shows that response times for bivariate symbols are slower across all tasks. This suggests that the abstraction of change without value symbolization makes the use of Univariate symbols quicker. This is possibly explained by the fact that there is less complexity in working with only one variable dimension. As can be seen in Figure 4-28, the

Bivariate Separable symbol had much slower response times across all tasks as compared to other symbols. This behavior runs against SAT predictions that dimensions of a bivariate separable symbol could be attended to separately and as such, a bivariate separable symbol should not

perform so slowly. This may be the result of using a Bivariate Separable symbol (orientation and

spectral scheme) that was not faithful to the combination of orientation and hue that originally

classified that variable pair as separable. While this modification (using the spectral scheme

instead of hue) may have had an impact on the symbol’s dimensional interaction behavior, the

spectral scheme was a better visual variable for numerical data values than hue would have been.

118

F = 5.7, df = 1, p = 0.0217 Figure 4-30: Main Effect of Univariate/Bivariate Symbol on Mean Response Time

F = 4.47, df = 3, p = 0.0051 Figure 4-31: Effect of Univariate/Bivariate Symbol across all tasks for Response Time

119

4.2.3.5 Value versus Geography Variation Effect

The task of value versus geography was designed and used as a result of the pilot study,

which indicated that participants had no reason to disagree with tasks of pure-language descriptions of geographic movement behavior. As a result, the value versus geography task set includes three variations: those where increasing trends were the result of value, or the result of geography, or they were equal.

F = 6.56, df = 2, p = 0.0021 Figure 4-32: Main Effect of Value vs. Geography Task Variation on Agreement Differences

120

F = 6.42, df = 2, p = 0.0024 Figure 4-33: Main Effect of Value vs. Geography Task Variation on Certainty Rating

F = 1.15, df = 2, p = 0.3198

Figure 4-34: Main Effect of Value vs. Geography Task Variation on Response Time

121 The main effect of within-participant value vs. geography task variation on the response variable indicates that for mean agreement difference (Figure 4-32) participants were significantly more correct on task variations where geography trends were greater than value trends. However, they were slightly more certain (Figure 4-33) with task variations where the reverse was true. While not significant, the same effect on response time (Figure 4-34), showed that participants were fastest on tasks where the trends were equal. Interestingly, participants were less certain on task variations where trends were equally the result of both value and geography. Additionally, as the mean agreement difference was lower for Geography greater than

Value trends, there may be some preferential focus on patterns in geography, which is supported by previous findings in this thesis that correctly recognizing trends in geography was much easier for participants.

4.2.3.6 Increasing/Decreasing Trends vs. Remaining About the Same Effects

Three of the four tasks (except for Value versus Geography tasks which were structured differently as previously discussed) included the following three different types of trends: increasing, decreasing, and remaining about the same. The pilot study used “varying widely” as a fourth type of trend, but those results showed that participants had difficulty interpreting that type of trend, resulting in extreme response variance. As a result, the main study used only the three trend types previously mentioned. Aggregating these types to increasing and decreasing trends and remaining about the same trends and studying the within-participant main effects of those trends for the three response variables reveals some interesting patterns.

122

F = 34.55, df = 1, p =< 0.0001

Figure 4-35: Main Effect of Trend Type on Agreement Difference

F = 13.16, df = 1, p = 0.0007 Figure 4-36: Main Effect of Trend Type on Certainty

123

F = 31.68, df = 1, p = <0.0001 Figure 4-37: Main Effect of Trend Type on Response Time

Most notably, when compared with the effect of increasing and decreasing trends, the effect of remaining about the same trends resulted in a higher mean agreement difference (Figure

4-35), a lower mean certainty (Figure 4-36) and a slower mean response time (Figure 4-37). This suggests that in this animated map setting, participants had a harder time determining whether a trend was remaining about the same, as compared with determining whether a trend was increasing or decreasing. This is logical, as determining whether there is no trend may be a

relatively difficult task with map animations, a stimulus that is frequently changing. The full

ramifications of this behavior might be better understood if these tasks were tested in comparison

with other map forms, such as static small-multiples.

124

4.2.4 Limitations

Experiment results often have limitations in impact, generality, and certainty of

interpretation. While some of these limitations may be minor, they are worth noting. For the

experiment used in this thesis, the limitations include a relatively narrow set of tasks used, lack of

interactivity, animation example generation methods that were partly subjective, the use of a

single animation frame rate, limited demographic diversity of participants, unbalanced participant

distribution amongst symbol groups, and the use of less-than-optimal symbol pairs.

As only a limited number of tasks were feasibly implemented in this experiment, those

restricted to geographic time-series trends, the results that apply to the use of bivariate symbols

and the application of selective attention theory in an animated setting are valid only for tasks

comparable to the ones used. While the tasks used were limited to trends, the types of trend tasks

(value, geography, change) used represented a reasonable breadth of possible trend type tasks.

Our knowledge of appropriate task types is limited with regard to those that animated maps will

be most useful for, so it is hard to evaluate how many different tasks could be relevant. However, it is possible that tasks different than the ones used in this experiment would yield completely different results and would offer different guidance for the use of bivariate symbols in representing geographic change in map animations.

The animated maps used in the experiment presented here were strictly non-interactive,

in an effort to prevent unaccounted variability in the results of the experiment. However,

animations have been argued to be much more successful when the user has the ability to

manipulate and control elements of a map animation (Andrienko 2000; Harrower and Fabrikant

2008). Under conditions involving interactivity, it is expected that the use of bivariate symbols

and change representations may differ widely from the behavior predicted by the results of this

experiment.

125 The process of generating animation examples for use in the experiment was somewhat

subjective. In spite of the care taken to select animation examples that most faithfully represented

the intended trends each were meant to represent, the possibility exists that the process of selecting examples resulted in a confounding variable. It was not possible, with the method used, to ensure that all examples meant to represent the same category of task were perfectly comparable, as each represented a different real species that exhibited notable differences in geographic distribution, frequency, and spatiotemporal movement patterns. While this may have had an impact on results, it is also worth nothing that symbols that are not successful in real world map settings are not particularly valuable to cartographers.

The selection of a single animation frame rate limits the applicability of the results from the experiment. Different animation frame rates might result in different uses of the bivariate symbols tested here, outside of an experiment setting. Within a similar experiment setting, different animation frame rates may have yielded different results than those presented here.

However, the results presented here are likely valid for a range of animation frame rates similar to the one (500 milliseconds per frame) used in this experiment.

Participants included in this experiment were restricted to a select range of undergraduate students. Thus, there was a limited the age range, cognitive abilities, previous experience, and motivation present in the participant pool. This sample does not represent the full breadth of potential ages, abilities, and experience that a more diverse audience may have. Under conditions with a broader and more diverse set of ages, abilities, experiences, and motivations, different results could be expected.

Finally, as a result of participant culling based on trial task performance, the distribution of participants across the five symbol groups that represented the between-participants dependent variable in the experiment, was uneven and unbalanced. To provide more robust results, the

126 experiment should evenly distribute participants across all five dependent variable groups. The

unbalanced distribution may have had an impact on the validity of the statistics applied.

As previously discussed, the use of poorly rated bivariate symbol pairs likely had an

impact on the performance of those symbol types in the experiment. Based on constraints

imposed by representing change, the Bivariate Integral pair of saturation and hue was the only

integral variable pair option. As a result, it was necessarily used, despite previous research

showing it to be a less-than-optimal pair. Similarly constrained, the Bivariate Separable symbol, which was intended to use orientation and hue, instead used orientation and a spectral scheme, the latter composed three visual variables: hue, value, and saturation. This meant that it did not faithfully represent the original visual variables it was intended to. While this symbol adjustment may have had an impact on the symbol’s dimensional interaction behavior, the spectral scheme was a better visual variable for numerical data values than hue would have been. That these less- than-optimal (integral) or altered (separable) bivariate pairs were used in the experiment likely meant that their performance, as it relates to predictions based on SAT, were not as representative as if they had used more optimal visual variable pairings.

4.2.5 Summary

This chapter presented the process of designing and executing the task-based experiment phase of this research. The first section provided a thorough discussion of experiment design decisions and the process of selecting tasks, designing the digital survey, generating examples, and performing the experiment. The second section presented an analysis and discussion of the response data generated from the task-based experiment survey. Results were broken down by within-participant and between-participant factors and both the main effects and interaction effects were discussed for each, with repeated measures ANOVA providing statistical support for

127 observed patterns in the data. A discussion of the limitations of the experiment was presented

towards the end of the chapter.

Broadly, results of the task-based experiment indicate that participants tended to be more successful with tasks about trends in geography, as compared to tasks about trends in value. The results addressed the central question of this thesis by tentatively showing that for explicit change tasks bivariate symbols representing change are more successful than a univariate non-change symbol. Inconclusive results countered SAT predictions that for certain tasks, particular symbols were more likely to be successful. Instead, the lack of significant differences suggests that the effect of symbol has little impact on performance across tasks. Additionally, the results of the experiment may show that orientation, as a visual variable, is not successful at helping users recognize trends in map animations.

More robust results may yet support the successful application of selective attention theory to animated mapping. As discussed in this section, SAT predicted that for certain tasks, particular symbols were likely to be more successful. However, results for symbol effects on response variables were not significant and the SAT predictions did not hold. Instead, counter to

SAT predictions, symbol was not shown to have a significant impact on performance across the tasks used. This was possibly explained by the behavior of integral and separable variables as it relates to both symbol redundancy (change derived from data value) and the interrelated data relationship implemented in the examples used in this experiment. This explanation suggests that

SAT predictions cannot be fully evaluated in this case where data was not independent, where symbol redundancy was present, and in particular where there was a strong temporal autocorrelation as is found in a map animation. Finally, the use of less-than-optimal visual variable pairs (orientation and spectral scheme for separable and saturation+value and hue for integral) may have had an impact on the performance of bivariate symbols in the experiment.

5 Conclusions

This thesis accomplished two central goals. The first goal was to derive the animated map reading tasks of domain experts to use in a task-based experiment and to help form a new task typology. The second goal was to determine whether representing change with a bivariate symbol in an animated setting helped or hurt a users’ ability to recognize patterns of change. Tasks derived as a result of the first goal were used first, to form a new task typology in combination with existing typologies. The domain analysis and task typology formation process presented in

Chapter Three addressed the tertiary question of this thesis presented in Chapter One. Second, tasks derived from domain analysis were used in the task-based experiment phase of research to address the primary and secondary questions of this thesis presented in Chapter One. This chapter reviews the overall success of this research, summarizing important findings, discussing the impacts of this work, and offering avenues of future research.

5.1 Summary of Findings

In accomplishing the two goals of this thesis, both expected and unexpected results were obtained. These were detailed in the previous two chapters, where the results of both task typology formation and task-based experiment completion were presented and discussed. Some of those findings have more conclusive results than others and have more impact on the future directions of research. This section summarizes the most important and key aspects discovered in accomplishing both goals of this thesis.

129

5.1.1 Typology Formation through Domain Analysis

Domain analysis was intended to identify the animated map reading tasks of domain experts working with their data. This was accomplished through the means of a focus group session that allowed consensus building amongst the experts and elucidated important map reading tasks. Combining tasks extracted from the focus group session and three existing typologies, a new typology was formed for movement patterns found in aggregated spatiotemporal point data. Tasks from this new typology were implemented in the task-based experiment phase of this research. The task typology may find use outside of this thesis, by offering geovisualization tool designers a guide that can help them identify the types of tasks that users are likely to perform. Further application of this new typology is encouraged to determine how generally it can be used.

5.1.2 Task-based Experiment

The results of the task-based experiment help answer the central question of this thesis: does representing change with bivariate symbology help user recognize patterns of change in an animated map? Results presented in Chapter Four show that a univariate non-change symbol, representing only value (and not encoding change) performed poorly for tasks explicitly directed at change as compared to both univariate and bivariate symbols representing change. This offers support that bivariate change symbolization can help users recognize patterns of change in an animated map.

Results from the task-based experiment also showed that participants with any symbol type were more successful with questions about trends in geography (the number of locations) than with questions about trends in value (number of units per location). This is supported by

130 evidence from previous research (described in Chapter Two) that indicates that reading tasks related to map legend readout and quantity evaluation are not successful with map animations.

This also suggests that map animations may be most successful with simplified and abstracted representations of attribute value, those that make thematically relevant information (in this case, change) the most perceptually salient.

5.2 Impacts

The results of this thesis have broad impacts on aspects of dynamic cartography, including research methods, task typology use for geovisualization tool development, animated map design, and the application of selective attention theory in cartography. While some of these impacts have been stated in the previous two chapters, more specific contributions are outlined in this section.

5.2.1 Typology Use

Existing task typologies for use with geovisualization have been oriented around very general map reading tasks and have not been generated from domain-specific sources, nor have they been designed to meet domain specific application goals. The task typology formed in this thesis is one of the first to define tasks based on both a specific kind of data, aggregated point data, and a mode of visualization, animated mapping. Extracting the tasks, via domain analysis, and forming a new typology, guided by existing ones, has shown that it is possible to derive useful tasks through relatively easy and simple methods of accessing domain knowledge. This task typology could offer geovisualization designers and users a way to understand the types of

131 exploratory map reading tasks directed at the movement patterns found in aggregated spatiotemporal point data.

5.2.2 Attribute Value and Geography Dimension Focus in Animated Maps

We are only beginning to understand the cognitive limits associated with using animated maps. Research in the field seeks to develop our understanding of how map animations can be best used in the geovisualization toolkit. Task-based experiment results from this thesis showing that users were much more successful identifying trends in geography than trends in value adds to our understanding of the perceptual-cognitive limits of map animations. This finding suggests that animated maps showing complex symbols, multiple classes of data, or large ranges of data values may not be successful for helping users recognize patterns and trends in data. Instead, animated map makers may be successful by creating map animations with simplified symbols that abstract relevant aspects of the data to reveal patterns in geography. Further, animated maps may be best as part of a set of tools in which they are relied upon to help users understand change in geography over time, while other tools are relied upon to help users understand other aspects of the targeted phenomenon.

5.2.3 Bivariate Animated Mapping

Prior to this research, the use of bivariate symbols in animated maps was untested. This work has shown that bivariate animated maps representing change are likely more successful than univariate animated maps not representing change in facilitating recognition of patterns and trends of change in a spatiotemporal dataset for explicit change tasks. Other examples of bivariate animated maps that seek to abstract relevant information in an effort to make the information

132 more salient will likely find success as well. Hopefully, the results of this thesis will open the

door for both future map makers and researchers to experiment with and study bivariate animated

maps further.

5.2.4 Selective Attention Theory

While SAT predicted that with an independent data set for certain tasks, particular

symbols would be more effective, the results suggested that this was not the case in the

experiment. There are three potential explanations: less-than-optimal use of symbol pairs, data relationships, and symbol redundancy.

First, as discussed in Chapter 4, less-than-optimal visual variable pairs were used for both the Bivariate Integral and the Bivariate Separable symbols as a result of constraints imposed by representing change. This sub-optimal symbol use likely had an impact on the overall effectiveness of the bivariate symbols in the experiment. If other variable pairs were used in a similar setting, different results may be expected. Although, considering the constraints of representing both data and change, other pairs may not be readily available.

Second, that SAT predictions did not hold may be explained by the interrelated character of the data. Better success for the Bivariate Integral symbol with interrelated data may suggest that integral variable pairs should be used with interrelated or correlated datasets. It remains uncertain what data relationships separable variable pairs will be most effective for.

Symbol effectiveness was likely impacted both by the interrelated data sets used and symbol redundancy present in the bivariate symbols (because change was derived from data value). Symbol redundancy can result in much better performance when compared to symbols that do not use redundancy. In this case, the better performance for the Bivariate Integral symbol can be potentially explained by the fact that, in a situation where both bivariate symbols had

133 redundancy, only the Bivariate Integral symbol required integrally forced attention of both dimensions. Thus, results can help support the use of symbol redundancy, especially as it applies to integral and separable bivariate pairs used in map animations.

5.2.5 Domain Specific Knowledge

An underlying, yet understated, goal of this thesis was to provide a benefit to the domain experts who participated in the study, by improving the quality of their visualizations. Being able to suggest symbolizations that were tested with their tasks and their data could be potentially valuable to those domain experts. While their existing symbolization method has shown to be most successful for recognizing basic trends in value, it was shown to perform poorly for recognizing patterns of change. New change symbolizations tested here should help improve their ability to study and recognize patterns and trends in change.

5.3 Future Research

Throughout the research process, a number of questions remained unanswered, while new questions were revealed. Results from both task typology formation and the task-based experiment help to identify avenues of future research. This section suggests a number of topics for future research that arose from the process of completing this thesis.

5.3.1 Tasks

Map animations may serve a limited number of map reading task types. Previous research reviewed in the literature section has identified trend type tasks as being successfully

134 accomplished with map animations, but certainly map animations work for some other tasks,

while they fail for yet other tasks. The research presented here limited task-based experiments to

one type of task, trend. In addition to additional investigation of bivariate mapping and selective

attention theory that is suggested by results, future work with map animations should be directed

at better delimiting successful task types such that appropriate symbolizations can be

implemented for those tasks. Additionally, future research should seek to make better links

between tasks in the new task typology presented here and the most appropriate forms of

visualizations in an effort to establish a lexicon for developing visualization tools or toolkits that

can best facilitate tasks identified for a target user group.

5.3.2 Data Relationships and Redundancy

An important, yet originally overlooked, element of working with bivariate symbols and selective attention theory are the data relationships being symbolized and the dimensional interaction behaviors of the variable pairs being used. In this work, only interrelated data was used and finding that, as explained by selective attention theory, an integral variable pair was more successful with trend tasks in an animated map setting than was a separable variable pair.

This is also potentially explained by symbol redundancy present in the symbols (change being derived from data value) and the fact that with the Bivariate Integral symbol, forced, integrated attention of the two redundant symbols occurred, while it did not in the Bivariate Separable symbol. Including multiple datasets with different data relationships could have revealed successes for each variable pair type for the given dataset relationship. Based on selective attention theory, separable variable pair types would be predicted to work best for independent data relationships. Future efforts should be directed at testing other data relationships and both non-redundant and redundant symbolizations in a similar animated map setting.

135

5.3.3 Animations Speeds

While not directly studied in this work, the role of animation frame rates in facilitating

animated map use across a number of tasks or goals remains understudied. Psychological

research, as discussed in Chapter Two, defined a curve of animated stimuli reading success

indicating that there are two ranges of speeds that may be most successful (less than

100msec/frame and greater than 500msec/frame). This behavior should be tested in an animated

map setting with different tasks, in seeking to determine if different speeds are more or less

effective for different tasks. Animated map design is in need of some authoritative research on

frame rate use.

5.4 Conclusion

The goals of this experiment were largely achieved. Our understanding of the animated

map reading tasks that are relevant for domain experts has been advanced. A new task typology

was formed from the focus group session material and resulting tasks were implemented in the

task-based experiment. The process of task-based experiment design and execution, which

required careful deliberation, was successful on many levels. Task-based experiment results

helped address the central questions of this thesis and unveiled new questions for future research.

The research presented here contributes to our knowledge of animated maps, extending the

application of selective attention theory to dynamic forms. Hopefully, this work will encourage

the use of bivariate symbols in animated mapping as a successful method for representing change

and helping users recognize patterns of change in spatiotemporal data. Future work with animated

maps should strive to make connections between domain-oriented goals and perceptually-based map behaviors in seeking to better understand how maps can be used to solve problems.

References

Acevedo, W. and P. Masuoka (1997). "Time-series animation techniques for visualizing urban growth." Computers and Geosciences 23(4): 423-435.

Andrienko, G. and N. Andrienko (2005). "Visual Exploration of the Spatial Distribution of Temporal Behaviors." Proceedings of the Ninth International Conference on Information Visualisation (IV’05): 799-806.

Andrienko, N. and G. Andrienko (2007). "Designing Visual Analytics Methods for Massive Collections of Movement Data." Cartographica: The International Journal for Geographic Information and Geovisualization 42(2): 117-138.

Andrienko, N., G. Andrienko, et al. (2000). "Supporting visual exploration of object movement." Proceedings of the working conference on Advanced visual interfaces: 217-220.

Andrienko, N., G. Andrienko, et al. (2003). "Exploratory spatio-temporal visualization: an analytical review." Journal of Visual Languages and Computing 14(6): 503-541.

Andrienko, N., G. Andrienko, et al. (2002). "Testing the Usability of Interactive Maps in CommonGIS." Cartography and Geographic Information Science 29(4): 325-343.

Bertin, J. (1967). Semiology of Graphics: Diagrams, Networks, Maps, The University of Wisconsin Press.

Bhowmick, T., A. L. Griffin, et al. (2008). "Informing geospatial toolset design: Understanding the process of cancer data exploration and analysis." Health and Place.

Blok, C. (2000). Monitoring Change: Characteristics of Dynamic Geo-spatial Phenomena for Visual Exploration. Spatial Cognition II, LNAI 1849. C. F. e. al. Berlin, Springer-Verlag.

Blok, C., B. Kobben, et al. (1999). "Visualization of Relationships between Spatial Patterns in Time by Cartographic Animation." Cartography and Geographic Information Science 26(2).

Blok, C. A. (2005). Dynamic visualization variables in animation to support monitoring of spatial phenomena. International Institute for Geo-Information Science and Earth Observation (ITC). Enschede, Netherlands, Utrecht University. Ph.D.

Brewer, C. A., A. M. MacEachren, et al. (1997). "Mapping Mortality: Evaluating Color Schemes for Choropleth Maps." Annals of the Association of American Geographers 87(3): 411-438.

Brewer, C. A. and L. Pickle (2002). "Evaluation of Methods for Classifying Epidemiological Data on Choropleth Maps in Series." Annals of the Association of American Geographers 92(4): 662-681.

137

Brewer, I. (2005). Understanding work with geospatial information in emergency management: A cognitive systems engineering approach in GIscience, The Pennsylvania State University.

Brewer, I. and A. J. Campbell (1998). "Beyond Graduated Circles: Varied Point Symbols for Representing Quantitative Data on Maps." Cartographic Perspectives 29(Winter 1998).

Buttenfield, B. (1999). "Usability Evaluation of Digital Libraries." Science And Technology Libraries 17: 39-60.

Campbell, C. S. and S. L. Egbert (1990). "Animated Cartography / Thirty Years of Scratching the Surface." Cartographica: The International Journal for Geographic Information and Geovisualization 27(2): 24-46.

Caruana, R., M. Elhawary, et al. (2006). "Mining citizen science data to predict prevalence of wild bird species." Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining: 909-915.

Chen, J., A. M. MacEachren, et al. (2008). "Supporting the Process of Exploring and Interpreting SpaceTime Multivariate Patterns: The Visual Inquiry Toolkit." Cartography and Geographic Information Science 35(1): 33-50.

Cutler, M. E. (1998). The Effects of Prior Knowledge on Children's Abilities to Read Static and Animated Maps. Department of Geography, University of South Carolina. M.S.

DiBiase, D. (1990). "Visualization in the earth sciences." Earth and Mineral Sciences 59(2): 13- 18.

DiBiase, D., A. M. MacEachren, et al. (1992). "Animation and the Role of Map Design in Scientific Visualization." Cartography and Geographic Information Science 19(4): 201-214.

Dobson, M. W. (1983). "Visual information processing and cartographic communication: The utility of redundant stimulus dimensions." Graphic communication and design in contemporary cartography. Chichester, UK: John Wiley & Sons: 149-75.

Dorling, D. (1992). "Stretching Space and Splicing Time: From Cartographic Animation to Interactive Visualization." Cartography and Geographic Information Science 19(4): 215-227. eBird. (2009). eBird: An online database of bird distribution and abundance [web application]. Version 2. eBird, Ithaca, New York. Available: http://www.ebird.org. (Accessed: May 25, 2009).

Edsall, R. M., M. J. Kraak, et al. (1997). "Assessing the effectiveness of temporal legends in environmental visualization." GIS/LIS 97: 28-30.

Fabrikant, S. and S. Garlandini (2009). Visual Change Detection in a Blink of the Eye American Association of Geographers Annual Meeting. Las Vegas, NV.

138 Fabrikant, S. I. and K. Goldsberry (2005). "Thematic relevance and perceptual salience of dynamic geovisualization displays." Proceedings of 22nd ICA international cartographic conference: mapping approaches into a changing world, A Coruna, Spain.

Fabrikant, S. I., S. Rebich-Hespanha, et al. (2008). "Novel Method to Measure Inference Affordance in Static Small-Multiple Map Displays Representing Dynamic Processes." The Cartographic Journal 45(3): 201-215.

Friborg, O., M. Martinussen, et al. (2006). "Likert-based vs. semantic differential-based scorings of positive psychological constructs: A psychometric comparison of two versions of a scale measuring resilience." Personality and Individual Differences 40(5): 873-884.

Griffin, A. L., A. M. MacEachren, et al. (2006). "A Comparison of Animated Maps with Static Small-Multiple Maps for Visually Identifying Space-Time Clusters." Annals of the Association of American Geographers 96(4): 740-753.

Harrower, M. (2002). Visual Benchmarks: Representing Geographic Change with Map Animation. Department of Geography. University Park, PA, The Pennsylvania State University. Ph.D.: 289.

Harrower, M. (2003). "Tips for Designing Effective Animated Maps." Cartographic Perspectives 44: 63-65.

Harrower, M. (2004). "A Look at the History and Future of Animated Maps." Cartographica: The International Journal for Geographic Information and Geovisualization 39(3): 33-42.

Harrower, M. (2007). "Unclassed Animated Choropleth Maps." The Cartographic Journal 44(4): 313-320.

Harrower, M. and S. Fabrikant (2008). "The role of map animation in geographic visualization." Geographic Visualization, Wiley and Sons: Chichester UK: 49–65.

Harrower, M., A. MacEachren, et al. (2000). "Design, implementation, and assessment of geographic visualization tools to support earth science education." Cartography & Geographic Information Systems 27: 279-293.

Hegarty, M., S. Kriz, et al. (2003). "The Roles of Mental Animations and External Animations in Understanding Mechanical Systems." Cognition and Instruction 21(4): 209-249.

Johnson, H. and E. S. Nelson (1998). "Using flow maps to visualize time-series data: Comparing the effectiveness of a paper map series, a computer map series, and animation." Cartographic Perspectives 30: 47-64.

Koussoulakou, A. and M. J. Kraak (1992). "Spatio-temporal maps and cartographic communication." Cartographic Journal 29(2): 101-108.

Kraak, M. J. and D. E. van de Vlag (2007). "Understanding Spatiotemporal Patterns: Visual Ordering of Space and Time." Cartographica: The International Journal for Geographic Information and Geovisualization 42(2): 153-161.

139

Krosnick, J. A. (1999). "Survey research." Annual Review of Psychology 50(1): 537-567.

Lloyd, D., J. Dykes, et al. (2007). Understanding geovisualization users and their requirements–a user-centred approach.

Lowe, R. (2004). "Interrogation of a dynamic visualization during learning." Learning and Instruction 14(3): 257-274.

Lowe, R. K. (1999). "Extracting information from an animation during complex visual learning." European Journal of Psychology of Education XIV: 225-244.

Lowe, R. K. (2003). "Animation and learning: selective processing of information in dynamic graphics." Learning and Instruction 13(2): 157-176. MacEachren, A. (1995). How Maps Work. New York, NY, The Guilford Press.

MacEachren, A. M., F. P. Boscoe, et al. (1998a). "Geographic Visualization: Designing Manipulable Maps for Exploring Temporally Varying Georeferenced Statistics." Proceedings, Information Visualization 98: 19-20.

MacEachren, A. M., C. A. Brewer, et al. (1998b). "Visualizing georeferenced data: representing reliability of health statistics." Environment and Planning A 30(9): 1547-1561.

MacEachren, A. M. and J. H. Ganter (1990). "A Pattern Identification Approach to Cartographic Visualization." Cartographica: The International Journal for Geographic Information and Geovisualization 27(2): 64-81.

MacEachren, A. M. and M. J. Kraak (2001). "Research challenges in geovisualization." Cartography and Geographic Information Science 28(1): 3-12.

Martis, K. C. (1989). The Historical of Political Parties in the United States Congress, 1789-1989. . New York, NY, MacMillan.

Midtbø, T., K. C. Clarke, et al. (2007). "Human Interaction with Animated Maps: The portrayal of the passage of time." Proceedings, SCANGIS'2007.

Midtbø, T. and T. Nordvik (2007). "Effects of Animations in Zooming and Panning Operations on Web maps: A Web-based Experiment." The Cartographic Journal 44(4): 292-303.

Monmonier, M. (1990). "Strategies For The Visualization Of Geographic Time-Series Data." Cartographica: The International Journal for Geographic Information and Geovisualization 27(1): 30-45.

Monmonier, M. (1996). "Temporal Generalization for Dynamic Maps." Cartography and Geographic Information Science 23(2): 96-98.

Monmonier, M. and B. B. Johnson (1991). Using qualitative data gathering techniques to improve the design of environmental maps. 15th International Cartographic Conference, Bournemouth, U.K.

140

Monmonier, M. S. (1994). "Minimum-change categories for dynamic temporal choropleth maps." Journal of the Pennsylvania Academy of Science 68(1): 42-47.

Montello, D. R. (2002). "Cognitive Map-Design Research in the Twentieth Century: Theoretical and Empirical Approaches." Cartography and Geographic Information Science 29(3): 283-304.

Morrison, J. B. (2000). Does animation facilitate learning? An evaluation of the congruence and equivalence hypothesis. Department of Pyschology, Stanford University. Ph.D.: 161.

Nelson, E. S. (1999). "Using selective attention theory to design bivariate point symbols." Cartographic Perspectives 32: 6-28.

Nelson, E. S. (2000)a. "Designing Effective Bivariate Symbols: The Influence of Perceptual Grouping Processes." Cartography and Geographic Information Science 27(4): 261-278.

Nelson, E. S. (2000)b. "The Impact of Bivariate Symbol Design on Task Performance in a Map Setting." Cartographica: The International Journal for Geographic Information and Geovisualization 37(4): 61-78.

Ogao, P. J. and M. J. Kraak (2002). "Defining visualization operations for temporal cartographic animation design." International Journal of Applied Earth Observations and Geoinformation 4(1): 23-31.

Raaijmakers, Q. A. W. (2000). "Adolescents' midpoint responses on Likert-type scale items: neutral or missing values?" International Journal of Public Opinion Research 12(2): 209-217.

Robinson, A. C., J. Chen, et al. (2005). "Combining Usability Techniques to Design Geovisualization Tools for Epidemiology." Cartography and Geographic Information Science 32(4): 243-255.

Shapiro, K. L. (1994). “The Attentional Blink: The Brain's Eyeblink.” Current Directions Psychological Science 3:86-89.

Slocum, T. A. (1983). "Predicting Visual Clusters on Graduated Circle Maps." The American Cartographer 10(1): 59-72.

Slocum, T. A., D. C. Cliburn, et al. (2003). "Evaluating the Usability of a Tool for Visualizing the Uncertainty of the Future Global Water Balance." Cartography and Geographic Information Science 30(4): 299-318.

Slocum, T. A., R. McMaster, et al. (2008). Thematic Cartography and Geovisualization, Prentice Hall Upper Saddle River, NJ.

Slocum, T. A., R. S. Sluter, et al. (2004). "A Qualitative Evaluation of MapTime, A Program For Exploring Spatiotemporal Point Data." Cartographica: The International Journal for Geographic Information and Geovisualization 39(3): 43-68.

141 Slocum, T. A., S. C. Yoder, et al. (2000). "MapTime: Software for Exploring Spatiotemporal Data Associated with Point Locations." Cartographica: The International Journal for Geographic Information and Geovisualization 37(1): 15-32.

Tufte, E. R. (1983). The visual display of quantitative information. Cheshire, CT, Graphics Press.

Tversky, B., J. B. Morrison, et al. (2002). "Animation: can it facilitate." International Journal of Human-Computer Studies 57(4): 247-262.

Ware, C. (2004). Information Visualization: Perception for Design. San Francisco, CA, Morgan Kaufmann.

Wolfe, J. M., K. R. Kluender, et al. (2006). Sensation and Perception. Sunderland, MA, Sinauer Associates, Inc.

Wehrend, S. (1993). Taxonomy of Visualization Goals. Visual Cues: Practical Data Visualization. P. R. Keller and M. M. Keller. Los Alamitos, CA, IEEE Computer Society Press: 187-199.

Yattaw, N. J. (1999). "Conceptualizing Space and Time: A Classification of Geographic Movement." Cartography and Geographic Information Science 26(2).

Appendix A

SCRIPT FOR FOCUS GROUP SESSION WITH AKN STAFF

PLAN A: Two Hours (Full Meeting)

PLAN B: One Hour or less (Short Meeting – important questions in bold)

INTRODUCTION (10 minutes)

Administer Consent Forms.

Discuss pertinent aspects of consent.

Get signatures and collect forms (make photocopies after session is complete).

Introduce myself, topic and format for the session:

“I need to indicate what is important within this session. During

discussion, I’d like you to focus on elements related to visualization of

the AKN data and recognizing patterns of change within the data, across

different temporal scales, and mainly at broader spatial scales. As well,

keep in mind the end representational goals for AKN and what users of

the AKN data will see.”

Be explicit about the things to look at. “For practical reasons, we’re

looking to represent time-series data for one species at a time.”

Offer things that I’m not looking for. Tell them about targets for these

animations…researchers, ecology students, not amateurs.

I will periodically end discussion once I’ve gotten enough information,

so don’t be surprised.

Field any initial questions.

SECTION ONE (15 minutes) – Visualization Goals

(Skip first half if on PLAN B)

How do you currently visualize the AKN data to facilitate your needs?

143 Follow up: Is it working?

How do you envision visualizing the AKN data?

Specifically, what information do wish to derive through visualization and

exploration?

Follow up: Finding errors? Exploring trends? Learning?

SECTION TWO (20 minutes) – Data Format

(combine prompts to one if on PLAN B)

AKN data is stored as discrete points, what format should the AKN data be presented

in?

Further: (Show examples) Points or Areas? Raster or Vector?

Follow Up: Would a combination (flexibility for the user) be suitable?

What, if any, statistical standardization should be used when representing the AKN way?

Follow Up: Does this change the format that the AKN data should be presented

at?

SECTION THREE (15 minutes) – Temporal Smoothing & Change Monitoring

(Skip first half if on PLAN B)

What role does monitoring change over time play in regards to visualizing the data?

Follow up: How do you imagine this being better done?

How important is accounting for noise through temporal smoothing?

Further: Could temporal aggregates of data play a role?

BREAK (skip if on PLAN B)

SECTION FOUR (45 minutes) – Map Reading Tasks from Animations

(Main part if on PLAN B)

Include a quick introduction to this section that discusses the need to focus on patterns in

the data, change, and overall impression reading (as opposed to map readout or quantity

144 evaluation. Don’t worry about: map format, interactivity, appearance, etc. We’re not

critiquing an interface.

Categories:

eBird 1 – Diving Ducks AKN movie

eBird 2 – RTHU AKN movie

eBird 3 – Purple Martin AKN movie

akn1 – Ross’s Goose, by year

akn1MW – Ross’s Goose, by year, moving Window

akn2 – Western Grebe, by Month

akn2MW – Western Grebe, by Month, moving Window

prop1 – Dickcissel, by year

prop2 – Arctic Tern, by Month

biv1 – Dickcissel, by year

biv2 – Ross’s Goose, by Month

Administer the following prompts during the showings, if necessary:

What are you looking for in terms of: density, change, spatial extent, clustering,

paths, overall pattern?

FINISHING (5 minutes)

Address any questions and offer any prompts that may have arisen during the main

sections.

Field any questions from participants.

Make photocopies of consent forms and return to participants.

(10 minute buffer)

145 APPENDIX B

FOCUS GROUP SESSION ANIMATED MAP STIMULI

This appendix contains animated map stimuli presented during the focus group session.

1. Purple Martin - AKN

146

2. Ross’s Goose – Auer

147

3. Western Grebe – Auer

148

4. Western Grebe – Moving Window – Auer

149

5. Dickcissel – Auer

150

6. Arctic Tern – Auer

151

7. Dickcissel – Bivariate Symbolization – Auer

152

8. Ross’s Goose – Bivariate Symbolization - Auer