Avian Flu Case Study with Nspace and Geotime
Total Page:16
File Type:pdf, Size:1020Kb
Avian Flu Case Study with nSpace and GeoTime Pascale Proulx, Sumeet Tandon, Adam Bodnar, David Schroh, Robert Harper, and William Wright* Oculus Info Inc. ABSTRACT 1.1 nSpace - A Unified Analytical Workspace GeoTime and nSpace are new analysis tools that provide nSpace combines several interactive visualization techniques to innovative visual analytic capabilities. This paper uses an create a unified workspace that supports the analytic process. One epidemiology analysis scenario to illustrate and discuss these new technique, called TRIST (The Rapid Information Scanning Tool), investigative methods and techniques. In addition, this case study uses multiple linked views to support rapid and efficient scanning is an exploration and demonstration of the analytical synergy and triaging of thousands of search results in one display. The achieved by combining GeoTime’s geo-temporal analysis other technique, called the Sandbox, supports both ad-hoc and capabilities, with the rapid information triage, scanning and sense- formal sense making within a flexible free thinking environment. making provided by nSpace. Additionally, nSpace makes use of multiple advanced A fictional analyst works through the scenario from the initial computational linguistic functions using a web services interface brainstorming through to a final collaboration and report. With the and protocol [6, 3]. efficient knowledge acquisition and insights into large amounts of documents, there is more time for the analyst to reason about the 1.1.1 TRIST problem and imagine ways to mitigate threats. The use of both nSpace and GeoTime initiated a synergistic exchange of ideas, where hypotheses generated in either software tool could be cross- TRIST, as seen in Figure 1, uses advanced information retrieval referenced, refuted, and supported by the other tool. methods to support rapid scanning and exploration of large datasets. The design of TRIST was informed by recent research CR Categories: H.5.2 [Information Interfaces & Presentations]: into information retrieval [4,7,17], the development of the User Interfaces – Graphical User Interfaces (GUI); I.3.6 cognitive dimensions framework [5], and the idea of an [Methodology and Techniques]: Interaction Techniques. information-seeking environment [8]. TRIST supports multiple analytical processes, including query Keywords: visual analytics, information visualization, human planning, rapid scanning of large corpus of documents, and result information interaction, sense making, geo-spatial information characterization across multiple dimensions [12]. Analysts work systems, temporal analysis, user centered design with TRIST to triage their massive data and to extract information into the Sandbox evidence marshalling environment. 1 INTRODUCTION GeoTime and nSpace are two novel visual analytic applications that have been developed in collaboration with analysts to support the investigation of large and complex datasets. nSpace is used for triaging massive data and for analytical sense-making [12], [22], and is currently undergoing experimental evaluation and pilot deployment. GeoTime supports the visualization and analysis of entities and events over time and geography [13] and is currently in transition to deployment for analysts to use on a day-to-day basis. The epidemiology dataset was created by the authors’ for the purposes of investigating the analytic process and evaluating nSpace and GeoTime using an ecologically valid scenario. This dataset is based on real news reports detailing the current outbreak of avian flu. The analyst used for the evaluation was experienced with both nSpace and GeoTime, but had no formal training in the art of intelligence analysis. * email: {pascale.proulx, sumeet.tandon, adam.bodnar, david.schroh, rob.harper, bill.wright}@oculusinfo.com Figure 1: TRIST interface. Left column: Launch Queries, Query History, and Dimensions panes. Middle column: Displayed dimensions with categorized results and Document Viewer. Right column: Entities pane. An evaluation of TRIST conduced over a two day period by The National Institute of Standards and Technology (NIST), found that analyst productivity and quality of analysis increased when using the tool [18]. © 2006 Oculus Info Inc. Page 1 1.1.2 Sandbox occurrence and distribution, in agent or host factors, or even in health care practices. Typical analysis tools used in epidemiology The Sandbox is a flexible and expressive environment for include statistical measures, spatial analysis, geographic working with evidence and generating hypotheses. While concept information systems, and time series analysis tools [1]. maps [11,15] and link analysis tools [10,19] aim to support similar activates, their node-link representations can be limiting, and 2 THE AVIAN FLU CASE STUDY readability and usability decrease as the volume of data increases. The Sandbox supports both ad-hoc and more formal analytical The avian flu case study presents several analytical challenges. sense making though the use of fluid gestures for interaction. The dynamic nature of the data requires tools which support Additionally, the Sandbox supports automated process model temporal analysis, and enable the understanding of complex geo- templates, layers for managing multiple dimensions of data, spatial events. The objectives for the case study are listed in assertions with evidence templates, and scalability mechanisms to Table 1. The role of the analyst will be played by Lynn, a support larger analytic tasks [22]. fictional analyst who is familiar with both nSpace and GeoTime, An evaluation of the Sandbox was conducted by the NIST but has no formal training in analysis, and can thus be considered Visualization and Usability Group in 2005 [16]. Four analysts a novice user. The case study will provide a detailed examination participated in the experiment consisting of three hours of training of the analytic process as Lynn uses nSpace and GeoTime to and eight hours of work on an intelligence analysis task. Results fulfill the objectives of the exercise. The data used in this case indicated that the Sandbox enabled the analysts to work quickly study is publicly available and consists entirely of internet sources and efficiently, and perform a higher level of analysis when obtained through TRIST via the Google search engine. Key compared to other tools. references include the World Health Organization’s avian flu disease outbreak news [21] and events timeline [20]. 1.2 GeoTime - Concurrent Temporal and Spatial Analysis Table 1. Analysis Objectives GeoTime improves perception and understanding of entity Investigate and characterize the avian flu outbreaks movements, events, relationships, and interactions over time Identify and analyze trends within a geospatial context. As seen in Figure 2, events are Identify key issues and threats represented within an X,Y,T coordinate space, in which the X,Y Develop, support or refute, and refine hypotheses about plane represents geographic space, and the T axis represents time. cause and effects In addition to providing spatial context, the ground plane marks Propose options for mitigating future risk the instant of focus between past and future, meaning that events along the timeline ‘occur’ when they meet the surface. 2.1 Thinking without Borders Several other visualization techniques for analyzing complex event interactions only display information along a single Lynn has just received the assignment to analyze the distribution, dimension, typically one of time, geography or network determinants, and potential impacts of the recent avian flu connectivity. outbreak. She begins by brainstorming in the Sandbox using her An evaluation of GeoTime found that its unified geo-temporal prior and tacit knowledge to generate hypotheses about the representation increased analysts’ understanding of entity possible causes and effects of the recent outbreaks. She notes key relationships and behaviors [13]. questions and outlines her analytical strategy for further investigation. The fluid gestural interaction within the Sandbox supports the early stages of analysis, which are often unconstrained to enable the generation of new ideas and concepts. Lynn simply points and clicks to create annotations. She draws circles to create groups and draw links to connect related ideas. Using these techniques, Lynn’s thoughts flow freely, and are quickly organized with drag and drop and semi-automatic layout. Emphasis can also be added to create visual structure. Alternative scenarios are sketched out and the resulting empty assertions become explicit reminders to find supporting and refuting evidence, as shown in Figure 3. Figure 2: Individual views of movement in time are translated into a continuous spatiotemporal representation. 1.3 Epidemiology Analysis Epidemiology is the study of the distribution and determinants of health and diseases, morbidity, injuries, disability and mortality in populations [14]. In this context, an epidemiologist can be thought of as a disease detective. Common tasks involved in epidemiology include surveillance, outbreak investigation, control measures, risk assessment, event characterization, and long term impact research and the evaluation of interventions. Trend analysis and change detection are also important in Figure 3: Brainstorming in the Sandbox. epidemiological analysis. Changes can occur in disease © 2006 Oculus Info Inc. Page 2 2.2 Acquiring Analytical Awareness all the documents that contain each entity become visible, and the entity