The National SecuritySecu Agency’s Review of Emerging Technologies prp occesessisinngg 7PMM  /P/ t  

An Information veriify Visualization Primer and Field Trip

Visual Analytics – Adventures in Exploring Information Spaces

Do You Know What’s Happening – Just Beyond Your View?

Knowledge Visualization with Concept Mapping Tool

Letter from the Editor

A picture is worth a thousand words. We need to see the big picture.

As these clichés indicate, pictures help us understand concepts and ideas that words cannot convey—no matter how many words we use. Pictures help us make mental leaps and enable us to see things differently. Combining this human need to use visuals as short-cuts for ideas and the deluge of data we face at work and at home, the fi eld of data visualization helps us make sense of our fast-paced world.

In our second issue of 2008, The Next Wave (TNW) presents articles on several aspects of the fi eld of data visualization. Our fi ve articles include:

t an information visualization primer and “tour” of the Internet

t an explanation of visual analytics, an emerging fi eld that focuses on injecting interactive visualization throughout the analysis process

t a description of longshot displays, a specifi c visualization technique that is used to maintain information awareness or context in a display showing large amounts of data

t an introduction to concept mapping as a technique and as a set of computer-

based tools for capturing, representing, and sharing knowledge t a brief look at visualization in virtual worlds, specifi cally in Second Life We hope you enjoy the issue and come away with a big picture of the fi eld of data visualization.

The Next Wave is published to disseminate signifi cant technical advancements and research activities in telecommunications and information technologies. Mentions of company names or commercial products do not imply endorsement by the U.S. Government. Articles present views of the authors and not necessarily those of NSA or the TNW staff.

For more information, please contact us at TNW@tycho ncsc.mil

CONTENTS

FEATURES 4 An Information Visualization Primer and Field Trip

14 Visual Analytics: Adventures in Exploring Information

20 Do You Know What’s Happening – Just Beyond Your View?

26 Knowledge Visualization with Concept Mapping Tool

FOCUS 30 Visualizing Data in the Metaverse An Information Visualization Primer and Field Trip

The commonplace concepts of information explosion and information overload… the calls to convert data into usable information and knowledge… the rallying cry “Volume, velocity, variety!” that describes an ever-increasing amount of data—all of these refer to the challenge of analyzing and understanding data. Terms, for example, information economy and knowledge economy, have been coined to refer to the world’s post-industrial economies, racing to understand and profi t from their data. On the large scale, our nation faces the globalizing infl uence of the Internet and the access to more data, which is fueling a drive to collect and capitalize on the data. On a smaller scale, you may face the task of analyzing several months of data and writing a report on your conclusions. To do so, you must query several different databases, normalize and collect the data together, identify trends and outliers, produce understandable, synthesized conclusions… all in time, you hope, to get home for the season premiere of Lost.

What Is Visualization? visualization as the human, mental activity A related defi nition is offered by Visualization is one strategy for starting from perception and leading to a the National Visualization and Analytics Center (NVAC): making sense of data. For the remainder mental model in memory that represents of this article, we will consider whether some level of understanding. Moving Visual Analytics—“The science of analytical reasoning facilitated by visualization could help you work more outside of the human, this article examines interactive visual interfaces.” [2] effi ciently and effectively. The article how technology can serve as external aids focuses on the goal of understanding in the formation of human comprehension. Visualization is a computerized interface using visual cues to improve data, especially when the understanding From the viewpoint that technology can human understanding of data. Visual cues relates to a task that must be performed. help us to be smart, this article provides can range from points, lines, shapes, colors, This article does not treat the topic of this commonly used defi nition that neatly intensity, textures, motion—anything visualization used for art or entertainment, captures both the purpose and the method that can be used to encode aspects of the but rather for analytic advantage and of visualization: data to make the data more accessible to scientifi c breakthrough. Visualization—The use of human perception. Effective visualization What constitutes visualization? Some computer-supported, interactive, visual means selecting and organizing the visual authors, defi ning visualization, focus on representations of data to amplify cues so that you can do your work faster internal human processes. They discuss cognition.” [1] or with fewer mistakes or possibly with discoveries that you would not otherwise Average Average Residential Residential Electricity Electricity make. Cost (cents Adopted Cost (cents Adopteded per Electricty per Electrictyty Within the fi eld of visualization, a State kilowatt hour) Restructuring State kilowatt hour) Restructuturing Alabama 8.23 no Monana 8.22 yes distinction has historically been made Alaska 13.71 no Nebraska 6.55 no Arizona 8.17 yes Nevada 10.88 no between scientifi c visualization and Arkansas 7.78 no Nevada 15.02 yes Figure 1: California 13.47 no New Hampshire 11.38 yes Data about information visualization. The distinction Colorado 9.2 no New Jersey 9.01 no 1) the average revolves around the type of data to be Connecticut 16.26 yes New Mexico 16.52 yes Delaware 8.66 yes New York 8.95 no cost of electricity visualized. Scientifi c visualization is Florida 11.14 no North Carolina 6.45 no for the 50 US Georgia 8.71 no North Dakota 8.94 yes states and based on physical data about, for example, Hawaii 22.94 no Ohio 8.46 no Idaho 6.14 no Oklahoma 7.46 noo the District of molecules or the weather. This type of data I liniois 8.05 yes Oregon 10.05 yees Columbia and Indiana 7.77 no Pennsylvania 14.73 yyes generally has a natural visual representation Iowa 9.68 no Rhode Is and 8.82 no 2) which states because the data refl ect an existing Kansas 7.43 no South Caro ina 7.33 no have adopted Kentucky 6.53 no South Dakota 6.53 no electricity three-dimensional reality. Examples of Louisiana 9.59 no Tennessee 11.85 yes Maine 14.07 yes Texas 7.35 no restructuring scientifi c visualization include climate Maryland 7.93 yes Utah 13.36 no Massachusetts 18.43 yes Vermont 7.94 yes studies, simulations of weapons testing, Michigan 9.56 yes Virginia 6.67 no Minnesota 8.63 no Washington 6.15 no models of chemical or biological systems, Mississippi 9.66 no Wisconsin 10.2 no air fl ow dynamics of vehicles, and Missouri 6.55 no Wyoming 7.07 no District of Columbia 7.89 yes structural engineering models. In contrast, information visualization is based on abstract data that, consequently, do not have an obvious visual representation. Examples Figure 2: Published by the of information visualization include stock Baltimore Sun, market views, sports statistics analysis, this map displays human genome mapping, the architecture the data from Figure 1. of a system, and social network Average electricity graphs. This article focuses on information cost appears in visualization. each state. Yellow circles mark Repeating from above, visualization the states that have is one strategy for making sense of data. adopted electricity restructuring. However, there are other strategies, and visualization is almost always used in conjunction with strategies from information retrieval, data correlation/ The Price Of Power fusion, interaction techniques, and analytic HI MA NY algorithms (e.g., data mining approaches, CT NH RI clustering techniques, and natural language ME AK CA processing). Visualization, per se, is not a VT TX NJ complete solution. A complete solution FL NV includes a way to collect and store data, WI PA Figure 3: IA fast query mechanisms, possibly some MS LA Bar chart showing MI automated analysis, and an interface that CO average residential NM NC electricity cost for gives the human maximum fl exibility OH SC the 50 US states GA in examining the data and automated DE and the District of MN OK Columbia. analysis. AL MT AZ The states marked IL A common combination is to mix VA in yellow have MD DC adopted electricity visualization with interaction techniques AR IN restructuring. to produce interactive visualization. OR KS U.S. States (descending order of average electricity cost) UT Interaction techniques include motions SD WY WA such as mouse clicks, mouse dragging, NE MO TN or fi nger actions on touch screens. The KY ND WV use of direct interactions to control ID D a visualization results in dynamic 0 5 10 15 20 25 Average residential electricity cost visualizations that change as a user directly (cents per kilowatt-hour)

The Next Wave n Vol 17 No 2 n 2008 5 manipulates the visualization. Interactive set as a textual table. Using Figure 1, how of this answer lies in the fact that humans visualization functions as both input and would you proceed to solve your assigned process, even with brief exposure, certain output mechanisms between the human task? Figure 2 shows the same data in a visual cues very quickly. Perception is the and machine. As a subtle, final point in format that was published in the Baltimore visual process by which people initially defining visualization, you can think of Sun, in June 2006, as a geographical map see their world. Perception is not actually a interactive visualization as a database with color coding [3]. Figure 3 shows the single mental system, but rather comprises interface. The interaction techniques data set as a bar chart with color coding. a number of vision subsystems. As a enable “query by direct manipulation,” Which data representation do you want simplification, you can think of perception which is hopefully simpler than writing to use and why… the tabular form, the as a two-stage process. In the first stage, an equivalent database query. This colored map, or the bar chart? visual information is processed rapidly and combination is especially helpful because Your choice of representation likely in parallel to identify basic features of the information overload results not only from gives you an example showing how data external visual frame. In the second stage, increasing volumes of data, but also from representation can make a difference. visual attention, which is largely a function the increasing complexity of the questions Your decision may involve a number of of eye movement, directs processing, and that we want to ask. observations. For instance, you may note items in the external visual frame are that the table is alphabetized by state name, processed sequentially. Why Use Visualization? whereas you might want it to be ordered The full mechanisms involved in human Why might one prefer a visual by a different attribute of the data. You perception and cognition are not fully representation to a textual representation may note that the map provides a familiar understood, but some generally accepted or an analytic algorithm? Answers like “A orientation but that the geographical concepts do exist that help to explain why picture is worth a thousand words” or “The orientation does not directly pertain to visualization may be helpful. In the first eye is the highest bandwidth to the brain” your task. In the end, it is unlikely that you stage of perception, we know that certain do not suffice. These vague, uninformative will assess these three data representations features pop out to a human eye, including statements appear in marketing, for to be equal. You will likely prefer one of 2D spatial position, color, size, orientation, example, but provide no actual explanation the representations because you judge it to and shape. This theoretical mechanism is for an inquisitive reader. Let’s take a more “work better” for your assigned task. called pre-attentive processing, because precise stab at the question using both This specific example is, in fact, the work occurs rapidly (on the order of an inductive example and then a more consistent with cognitive studies of 10 milliseconds per visual item) before detailed answer. how representation affects people’s conscious attention occurs. To illustrate As our example, I will describe a data comprehension and problem solving this effect, consider how the task of finding set and a task that you must complete. There skills. Looking at the cognitive and human the occurrences of the 2s in the numbers are three representations of the data offered factors literature, you can find many below is a tedious prospect: examples where one representation is to you, and you can choose one of the three 712985791828759871983749812723 to help you complete the task. The data better or worse than another for a specific 958719873498719384719983798134958 concern the cost of electricity in the fifty task. In designing an interface, therefore, 639847937558783449867873647658365 US states and the District of Columbia and success occurs when the designer creates 847594060374961837173587834587349 also which states have opted for electricity an effective representation for the intended 560346807740759189638738761475983 restructuring. Your task is to answer this tasks. question: Is there a relationship between So representation can make a 374068340576818734591874598373698 electricity restructuring and the cost of difference in performance, but why would 239898569833749573985791847868579 electricity? Figure 1 shows the full data a visual representation be helpful? The crux 040795806597037918618726981649867 In contrast, when highlighted for pre- attentive processing, the relevant digits ap- pear before you pay conscious attention: 712985791828759871983749812723 Figure 4: An example of 958719873498719384719983798134958 the principle 639847937558783449867873647658365 of proximity. Because of spatial 847594060374961837173587834587349 placement, people 560346807740759189638738761475983 tend to perceive the dots on the 374068340576818734591874598373698 left as columns and the dots on 239898569833749573985791847868579 the right as rows. 040795806597037918618726981649867

6 An Information Visualization Primer and Field Trip Figure 5: Visual Thesaurus display for the word Quiet Figure 6: Inspecting one defi nition of Quiet (Visual Thesaurus)

Image or text from the Visual Thesaurus (http://www.visualthesaurus.com), copyright 1998-2008 Thinkmap, Inc. All rights reserved.

Going beyond the fi rst stage of As fi nal points motivating the value Exploring Relationships: perception, we also know that humans of visualization, here are two comments Visual Thesaurus engage in object perception, which includes about how not to use visualization. Tour Stop: http://www. perceiving that several visual features are Visualization can be especially helpful for visualthesaurus.com/ related and form a group. Building from exploratory analysis; for example, some Visual Thesaurus is an interactive researchers hypothesize that visualization basic visual elements such as color and visualization of word relationships in is appropriate when the user does not stereo depth, people build contours and the English language [5]. The site is a yet know what questions to ask or what boundaries and use these to segment their subscription-based website. However, you to look for. However, for well-defi ned visual world into regions. The Gestalt can test drive the interface on a trial basis. tasks, visualization is not a good substitute principles, as related to perception, describe To see the visualization and interaction if the solution can be automatically how people perceive visual elements as features, enter the word quiet. Quiet is a calculated. Also, not all visualization is organized into patterns. For example, the good example word because it has a rich helpful; visualization can certainly be principle of proximity captures the fact set of associations. (Good and abstract are poorly designed. Chartjunk [4], exotic that, when people see objects that are close also good example words.) 3D layouts, and animations, for example, together, they perceive the objects as one Searching on the word quiet results group (see Figure 4). may be eye candy that seems interesting but that distracts, confuses, or misleads the in a view displaying all of the defi nitions The rationale for visualization is viewer. An appropriate visualization for a of quiet and other words related to those that judicious use of visual cues and task might be a simple, straight-forward defi nitions (see Figure 5). Defi nitions arrangements assists a person in forming chart that lets the data shine through. appear as circles and are colored according and interpreting a mental model (also to their parts of speech (i.e., nouns, known as a cognitive map) of some data. Visualization Tour Of The adjectives, verbs, and adverbs). Words If data are encoded so that useful semantic Internet are connected to their defi nitions by solid attributes are mapped to strong visual Now it is time for some exercises for lines. All of the words connected to a cues, then the resulting representation the reader! To make these visualization single defi nition are synonyms. Antonyms takes advantage of human perception. A defi nitions and descriptions less academic are linked to a defi nition by dashed red visual feature or visual pattern that attracts and more tangible, follow along on this lines. a viewer’s attention, then, corresponds fi eld trip. Each stop in the tour illustrates Defi nitions can also link to other to an interesting fact or trend in the data. visualization and interaction techniques, defi nitions by dashed lines that represent a Effective visualization can support needs right in your own cyber backyard. variety of relationships. Visual Thesaurus such as detecting relationships, detecting Together, the stops were selected to show currently represents 16 different types of patterns (both seeing trends and outliers), you a range of possibilities that open relationships, such as “is similar to,” “see tracking events over time, change data to exploration and that help a viewer also,” and “is a type of.” By hovering the detection/monitoring, and improving understand sometimes subtle relationships mouse pointer over a line, the specifi c situation awareness. in the data. relationship depicted by the line appears.

The Next Wave n Vol 17 No 2 n 2008 7 At the right side of the display, all Animating Scatterplots: providing animated statistics to promote definitions in the current view are listed Gapminder a fact-based understanding of our world. in groups according to the four parts of Tour Stop: http://www.gapminder. To do so, Gapminder uses a scatterplot speech. You can browse the definitions in org/world/ as its visual technique. Its scatterplots are the current view either by hovering your Gapminder World is a site that displays souped up with flexible, interactive options mouse over the visual side or by moving data (e.g., from the World Bank and that include animation. your mouse over the list entries. Moreover, UNICEF) about global development trends, Using drop-down menus, you can the two sides are coordinated; highlighting for example, the changing relationship choose the data for the x-axis and y-axis a definition on one side will highlight the between income levels and child mortality of the scatterplot as well as the data scale corresponding location on the other side [6]. Gapminder World describes its goal as (linear or logarithmic). Using the color and (see Figure 6). The Visual Thesaurus view is oriented around a single focus item (quiet in the case of our running example), which is placed in the center of the view. To choose a new focus, left-click on the word or definition (either in the visual side or list side). For certain words and definitions, right-clicking expands the item without refocusing the view. Using the these options, you can journey through the English language in any direction, using the current display as a starting point. Visual Thesaurus uses the concept of graphs as its visual technique. Here, graph is used, as in computer science and graph theory, to mean a set of relationships between entities. Nodes represent entities, and edges represent the relationships between entities. Renoir is another example Figure 7: Gapminder World view showing relationship between income level and of a graph-based visualization tool. life expectancy for three countries (time period 1975 to 2004) From the standpoint of interface design, Visual Thesaurus successfully tackles two challenges. The first challenge is one of data scale: How does the tool cope when it cannot display all of the user’s data in a single visualization? After all, Visual Thesaurus includes over 140,000 words and 115,000 definitions. The second challenge is one of data fusion: Can the tool provide effective, synthesized views of multiple types of data? Visual Thesaurus addresses data scale by providing navigation through subsets of the full graph of the English language. A user views visually manageable, yet still useful, subsets of data and can easily move to new subsets. Visual Thesaurus addresses data fusion by combining different node and edge types

within its graphs yet uses color and line Figure 8: Gapminder World view showing relationship between income level and styles to visually distinguish the multiple fertility rate for four countries (time period 1975 to 2004) types of data. A user works with all of the data in a unified way. Visualization from Gapminder World, powered by Trendalyzer from www.gapminder.org.

8 An Information Visualization Primer and Field Trip Figure 9: Map of the Market, using the blue/yellow color scheme, Figure 10: Map of the Market, using the red/green color scheme, taken on February 11, 2008 taken on February 27, 2007

size panels, you also choose how the data is Gapminder’s animation trails enable a user Figure 11: colored and sized. By moving your mouse to notice overall changes in the data while Control over the color legend, you can highlight also remembering the specific progression Panel for the SmartMoney data points in the scatterplot. You can of the data of most interest. Map of the select items by clicking on points within Market Grouping Data: Stock Market the scatterplot and by checking items in Map and News Map the selection list. Hovering your mouse over a point in the scatterplot reveals the Tour Stop: http://www.smartmoney. precise information for that point. com/marketmap/ The benefits of the interactive The SmartMoney Map of the Market provides a view of more than 500 stocks features are magnified when combined grouped into sectors (e.g., health care, with the ability to animate the scatterplot. financial, and consumer cyclicals) and Animation displays the data over time and, industries (within health care, the industries for pre-selected points, creates trails that are pharmaceuticals, medical products, SmartMoney.com ©2008. reveal trends over time. As an example, biotechnology, and managed care) [7]. SmartMoney is a joint Figure 7 shows upward trends in life publishing venture of Within an industry, the visual layout DowJones & Co and expectancy and income level in the United is fixed so that neighboring companies Hearst SM Partnership. States and China since 1975, but a very SmartMoney is a have historically had similar stock price registered trademark. different story, reflecting civil war and movements. As a result of this layout, the All rights reserved. genocide, for Rwanda. Figure 8 explores Used with permission user has increased ability to notice larger from SmartMoney. how fertility rates and income levels have market movements, with trends appearing changed for the United States, China, as regions of light and dark. With daily India, and Mexico. performance of all of the stocks displayed price gains and blue areas indicate losses, From an interface design standpoint, in a visualization that fits into a single compared to prices from the previous day’s Gapminder shows how simple visual and screen, the user immediately gets an market close. Dark colors indicate little or interactive techniques are powerful in the overview for the current trading day. no price change. Figure 10, a map with a right mix. Hardly an exotic technique, Figure 9 is a map that captured the red/green color scheme, shows a bearish scatterplots are often helpful for state of the market on a mixed day, with day in which the Dow Jones Industrial representing numerical data. By combining a fair amount of gainers, but some losers, Average fell 3.29 percent, which was the selection and animation, Gapminder particularly in the financial sector. Each biggest market drop seen in the prior four overcomes one difficulty with animation, square represents a single company years. Losses were across all sectors, one namely human memory limitations. and is sized by the company’s market exception being the happy, green company Whereas people might see an interesting capitalization. Each square is colored by in consumer staples. progression while watching an animation, the company’s stock price performance. Figure 11 shows the Map of the they probably will not remember details This example is using a blue/yellow color Market control panel, which includes after the animation has completed. scheme. Yellow areas indicate stocks with the color encoding being used. Using

The Next Wave n Vol 17 No 2 n 2008 9 “Overview, zoom & filter, details-on- demand.” This mantra, conceived by the eminent visualization researcher Ben Shneiderman, captures the idea of initially providing a user with an overview of a dataset. The overview provides a context for full understanding. From the overview and based on the question at hand, the user can navigate and drill down to the interesting details. Spotting Trends Over Time: Baby Name Popularity Tour Stop: http://www. babynamewizard.com/namevoyager/ NameVoyager is a visual interface to the most popular names for babies born in the United States [12][13]. The database comprises the top 1000 boys and girls Figure 12: Newsmap, taken on February 11, 2008, showing world, national, names for each decade since the 1890s. The technology, and entertainment headlines in the United States media overall NameVoyager view encompasses 5000 names. the control panel, you can select the Hierarchical groups are displayed as The data is maintained and available color scheme (blue/yellow or red/green). nested blocks. In the case of the Map of from the Social Security Administration You can choose to augment the map by the Market, the grouping helps a user (SSA). You can visit the SSA website marking companies appearing in current notice sector trends and exceptions to (http://www.ssa.gov/OACT/babynames/) news stories or by highlighting the top five those trends. for more information about the data stock gainers or losers. You can select how Newsmap is another application of collection. Moreover, the SSA website price changes are displayed (e.g., change treemaps (Extra Credit Tour Stop: http:// provides a textual comparison to since yesterday’s closing market price or www.marumushi.com/apps/newsmap/ NameVoyager. The webpage http://www. change over the past year). Finally, you can newsmap.cfm) [10]. Newsmap is a visual ssa.gov/OACT/babynames/decades/ use a search feature to locate individual version of ’s News Aggregator names1990s.html shows the list of popular companies. [11]. The Aggregator allows names for the 1990s. NameVoyager Within a map, you can click on users to search and browse news stories includes this decade along with the rest of companies and headline icons to drill appearing in 4,500 different news sources. the century. down to details. Clicking a company Newsmap presents a summary of the Looking at a NameVoyager view opens a menu that allows you to zoom into headlines (see Figure 12), highlighting (see Figure 13), each strip is a popularity a sector or industry or to access company the stories receiving the most attention (as measure for one name. A strip’s thickness details. Clicking a headline icon presents measured by the number of articles on the is based on the name’s frequency of use the related news article. For example, topic) for the indicated time period. The for that decade; the thickness of a strip in Figure 9, a user can inspect one of Smithsonian Institute (Extra Credit Tour varies over time. The overall height of the the bigger price decreases, American Stop: http://historywired.si.edu/index. graph shows the cumulative use for the International Group, and read the related html) offers a treemap-based exhibit of currently displayed names. Names appear article “AIG Shares Sink As Credit Losses 450 museum artifacts. in alphabetical order, from top to bottom. Mount.” From the standpoint of interface Each name is colored as either a girl’s or Map of the Market uses a visualization design, treemaps maximize the use of boy’s name. technique called a treemap [8][9]. The screen real estate. Compared to a graph Starting from the initial view of all treemap design is a technique for showing layout, which can contain a lot of white, top-ranked names, you can explore the data data in groups, especially for data that unused space, compact treemaps use all of by hovering the mouse, clicking on a strip, forms hierarchical grouping. Data in the pixels to represent data. Treemaps also or doing a prefix search. Mousing over a the same group are displayed together are an example of a design principle within name will highlight its strip and display within a contiguous block on the screen. the information visualization community: its rank for the given decade. Clicking a

10 An Information Visualization Primer and Field Trip name will zoom in on that name. Most stock prices of different companies. For currently has relatively active topic hubs on powerfully, doing a prefix search will stock prices, line graphs would probably food safety issues, crime statistics for the zoom in on a group of names. To conduct be a better visual choice. United States, terrorism, and presidential a search, click to activate the search box. campaigns. Collaborative Visualization: Type a letter and NameVoyager will zoom As a social experiment, Many Eyes Many Eyes into all names beginning with that letter. looks at the question of what happens You can add more letters to your search Tour Stop: http://services. when large numbers of people can access to narrow the focus of the graph. Notice alphaworks.ibm.com/manyeyes/home visualization technology. Many Eyes that, as the graph changes, NameVoyager Launched in 2007, Many Eyes is exemplifies a trend in the information adjusts the scale of the y-axis to maximize a participatory website where visitors visualization community towards giving visibility. can upload data, create interactive people greater access to data and visual Exploring with NameVoyager, you visualizations, and contribute comments analysis tools. Many Eyes also exemplifies can answer specific questions and see [14][15]. By providing a website and the concept of Web 2.0, which is the idea larger trends. After searching for your visualization tools, its developers hope to of using the Internet as the platform name (and your spouse’s, children’s, and spark collaborative, deliberative analysis for information sharing and social parents’ names…), you may ask historical and discussion about data. collaboration. Proponents of Web 2.0 see and cultural questions (e.g., What has A Many Eyes user can view and the World Wide Web as a place to maximize happened to the name Adolph? Are people comment on data and visualizations individual and collective intelligence by naming their children Lexus?). In doing created by others. A user can also load new enabling users to contribute as well as to prefix searches, you may start to notice data and create visualizations based on consume information. group trends. For example, Figure 14 how the data is formatted. Currently, Many Visualization for the Masses shows that names beginning with “I” are Eyes provides roughly fifteen interactive rebounding. Does this pattern hold true for visualization options, including treemaps, The trend towards giving people other vowels? stack graphs, and bubble charts. Figure 15 greater access to data and useful NameVoyager uses a visualization shows a bubble chart of data from the US visualizations is also occurring in the press. technique known as stack (or stacked) Centers for Disease Control and Prevention Both print and web-based publications graphs. Each strip, individually equivalent (CDC) of gastrointestinal illness outbreaks are using visualizations as access to data. to a single line graph, rests on top of a stack on international cruise ships. Each bubble Consequently, readers have a chance to of other line graphs. From the standpoint represents a cruise ship line, and colored assess whether the data support conclusions of interface design, stack graphs were slices within a bubble represent the number presented in the media. designed to visualize change over time of ill passengers for each ship in the cruise Tour Stop: http://www.slate.com/ for groups of non-negative quantities, line. Visualizations on Many Eyes can be id/2180392/ especially if it makes sense to add up the collected into topic hubs to allow people Slate, a daily web-based magazine, groups. As a counterexample, it generally with specific interests to focus attention provides an example of using a social does not make sense to add together the on relevant data. For example, Many Eyes network graph (see Figure 16)[16].

Figure 13: NameVoyager interface highlighting Figure 14: NameVoyager interface showing a resurgence the long-term popularity of the name James of interest in names beginning with the letter “I”

The Next Wave n Vol 17 No 2 n 2008 11 Figure 15: Many Eyes bubble chart titled “Outbreak Summaries for Figure 16: Slate’s summary of the Mitchell Report International Cruise Ships: Jan. 2004 - Aug. 2007”

The interactive graph summarizes the way, Ericson emphasized that the team performance from shortly after home run Mitchell Report about performance- members function as journalists rather 400 to home run 600, Bonds’ rate of home enhancing drugs in professional baseball. than illustrators. runs increases (refl ected by the slope of the The graph condenses the 409-page report Tour Stop: http://www.nytimes.com/ line). This acceleration differs from other into key fi ndings about people and their interactive/2007/10/23/us/20071023_ players, whose home run rates tend to relationships. For a reader who does FIRES_GRAPHIC.html# plateau or decrease when they reach their not also read the full Mitchell Report, thirties. According to claims about steroid Tour Stop: http://www.nytimes. the graph serves as a comprehensible use, Bonds allegedly began taking steroids com/ref/sports/20070731_BONDS_ summary of the material. For a reader who before his 14th season (shortly after his GRAPHIC.html# also reads the full report, the graph helps 400th home run). Final examples from The Tour Stop: http://www.nytimes. the reader organize, remember, and review New York Times focus on the words that com/ref/washington/20070123_ the fi ndings. politicians use in speeches and how their STATEOFUNION.html The New York Times provides many change over time [20][21]. Views Tour Stop: http://www.nytimes. other examples of static, in-print graphics of State of The Union speeches (using com/interactive/2007/12/13/us/ and dynamic, on-line visualizations. Matt bubble charts) and presidential candidate Ericson, deputy graphics editor at The New politics/20071213_DEBATE_GRAPHIC. debates (via a transcript analyzer) enable York Times, recently spoke about how html#transcript readers to fi nd passages relevant to their the information graphics team produces As an example, The New York Times interests (e.g., budget, social security, or visualizations for a broad audience [17]. published a visualization of the wildfi res war). in southern California in October 2007 Using the principle “Show, Don’t Tell,” the References team strives to present raw data clearly so [18]. The online visualization integrated that a reader does not have to comb through data from California state and county [1] Card S, Mackinlay J, & Shneiderman an associated article to understand the emergency management agencies and the B. Readings in Information Visualization: visualization. They pay close attention to United States Forest Service, providing Using Vision to Think. Morgan Kaufmann accurate portrayal, avoiding visualizations a direct understanding of the extent and Publishers; 1999. that tend to cause mistaken conclusions. progression of the fi res. An example of [2] Thomas JJ & Cook K, editors. They augment their visualizations with sports data, a visualization of Barry Bonds Illuminating the Path: The Research and textual descriptions to guide readers to home runs record shows the accumulation Development Agenda for Visual Analytics interesting areas of the data. They provide of home runs for the 21 professional [Internet]. IEEE Computer Society; 2005 more than one view when needed to players with the most home runs in [retrieved 26 Feb 2008]. Book available at: explain a complex or subtle point. In this baseball [19]. Looking at Barry Bonds’ http://nvac.pnl.gov/agenda.stm

12 An Information Visualization Primer and Field Trip [3] Bremner E & Fellenz C. The Price of IEEE Transactions on Visualization and Power. Baltimore Sun. 11 June 2006. Computer Graphics. 2007;13(6):1121- [4] Tufte ER. The Visual Display of 1128. Quantitative Information. Graphics Press; [16] Perer A & Wilson C. The Steroids 1983. Social Network: An Interactive Feature [5] Visual Thesaurus [Internet]. Thinkmap, on the Mitchell Report. Slate [Internet]. Inc; 1998-2008 [retrieved 11 Feb 2008]. updated 21 Dec 2007 [retrieved 14 Feb Available at: www.visualthesaurus.com 2008]. Available at: http://www.slate.com/ id/2180392/ [6] Gapminder [Internet]. Gapminder [17] Ericson M. InfoVis Keynote World [retrieved 11 Feb 2008]. Available Address. Visualizing Data for the Masses: at: http://www.gapminder.org/world Information Graphics at The New York [7] SmartMoney Map of the Market Times. IEEE Transactions on Visualization [Internet]. Dow Jones & Company, Inc. and Computer Graphics. 2007;13(6):xxv. and Hearst SM [retrieved 11 Feb 2008 and Additional information available at http:// 27 Feb 2007]. Available at: http://www. www.ericson.net/, [retrieved 22 Feb 2008] smartmoney.com/marketmap/ [18] Aigner E, Bloch M, Carter S, [8] Johnson B & Shneiderman B. Tree- Cox A, & Ericson M. The Extent of the maps: A Space-Filling Approach to the Fires. The New York Times [Internet]. Visualization of Hierarchal Information 23 Oct 2007 [retrieved 26 Feb 2008]. Structures. In: Proceedings IEEE Infor- Available at: http://www.nytimes.com/ mation Visualization 1991; p. 297-282. interactive/2007/10/23/us/20071023_ [9] Shneiderman B. Treemaps for Space- FIRES_GRAPHIC.html Constrained Visualization of Hierarchies [19] Carter S, Cox A, & Ward J. Paths [Internet]. [retrieved 25 Feb 2008]. to the Top of the Home Run Charts. The Available at: http://www.cs.umd.edu/hcil/ New York Times [Internet]. 08 Aug 2007 treemap-history/ [retrieved 26 Feb 2008]. Available at: http:// [10] Newsmap [Internet]. Weskamp M www.nytimes.com/ref/sports/20070731_ & Albritton D [ retrieved 11 Feb 2008]. BONDS_GRAPHIC.html# Available at: http://www.marumushi.com/ [20] Werschkul B. The 2007 State apps/newsmap/newsmap.cfm of the Union Address. The New [11] Google News Aggregator [Internet]. York Times [Internet]. [retrieved 26 [retrieved 11 Feb 2008]. Available at: Feb 2008]. Available at: http://www. http://news.google.com/ nytimes.com/ref/washington/20070123_ STATEOFUNION.html [12] Baby Name Wizard’s Name Voyager [Internet]. Generation Grownup [21] Carter S, Dance G, Ericson M, LLC [retrieved 11 Feb 2008]. Available Jackson T, Singh V, Ellis J, & Sharpe at: http://www.babynamewizard.com/ M. Democratic Debate: Analyzing namevoyager/. the Details. The New York Times [Internet]. 13 Dec 2007 [retrieved 26 [13] Wattenberg M. Baby Names, Feb 2008]. Available at: http://www. Visualization, and Social Data Analysis. nytimes.com/interactive/2007/12/13/us/ In: Proceedings IEEE Information politics/20071213_DEBATE_GRAPHIC. Visualization 2005; p 1-7. html#transcript [14] Many Eyes [Internet]. Visual Communication Lab, IBM [retrieved 15 Feb 2008]. Available at: http://services. alphaworks.ibm.com/manyeyes/home [15] Viegas F, Wattenberg M, van Ham F, Kriss J, & McKeon M. Many Eyes: A Site for Visualization at Internet Scale.

The Next Wave n Vol 17 No 2 n 2008 13 Visual Analytics: Adventures in Exploring Information Spaces

On a recent adventure in a remote part of the Grand Canyon, I gazed at a panel of rock art depicting the revered pursuit of an ancient hunt. These 2,000 year- old petrogylphs1 seemed to tell the sto- ry of a hunter and the hunted. A man armed with a bow and arrow was track- ing a bighorn sheep. At least, that’s the interpretation that unfolded in my mind. In reality, the meaning and purpose of such drawings is unknown and is often debated by scholars. These early static visualizations chiseled in rock have out- lasted both time and meaning. We do not know the original intention of the ancient artist. Was he telling a story, recording history, paying homage to the gift of food, relaying info to his bud- dies where to find good hunting? It’s a mystery, but clearly, he was recording some important notion for himself and others. We don’t know what he was thinking or intending as he rendered his

14 Visual Analytics masterpiece. A static drawing or visualization with forts for applying ideas of visual analytics ter support human judgment and analysis. limited additional supporting information to intelligence problems. An over-arching The intent was to make the best possible is easily misinterpreted. Its meaning can goal of this article is to inform and inspire use of intelligence information and ap- be difficult to share with others. A good, you to think of your problems, analyses, propriately share it with others to prevent, useful, well-conceived visualization may and information spaces in terms of visual deter, and respond to threats. Therefore, readily show relationships or concepts that analytics. Are you stumped, gazing at your in 2004, DHS chartered and established answer specific questions. However, what own “rock art”? Perhaps, the NVAC pro- NVAC specifically to address the need for if you want to ask a slightly different ques- gram can help you become a full-fledged better analysis through visualization [2]. tion? What if you want to vary one param- “archeologist” in your analytic pursuits. A major objective of NVAC was to eter? What if you want to explore an idea define a long-term research and develop- that was not anticipated by the original Visual Analytics ment agenda for visual analytics to facili- “artist”? You don’t want to gaze at a static Everything old is new again. tate advanced analytical insight. Based in piece of “rock art” and guess an interpreta- There is nothing new under the sun. part on a long, well-established record of laudable research in information visu- tion. You, the curious, delving intelligence Visual Analytics is often touted as a alization, DHS chose Pacific Northwest analyst, want to explore many facets of new emerging discipline that combines ef- National Laboratory (PNNL) to oversee your information space. You want to look forts from a variety of other fields. This is and lead the NVAC effort. NVAC pulled at data from multiple perspectives and test true. However, in many respects, it might together a multi-disciplinary panel of top different theories. You want to examine, also be viewed as the maturing of visual- researchers from diverse fields to develop compare, and contrast multiple relation- ization applied to the complete spectrum this research agenda through a series of ships and patterns. You require a dynamic, of analysis. One could argue that the very workshops. The panel’s resulting research interactive, visual exploration environ- earliest computer graphics such as Ivan was published as a well-received and com- ment to conduct your own adventures in Sutherland’s Sketchpad system were also prehensive 2005 publication titled “Illu- analysis. Providing such a useful analytic early forms of visual analytics [1]. How- minating the Path: The Research and De- exploratorium is exactly the goal of a new ever, in recent years there has been a key velopment Agenda for Visual Analytics” multi-disciplinary field of research known confluence of events: 1) the maturing of (see Figure 1) [3,4,5]. This noteworthy as visual analytics. Visual Analytics fo- dependent sub-fields of research; 2) suf- publication is a key milestone in launching cuses on injecting interactive visualization ficiently powerful, affordable, interac- this new field of visual analytics as well throughout the analysis process. Here vi- tive computers and software; 3) scale: the as providing a long-term research agenda sualization is not an after-thought or used explosion in the sizes and complexity of to achieve its goals. It focuses on the need only for presentation but an integral part of datasets that outstrip traditional analysis for collaborative and cooperative work in analysis. methods; and 4) wide-spread demand for academia, government, and industry to This article is an introduction to the better analysis methods to deal with this improve analysis in Homeland Security new, exciting research area of visual ana- complexity by many people from many di- applications. lytics. We outline the multi-disciplinary verse domains. The combination of these In this publication, the panel carefully nature of this work and point to its roots in new technological capabilities and experi- chose to define visual analytics as “the sci- supporting fields. We then focus on a new ences as well as the pent-up demand for national research initiative in visual ana- better analysis methods seems to have co- lytics known as the National Visualization alesced into a critical nexus. and Analytics Center (NVAC). NVAC is The oft-cited birth of this new push sponsored by the Department of Homeland in modern visual analytics is traced to a Security (DHS) and implements a broad research panel gathered by DHS following research agenda focused on homeland se- the September 11 attacks on the United curity applications in academia, industry, States. The subsequent response to these and government. The National Security horrific events by the US Government Agency (NSA) is an active participant in took many forms. The establishment of these NVAC efforts. We briefly discuss our DHS itself was one such result. One gen- relationship and focus on a current research eral thrust within DHS was to develop challenge problem that NVAC is pursuing new technologies to provide technical and for the Research Directorate’s Knowledge analytical advantages in preventing terror- Discovery Research group. In the process, ism. With the increased complexities of we mention a few examples of in-house ef- analysis, technologies were needed to bet- Figure 1: NVAC report: Illuminating the Path 1 Petroglyphs are images chiseled in rock. Pictographs are images painted on rock.

The Next Wave n Vol 17 No 2 n 2008 15 ence of analytical reasoning facilitated by nteractive visual interfaces.”

Visual Analytics is the science of analytical reasoning facilitated by interactive visual interfaces.

Visual Analytics as a Research Discipline One very important factor that dif- ferentiates visual analytics from its sup- porting fields of information visualization and human-computer interaction (HCI) is its emphasis on analytical reasoning. Tra- ditional visualization focuses on visual representations and visual mappings of data. HCI focuses on effective interaction

in terms of usability and utility. However, Figure 2: The Multi-disciplinary Scope of Visual Analytics how these interactive visualizations best goals of visual analytics is better integra- search centers where NVAC would guide fit into the analytic cognitive workflow has tion and fertilization of these disciplines. (and fund) research and education in visu- often been ignored. The act of using visual- Hence, the NVAC report strongly suggests al analytics. These Regional Visualization ization often takes analysts out of their an- multi-disciplinary education and training and Analytics Centers (RVACs) would be alytic “zone” or cognitive working model. in some of these key areas. One thrust of primary links to academia and research Visual analytics emphasizes analytical rea- NVAC is to educate and train the next gen- and serve as regional centers of excel- soning and attempts to blend visualization eration of researchers and practitioners in lence. Government and industry partners throughout the analytic process without vi- visual analytics. Hence NVAC is working could work directly with RVACs and help olating this cognitive workflow. An entire with academia, industry, and government drive their research problems. An RVAC early section of the NVAC report (chapter to help formulate and plan a visual analyt- can be a single institution or a team of 2) focuses on “The Science of Analytical ics curriculum. institutions. Specific RVACs might focus Reasoning” and discusses visualization’s on particular domains or research areas. role in the analytic sense-making loop. National Visualization and Analytics Center Similarly, Dr. Thomas envisioned Gov- The NVAC report also characterizes NVAC is a multitude of things. It’s ernment Visualization and Analytic Cen- the analytic use of visual analytics in an idea. It’s a DHS research program. ters (GVACs). These government centers four main ways: It’s a government resource, an academic would help drive NVAC analytic problems • Synthesize information and derive in- resource, and an industry resource. It’s a as well as serve as test beds and technolo- sight from massive, dynamic, ambiguous, network of collaborative researchers and gy delivery conduits for NVAC developed and often conflicting data. centers. It’s an education resource, an capabilities and technologies. • Detect the expected and discover the analytic resource, and technical resource. In addition to directing and funding unexpected. In addition to all of these virtual facets, research at these RVACs, NVAC itself • Provide timely, defensible, and under- the NVAC program is also a real, physical conducts research at its PNNL facili- standable assessments. center of excellence. NVAC is located ties. To guide this research, NVAC seeks • Communicate assessment effectively at PNNL, in Richland, WA. When DHS analytic input and challenge problems for action. decided to form NVAC, they chose PNNL from the intelligence community, DHS, Since analysis is very much a multi- to direct and run the NVAC program and other parts of government. To date, disciplinary endeavor, visual analytics is based on PNNL’s previous experience and two government centers (GVACs) have also multi-disciplinary in nature. It draws research in visualization and government been established and actively participate heavily from a wide range of fields includ- analytic applications. Dr. Jim Thomas in the NVAC research program: CIA and ing information visualization, interaction, of PNNL was selected to be director of NSA. The Canadian government has cognitive science, knowledge discovery, NVAC and serve as the nation’s visual also followed the development of NVAC computer science, math, statistics, per- analytics champion. with much interest. Just recently, the De- ception, and data management, as well From the very beginning, Dr. Thomas fence R&D Canada announced their own as others (see Figure 2). One of the main envisioned a network of key academic re- NVAC-like research program, nominally

16 Visual Analytics [3] Bremner E & Fellenz C. The Price of IEEE Transactions on Visualization and Power. Baltimore Sun. 11 June 2006. Computer Graphics. 2007;13(6):1121- [4] Tufte ER. The Visual Display of 1128. Quantitative Information. Graphics Press; [16] Perer A & Wilson C. The Steroids 1983. Social Network: An Interactive Feature [5] Visual Thesaurus [Internet]. Thinkmap, on the Mitchell Report. Slate [Internet]. Inc; 1998-2008 [retrieved 11 Feb 2008]. updated 21 Dec 2007 [retrieved 14 Feb Available at: www.visualthesaurus.com 2008]. Available at: http://www.slate.com/ id/2180392/ [6] Gapminder [Internet]. Gapminder [17] Ericson M. InfoVis Keynote World [retrieved 11 Feb 2008]. Available Address. Visualizing Data for the Masses: at: http://www.gapminder.org/world Information Graphics at The New York [7] SmartMoney Map of the Market Times. IEEE Transactions on Visualization [Internet]. Dow Jones & Company, Inc. and Computer Graphics. 2007;13(6):xxv. and Hearst SM [retrieved 11 Feb 2008 and Additional information available at http:// 27 Feb 2007]. Available at: http://www. www.ericson.net/, [retrieved 22 Feb 2008] smartmoney.com/marketmap/ [18] Aigner E, Bloch M, Carter S, [8] Johnson B & Shneiderman B. Tree- Cox A, & Ericson M. The Extent of the maps: A Space-Filling Approach to the Fires. The New York Times [Internet]. Visualization of Hierarchal Information 23 Oct 2007 [retrieved 26 Feb 2008]. Structures. In: Proceedings IEEE Infor- Available at: http://www.nytimes.com/ mation Visualization 1991; p. 297-282. interactive/2007/10/23/us/20071023_ [9] Shneiderman B. Treemaps for Space- FIRES_GRAPHIC.html Constrained Visualization of Hierarchies [19] Carter S, Cox A, & Ward J. Paths [Internet]. [retrieved 25 Feb 2008]. to the Top of the Home Run Charts. The Available at: http://www.cs.umd.edu/hcil/ New York Times [Internet]. 08 Aug 2007 treemap-history/ [retrieved 26 Feb 2008]. Available at: http:// [10] Newsmap [Internet]. Weskamp M www.nytimes.com/ref/sports/20070731_ & Albritton D [ retrieved 11 Feb 2008]. BONDS_GRAPHIC.html# Available at: http://www.marumushi.com/ [20] Werschkul B. The 2007 State apps/newsmap/newsmap.cfm of the Union Address. The New [11] Google News Aggregator [Internet]. York Times [Internet]. [retrieved 26 [retrieved 11 Feb 2008]. Available at: Feb 2008]. Available at: http://www. http://news.google.com/ nytimes.com/ref/washington/20070123_ STATEOFUNION.html [12] Baby Name Wizard’s Name Voyager [Internet]. Generation Grownup [21] Carter S, Dance G, Ericson M, LLC [retrieved 11 Feb 2008]. Available Jackson T, Singh V, Ellis J, & Sharpe at: http://www.babynamewizard.com/ M. Democratic Debate: Analyzing namevoyager/. the Details. The New York Times [Internet]. 13 Dec 2007 [retrieved 26 [13] Wattenberg M. Baby Names, Feb 2008]. Available at: http://www. Visualization, and Social Data Analysis. nytimes.com/interactive/2007/12/13/us/ In: Proceedings IEEE Information politics/20071213_DEBATE_GRAPHIC. Visualization 2005; p 1-7. html#transcript [14] Many Eyes [Internet]. Visual Communication Lab, IBM [retrieved 15 Feb 2008]. Available at: http://services. alphaworks.ibm.com/manyeyes/home [15] Viegas F, Wattenberg M, van Ham F, Kriss J, & McKeon M. Many Eyes: A Site for Visualization at Internet Scale.

The Next Wave n Vol 17 No 2 n 2008 13 size is an order of magnitude improve- ment beyond most cited and recognized capabilities. This has truly been a ground- breaking effort. For the coming year, we are continuing this work with new empha- sis on using analytics to guide and abstract visual representation and layout during navigation and interaction. We consider this initial prototype to be an interesting base capability for further testing and ex- ploring a variety of interesting analytics. In summary, we have found participating with NVAC to be very beneficial in sup- porting our programs in visual analytics.

Conclusion This article has been a very brief over- view and introduction to the new multi- disciplinary field of visual analytics. A key Figure 3: Interactive Visualization and Navigation of Large-Scale Graphs focus of visual analytics is the science of analytical reasoning by interactive visual interfaces. This requires tight integration of analytics, visualization, and interaction to provide a rich cognitive exploratorium. NVAC is leading a national research initia- tive for further research and education in this field with the goal of advancing analy- sis for homeland security applications. NSA is actively participating in NVAC and is currently funding a challenge problem to advance the state of large-scale graph analysis for analytic applications. We wel- come further discussion and participation in our efforts. We also encourage you to learn more about NVAC and participate as appropriate. Visual Analytics is intended to be an active, engaging exploratory process of discovery. Interactive probing of rela- FigurFigure 4: MuMultipleltiple LeLevelvelso fD etaetailil and AbAbstractionstraction in LaLarrgge-Scalee Scale GrGraphsaphs tionships and testing of hypotheses is fa- filter and details-on-demand paradigm is abstract parts of the graph. An overview cilitated by visual feedback. Reflecting on much more effective for understanding window is used to maintain overall context my amateur interpretation of static Grand large datasets than a bottom-up query and of the entire dataset. Figures 3 and 4 are Canyon rock art, it is clear that meaningful display paradigm; 3) Interactive visual dis- examples and early results of this NVAC analysis or interpretation is difficult with- covery methods are necessary for finding research challenge problem using a syn- out further context or supporting informa- the “unknown unknowns”; 2 4) Size and thetic dataset. Our synthetic test dataset tion. These 2,000-year-old static visual- complexity of information spaces have consists of a co-author network of paper izations are frozen in time with obscure outpaced the limits of many visualization publications in which a node represents an meaning. Hopefully, as you explore your capabilities. author and a link represents a co-authored own information spaces, you will have In this work, progressive disclosure paper. It should be noted that interactive sufficient analytic tools and a supportive and multiple levels of detail are used to display and navigation of graphs of this environment to conduct a full “archeologi-

2 Finding “unknown unknowns” - identifying the gaps or missing information in an analytic information space, realizing what you don’t know

18 Visual Analytics cal excavation.” The emerging field of vi- References and Resources sual analytics might provide you with the [1] Sutherland IE. Sketchpad: A Man-Ma- proper analytic environment to help you chine Graphical Communication System. turn your static “rock art” into a lucrative Proceedings of the Spring Joint Computer mine of gold. Conference. Baltimore (MD): Spartan Books, 1963. [2] National Visualization and Analytics Center (NVAC): http:// nvac.pnl.gov [3] Thomas JJ & Cook K, editors. Illu- minating the Path: The Research and De- velopment Agenda for Visual Analytics [Print]. IEEE Computer Society; 2005. ISBN 0-7695-2323-4. [4] Thomas JJ & Cook K, editors. Illu- minating the Path: The Research and De- velopment Agenda for Visual Analytics [Internet]. IEEE Computer Society; 2005. Book available from http://nvac.pnl.gov/ agenda.stm [5] Thomas JJ & Cook K, editors. Illumi- nating the Path: The Research and Devel- opment Agenda for Visual Analytics [In- ternet movie]. Available from: nvac.pnl. gov/agenda.stm#movie [6] Regional Visualization and Analytics Centers (RVACs): nvac.pnl.gov/centers. stm [7] North-East Visualization and Analytics Center (NEVAC): www.geovista.psu.edu/ NEVAC/ [8] Purdue Visualization and Analytics Center (PURVAC): engineering.purdue. edu/PURVAC/ [9] Stanford University RVAC: graphics. stanford.edu/~hanrahan [10] South-East Regional Visualization and Analytics Center (SRVAC): srvac. uncc.edu [11] Pacific Rim Visualization and Analyt- ics Center (PARVAC): www hitl.washing- ton.edu/parvac/ [12] IEEE VAST Challenge: www.cs.umd. edu/hcil/VASTchallenge08/ [13] Information Visualization Bench- marks Repository: www.cs.umd.edu/hcil/ infovisRepository [14] Visual Analytics Digital Library: vadl.cc.gatech.edu

The Next Wave n Vol 17 No 2 n 2008 19 Do You Know What’s Happening — Just Beyond Your View?

Analysts are always looking for more to provide new tools for presenting those to return a data subset that becomes the ba- information to broaden their understand- data to better support the analytic process. sis for the day’s work. Within this list, the ing of a subject in order to provide the best As analysts find growing volumes of data analyst is expected to work through each possible assessment. They want to know at their disposal, software developers are of the items, select which data to work what is not yet known, to see beyond their retuning display perspectives to make the (typically done in a sequential fashion), current view, and it is this desire that drives information available faster and more in and then conduct the appropriate actions. the collection of more and more data. This synch with the requirements of intelligence But does this Filter-Select-Analyze (FSA) need to understand what is outside the cur- analysis. In this article, we describe how model accurately represent and complete- rent view is also important when design- longshot displays are a critical element ly support the cognitive work that needs ing representations that support analysts to complement the current trend toward to be accomplished by an intelligence ana- who must work through these resultant powerful but highly focused visualization lyst? What are the hidden assumptions large data sets. techniques. Longshot displays explicitly contained within an FSA model? In a world where technology en- support the analyst’s ability to understand The first assumption the FSA model ables the processing and analysis of a data in its larger context, focus and refocus makes is that the intelligence analysts can huge amount of electronic data, it falls to on individual elements of that data set for accurately construct a query that will ap- the intelligence analyst to make sense of detailed inspection in traditional visual- propriately filter the database down into a it all. One area in which technology has izations, and prevent the “keyhole effects” meaningful subset of data that is ready for attempted to support intelligence analy- that contribute to “Premature Closure” of analysis. When given only the selected re- sis is through data visualizations and in- analytic conclusions [1]. sults from a query, the analyst is left won- formation displays. Visualizations, in A technology common in many ana- dering—what data did I miss and why? which data are displayed in recognizable lytic tools is the ability to query a database After all, analysis is an iterative narrow- graphic elements, are increasingly moving Longshot displays are used as a visualization technique to into mainstream applications as a remedy maintain information awareness or context in a domain display for information overload. Just as tech- with large amounts of data. The name “Longshot Display” is nological advancements have improved used to describe a variant of any number of displays that range data collection, attempts have been made from a simple scatterplot to more complex applications.

20 Do You Know What’s Happening ing and broadening process. In addition all data related to a particular task to an Criteria for Longshot Displays to narrowing functions, the support tools analyst simultaneously in a single global In order for a display to be analysts use should cater to broadening context, providing the proverbial 40-thou- considered a longshot display it must pass several criteria: searches to help recognize relevant infor- sand foot view from which the entire data 1) The display must contain all data mation, break fixations on single or brittle set can be seen and understood. related to the user’s defined hypotheses, and/or widen the hypothesis cognitive work. set that is considered to explain the avail- Characteristics of Longshot 2) The data must be placed in a able data [1]. Displays meaningful frame of reference. A second common assumption of vi- Showing all the data in the data set is 3) The display must provide sualization techniques is that if the under- a necessary but not sufficient requirement mechanisms to support user lying data (e.g., the results from a query) to create a longshot display. An effective focusing and refocusing. can be represented, then the visualization longshot display will also have frames of 4) The display must be coupled with a details view, in which users can has done all that it needs to do. The data reference (the variables used on the infor- focus on selected data. can be selected from the visualization, mation space axes) that support the infor- therefore it is a good visualization, and mation relationship requirements under- `represented on the display, the user is also the work to understand the data falls on lying the user’s relevant cognitive work. able to see and determine the value of an the user. However, simply assuming ap- For example, a map of the world with individual datum. The analyst is given the plication programming interface (API) data overlaid on it will show all data on it best of both worlds. data compatibility between data and pre- (since the world will contain all the data), Text lists and spreadsheets do not sup- sentation layers does not ensure that the but may not qualify as a longshot display port (and can often undermine) this abil- analysts are effectively supported. What if the analyst does not need the data in a ity to identify patterns and force analysts is necessary is a set of visualizations and spatial frame of reference provided by the to think about data one element at a time. technology that supports all of the cogni- map to complete an analysis on the data. Such serial processing is constrained by tive work the intelligence analyst must ac- Establishing the proper frames of ref- the limits of human memory and requires complish [2]. Effective support for intelli- erence for the display allows analysts to significant cognitive processing to make gence analysts requires that visualizations detect patterns within the data set. In such the connections hidden by such “flattened” provide mechanisms for supporting a large emergent pattern displays, although each representations of the data. By presenting number of cognitively difficult tasks, in- datum is represented, the data as a whole the data to take advantage of the ability to cluding (but certainly not limited to): accumulate into larger coherent structures, see patterns, the representation designer • pulling records and running database providing meaning to the data set as a searches, alters how analysts process data, shifting • understanding and managing the results of whole. the work from effort intensive cognitive the data search, This design approach is grounded in processing into the faster, unconscious, • staying on top of new developments while doing research and analysis, the fundamental competency of the hu- less error prone processing of the visual • determining which datum out of the set to man perceptual system. Research on hu- system. This is equivalent to a computer, select for analysis, man perceptual capabilities has repeatedly in which graphics processing is faster and • completing analysis of selected data, and shown that our perceptual systems exhibit • determining what kind of information is more efficient when visual information is helpful and what is not. the tendency to seek out patterns in the stored and processed on specialized video While no one component technology world, requiring little cognitive effort to hardware instead of using the general pur- can facilitate all of the necessary cogni- see and understand these patterns. These pose CPU. tive tasks, the suite of support tools avail- pattern recognition capabilities are so But how does a representation de- able to the analyst must facilitate the syn- prevalent that our perceptual systems may signer determine what the proper frames chronized accomplishment of these tasks. even create false patterns when objects of reference should be for the longshot Most visualizations focus only on support- have no relationship to one another (e.g., display? The key to answering this ques- ing analysts to run database searches and consider how easy it is to spot animals or tion lies in understanding the work the perform analysis on the returned data, yet other objects in cloud formations1). user is trying to accomplish. The cognitive they fail to support the other forms of cog- When displays are designed to take work of the analyst will drive the infor- nitive work. Coupling these detail-focused advantage of our visual system’s powerful mation relationships needed to complete displays with an information space, long- pattern recognition processing, the ana- that work, which will guide the frames of shot displays will provide support for these lyst doesn’t need to remember the value reference to be used. For example, if an other forms of cognitive work, enabling of each datum; the pattern reveals the data analyst’s report is due at a certain time, the analysts to more effectively work with the set (see: Longshot Example 1). At the designer may consider using time until the overall data set. Longshot displays present same time, as each individual data value is report is due as a key frame of reference in

1 Of course, designers must also be careful about creating false patterns with their data representations.

The Next Wave n Vol 17 No 2 n 2008 21 Longshot Example 1: Intelligence Questions Display theth visualization. In doing so, it is clearly An Intelligence Question is a request for information that can be directly visiblevi which reports are coming due and answered based on collected evidence in the world. These are de-aggregations wwhich are not, which will help the analyst of Intelligence Needs, which ask higher level, operational questions. dedecide which report to work on next. NSA is tracking a large number of However, the reader is cautioned from Intelligence Questions at a time (the thinking that the same frames of reference number is large enough that a precise th count is not known). Managing that many eaeasily and arbitrarily generalize across Intelligence Questions (or any entity, for that matter) is a daunting challenge, but mmany visualizations; the most important managing these Intelligence Questions is essential to operational success. ininformation required to complete the cog- Therefore, the NSA’s Laboratory for Creative Development (LCD) devised ninitive work should always determine the a conceptual solution that utilized a apappropriate frames of reference [2]. Re- Longshot display. LCD identified the key frames of tuturning to the preceeding example, the im- reference for monitoring the set of pportanceo of the analyst report (i.e., priority) Intelligence Questions: the priority of each Intelligence Question, whether anand number of previous reports generated reports have been generated in the past on each Intelligence Question, and how on that topic may be more important than long each Intelligence Question has gone without a report. The LCD believed that, while more information existed for each Intelligence Question, these three parameters would allow analysts to decide where in the ththe report’s due date, and the representa- set of Intelligence Questions to focus. ttionio designer may choose to use these as Those Intelligence Questions that are higher priority are higher on the y-axis of a scatter-plot. On the left side of the y-axis are those Intelligence Questions that are new and have never been reported on, those to ththe frames of reference. There is no ge- the right have been reported on in the past, but require continued scrutiny. Intelligence Questions that are older are further out on the x-axis (making the display reflective along the y-axis). Given that many Intelligence neneric way to generate the proper frames of Questions may have the same set value for each parameter, the size of a mark in the visualizations indicates rereference—the designer must understand how many Intelligence Questions share those parameters. The display clearly reveals to the analyst high priority Intelligence Questions that have gone a long time ththe cognitive work of the analyst and de- without an assessment (upper corners of the display), as well as low priority Intelligence Questions that have tetermine the relative importance of all the been assessed recently (lower-center of the display. It is easy to see when low priority work is being done over high priority work (although why is left for the details display—an assessment could have been made on ininformation required to support that work. a low priority Intelligence Question because the information required was easily available, while not data were available for the high priority Intelligence Questions). The visualization guides analysts to where to focus next, Techniques exist to support the extraction allowing them to better manage the set of thousands. anand distillation of this cognitive work. Within the field of Cognitive Systems En- ggineering,i the application of a Cognitive Longshot Example 2: Performance View Work Analysis has traditionally been the Performance View is a replacement tetechnique by which our understanding of visualization design that supports workers monitoring the health and flow of a wwork domain demands is turned into rep- processing network. The previous version of the system evolved as an aggregation of reresentation design requirements [3]. individual sensor display elements, reaching a real estate constraint that at its upper Establishing the proper frames of ref- limit allowed the user to monitor around 200-250 sensors at once, with the usual eerencer for the data is important to place the detail displays that provide specific localized ddata in its proper context, but a longshot information. Each sensor monitored a node in the processing network, although the ddisplay also must allow an analyst to focus main display, using a group alarm approach (a type of statistical summary), provided anand refocus on the data of interest. The un- very little information other than the most serious condition found in the underlying dderstanding of the data set as a whole must detailed data. However, no information was provided about what was causing errors, bbe offset with the ability to understand or even where in the subset errors were aann individual datum within the set. This occurring. The users were having a difficult time understanding how the network was ppresents the user with two perspectives on behaving. ththe data that they must handle, labeled by Despite the paucity of information provided by the group alarm technique, the main display was nearing its upper limit on the number of sensor monitors that could be placed on the screen at one time. Still, users Tufte as the Micro/Macro approach [4], in foresaw the need to monitor many more sensors at one time (into the thousands), and realized that all forms of scaling the current display design would be insufficient. The users were already struggling to understand wwhich the representation of each datum al- the network, and adding more sensors to monitor would have exacerbated the problem. lolows the data to group into a larger whole. The large information set demanded that the display allow users to refocus their attention to important events in the network, which is best achieved in a longshot display. The new display allows operators to The concept of combining detailed, monitor tens of thousands of data points at one time, in real time, by taking advantage of Micro/Macro design principles [4]. ffocusedo central processing within the The data reported on each system is placed into a single 2 high row. There are four main components cocontext of a broader surround is so funda- to the network: the inboxes that receive data, the software that processes data, the outboxes that send data, and the overall hard drive of the system. In general, inboxes or outboxes that are full, overworked/down mmental it forms the basis of our own visual processes, and full hard drives are problematic. Data that are normal are less important (although must be shown). Within the display, as inboxes and outboxes grow fuller, the data mark representing their value pprocessing in the eye. Because the world moves towards the outside of each row (breaking away from the rest of the data) and become brighter on ppresents far more data than we can analyze the screen. Similarly, as processes and hard drives moved from nominal, they became more prevalent on the screen (changing color and becoming brighter). The user is quick to see where problems are occurring, in detail, our visual system is organized since they almost leap off of the display, even though each mark is only 2 pixels wide by 2 pixels high. When multiple problems are occurring at the same time, the user is able to know where to focus, since the type ininto a central region with very high acuity and magnitude of each problem is known. Once a user decides to focus on a particular network issue, Performance View allows the user to quickly access a detailed display for the system of interest. aandn a surround region that is processed at lolower resolution. Our visual field is ap- 22 Do You KnKnowow WhaWhat’t’s HaHappeningppening proximately 180 deg, but only the central shshould be focusing their efforts. Analysts 2 deg is processed in the highest level of ccan more effectively manage their time detail. This center-surround arrangement aandn effort working with the data, ensuring enables humans to be aware of significant ththe most important work is getting done at events (e.g., a fast moving object) over a aall times. very large field of view, and then move our This ability to focus and refocus is a eyes to focus the high acuity processing on kkey difference between longshot displays the region of interest. If our eyes only pos- aandn displays that allow analysts to com- sessed the relatively small 2 deg region of ppletely zoom out on a data set (design- detailed viewing, we would never know eersr often build this capability into maps). when potentially important events had WWhile all the data are visible on the screen, happened outside this area of focus. Simi- ththe zoomed out view simply jams the data larly, visualization systems that contain ininto the window, making most of the data only detail views with no corresponding inincomprehensible. Such displays leave longshot display do not sufficiently sup- aanalystsn unable to determine where to fo- port analyst refocusing. cucus, requiring them to zoom in and scan This ability for an analyst to under- aaroundr to determine the appropriate region stand both the whole and the individual is a oof focus, thus defeating the purpose of the Figure 1: Without a longshot display, systems key benefit provided by longshot displays. force users to view their data through a keyhole. zzoomed-out view. While analysts can only work on a single Longshot displays provide one way of escaping the In the same vein, statistical summa- Keyhole Effect by placing data in the necessary datum (or a few data) at a time, the dy- context so that users can determine where in the ries of a data set do not sufficiently allow a namic nature of world events and analytic data set to focus (photo courtesy of Flip Phillips). user to refocus in the data set. While each tasking means that the appropriate focus events, have increased difficulty in navi- data value is taken into account in a statis- may be constantly changing. Important gating novel data environments, and gen- tical summary, the actual data values are new events are constantly occurring (or at erate incoherent models of the explored hidden from the user—producing a differ- least may be occurring), and visualizations data space [5]. ent type of keyhole. While the statistical must support the analyst’s need to refocus This Keyhole Effect is instantiated in summary can allow judgments on the data even as they are engaged in detailed con- many different ways, the most common— set as a whole, it prevents data judgments sideration of a narrow part of the data set. though not only—being the scrolled data at the individual level. Each datum has a In showing the entire data set, longshot list (or any table, spreadsheet, etc. that in- value unto itself, and this value needs to displays enable analysts to refocus when cludes a scroll bar to see the entire set). Dis- be known, especially since analysts do events change in the world, to hone in on plays that show only a focused, key-holed not work on the data set as a whole—they the specific data that should be worked on view of data give analysts great detail into work on an individual datum. It becomes given the current situation (see: Longshot a select few data while giving the analyst extremely difficult to work on a single da- Example 2). When events change, the ana- no insights into the remainder of the data tum when it is rolled up into a summary, lyst is easily able to see what aspect of the set. This display makes it virtually impos- and it certainly will not allow the analyst data set is the most important and decide sible for analysts to redirect their attention to adjust focus when necessary. whether or not to redirect the focus based when an important event happens outside Longshot displays, on the other hand, on the changing situation. This ability to their focused view. In these visualizations, make it possible to work with each indi- redirect attention is vital in dynamic, time- the representation designer is requiring vidual datum as a true, independent entity. critical situations. the analyst to look at the right place at the Additionally, longshot displays make it Supporting an analyst’s ability to fo- right time to see when changes occur. The possible to understand the impact of outli- cus and refocus within the data set helps larger the data set, the more unlikely that ers on the data set. As a small subset of overcome what has been labeled the Key- the analyst will ever be fortunate enough the data set, outliers are not able to greatly hole Effect, which occurs when people to have their detail view pointed at exactly impact the status of the overall data set. are only provided a limited view from the the right place to see the onset of a signifi- Yet, these outliers are often the interesting much larger information space of relevant cant event. data to analysts, as they represent where information2. The Keyhole Effect hobbles However, longshot displays, with all important events are occurring. With long- a viewer’s peripheral awareness by nar- the data presented simultaneously, make shot displays, analysts are able to focus on rowing both perceptual and attentional it easier for analysts to see where they both the typical data and the atypical data, mechanisms, leading analysts to miss new depending on their current interests, mak-

2 The Keyhole Effect refers to the ability of a person to understand a room based on what they can see by looking through a keyhole in a door. A person can focus on a small part of the room, and can even scan around the room, but can never truly understand what is going on in the room through this keyhole view. In the same way, analysts will not be able to understand their entire data set while they are narrowly focused on only one small portion of it.

The Next Wave n Vol 17 No 2 n 20020088 2323 ing selection of data to attend to easier. representation designer can highlight the of salience are graphical elements that This need to refocus also has impli- most relevant data and reduce the vis- organize and provide structure for the cations for the use of filters on the data ibility of the less relevant data, making it overall display (e.g., Plot labels). At the set. When large data sets are collected, easier for the analyst to focus on what is next level are so-called static data or refer- data filters are used to reduce the data set important while still preserving the ability ence marks. These are important values available to analysts by eliminating data to see the larger context. Representation for the system, but they are known in ad- that are deemed irrelevant. Filters can be a designers can control the color, brightness, vance and do not change in the course of very useful method for supporting analysis size, and position of each datum (to name normal operations. The last three levels of extremely large data sets, but they can a few options) in order to give analysts in- of salience are reserved for dynamic data also be misused and ultimately create ad- sights into how important each datum may elements: (1) routine dynamic data should ditional work for analysts. be. Regardless of how important each da- be the least salient of the three levels, with System designers use filters based on tum seems, all data make it to the screen, more salience assigned to (2) urgent dy- the premise that it is possible to know a and analysts can still return to those data namic data, and finally, (3) abnormalities priori what data will be important and what if they want to broaden their focus. This or alarms should be the most salient items data will be irrelevant. When this distinc- gives analysts greater control over their on the display. tion can be made, it becomes easy to sepa- data set and can allow them to see discon- Salience in the display is a statement rate what seems to be good data from what nects between what a pre-programmed of design intent which results from an as- seems to be bad data. Unfortunately, filters relevancy generator has deemed to be ir- sessment of the relative importance of the are often designed to completely eliminate relevant for the task and what the analyst different information required to support those data that are deemed irrelevant. This actually thinks is irrelevant. the relevant cognitive work (e.g., abnor- is the ultimate form of a keyhole. Not only A useful technique for achieving some malities or unexpected events should be are analysts given only a partial view of of the information highlighting described made most salient on the display in order the data set at a time (since once data are above involves utilizing the natural per- to immediately draw the attention of the ceptual way in which our visual system operator). However, perceived salience of filtered, they are often placed into stan- processes different types of visible features the visual display is based on the pre-at- dard representation keyholes), but there to influence the perceived salience of par- tentive perceptual mechanisms of the hu- are large portions of the data set that are ticular graphical elements of the display. man visual system, and results from spatial left completely irretrievable. This com- A salience map (see Figure 2) enables a and/or temporal contrasts of the individual plete removal of data causes a couple of representation designer to make a specific display elements [6]. Thus, in order to ef- complications for analysis. First, analysts rank ordering of the various elements that fectively direct the analyst’s attention, cre- have no idea what data have been removed will constitute the final display. The over- ate relevant contrasts to highlight informa- by the filter. While the distinction between all display space can be structured based tion, and group data into meaningful com- important and irrelevant data may be clear on decisions about the relative importance binations, representation designers must some of the time, just as often the differ- of information re- Figure 2: The use of a Salience Map to structure information in a visual display. ence between important and irrelevant is lated to the cogni- Controlling salience allows a representation designer to effectively direct attention very murky. This is especially true in in- and highlight information based on the cognitive work the display must support. tive work the display telligence collection, in which obfusca- must support. tion can be a key tactic of an adversary. The appropriate Unfortunately, since the filter removes the number of levels in data, analysts will have no way of know- a salience map de- ing what they are not seeing—the data pends on the com- filter prevents users from ever refocusing plexity of the domain on these data. These data can never be ex- under consideration amined more closely. While the majority and the nature of the of the data may indeed be irrelevant, the graphical elements select few that are relevant may be most in the display de- valuable to the analysts. sign. In the example Since longshot displays are able to salience map shown represent large data sets to the analyst, in Figure 2, we show there is less of a need to completely re- five salience levels. move data that does not pass the filters At the lowest level limits3. Instead of removing the data, the

3 In some cases, filters are unavoidable. Even when some data must be removed from the data set altogether, this does not free the designer from the responsibility of using a longshot display for the remaining data set.

24 Do You Know What’s Happening understand and utilize characteristics of text, and allowing the analysts to focus on our pre-attentive perceptual mechanisms. the portion of the data as desired. As the Finally, the ability to focus and refo- pattern changes, so too can the analyst’s cus within the data set provides little value focus. This can enable better navigation if the analyst is unable to work on a partic- of the information space, allowing ana- ular datum. Therefore, in order to satisfy lysts to work on the most important data. the final criteria, the longshot display must With tightly coupled longshot and detail also be coupled with a detail display that displays, analysts are able to understand allows users to work on the focused data. and manage the entire data set, and still Longshot displays are important, but they generate an analysis on the individual da- cannot exist in isolation, as they will sup- tum. By adding a longshot display to the port some of the user’s cognitive work, but detail display, analysts’ work will never be cannot support all of it. The two data rep- limited by what may be just outside their resentations (longshot and detail displays) focused view—everything is in view. provide complementary perspectives on the same data and will be of great ben- efit when combined into a single tool for analysts. The longshot display provides analysts with the ability to see the entire information space to determine where to focus, while the detail display allows the analyst to focus in on the area selected. Combining the longshot and detail views References and Selected helps analysts complete the difficult cog- Reading nitive work they face, and allows analysts [1] Patterson ES, Roth EM, & Woods DD. to quickly go from focusing on the data set Predicting Vulnerabilities in Computer- as a whole to focusing on a single datum to Supported Inferential Analysis Under assess and then back again as needed. Data Overload. Cognition, Technology &

Work. 2001;(3):224-237. Supporting a New Analytic Approach for Working [2] Norman DA. Things That Make Us through Large Data Sets Smart: Defending Human Attributes in Longshot displays are critically the Age of the Machine. Reading (MA): necessary complements to existing detailed Addison-Wesley; 1993. views in order to add the robustness of [3] Elm WC, Potter SS, Gualtieri JW, Roth directing/redirecting attention to the EM, & Easter JR. Applied Cognitive Work already powerful ‘drill down and inspect’ Analysis: A Pragmatic Methodology for capabilities provided by traditional detail Designing Revolutionary Cognitive Affor- displays. They support additional cognitive dances. In: Hollnagel E, editor. Handbook work that cannot be supported solely by for Cognitive Task Design. London(UK): detail displays (e.g., refocusing when Lawrence Erlbaum Associates, Inc.; 2003. events change and selecting the proper [4] Tufte ER. Envisioning Information. data for analysis given the situation), Cheshire (CT): Graphics Press; 1990. allowing analysts to alter their approach [5] Woods DD, Watts JC. How not to have to large data sets. Rather than using the to navigate through too many displays. In: FSA model, analysts can shift to the richer Helander MG, Landauer TK, Prabhu P, and more resilient Organize-(Re)Focus- editors. Handbook of Human–Computer Analyze (OFA) model of intelligence analysis. Interaction. 2nd ed. Amsterdam (NL): El- sevier Science; 1997. p. 617–650. By choosing the proper frames of ref- erence for the data, the longshot display [6] Wolfe JM. Visual Search. In: Pashler utilizes the human perceptual system’s H, editor. Attention. London (UK): Uni- tendency to detect patterns in the world, versity College Press; 1998. organizing the data in a meaningful con-

The Next Wave n Vol 17 No 2 n 2008 25 Knowledge Visualization with Concept Mapping Tool

Mission-critical knowledge resides in the memories of experts and in a relatively small subset of external resources, often widely distributed, such as technical re- ports, manuals, books, online material, etc. In order to make such knowledge explicit and sharable, it is necessary to have an effective elicitation and capture methodology with a powerful and useful representation scheme. There are many methods and tools that can perform one or more of these require- ments. Through research, experimentation, and observation, achieving many of the requirements led to the discovery of a simple yet powerful representation scheme (concept mapping) based on Ausubel’s Assimilation Theory of meaningful learning [1]. The representation scheme described by Joe Novak is based on Ausubel’s Theory, which is different from other methodology schemes such as mind maps, knowledge maps, conceptual structures [2], and the like, in that it conveys rich semantic con- tent and also provides an organizing factor for other resources and media that are added to the concept map knowledge model. Background though sometimes multiple words or sym- domains of the concept map. Cross-links bols such as + or % are used. Propositions Concept maps were developed in 1972 help one see how a concept in one domain are statements about some object or event in the course of Novak’s research program of knowledge represented on the map is in the universe, either naturally occurring at Cornell, where he sought to follow and related to a concept in another domain or constructed [7]. Propositions contain understand changes in children’s knowl- shown on the map. In the creation of new two or more concepts connected using link- knowledge, cross-links often represent edge of science [3]. During the course ing words or phrases to form a meaningful creative leaps on the part of the knowledge of this study, the researchers interviewed statement. Sometimes these are called se- producer. many children and found it difficult to mantic units, or units of meaning. identify specific changes in the children’s There are many more innovative char- understanding of science concepts by ex- Characteristics of acteristics to concept maps, for example, amination of interview transcripts. This Concept Maps in the case of resources, which are events or objects that will help clarify the mean- program was based on the learning psy- Concept maps are depicted by the ing of a given concept. Normally these chology of David Ausubel [1][4][5]. The concepts that are represented in a hier- are not included in ovals or boxes, since fundamental idea in Ausubel’s cognitive archical fashion with the most inclusive, they are specific events or objects and do psychology is that learning takes place by most general concepts at the top of the not represent concepts. Resources can be the assimilation of new concepts and prop- map and the more specific, less general anything, for example, images, videos, ositions into existing concept and proposi- concepts arranged hierarchically below. URLs, spreadsheets, PowerPoint slides, tional frameworks held by the learner. This The hierarchical structure for a particular documents, other concept maps, etc. It is knowledge structure as held by a learner is domain of knowledge also depends on the critical to understand that these and many also referred to as the individual’s cogni- context in which that knowledge is being more characteristics of concept maps are tive structure. Out of the necessity to find applied or considered. Therefore, it is best important also to the facilitation of cre- a better way to represent children’s con- to construct concept maps with reference ative thinking and innovation. ceptual understanding emerged the idea of to some particular question, called a focus representing children’s knowledge in the question, for which an answer is sought. Applications of Concept form of a concept map. Thus was born a A situation or event that requires organiza- Maps new tool not only for use in research, but tion of knowledge for understanding may To give a sense of the utility of Cmaps, also for many other uses. provide context for the concept map [8]. they have been used to show gaps in train- Another important characteristic of The Foundation of Concept ees’ knowledge, capturing the proficiency concept maps is the inclusion of cross- Map scale for a level of performance. Concept links. These are relationships or links be- mapping is used as part of the curricular Living in the digital age made it neces- tween concepts in different segments or sary to transform the meaningful learning infrastructure of many individual schools methodology into a more desirable format. Therefore, the Institute for Human and Machine Cognition (IHMC) with the help of Joe Novak created a computer-based tool that included all foundational theo- ries. The Cmap Tool was designed to be a simple-to-use but powerful environment that facilitates collaboration and sharing during the construction of a concept map- based knowledge model. Concept maps are meaningful diagrams (directed graphs) for organizing and representing knowledge by using concepts and propositions as node- link-node expressions of the simple form [6]. A concept is defined as a perceived regularity in events or objects, or records Figure 1: An example of a concept map. This map was made on the basis of collec- of events or objects, designated by a label. tive external experience after conducting a cognitive task analysis with intelligence The label for most concepts is a word, al- analysts, in an unclassified setting.

The Next Wave n Vol 17 No 2 n 2008 27 Concept maps can be used to: U iˆVˆÌ Ž˜œÜi`}i vÀœ“ iÝ«iÀÌà dcmitype:Event U `iÈ}˜ ˜iÜ ÌiV ˜œœ}Þ LÞ domain experts dc:type dcterms:issued dcterms:hasVersion U ÀiÛi> iÝ«iÀ̇˜œÛˆVi differences 2000-07-11 m ??:vocabulary-ter (literal) ?:Event-001 U >VµÕˆÀi ÜvÌÜ>Ài‡>ÃÈÃÌi` knowledge U LÀ>ˆ˜Ã̜À“ U à >Ài Ž˜œÜi`}i

U ÃÕ««œÀÌ ÌÀ>ˆ˜ˆ˜} >˜` Figure 2: A representation of what a very simple RDF triple would look performance like visually, and, on the right, is a textual representation in OWL.

U ˆ`i˜ÌˆvÞ Ž˜œÜi`}i }>«Ã The wide use of concept mapping from structure. The spatial (diagrammatic) and U VÀi>Ìi ˜iÜ Ž˜œÜi`}i] vœÀ grade school kids to research scientists has temporal features of Cmaps (Cmapping as example, turning tacit created many architectural challenges for a process) can be used as a tool to achieve knowledge into an the Cmap Tool. The present architecture clarity about intended meanings of propo- organizational resource has been relatively closed to third-party sitions (concepts and their relations). U “>« Ìi>“ Ž˜œÜi`}i developers who wish to interoperate with The adoption of the Cmap Tool in- the Cmap Tool until recently, when a new terface to a Semantic Web Graphic User U ÃÌÀÕVÌÕÀi Vœ˜Vi«ÌÕ> µÕiÀˆià approach called the Knowledge Exchange Interface (GUI) provides a fully functional U Vœ“«>Ài “iÌ œ`œœ}ˆià Architecture (KEA) [10][11], provided the display, editing and recomposing of OWL, data and procedural framework to allow and Resource Description Framework U ÃÌÀÕVÌÕÀi ˆ˜}ՈÃ̈V `iw˜ˆÌˆœ˜ third-party developers to design programs (RDF) meta models. The tool, called U `iÈ}˜ Vœ“«iÌi˜VÞ µÕiÃ̈œ˜Ã that interact with the Cmap Tool suite of CMap-based Ontology Editor (COE), is programs, thus providing a new range of acquiring widespread use as an ontology applications that can be developed and in- viewer and editor, capable of importing and by entire school systems [9]. Concept tegrated with Cmaps. and exporting various formats. COE en- map knowledge models can serve as liv- ables users to display, manipulate, edit, ing repositories of expert knowledge, en- Formalizing the Informal of and compose ontologies and other formal couraging knowledge sharing as well as Concept Maps representations in a variety of languages. supporting knowledge preservation [3]. Upon first seeing concept maps, a They can serve as interfaces for decision common reaction is, “That’s just a flow The challenge to display very large support systems where the model of the diagram.” Only with time, instruction, ontologies provided insight into the over- expert’s knowledge becomes the interface and experience do people come to see that all look and feel of understanding concept for a performance support tool or training concept maps can take on different func- maps. The idea of visualizing, the abil- aid [6]. Concept map knowledge models tions with respect to the representation of ity to handle rapid navigation facilities have successfully captured and preserved concepts and their relations. Upon further and “zooming” provided some novel ap- the domain knowledge of experts in such observation and research, the development proaches in this domain. This technique areas as terrain analysis, aviation weather of expressing concept maps for the repre- has enhanced the overall capability of the forecasting, and engineering [9]. sentation of artifacts used in the Semantic Cmap Tool. The Semantic Web commu- Through first-hand knowledge and Web (Web 3.0) were created, more specifi- nity has taken great interest in the visual- observation, there were many more inter- cally the Web Ontology Language (OWL) ization of these formal languages to gain esting applications of concept maps. As [12]. The confirmation that much of the insight from the normal textual method of more and more users find their niche, there utility of the Cmap Graphic User Interface descriptions [13]. have been great benefit and promise to- (GUI) arises from its ability to utilize the COE has been used to interface with wards mission in the usage of the concept essentially spatial structure of the display knowledge mining engines and reasoners. mapping tool and its fundamental meth- in meaningful ways, not the least simply It has been adapted to input useful data and odology. It brings together communities as a memory aid and a rapid visual rec- to display results from data analysis. The to share and collaborate on knowledge. ognition mode for “familiar” knowledge key to moving from informal to formal

28 Knowledge Visualization With Concept Mapping Tool References [10] Cañas AJ, Hill G, Granados A, Perez C, & Perez JD. The Network Architecture [1] Ausubel DP. The Psychology of of CmapTools. Pensacola (FL): Institute Meaningful Verbal Learning. New York for Human and Machine Cognition; 2003. (NY): Grune and Stratton; 1963. Report No.: IHMC CmapTools 2003-01. [2] Sowa JF. Conceptual Graphs as a Uni- [11] Cañas AJ, Hill G, Bunch L, Carff R, versal Knowledge Representation. Com- Eskridge T, & Perez C. KEA: A Knowl- puters & Mathematics with Applications. edge Exchange Archtecture Based on 1992;23(2-5):75-93. Web-services, Concept Maps, and Cmap- representation allows one to see the possi- [3] Novak JD & Gowin DB. Learning Tools. In: Cañas AJ., & Novak JD, edi- bility of a smooth transition of automation How to Learn. Ithaca (NY): Cornell Uni- tors. Proceedings of the Second Interna- without needing to learn the details and versity Press; 1984. tional Conference on Concept Mapping; to use the machine to help offload many 2006. San Jose, Costa Rica: University of functions as well as share the information. [4] Ausubel DP. Educational Psychology: Costa Rica. p. 304-310. Building this bridge between knowl- A Cognitive View. New York (NY): Holt, [12] Eskridge T, Hayes P, Hoffman R, & edge models that reflect human learning Rinehardt and Winston; 1968. Warren, M. Formalizing the Informal: and formal ontologies that can support [5] Ausubel DP, Novak JD, & Hanesian A Confluence of Concept Mapping and machine inference is a challenge, but it is H. Educational Psychology: A Cogni- the Semantic Web. In: Cañas A, editor. necessary and important. tive View. 2nd ed. New York (NY): Holt Proceedings of the Second International Rinehart and Winston; 1978. Summary and Conclusion Conference on Concept Mapping; 2006. [6] Hoffman RR & Linton G. Elicit- San Jose, Costa Rica: University of Costa The theoretical foundation and the or- ing and Representing the Knowledge Rica. igins of concept maps have been presented of Experts. In: Ericsson KA, Charness [13] Verirl I. What is the Value of Graphi- in this article. Initially, concept maps may N, Feltovich P, & Hoffman R, editors. cal Displays in Learning? Education Psy- appear to be just another graphical repre- Cambridge Handbook of Expertise and chology Review. 2002;(14):261-290. sentation of information. Understanding Expert Performance. New York (NY): the foundations of concept maps and proper Cambridge University Press; 2006. p. use of the Cmap Tool will lead the user to 203-222. see that it is truly a profound and powerful [7] Novak JD & Cañas AJ. The Theory tool. The software was designed to have a Underlying Concept Maps and How to low threshold, making it easy to learn for Construct and Use Them. Pensacola (FL): new and naïve users, but at the same time Institute for Human and Machine Cogni- have a high ceiling, enabling sophisticated tion; 2008. Report No.: IHMC Cmap- and expert users to develop large, complex Tools 2006-01 Rev01-2008 knowledge models. Concept mapping’s [8] Cañas AJ & Novak JD. Re-examining greatest contributions are facilitating col- the Foundations for Effective Use of laboration, documenting team analytic ef- Concept Maps. In: AJ Cañas & JD No- forts, targeting information, and fostering vak, editors. Proceedings of the Second an analytical environment. International Conference on Concept Mapping; 2006. San Jose, Costa Rica: University of Costa Rica. p. 494-502. [9] Cañas AJ, Coffey JW, Carnot MJ, Feltovich P, Hoffman R, Feltovich J, & Novak JD. (Institute for Human and Machine Cognition, Pensacola, FL) A Summary of Literature Pertaining to the Use of Concept Mapping Techniques and Technologies for Education and Perfor- mance Support. Report to the Chief of Naval Education and Training; 2003.

The Next Wave n Vol 17 No 2 n 2008 29 Visualizing Data in the Metaverse

From the earliest cave drawings of a mastodon hunt to Leonardo Di Vinci’s sketches of the ornithopter, humans have tried to visualize their experiences and their dreams. Images captured on cave walls provided a collective perspective of the hunt that might have altered tactics for the next kill; detailed renderings of feathers and fulcrums helped turn flights of fancy into flying machines. People often need to see how things work before we understand how they work.

Throughout history, new ways to vi- generation of gamers for the virtual world the metaverser is filling up with more sualize data have repeatedly spawned dis- that may be redefining how we work, serious projectsj that lend themselves to covery and invention. Through the twen- learn, shop, and play. Just as data models interactivity. tieth century, these visions were captured and simulations have provided new ways Businessess have opened offices in in drawings, diagrams, photographs, and to visualize substances and systems, the SL to conductu virtual meetings and sales even movies. But data visualization has metaverse gives us an alternative way to presentations. Retail merchants are using evolved into more than a “snapshot” of look at ourselves and the environment we SL to promote their products, with some an event or physical state to preserve it inhabit. shops set up to let customers design virtual for analysis. New data visualization tools Second Life was the first virtual world products for their avatars, and then receive can provide a portal for the viewer to step to demonstrate what life in the metaverse similar items for themselves. Government through, to enter and experience an other- could be like. Launched in 2003 by Linden agencies, embassies, and various service wise inaccessible world. Research, Inc., Second Life (SL) slowly industries have discovered the benefits of As the ways we visualize information amassed a devout following, registering providing easy, interactive access to their rapidly evolve, so do the ways we access over 13 million accounts by March 2008. information, any time of day or night. that information. It took only one genera- At any given time, tens of thousands of SL Several museums have found that 3-D tion for libraries of atlases and textbooks residents can be found engaged in virtually exhibits in SL are a natural extension for to yield to shared databases and online ac- every activity you would expect to find their local displays. And the potential for cess. Now, pages are turned by a clicked among a similar population in real life. education and research is boundless, with mouse button instead of a licked thumb. The curators of SL, at Linden Labs, numerous schools and universities already It may take even less time for scrolling are committed to letting their virtual building virtual campuses and hosting through pages on a computer monitor to world evolve naturally, with communities, classes in SL. become equally arcane, as the computer customs, laws, and even an economy As supercomputing power becomes al- world we know today yields to the virtual settling into social equilibrium with few most commonplace, elaborate data models world of tomorrow. constraints. Their new world was designed and simulations are finding their way into A digital virtual world—the Meta- to be created by its residents—a crucible more and more hands. The increase in ac- verse—was vividly described by Neal for a grand social experiment. curacy and resolution of what we perceive Stephenson in his 1992 cyberpunk novel The first to come to SL were the provides for better interpretations of how Snow Crash. Residents of the Metaverse gamers. They were soon followed by things work and what things mean. These buy, sell, socialize, create, and learn, just as profiteers, hucksters, and people who new ways to present information are being they do in the physical universe. The only were just curious. As SL has grown in met with new ways to receive it. Visual- real difference is that these residents are popularity—and profitability—businesses, izing data does not have to be limited to digital manifestations of the people who merchants, government agencies, nonprofit a passive event in two-dimensional space. create them—control them—are them. organizations, museums, and schools have In the metaverse, the possibilities for vi- Online games such as World of started setting up shop as well. Along with sualizing data are virtually limitless. Warcraft and Everquest have prepared a the virtual beach houses and nightclubs,

30 Visualizing Data in the Metaverse The National Oceanic and Atmospheric Administration (NOAA) was one of the first government agencies to establish a Second Life island—a plot of virtual land that, as the name implies, is surrounded by water. One of NOAA’s more popular interactive exhibits is a weather map of the United States. Live reports of weather conditions from major weather stations across the country are used to recreate prevailing weather systems, com- plete with rain clouds, lightning, and snow. Not only can visitors observe the weather in 3-D, they can walk among the clouds.

The American Chemical Association (ACS) sponsors an edu- cational museum in SL. One area of their island has floating platforms with giant molecules that can be disassembled. By touching a box representing a specific part of the molecule—the Tyrosine R Group, for example—the corresponding section of the molecule drops to the ground.

NASA (the National Aeronautics and Space Administration) sup- ports more than one island in SL. At NASA CoLab’s Second Life Mission, visitors can fly with satellites, walk on the moon, and sit on the surface of other planets. Outside NASA’s virtual plane- tarium, a visitor can select any point on a world map. Afterward, the night sky in the planetarium displays the stars and planets as they would appear from the selected location.

The Starry Night island in SL has devised an innovative way to engage viewers in works of art. Vincent van Gogh’s paintings are displayed on easels, set up in various locations in a virtual French villa. The villa is designed around the areas where Van Gogh lived and painted. Many of the paintings are positioned just where they would have been as blank canvases. The “museum goer” has the opportunity to view the paintings, and then actually walk across the Langlois bridge or sit at a café table under that now famous starry night.

The Next Wave n Vol 17 No 2 n 2008 31 i040425