Capability Based Planning and Information Development of Visualizations

 . L  . L  .  .  .     S. CAE Professional Services (Canada) Inc. 1135 Innovation Dr. Ottawa, Ontario K2G 3G7, Canada Contract Project Manager: Richard Percival PWGSC Contract Number: W7714-11-5174 CSA: Mark Rempel, Chad Young, Strategic Planning Operations Research Team

 "!"  "" 0 !   "     0    0 !   "

  "   "  0        ! )!" <) .

DRDC CORA CR 2012-065 March 2012

Defence R&D Canada Centre for Operational Research and Analysis

Strategic Planning Operational Research

Capability Based Planning and Information Visualization Development of Visualizations

                   !   " #$ %" &&'( % )  *+ *  ,-, '. 

   /01   !   " #$ %" &&'( % )  *+ *  ,-, '.   "  2" 3 1 "   " 4&'-5.6'5- 7  " 8 1 7..&5&&(&.5 1  9    :3 )!" "  4&';5.'(.;

 "!"  "" 0 !   "     0    0 !   "   "   "  0        ! )!" <) .

Defence R&D Canada – CORA  "   )) *  -6&-64(  " -6&-

Principal Author

Original signed by Catherine Campbell C. Campbell CAE Professional Services (Canada), Inc.

Approved by

Original signed by Robert Burton Robert Burton Section Head, Joint Systems Analysis

Approved for release by

Original signed by Paul Comeau Paul Comeau Chief Scientist

© Her Majesty the Queen in Right of Canada, as represented by the Minister of National Defence, 2012 © Sa Majesté la Reine (en droit du Canada), telle que représentée par le ministre de la Défense nationale, 2012

Abstract ……..

The Capability Based Planning (CBP) process in Canada is based on an extensive data model that integrates scenario information, effects information, capabilities, force elements, a strategic cost model, and a Program Activity Architecture (PAA) model. Each element is a dense and highly inter-related set of data. Understanding the breadth and depth of considerations that must be made in CBP is difficult for senior decision-makers and their staffs. This contract investigates the potential for information visualization to support these people in understanding and making effective use of CBP model data to support decision-making. To this end, this report describes the preliminary use of two information visualization toolkits (Prefuse and Flare, selected during phase I of this work) to develop visualizations. This work had two purposes: familiarization with the toolkits, and evaluation of the relative strengths and weaknesses of the two toolkits. The output from this preliminary work is presented here, as are the results of an evaluation session involving the contract Scientific Authority (SA). Based on this preliminary work, Prefuse is recommended for further development work, and a number of features of a prototype information visualization application are introduced that may be helpful for senior decision makers.

Résumé ….....

Au Canada, le processus de planification fondée sur les capacités (PFC) s’appuie sur un vaste modèle de données qui intègre de l’information sur le scénario, de l’information sur les effets, des capacités, des éléments de la force, un modèle de coût stratégique, et un modèle d’Architecture des activités de programme (AAP). Chaque élément contient un ensemble de données denses et hautement interreliées. La compréhension de l’étendue et de la portée des considérations qui peuvent être utilisées dans la PFC est difficile pour les principaux décideurs et leur effectif. Le présent travail permet d’examiner la possibilité d’utiliser la visualisation de l’information afin d’aider ces gens à comprendre et à utiliser efficacement les données du modèle de PFC pour appuyer la prise de décisions. À cette fin, le rapport décrit l’utilisation préliminaire de deux trousses d’outils de visualisation de l’information (Prefuse et Flare, qui ont été sélectionnés pendant la phase I de ce travail) afin de développer des outils de visualisation. Ce travail avait deux objectifs : la familiarisation avec les trousses d’outils et l’évaluation des forces et faiblesses relatives des deux trousses. Le résultat de ce travail préliminaire est présenté dans ce rapport, ainsi que les résultats de la séance d’évaluation faisant intervenir l’autorité scientifique des marchés (ASM). À partir des travaux préliminaires, Prefuse a été recommandé pour le travail de développement à venir. De plus, certaines caractéristiques d’un prototype d’application de visualisation de l’information ont été mises en œuvre et pourraient être utiles aux principaux décideurs.

DRDC CORA CR 2012-065 i

This page intentionally left blank.

ii DRDC CORA CR 2012-065

Executive summary

Capability Based Planning and Information Visualization: Development of Visualizations Campbell, C.; Labrie, M-A.; Lamoureux, T.; Guo, R.; Smith, G.; and Pronovost, S.; DRDC CORA CR 2012-065; Defence R&D Canada – CORA; March 2012. Introduction or background: The Capability Based Planning (CBP) process in Canada is based on an extensive data model that integrates scenario information, effects information, capabilities, force elements, a strategic cost model, and a Program Activity Architecture (PAA) model. Each element is a dense and highly inter-related set of data. Understanding the breadth and depth of considerations that must be made in CBP is difficult for senior decision-makers and their staffs. This contract investigates the potential for information visualization to support these people in understanding and making effective use of CBP model data to support decision-making. To this end, this report describes the preliminary use of two information visualization toolkits (Prefuse and Flare, selected during phase I of this work) to develop simple visualizations. This work had two purposes: familiarization with the toolkits, and evaluation of the relative strengths and weaknesses of the two toolkits.

Results: The output from this preliminary work is presented here, as are the results of an evaluation session involving the contract Scientific Authority (SA). Flare was found to be more attractive for users, but it was harder to use because of the need to learn ActionScript. A number of different visualizations were developed in each toolkit, as follows:

 Dependency graph (Prefuse and Flare);  Hyperbolic tree (Flare);  Tree view (Prefuse);  Tree (Prefuse); and,  Term/word cloud (Prefuse).

Also, a number of demonstrations of different interactions were developed for each toolkit.

Given the degree of familiarity with Prefuse, in part because it is based, it was decided to focus on Prefuse for the development of an information visualization prototype.

Significance: The preliminary work with the two information visualization toolkits allowed the team to select a toolkit on which to focus effort, and led to the development of plans for a prototype information visualization application, to be described in a report for phase III of this contract.

DRDC CORA CR 2012-065 iii

Future plans: Based on this preliminary work, Prefuse is recommended for further development work, and a number of features of a prototype information visualization application are introduced that may be helpful for CBP decision makers.

iv DRDC CORA CR 2012-065

Sommaire .....

Capability Based Planning and Information Visualization: Development of Visualizations Campbell, C.; Labrie, M-A.; Lamoureux, T.; Guo, R.; Smith, G.; and Pronovost, S.; DRDC CORA CR 2012-065; R & D pour la défense Canada – CORA; Mars 2012.

Introduction ou contexte: Au Canada, le processus de planification fondée sur les capacités (PFC) s’appuie sur un vaste modèle de données qui intègre de l’information sur le scénario, de l’information sur les effets, des capacités, des éléments de la force, un modèle de coût stratégique, et un modèle d’Architecture des activités de programme (AAP). Chaque élément contient un ensemble de données denses et hautement interreliées. La compréhension de l’étendue et de la portée des considérations qui peuvent être utilisées dans la PFC est difficile pour les principaux décideurs et leur effectif. Le présent travail permet d’examiner la possibilité d’utiliser la visualisation de l’information afin d’aider ces gens à comprendre et à utiliser efficacement les données du modèle de PFC pour appuyer la prise de décisions. À cette fin, le rapport décrit l’utilisation préliminaire de deux trousses d’outils de visualisation de l’information (Prefuse et Flare, qui ont été sélectionnés pendant la phase I de ce travail) afin de développer des outils de visualisation. Ce travail avait deux objectifs : la familiarisation avec les trousses d’outils et l’évaluation des forces et faiblesses relatives des deux trousses. Résultats: Le résultat du travail préliminaire est présenté ci-dessous ainsi que la séance d’évaluation faisant intervenir l’autorité scientifique de marchés (ASM). Aux yeux des utilisateurs, Flare était plus attrayant, mais il était plus difficile à utiliser étant donné que les utilisateurs devaient apprendre le programme ActionScript. Diverses visualisations ont été développées pour chaque trousse d’outils, les voici :

 Diagramme de dépendances (Prefuse et Flare);  Arbre hyperbolique (Flare);  Vue d’arborescence (Prefuse);  Carte d’arborescence (Prefuse);  Nuage de termes/mots (Prefuse).

De plus, un certain nombre de démonstrations de diverses interactions ont été menées pour chaque trousse d’outils.

En raison du degré de familiarisation avec Prefuse, en partie parce qu’il est basé sur le langage de Java, on a décidé de mettre l’accent sur Prefuse en vue du développement du prototype de visualisation de l’information.

Importance: Le travail préliminaire effectué à l’aide de deux trousses d’outils de visualisation de l’information a permis à l’équipe de sélectionner une trousse d’outils afin d’y consacrer ses efforts. Cela a mené à l’élaboration de plans pour le prototype d’application de visualisation de l’information qui seront décrits dans le rapport de la phase III de ce travail.

DRDC CORA CR 2012-065 v

Perspectives: À partir des travaux préliminaires, Prefuse a été recommandé pour le travail de développement à venir. De plus, certaines caractéristiques d’un prototype d’application de visualisation de l’information ont été mises en œuvre et pourraient être utiles pour les décideurs de la PFC.

vi DRDC CORA CR 2012-065

Table of contents

Abstract ……...... i Résumé …...... i Executive summary ...... iii Sommaire ...... v of contents ...... vii List of figures ...... ix List of tables ...... x 1 Introduction...... 1 1.1 Purpose and Scope...... 1 1.2 Objectives...... 1 1.3 Scope ...... 2 1.4 This Document ...... 2 1.5 Method...... 3 2 Initial implementation...... 4 2.1 Prefuse...... 4 2.1.1 Dependency graph...... 5 2.1.2 Tree view...... 6 2.1.3 Tree map ...... 8 2.1.4 Term/Word cloud...... 9 2.1.5 Interaction features...... 10 2.1.6 Visualization types that were not implemented ...... 14 2.1.7 Approach and lessons learned...... 16 2.1.7.1 General ...... 16 2.1.7.2 Adding label data in the graphml file ...... 16 2.1.7.3 Trees in TreeML...... 16 2.1.7.4 Multiple edges ...... 16 2.1.7.5 Interconnecting different visualizations ...... 16 2.2 Flare...... 17 2.2.1 Dependency graph...... 17 2.2.2 Hyperbolic Tree ...... 19 2.2.3 Interaction features...... 20 2.2.4 Visualization types that were not implemented ...... 21 2.2.5 Approach and lessons learned...... 21 2.3 Graphical User Interface...... 21 2.3.1 GUI Concept Evolution...... 21 2.3.2 GUI Prototype...... 27 3 Assessment of initial implementations ...... 29

DRDC CORA CR 2012-065 vii

4 Creative development ...... 31 5 Feasibility assessment...... 33 6 Enhanced implementation plan...... 36 Annex A .. Minutes of the Implementation Assessment and Creative Development Meeting...... 39 List of symbols/abbreviations/acronyms/initialisms ...... 43

viii DRDC CORA CR 2012-065

List of figures

Figure 1: A Dependency graph view between force elements and bottom level capabilities...... 5 Figure 2: Dependencies for a selected force element with edge values displayed ...... 6 Figure 3: Tree view (orientation 1)...... 7 Figure 4: Tree map representation of the capability framework and PAA...... 8 Figure 5: Tree Map visualization of Capability Framework and PAA with “paa or capability_framework::3” entered as a search criteria...... 9 Figure 6: Force elements in a word cloud view...... 10 Figure 7: Tree view (orientation 1)...... 11 Figure 8: Tree view (orientation 2 - detail)...... 11 Figure 9: Tree view (orientation 3 - detail)...... 12 Figure 10: Tree view (orientation 4)...... 13 Figure 11: Result of double-clicking on a node...... 14 Figure 12: Flow map example available with the Stanford library...... 15 Figure 13: Flare Dependency Graph ...... 18 Figure 14: Flare Dependency Graph showing selected node and related edges...... 19 Figure 15: Hyperbolic Tree in Flare...... 20 Figure 16: Inspiration for overview development: yEd – Graph Editor (http://www.yworks.com/en/products_yed_about.html)...... 22 Figure 17 Example of yEd-Graph editor interface including an overview pane, Successors pane, Structure view, Main window, Palette pane for editing, and Properties view... 23 Figure 18: Initial concepts for information visualization interface layout and functionality ...... 24 Figure 19: Initial concepts for information visualization interface design and specific capabilities...... 25 Figure 22: First prototype: initial screen - dependency graph...... 28 Figure 23: First prototype: node selection and zoom in on main window and tree view...... 28

DRDC CORA CR 2012-065 ix

List of tables

Table 1: Summary of outcomes of research conducted during the first phase of this project...... 1 Table 2 Proposed visualization development activities...... 31 Table 3 Feasibility Analysis for Additional Development of Visualizations in Prefuse...... 33

x DRDC CORA CR 2012-065

1 Introduction

1.1 Purpose and Scope

The purpose of this contract is to investigate the field of information visualization and to develop visualizations that will assist strategic decision makers in the Department of National Defence (DND) and the Canadian Forces (CF) to employ the Capability-Based Planning (CBP) approach to guide the Department’s strategic activities. To successfully carry out this contract, datasets contributing to the CBP process were studied, the means to create visualizations were investigated, and the various strengths and weakness of different visualization types were mapped to the needs of the datasets and users. This led to selection of a finite number of visualization types to be implemented in the selected toolkit as presented in Table 1 (justification for these selections is presented in the phase 1 report). The development of these visualizations is the focus of this report.

Table 1: Summary of outcomes of research conducted during the first phase of this project Selected visualization types Selected interaction Selected toolkits techniques Dependency graph Zoom +/- Prefuse Tree view Fisheye Flare Flow map Pan Hyperbolic tree (radial space- Selection filling tree graphs) Self-organizing map Search/query Tree map Filtering Term/Word cloud Gauges Starfield Basic visualizations (bar graph, pie , scatter , etc.)

1.2 Objectives The contract objectives most relevant to this second phase of the work are:

1. Develop and demonstrate interactive information visualization and related functionality;

2. Assess the ultimate benefit of interactive visualizations that have been built to explore and exploit the relationships between existing data, models and frameworks within the domain of strategic planning and enterprise risk management as it currently exists with DND/CF.

DRDC CORA CR 2012-065 1

1.3 Scope

A subset of the dataset used within the CBP process is represented by five ‘dummy’ datasets that are richly interconnected. These datasets include variables (i.e. ‘nodes’) and relationships between the variables (i.e. ‘edges’). The datasets represent: force planning scenario and desired effects; the capability framework; force elements; the Program Activity Architecture (PAA); and, the Strategic Cost Model (SCM). These datasets are discussed in detail in the phase 1 report and represent the scope of data to be considered for selection and development of visualizations using the selected toolkits.

Although upwards of 80 visualization development toolkits were identified, as discussed in the phase 1 report, the focus of this contract is on two in particular: Prefuse (www.prefuse.org) and Flare (flare.prefuse.org). The Scientific Authority (SA) for this contract provided a shortlist of 10 visualization types for implementation. These, or modified versions of these, were implemented in the visualization development toolkits. Implementation of additional visualization types and variations of these are discussed as part of the creative development activity in this phase of the project (outlined below).

1.4 This Document This document reports on the second phase of work in this project and is divided into six sections:

1. Introduction (this section): describes the background to the contract, the approach taken to the overall contract, and a review of the process and outcomes from the first phase of this contract;

2. Documentation of the implemented visualizations;

3. Results of the Assessment of initial visualizations;

4. Results of Creative Development;

5. Feasibility assessment of recommended changes; and,

6. Enhanced Implementation Plan: describes the plan for further development of selected visualizations.

Computer code developed as part of this phase of work has been provided to the SA electronically.

There are two other reports in this contract:

1. The phase 1 report describes the CBP process and datasets used in this work; the process for identification and outcomes from the evaluation of various classes of visualization types and their relative strengths and weaknesses; frequently referenced open-source visualization toolkits and their relative strengths and weaknesses; the selection criteria developed to support evaluation of visualization types and toolkits in the context of the CBP dataset; and an Implementation plan outlining the steps required to build the selected visualizations using the selected toolkit.

2. The phase 3 report describes the enhanced implementation, discusses outstanding issues and provides a list, with brief descriptions, of potential ways of improving the visualizations to facilitate the CBP process.

2 DRDC CORA CR 2012-065

1.5 Method

This phase of the project is predominantly concerned with initial development of visualizations, assessment of initial implementations, and development of plans for future implementations. The process was conducted as follows:

1. Initial implementation of visualizations using subsets of the CBP dataset provided by the Scientific Authority;

2. Assessment of the initial visualizations in collaboration with the SA;

3. Creative development workshop targeting possible elaborations on the developed visualizations, selection of additional visualizations and interaction techniques to be implemented in the second development effort;

4. Feasibility assessment for desired elaborations in the context of allocated resources and project timeline; and

5. Development of an implementation plan for the development of enhanced visualizations.

DRDC CORA CR 2012-065 3

2 Initial implementation

The team focussed on implementation of the selected visualizations in Prefuse and Flare. The toolkits were investigated in parallel such that a comparison between them could be made with respect to:

1. level of effort to produce visualizations, and

2. visual output produced.

The following sub-sections present the visualization types and interaction techniques that were achieved in Prefuse and Flare.

2.1 Prefuse

Initial implementations using Prefuse focussed on the list of preferred visualizations provided by the SA in the SOW. From that list the following visualizations were implemented, with varying degrees of success, using a subset of the CBP dataset:

Dependency graph Tree view Tree map Term/Word cloud

As part of the familiarization phase of the project labelling nodes and edges, colour coding and edge thickness were explored in isolation. Many of these features were integrated into the visualizations presented above. Basic interaction techniques were explored, including drag and drop moving nodes, mouse over labelling, search fields, and selection of edges and nodes.

The visualization types that were not implemented include:

Flow map Hyperbolic tree Self-organizing map Gauges Starfield Basic visualizations (as demonstrated on flex.org)

4 DRDC CORA CR 2012-065

The remainder of this section presents the implemented visualizations, the rationale for not implementing the remaining visualizations and a discussion of lessons learned from the successes, challenges and anticipated capabilities of the Prefuse program with respect to the implementation of these visualizations for the CBP dataset.

2.1.1 Dependency graph

The dependency graph implementation includes representation of the dependencies between the force elements and the capability framework. Note that only the lowest level elements in the capability framework are shown in this work, since they are the only ones with a direct relationship with the force elements. A circle layout is used to place the nodes. The force elements are displayed in orange and the capability nodes are displayed in light-blue, illustrating some of the labelling capabilities of the Prefuse toolkit.

When the visualization is first opened by the user, all of the edges are represented as shown in Figure 1. Due to the selected subset of data, only data for cost_horizon_0 is represented and displayed directly on the edge.

If a user double-clicks on a node, the color of that node will change to red and only the edges related to that node will be visible. The result of this action is presented in Figure 2.

Figure 1: A Dependency graph view between force elements and bottom level capabilities.

DRDC CORA CR 2012-065 5

Figure 2: Dependencies for a selected force element with edge values displayed

2.1.2 Tree view

The tree view implementation was developed using a subset of the original dataset: the capability framework hierarchy and associated information for each node. It was transformed to TreeML format using available open source example code1. Using this data subset, tree values are defined for each node, corresponding to the score of the capability element for the three scenarios. As the file was generated manually, all the values are 0.0.

1 See http://prefuse.org/doc/api/prefuse/data/io/TreeMLReader.html and http://www.nomencurator.org/InfoVis2003/download/treeml.dtd 6 DRDC CORA CR 2012-065

A global view of the Tree view visualization examples is presented in Figure 3. Additional examples are used to illustrate possible interactions and manipulations by the user in section 2.1.5.

Figure 3: Tree view (orientation 1).

DRDC CORA CR 2012-065 7

2.1.3 Tree map

This implementation displays the capability framework and PAA hierarchy in a tree map view (Figure 4). It does not allow the user to see node data such as the score of a capability vs. a scenario, but this is a possibility. The names of the lowest level of the hierarchy (leaves) are displayed along the bottom of the window when the mouse cursor is over a node. Thus, this implementation gives an overview of the size of a sub-tree i.e. the more leaves a sub-tree has, the larger the region occupied by this sub-tree. The tree map can also provide graphical indications of other properties of nodes, for instance the importance or some other specified value, if the user chose to arrange the tree map according to particular data fields.

Figure 4: Tree map representation of the capability framework and PAA

The tree map also shows the ability of Prefuse to support searching. The example in Figure 5 presents a varied search for nodes beginning with ‘PAA’ or ‘Capability_Framework::3’. Although not necessarily possible in a tree map, such a search could also be performed for edges, which are also named.

8 DRDC CORA CR 2012-065

Figure 5: Tree Map visualization of Capability Framework and PAA with “paa or capability_framework::3” entered as a search criteria.

2.1.4 Term/Word cloud

In the term/word cloud implementation, force elements are represented in a word cloud view. Notice that the node parameter cost_horizon_0 is used to determine the size of each element (see Figure 6). Placement of the labels (words) is arbitrary and the user can move them around by selecting and dragging elements across the screen. If appropriate data was provided for the nodes, the distance between words could be meaningful, denoting, for example, similarity between words or the importance of a relationship between one word and another. This would lead to clustering, providing a quick and effective means for the user to understand the structure of the dataset.

DRDC CORA CR 2012-065 9

Figure 6: Force elements in a word cloud view.

2.1.5 Interaction features

Many controls are available to modify the view of visualizations displayed in Prefuse. The following interactions were achieved using the Tree view:

1. Ctrl-1, Ctrl-2, Ctrl-3 and Ctrl-4 allow changing the orientation of the tree (see Figure 7, Figure 8, Figure 9 and Figure 10).

2. Left-mouse button: drag the view.

3. Right-mouse button: zoom.

4. Mouse-over node: display node’s name in the bottom of the window

5. Double-click on a node: display node information (values for each scenario) in a contextual view (see Figure 11).

6. Search box at lower right.

It was discovered that the actual mechanism to see the scores of each capability against each scenario does not allow for a quick overview of the data. The values can only be viewed for one node at a time. Since these data are related to nodes, it would be possible to adapt the node’s shape, size or color depending on the scenario the user wishes to analyze. For example, radio buttons could be added to allow the user to select the scenario of interest. 10 DRDC CORA CR 2012-065

Figure 7: Tree view (orientation 1).

Figure 8: Tree view (orientation 2 - detail).

DRDC CORA CR 2012-065 11

Figure 9: Tree view (orientation 3 - detail).

12 DRDC CORA CR 2012-065

Figure 10: Tree view (orientation 4).

DRDC CORA CR 2012-065 13

Figure 11: Result of double-clicking on a node.

2.1.6 Visualization types that were not implemented The following visualization types were not implemented during this phase of the project:

1. Flow map

2. Hyperbolic tree

3. Self-organizing map

4. Gauges

5. Starfield

6. Basic visualizations (as demonstrated on the Flex visualization toolkit website [not applicable for Prefuse])

One reason for not implementing some of these visualization types was the structure of the data in the dataset, for example: the scenario and effects dataset does not require a tree view; the flow map is

14 DRDC CORA CR 2012-065

difficult without significant time- and location-based data2 ; and the self-organizing map would be more relevant if users were inputting data and the map was “learning” based on the inputted data. The absence of pre-existing examples of open-source code with similar data structures also made it more difficult to quickly adapt Prefuse visualizations to the CBP dataset. For example, existing Prefuse objects, although compatible with code that results in a flow map representation, require a great deal of development work in both the toolkit and the data in order to implement.

Gauges are not possible in Prefuse and the basic Flex visualizations requested by the SA are not applicable for the JAVA-based Prefuse toolkit. The utility of starfield for the CBP context was perceived as less significant than the dependency graph and treeview and therefore it was not prioritized (although the starfield map was given a higher priority later in this project).

It is worth noting that the review of online references discovered that a student at Stanford University developed custom additions to the library to perform a flow map layout, as illustrated in Figure 12. Therefore, if this visualization type is perceived to be very interesting for the CBP task, this custom library would be a good starting point (see http://graphics.stanford.edu/~dphan/code/flowmap/).

Figure 12: Flow map example available with the Stanford library.

Similar extensions of the Prefuse library were not identified for hyperbolic trees, self-organizing or starfields. This does not mean that these visualizations are impossible to create using Prefuse software, only that example code was not found and therefore development is more time consuming for software developers to generate.

2 Flow maps typically use time data that is continuous rather than discrete, such as that in the ‘Horizon’ attributes. However, distance metrics can be calculated from discrete data in order to create a flow map, but this was not done for this contract. DRDC CORA CR 2012-065 15

2.1.7 Approach and lessons learned

The development of visualizations in Prefuse required the software development team to become familiar with the toolkit capabilities and function. During this process a number of important considerations were realized that may contribute to future development during the next phase of this project. These approaches and challenges are presented in the following paragraphs.

- & . &  

The Prefuse visualizations are initially less visually appealing than those generated by Flare. However, it is reasonable to expect that rendering functions could be added to address the “look and feel” of the visualizations generated by the toolkit. Ultimately the Prefuse toolkit is more powerful than Flare because of the Java back-end that allows for significant computing power, as well as being a language that is widely used by developers (opening up the possibilities of a large user community from whom to draw support, as well as ongoing public/private development of the Prefuse tool3).

- & . - 3     3  !

The dataset was modified to include, for each node of the dataset, the following element: custom label for the node, where “custom label for the node” is replaced by the value of the id parameter already defined for each node. This way, Prefuse has easy access to this data during processing. This data was used to label Prefuse nodes, filter nodes, etc. Notice that this addition to the dataset was automatically done by the use of regular expressions via the Notepad++ tool (http://notepad- plus-plus.org/ ).

- & . '     

For tree based layouts, the TreeML format was used (as opposed to the GraphML format) and followed the models provided in example code4.

- & . 5  3

Displaying a graph with multiple edges between two nodes (dense) may be a challenge from two perspectives. First, the default edge renderer will put the edges one on top of each other. Second, if the edges could be separated to be distinguishable, it might result in a cluttered display that is difficult for the user to make sense of without advanced filtering tools. There are two immediately apparent solutions: 1) create a new edge renderer which uses a more appropriate algorithm; 2) make available to the user some GUI controls (e.g. filters) that allow them to select and display only one edge at a time (filtering). Both of these solutions are possible, but do require additional development.

- & . ( % ""3 !!   =

The mechanism by which to display two graphs using two different layouts while having edges between nodes of the two graphs is not initially evident. For example, it is not clear how to connect a hyperbolic

3 Prefuse development has been opened up to the community and updates are available github.com/prefuse/Prefuse. 4 See http://www.nomencurator.org/InfoVis2003/download/treeml.dtd. 16 DRDC CORA CR 2012-065

tree layout for the PAA and a tree map for the capability framework with edges between nodes of the two structures. However, displaying the same data in two different visualization types and moving back and forth between visualization types, where one reflects changes in the other, may be possible.

2.2 Flare

Initial implementations in Flare focussed on the list of preferred visualizations provided by the SA in the SOW. From that list the following visualizations were implemented, with varying degrees of success, using a subset of the CBP dataset:

Dependency graph Hyperbolic tree

Each of these visualizations was generated during the familiarization phase. Very little customization was done on the visualization beyond small tweaks to the code to enable the selected portion of the GraphML file to display in Flare. Basic interaction techniques were explored including selection of nodes, highlighting selected nodes and changing the color of selected edges.

The visualization types that were not implemented include:

Tree view Tree Map Flow map Self-organizing map Gauges Starfield Basic visualizations (as demonstrated on the Flex visualization toolkit website)

The remainder of this section presents the implemented visualizations, the rationale for not implementing the remaining visualizations and a discussion on the successes, challenges and anticipated capabilities of Flare with respect to the implementation of these visualizations for the CBP dataset.

2.2.1 Dependency graph

The Flare dependency graph was visually attractive ‘out of the box’. The edges exhibited bundling in such a way to minimize clutter on the screen (see Figure 13). When a node was clicked, only related edges remained, and these were colour-coded to denote ‘…influences…’ or ‘…is influenced by…’ (see Figure 14).

DRDC CORA CR 2012-065 17

Figure 13: Flare Dependency Graph

18 DRDC CORA CR 2012-065

Figure 14: Flare Dependency Graph showing selected node and related edges

There were some errors in translation from the example provided on the Flare website to the implementation realized by the development team. For example: Nodes are not clickable Node labels are rendered all horizontally Only lowest level nodes visible (not tree structure)

2.2.2 Hyperbolic Tree

The Hyperbolic Tree in Flare was not as immediately appealing as the Dependency Graph. Also, in Flare, it was called a ‘Circle Graph’. However, in common with a Hyperbolic Tree, it would reorganize around the selected node. The Hyperbolic Tree in Flare is shown in Figure 15.

DRDC CORA CR 2012-065 19

Figure 15: Hyperbolic Tree in Flare

This preliminary implementation of the Hyperbolic Tree did not include any features such as edge labelling or hiding of unconnected edges and nodes. However, the reorganization of the map to place the selected node at the centre could be useful. In particular, if the subsequent positioning of nodes relative to the selected node corresponded to some value, such as the value of the edge connecting them, such that high value edges resulted in nodes being closer together, this could be useful in providing the viewer with additional cues to the structure and values contained in the dataset.

2.2.3 Interaction features

Interaction with the visualizations included node selection, edge selection, highlighting edges and nodes. No further interaction or animations were investigated in this phase of the project for the Flare toolkit.

20 DRDC CORA CR 2012-065

2.2.4 Visualization types that were not implemented

The remaining visualization types were not implemented due to perceived limitations of the Flare toolkit in comparison to DND/CF Information Technology (IT) infrastructure requirements. In particular:

 the need to learn ActionScript rather than Java;

 the use Adobe Flash tools to create an information visualization tool;

 the need to maintain up-to-date versions of browser plug-ins and Flash players; and

 ActionScript and Flash’s focus on graphics at the expense of algorithmic processing power.

Given these considerations, the Flare toolkit was considered ill-suited to the needs of DND.

2.2.5 Approach and lessons learned

Visualizations were generated following example open source code. Though this toolkit is capable of generating aesthetically pleasing visualizations, the fact that it is based in ActionScript (i.e. targets browsers) is viewed as a potential barrier to future development efforts. Flare is primarily a graphics/animation engine with less ability to draw upon or be interoperable with other computing resources, making it less likely to be as flexible and responsive to the development needs of DND. Any application developed using this toolkit would need to be linked to a better back end. It would also be difficult to keep up to date (in a secure information technology environment like that at DND) due to frequent version changes to Flash. For these reasons, Flare received less attention from the development team.

2.3 Graphical User Interface

In addition to investigating the implementation of specific visualization types and interaction techniques in Prefuse and Flare, a Graphical User Interface (GUI) was developed as a conceptual sketch to address user requirements for visualizing and using the CBP dataset from a more holistic perspective. This section presents this GUI and explains how it is intended to house the information visualizations and to support interaction and sensemaking with the CBP model.

2.3.1 GUI Concept Evolution

A number of the requirements generated during phase one of this project relate to interaction with the visualization in a way that will support sensemaking and data analysis. For example:

Consistent visual language across visualizations; The ability to use different layouts with different datasets to obtain clear views of the data; The capability to gain an overview of the entire collection of data; Filtering data in and out, i.e. select subsets of the data to view according to logical parameters;

DRDC CORA CR 2012-065 21

Use of legends, datatips, and other contextual views of supportive information (mouse-over, hover, etc.); and, Visual separation, categorical boundaries and groupings, and other guidelines in accordance with human factors and human-computer interaction best practices.

The common GUI approach of presenting a single application window with several different working areas and views provided some inspiration for development of a GUI that could be used to facilitate manipulation of the visualizations and interaction with the dataset. Screenshots of a tool (yEd: http://www.yworks.com/en/products_yed_about.html ) in Figure 16 and Figure 17 show the same SCM dataset filtered in different ways. The main menu options along the top provide additional viewing and filtering options.

Figure 16: Inspiration for overview development: yEd – Graph Editor (http://www.yworks.com/en/products_yed_about.html)

22 DRDC CORA CR 2012-065

Figure 17 Example of yEd-Graph editor interface including an overview pane, Successors pane, Structure view, Main window, Palette pane for editing, and Properties view.

Ideas and evolutions of the GUI concept for an information visualization application are presented in Figure 18 through Figure 21.

DRDC CORA CR 2012-065 23

Figure 18: Initial concepts for information visualization interface layout and functionality

Figure 18 depicts initial thoughts about the development of a windowed application that would simultaneously present multiple perspectives on the data, from overviews to details. This would address the likely needs of the user to understand the detailed relationships and characteristics of the data while also appreciating the impact on the dataset as a whole.

Figure 19 depicts some initial ideas for the default main view. The intent was that the dependency graph would be the foundation view, augmented by a graph around the circumference appropriate to the dataset being represented (in this view, a hierarchy). The dataset would be successively aggregated as less detail was required by the user (i.e. as the user focuses attention, through their cursor, further away from a particular dataset). Figure 19 also shows ideas for a modified fisheye view as well as a node detail view, which would be ‘held’ until the next cursor selection within the window. Figure 19 also has filter fields and settings, as well as a search function to support user navigation of large datasets.

24 DRDC CORA CR 2012-065

Figure 19: Initial concepts for information visualization interface design and specific capabilities

DRDC CORA CR 2012-065 25

Figure 20: Second iteration of information visualization interface layout

Figure 21: Third iteration of information visualization interface layout

26 DRDC CORA CR 2012-065

Figure 20 and Figure 21 depict iterations of the windowed application. In particular, Figure 21 shows the tabbed main view, which provides the user quick access to different views of the data, all of which are linked to whatever manipulation is carried out in another view.

2.3.2 GUI Prototype

A GUI prototype was developed (see Figure 22 and Figure 23) to function with the dependency graph visualization built using Prefuse. This GUI houses eight frames:

1. Overview window (top left) shows the entire visualization with an inset marking the area the user has currently ‘zoomed/panned to’ in the Main view;

2. Neighbours window (middle left) shows the nearest neighbor nodes to the selected node;

3. Properties window (bottom left) lists the properties of the selected nodes including values and attributes;

4. Main view (top centre) is the largest central frame where the user can interact with the visualization;

5. Dataset view (bottom centre) shows a tree-view of the selected node;

6. Visibility filters (top right) enables the user to show or hide some or all of the dataset;

7. Colour filters (middle right) which allows the user to apply a colour to edges with a value above a specified threshold; and,

8. Bottom right is currently unassigned.

Interaction functionality includes:

1. A search function at the top left;

2. Zooming in and out using the scroll wheel on mouse;

3. Click and drag to re-centre the visualization in the main window;

4. Click on any node and it moves to centre and shows only related edges; and,

5. Different datasets are colour coded.

For this initial prototype, the main window and dataset views do not work together. A screenshot of the prototype appears in Figure 22 and Figure 23.

DRDC CORA CR 2012-065 27

Figure 22: First prototype: initial screen - dependency graph

Figure 23: First prototype: node selection and zoom in on main window and tree view

28 DRDC CORA CR 2012-065

3 Assessment of initial implementations

The assessment of initial visualization implementations was conducted collaboratively between the CAE PS team (i.e. software developers and human factors specialists) and the SA during a face to face meeting on March 21, 2011. The minutes from this meeting are presented in Annex A.

At the meeting, the team reviewed the prototypes one-by-one and discussed:

1. The level of effort required to generate these visualizations;

2. Their ability to meet visualization and interaction requirements, and

3. Scenarios where the visualization might be useful in the context of the CBP process.

It was subsequently determined by those in attendance that:

1. Development work should continue with Prefuse only. The rationale for this decision was the perceived flexibility of Prefuse in comparison to Flare, the level of comfort by the development team with Prefuse vs. Flare, the belief that there are likely a number of software tools that could be used to augment the visual appeal of Prefuse products if so desired in future, and the acknowledgement that at this point in the development process focussing resources in one area would facilitate development of more advanced visualization implementations. In short, compared to the flash-based Flare, the Java-based Prefuse toolkit allows for platform independence, additional calculation power and increased deployability within DND/CF.

2. The following visualizations would be pursued:

a. Flow map, may not necessarily be linked to geographic info, could be linked to hierarchy or time horizon. For example, the SCM dataset may be used to generate a flow map, e.g. could compute the total at the highest level of the hierarchy, then attribute costs at successive levels of decomposition at each node (based on node data). There may be ways of determining the coordinate points of the nodes. The locations could be based on the coordinates developed for the starfield map.

b. Self organizing map, would require two dimensions data on nodes in order to compute the geometric distance between nodes, which could be easily calculated from the capability scores in two or more scenarios.

c. Starfield would require at least two dimensions of data on the nodes, which could be calculated from the capability scores in two (or more) scenarios.

DRDC CORA CR 2012-065 29

d. Word-cloud where proximity is associated with edge value and word ‘size’ is based on node data.

3. Gauges would not be prototyped because they are not part of the Prefuse set of visualizations

4. CAE PS would modify the provided graphML files as required to generate desired visualizations.

5. All data subsets should be linked, resulting in the creation of a generic interface to display the dataset. This interface should be in line with the GUI concepts presented by CAE PS including multiple tabs giving user options of how to view the visualizations and including filtering and search capabilities.

30 DRDC CORA CR 2012-065

4 Creative development

Immediately following this assessment, the team engaged in creative development. Creative development was a collaborative effort between with the contractor and Scientific Authority. This activity involved collaboratively generating ideas for new iterations and augmentations of the visualizations and associated interaction techniques that would be of use to identified users. Specifically, the team:

1. Generated a wish list for prototype enhancements;

2. Conducted a review of how the data set could be used or modified to generate specific data visualizations (i.e. add fictitious data for the purpose of supporting visualizations); and

3. Conducted a review and discussion of proposed overview screen options for user interaction with the dataset via visualizations, filters, search functions, etc.

The following was the initial use scenario presented during the creative development session. The text is taken directly from the presentation:

“Upon opening the GUI, a default overview visualization of the dataset appears. The user can filter the dataset based on pre identified data subsets (PAA, capability framework, horizon 1, 2, 3 etc.). The user can then select a different tab to display an alternate view of that same filtered data. These tabs will be linked such that the user can move back and forth between tabs without having to re-select filtering criteria.”

The results of these efforts resulted in the following list of proposed development activities (see Table 2; items with a ‘*’ under the column labelled ‘priority’ were designated of highest importance):

Table 2 Proposed visualization development activities Visualization/Functionality Development Activities Priority 1. CAE PS to decide the number of edges between nodes 2. Solution to the problem of showing all datasets in a single view might be to represent the hierarchies on the outwards with the connected nodes (lowest level) towards the centre of the circle (note, this may be superseded by ongoing development) 3. Flow map. It may not necessarily be linked to geographic info, could be linked to hierarchy or time * horizon 4. Self organizing map, will require two dimensions data on nodes in order to compute the geometric * distance between nodes 5. Starfield will require two dimensions of data on the nodes * 6. Word-cloud where proximity is associated with edge value and word ‘size’ is based on node data * 7. CAE PS to generate required “dummy” data in the form of node or edge attributes and distance

DRDC CORA CR 2012-065 31

metrics as required to achieve visualizations that may be interesting 8. Breadcrumbs5 are required 9. Investigate attributes that will allow bundling of data points (aggregation) to support readability 10. Try improving the labels, e.g. shorter and directing away from the centre (note, this may be superseded by ongoing development) 11. Explore coloring the nodes, groups of nodes, filtering and aggregation – can one zoom in and out of aggregation to view the dataset at different levels? 12. CAE PS to implement full functionality on tree view * 13. Investigate functionality of the tree : is it possible to highlight other aspects of the data? Bundle links? Illustrate a relationship between nodes by making them closer together? 14. In the overview investigate a way to illustrate to the user the scope of the dataset they have filtered to (so they know how much they have filtered out) 15. Investigate the use of keywords to guide the user’s search (like Google) – e.g. alphabetized ‘address book’, indented list, etc. 16. CAE PS to move forward with Java development and put all resources on Prefuse 17. CAE PS to evaluate the feasibility of the visualization tool/interface offering the user the ability to update the graphML file by assigning nodes a new name, edge, value, attribute, etc. 18. Comment on the possibility of integrating with MatLab in the report (C can be wrapped in Java, MatLab can be wrapped in C, etc.) 19. CAE PS to send a revised outline of this phase 1 report to the scientific authority 20. By Friday, March 25th, CAE PS to send notes from this meeting and programmer’s estimate of feasibility of work (as described above) to be completed on schedule.

5 Breadcrumbs refers to a navigation aid used in user interfaces. There are three types of breadcrumbs: path (that the user has taken to reach that page); location (where the page is located in the website hierarchy); and attribute (information that categorizes the current page).

32 DRDC CORA CR 2012-065

5 Feasibility assessment

Following creative development and the identification of desirable iterations of the visualizations, each proposed activity was assessed for feasibility by software development professionals familiar with the CBP dataset and visualization toolkits. The feasibility assessment was based on a rating system of “high”, “medium”, “low”, where each activity associated with development was scored as follows:

1. H (high) if it was estimated that the completion of the activity was achievable within the timeframe and resources available for the project;

2. M (medium) if it was estimated there was a good chance the activity may be achievable within the timeframe and resources available for the project; and,

3. L (low) if it was unlikely the development team would be able to address the activity within the time remaining on the contract.

The complete list of enhanced implementation activities proposed as a result of creative development and meetings with the SA are presented in Table 3.

Table 3 Feasibility Analysis for Additional Development of Visualizations in Prefuse Feasibility Visualization/Functionality Development Activities Priority Notes (H/M/L) 1. CAE PS to decide the number of edges between H nodes 2. Solution to the problem of showing all datasets in Low Not enough time or budget a single view might be to represent the hierarchies remaining to write new code on the outwards with the connected nodes to achieve this objective (lowest level) towards the centre of the circle (note, this may be superseded by ongoing development) 3. Flow map. It may not necessarily be linked to * M to H The SCM dataset may be used geographic info, could be linked to hierarchy or to generate flow map, e.g. time horizon. could compute the total at the highest level of the hierarchy, then attribute costs at successive levels of decomposition at each node (based on node data). There may be ways of determining the coordinate points of the nodes. The locations could be

DRDC CORA CR 2012-065 33

Feasibility Visualization/Functionality Development Activities Priority Notes (H/M/L) based on the coordinates developed for the starfield map 4. Self organizing map, will require two dimensions * M toH data on nodes in order to compute the geometric distance between nodes 5. Starfield will require two dimensions of data on * M toH the nodes 6. Word-cloud where proximity is associated with * M toH edge value and word ‘size’ is based on node data 7. CAE PS to generate required “dummy” data in the H CAE PS has decided to use SCM form of node or edge attributes and distance data to demonstrate Flow metrics as required to achieve visualizations that Map, Self Organizing Map, may be interesting Starfield and Word Cloud 8. Breadcrumbs are required M Implementation will probably not be the one expected by customer. Instead of showing what actions have been taken, breadcrumbs will be more like ‘undo’ function 9. Investigate attributes that will allow bundling of H data points (aggregation) to support readability 10. Try improving the labels, e.g. shorter and directing H away from the centre (note, this may be superseded by ongoing development) 11. Explore coloring the nodes, groups of nodes, M filtering and aggregation – can one zoom in an out of aggregation to view the dataset at different levels? 12. CAE PS to implement full functionality on tree *H view 13. Investigate functionality of the tree diagram: is it L-M Not enough time or budget possible to highlight other aspects of the data? remaining to write new code Bundle links? Illustrate a relationship between to achieve this objective nodes by making them closer together? 14. In the overview investigate a way to illustrate to M More discussion required on the user the scope of the dataset they have how this should be filtered to (so they know how much they have implemented filtered out) 15. Investigate the use of keywords to guide the user’s L search (like Google) – e.g. alphabetized ‘address book’, indented list, etc. 16. CAE PS to move forward with Java development H and put all resources on Prefuse 17. CAE PS to evaluate the feasibility of the L visualization tool/interface offering the user the ability to update the graphML file by assigning

34 DRDC CORA CR 2012-065

Feasibility Visualization/Functionality Development Activities Priority Notes (H/M/L) nodes a new name, edge, value, attribute, etc. 18. Comment on the possibility of integrating with L MatLab in the report (C can be wrapped in Java, MatLab can be wrapped in C, etc.)

DRDC CORA CR 2012-065 35

6 Enhanced implementation plan

On the basis of the initial implementation of the prototype, an enhanced implementation plan was developed. The enhanced implementation plan takes the form of a functional specification for the final prototype to be delivered under this contract. The functional specification is provided in the bulleted list below (list is in no particular order).

Build an information visualization application that permits presentation of several different visualizations of the same or related data. Provide different frames in the application so the user has multiple levels and perspectives on data in a single view, allowing them to maintain situation awareness of the model at all times. Implement the dependency graph, force directed graph (instead of a self-organizing map), tree map, tree view, starfield map, word cloud. Take the force directed graph and make the font size proportional to the sum of node values. Implement a Parallel Coordinates graph. Allow quick access to different visualizations through tabbed frames. Include interactivity such as filters, search, zoom in and out, click-and-drag panning, and fisheye lens effects. Include graphical coding strategies such as colours, labels, grouping/bundling, etc. as appropriate. Permit algorithms that draw the visualizations to be manipulated by the user. Render the visualization engines data ‘agnostic’; that is: ensure that the visualization engine can read any standard format file (e.g. GraphML) and extract the salient structural features in order to render the data appropriately, rather than hard-coding the visualization engine to render a specific data file correctly. Bundling of edges so that the dependency graph looks more like the Flare version of the visualization. Use bars on the edge of the dependency graph (e.g. use cost data to generate length of bars that grow outwards from the nodes). The bars should include the node label/name. Orient the labels (or bars) outwards from the node. Enable collapsing and expanding of tree structure that appears in the bottom pane. Threshold filter is currently based on edge data, but it could also be valid for node data. GUI should show the two options even if functionality is currently limited to edges Implement the ability to re-order nodes Implement the ability to change the diameter of the fisheye lens in accordance with user inputs.

36 DRDC CORA CR 2012-065

Implement interconnectedness between all tabs and views (e.g. GUI tabs are linked, overview is linked - a change executed on one graph will propagate to all other tabs).

DRDC CORA CR 2012-065 37

This page intentionally left blank.

38 DRDC CORA CR 2012-065

Annex A Minutes of the Implementation Assessment and Creative Development Meeting

Development of Interactive Information Visualization and Data Manipulation Prototypes for Strategic Planning Applications

Implementation Assessment and Creative Development Meeting March 21, 2011

Attendees CAE PS: Tab Lamoureux, Catherine Campbell, Gregoire Seguin, and Grant Smith DRDC: Chad Young

Activities

1. Presentation of data visualization prototypes by CAE PS

a. Overview of work to date

b. Overview of prototypes generated in Prefuse

c. Discussion of outstanding issues and options

i. Continue with development of visualizations in both Prefuse and Flare?

ii. Potential future migration to alternative visualization toolkit?

iii. Review of schedule and deliverables

2. Implementation assessment and creative development

a. Review of each data visualization type and wish list for prototype enhancements

b. Review of how the data set could be used or modified to generate specific data visualizations (i.e. add fictitious data for the purpose of supporting visualizations)

c. Review and discussion of proposed Overview screen options for user interaction with the dataset via visualizations, fliters, search functions, etc.

3. Review of SOW and planned report contents

Meeting Notes

Progress to date (CAE PS):

DRDC CORA CR 2012-065 39

1. Literature review, familiarization with the dataset, familiarization with the visualization types and available toolkits complete. Report pending.

2. Most of the development focus has been using Prefuse

3. Recent addition of two software specialists to support further development

4. No effort has gone into the visual appeal or of the visualizations, up to this point we have just been trying to generate the visualizations using the dataset

5. We are on track for delivery of the phase 1 report on April 18 and prototype visualizations on April 29th.

Scientific Authority Comment: Overall objective of this contract is to prove that this type of dataset can be visualized. The next step will be to take these prototypes and implement them using other datasets to generate prototypes that DC can work/experiment with. One step further would be the development of a deployable visualization tool.

Discussion of prototypes presented:

6. With respect to the SCM dataset, CAE PS to decide the number of edges between nodes: 6 edges, one per attribute, or one edge with multiple attributes. Note: The existing database (where real SCM data is held) assigns multiple attributes to both edges and nodes as appropriate. The question is which is easier to implement/represent with respect to generating data visualizations?

7. Perhaps Treemap could be used for SCM?

8. Gauges will or will not be prototyped depending on if we choose to proceed with Prefuse or Flare

9. The example Prefuse tree is easier to navigate through, it should be able to expand/contract

10. The context/hierarchy of the dataset is lost in the dependency graph prototype – a possible solution might be to represent the hierarchies on the outwards with the connected nodes (lowest level) towards the centre of the circle.

11. Would like to see the following visualizations:

a. Flow map, may not necessarily be linked to geographic info, could be linked to hierarchy or time horizon. E.g. the SCM dataset may be used to generate flow map, e.g. could compute the total at the highest level of the hierarchy, then attribute costs at successive levels of decomposition at each node (based on node data). There may be ways of determining the coordinate points of the nodes. The locations could be based on the coordinates developed for the starfield map

40 DRDC CORA CR 2012-065

b. Self organizing map, will require two dimensions data on nodes in order to compute the geometric distance between nodes

c. Starfield will require two dimensions of data on the nodes

d. Word-cloud where proximity is associated with edge value and word ‘size’ is based on node data

iv. Note: These may require that additional “dummy” data be generated to support the visualization. CAE PS to generate required “dummy” data in the form of node or edge attributes and distance metrics as required to achieve visualizations that may be interesting.

12. SA leaning towards action script. The issue with these toolkits is that they are monolithic; the user has to know the desired visualization – that is not what DRDC is looking for here.

13. Performance issues are not a priority at this time, the real dataset is approximately 10x the one given to CAE PS to work with.

14. Future development could take two tracks (SA prefers option a)

a. CAE PS takes the graphML file and modifies it as required to generate desired visualizations, no nested trees, all datasets are linked, the result is creation of a generic interface to display the dataset. This is in line with the overview presented by CAE PS (with multiple windows giving user options of how to view the visualizations and including filtering and search capabilities). Desired scenario: on screen one the user picks their filters for the dataset and can then select a tab to display an alternate view of that same filtered data. The user should be able to filter the data once and then view it in multiple visualization forms using the tabs.

b. CAE PS creates individual visualizations and works to make each of them as best as possible within the time remaining.

15. Breadcrumbs are required

16. Investigate attributes that will allow bundling of data points (aggregation) to support readability – dynamic visualizations where the selected data point moves into the centre and the graph reorganizes accordingly do not work with large datasets, but if these large datasets are bundled (e.g. according to a hierarchy) to display like a smaller dataset, it works.

17. Try improving the labels, e.g. shorter and directing away from the centre (as in the Flare example)

18. Explore coloring the nodes, groups of nodes, filtering and aggregation – can one zoom in an out of aggregation to view the dataset at different levels?

19. CAE PS to implement full functionality on tree view

DRDC CORA CR 2012-065 41

20. Investigate functionality of the tree diagram: is it possible to highlight other aspects of the data? Bundle links? Illustrate a relationship between nodes by making them closer together?

21. In the overview investigate a way to illustrate to the user the scope of the dataset they have filtered to (so they know how much they have filtered out)

22. Investigate the use of keywords to guide the user’s search (like google) – e.g. alphabetized ‘address book’, indented list, etc.

23. CAE PS to move forward with Java development and put all resources on Prefuse

24. CAE PS to evaluate the feasibility of the visualization tool/interface offering the user the ability to update the graphML file by assigning nodes a new name, edge, value, attribute, etc.

25. Comment on the possibility of integrating with MatLab in the report ( C can be wrapped in Java, MatLab can be wrapped in C, etc.)

Review of contract and deliverables:

26. Confirmed that first report (to be delivered April 18th) will include:

a. A list of problems CAE PS sees and the way forward

b. Links to sample code and visualization examples

27. CAE PS to send a revised outline of this phase 1 report to the scientific authority

28. By Friday, March 25th, CAE PS to send notes from this meeting and programmer’s estimate of feasibility of work (as described above) to be completed on schedule.

42 DRDC CORA CR 2012-065

List of symbols/abbreviations/acronyms/initialisms

CBP Capability Based Planning CF Canadian Forces DND Department of National Defence DRDC Defence Research & Development Canada DRDKIM Director Research and Development Knowledge and Information Management PAA Program Activity Architecture SCM Strategic Cost Model SOW Statement of Work

DRDC CORA CR 2012-065 43

This page intentionally left blank. DOCUMENT CONTROL DATA (Security classification of title, body of abstract and indexing annotation must be entered when the overall document is classified) & *%%8* (The name and address of the organization preparing the document. -  >%:  %?%%*8 Organizations for whom the document was prepared, e.g. Centre sponsoring a (Overall security classification of the document contractor's report, or tasking agency, are entered in section 8.) including special warning terms if applicable.) >8 %?% )   !   " #$ %" &&'( % L )  *+ * ,-, '.  (NON-CONTROLLED GOODS) DMC A REVIEW: GCEC June 2010

' % (The complete document title as indicated on the title page. Its classification should be indicated by the appropriate abbreviation (S, C or U) in parentheses after the title.) 0 /  3  %!  Visualization: Development of Visualizations

5 >@* (last name, followed by initials – ranks, titles, etc. not to be used)   A    A    A   A   A     

( ) *? >/ %%*8 4 8* *?   4 8* *?  ? (Month and year of publication of document.) (Total containing information, (Total cited in document.) including Annexes, Appendices, etc.)  " -6&- 45

. ) %%B 8*  (The category of the document, e.g. technical report, technical note or memorandum. If appropriate, enter the type of report, e.g. interim, progress, summary, annual or final. Give the inclusive dates when a specific reporting period is covered.)

 "  

C *8*%8 %B%: (The name of the department project office or laboratory sponsoring the research and development – include address.) )!" <)  M *&6&  /0 )  *+ *  ,& 6,-

; *E  * 8 8* (If appropriate, the applicable research ; *8 8* (If appropriate, the applicable number under and development project or grant number under which the document which the document was written.) was written. Please specify whether project or grant.) PWGSC W7714-11-5174

&6 *%%8*F )*> 8 8>/  (The official document &6 *@  )*> 8 8*# $ (Any other numbers which may be number by which the document is identified by the originating assigned this document either by the originator or by the sponsor.) activity. This number must be unique to this document.) )) *  -6&-64(

&& )*> 8 B% /% %: (Any limitations on further dissemination of the document, other than those imposed by security classification.) >

&- )*> 8 88*>8  8 (Any limitation to the bibliographic announcement of this document. This will normally correspond to the Document Availability (11). However, where further distribution (beyond the audience specified in (11) is possible, a wider announcement audience may be selected.)) >

&' / (A brief and factual summary of the document. It may also appear elsewhere in the body of the document itself. It is highly desirable that the abstract of classified documents be unclassified. Each paragraph of the abstract shall begin with an indication of the security classification of the information in the paragraph (unless the document itself is unclassified) represented as (S), (C), (R), or (U). It is not necessary to include here abstracts in both official languages unless the text is bilingual.) The Capability Based Planning (CBP) process in Canada is based on an extensive data model that integrates scenario information, effects information, capabilities, force elements, a strategic cost model, and a Program Activity Architecture (PAA) model. Each element is a dense and highly inter-related set of data. Understanding the breadth and depth of considerations that must be made in CBP is difficult for senior decision-makers and their staffs. This contract investigates the potential for information visualization to support these people in understanding and making effective use of CBP model data to support decision-making. To this end, this report describes the preliminary use of two information visualization toolkits (Prefuse and Flare, selected during phase I of this work) to develop simple visualizations. This work had two purposes: familiarization with the toolkits, and evaluation of the relative strengths and weaknesses of the two toolkits. The output from this preliminary work is presented here, as are the results of an evaluation session involving the contract Scientific Authority (SA). Based on this preliminary work, Prefuse is recommended for further development work, and a number of features of a prototype information visualization application are introduced that may be helpful for CBP decision makers.

Au Canada, le processus de planification fondée sur les capacités (PFC) s’appuie sur un vaste modèle de données qui intègre de l’information sur le scénario, de l’information sur les effets, des capacités, des éléments de la force, un modèle de coût stratégique, et un modèle d’Architecture des activités de programme (AAP). Chaque élément contient un ensemble de données denses et hautement interreliées. La compréhension de l’étendue et de la portée des considérations qui peuvent être utilisées dans la PFC est difficile pour les principaux décideurs et leur effectif. Le présent travail permet d’examiner la possibilité d’utiliser la visualisation de l’information afin d’aider ces gens à comprendre et à utiliser efficacement les données du modèle de PFC pour appuyer la prise de décisions. À cette fin, le rapport décrit l’utilisation préliminaire de deux trousses d’outils de visualisation de l’information (Prefuse et Flare, qui ont été sélectionnés pendant la phase I de ce travail) afin de développer des outils de visualisation. Ce travail avait deux objectifs : la familiarisation avec les trousses d’outils et l’évaluation des forces et faiblesses relatives des deux trousses. Le résultat de ce travail préliminaire est présenté dans ce rapport, ainsi que les résultats de la séance d’évaluation faisant intervenir l’autorité scientifique des marchés (ASM). À partir des travaux préliminaires, Prefuse a été recommandé pour le travail de développement à venir. De plus, certaines caractéristiques d’un prototype d’application de visualisation de l’information ont été mises en œuvre et pourraient être utiles aux principaux décideurs.

&5 , :7*) ) %*  %) 8%?%  (Technically meaningful terms or short phrases that characterize a document and could be helpful in cataloguing the document. They should be selected so that no security classification is required. Identifiers, such as equipment model designation, trade name, military project code name, geographic location may also be included. If possible keywords should be selected from a published thesaurus, e.g. Thesaurus of Engineering and Scientific Terms (TEST) and that thesaurus identified. If it is not possible to select indexing terms which are Unclassified, the classification of each should be indicated as with the title.)

0 /  3A %!  B =A  3" 3