A System Design for the Dynamic Cognitive Mapping of Wiki Articles

By Daniel Pettus

A MASTER OF ENGINEERING REPORT

Submitted to the College of Engineering at Texas Tech University in Partial Fulfillment of The Requirements for the Degree of

MASTER OF ENGINEERING

Approved

______Dr. A. Ertas (Chair)

______Dr. T. Maxwell (member)

______Dr. M. Tanik (member)

______Dr. J. Woldstad (COE Representative)

October 25, 2008 ACKNOWLEDGEMENTS

I would like to thank Dr. Atila Ertas, the Department of Mechanical Engineering at Texas

Tech University, the entire Texas Tech staff, and guest speakers for their role in organizing the

Master of Engineering program for Raytheon employees.

I would like to thank my fellow students in the Master of Engineering program for their support in helping me accomplish my goals.

I would like to thank my coworkers and management for accommodating a class schedule and deadlines that often did not synch well up with the schedule and deadlines of my professional career.

I would like to thank my family and loved ones for offering their support, understanding, enthusiasm, and sacrifice during this time.

Most of all, I would like to thank Minh “Michele” Luong, who encouraged me to join this program and who provided me with her confidence and support throughout. I would not have accomplished so much without her.

1 TABLE OF CONTENTS

ACKNOWLEDGEMENTS ...... 1 DISCLAIMER...... 5 ABSTRACT ...... 6 LIST OF FIGURES ...... 7 LIST OF TABLES ...... 9 CHAPTER I INTRODUCTION...... 10 CHAPTER II BACKGROUND ...... 13 2.1 The Background of Wikis 13 2.2 The Background of Cognitive Mapping 14 2.2.1 Concept Mapping 14 2.2.2 Mind Mapping 15 2.2.3 Dialogue Mapping 18 CHAPTER III ANALYSIS OF TOOLS ...... 19 3.1 Cognitive Mapping Tools 19 3.1.1 Concept Mapping 19 3.1.1.1 CmapTools 19 3.1.2 Dialogue Mapping 19 3.1.2.1 Compendium 19 3.1.3 Mind Mapping 20 3.1.3.1 MindManager 20 3.1.3.2 FreeMind 20 3.1.3.2.1 FreeMind Flash Browser 20 3.1.4 Cognitive Mapping Conclusion 21 3.2 Wikis 21 3.2.1 Platypus Wiki 21 3.2.2 MediaWiki 22 3.2.2.1 Semantic MediaWiki 22 3.2.3 Wiki Conclusion 22 3.3 Source Parsing Tools 23 3.3.1 Natural Language Processing 23 3.3.2 Structure Parsing 23 3.3.3 Source Parsing Conclusion 23 CHAPTER IV WIKI-POWERED DESIGN ...... 24 4.1 Scope 24 4.2 Design Process 24 4.2.1 Requirements 24 4.2.2 System Architecture 25

2 4.2.2.1 Contextual 25 4.2.2.1.1 Interrogative Data 26 4.2.2.1.1.1 Why does the work need to be done? (Goals) 26 4.2.2.1.1.2 Who are the participants? (Stakeholders) 26 4.2.2.1.1.3 What information and entities are used? (Data / Entities) 26 4.2.2.1.1.4 How do they operate? (Processes) 26 4.2.2.1.1.5 Where do they operate? (Geography) 27 4.2.2.1.1.6 When are the various actions taken? (Timeline) 27 4.2.2.1.1.7 Relationship Matrix (Why vs. Who) 27 4.2.2.1.2 Context Diagram 28 4.2.2.1.3 Scope and Boundaries 28 4.2.2.1.3.1 Deliverables 28 4.2.2.1.3.1.1 External 29 4.2.2.1.3.1.2 Functionality 29 4.2.2.1.3.1.3 Data 30 4.2.2.1.3.1.4 Technical Structure 30 4.2.2.1.4 Quality Attributes 30 4.2.2.2 Operational 32 4.2.2.2.1 Operational Concept 32 4.2.2.2.2 Key Scenarios 33 4.2.2.2.3 Actors 33 4.2.2.2.4 Use Cases 33 4.2.2.2.5 Activity Diagram 37 4.2.2.3 Logical 37 4.2.2.3.1 Functional Decomposition 38 4.2.2.3.2 Logical Solution 38 4.2.2.3.3 Logical Block Diagram 39 4.2.2.3.4 Data Flow Diagram 39 4.2.2.3.5 Sequence Diagrams 40 4.2.2.3.5.1 Search and View Wiki Mind-map 40 4.2.2.3.5.2 Browse Wiki Mind-map 40 4.2.2.3.5.3 View Wiki Mind-map Source 41 4.2.2.3.6 Evaluation 41 4.2.2.4 Physical 41 4.2.2.4.1 Physical Components 42 4.2.2.4.1.1 Hardware 42 4.2.2.4.1.2 Software 42 4.2.2.4.1.3 Data 42 4.2.2.4.1.4 Interface 42 4.2.2.4.2 Physical Approach 43 4.2.2.4.3 Static View of Physical System 43 4.2.2.4.4 Trade of approaches to quality attributes 43 4.2.2.4.4.1 Usability 43 4.2.2.4.4.2 Robust 44 4.2.2.4.4.3 Reliability 44

3 CHAPTER V SYSTEM MOCKUP ...... 46 5.1 Search and View Wiki Mind-map 46 5.1.1 Opening Page 46 5.1.2 Search and View Results 47 5.1.3 Search the Contents of the Map 48 5.2 Browse Wiki Mind-map 49 5.2.1 Expand the Nodes 49 5.3 View Wiki Mind-map Source 51 5.3.1 View Main Topic 51 5.3.2 View Categories 53 5.3.3 View Linked Topic 55 5.3.4 View External Page 57 CHAPTER VI SUMMARY AND CONCLUSIONS ...... 59 REFERENCES ...... 60 APPENDIX A ACRONYM LIST ...... 61 APPENDIX B RESOURCE LIST ...... 62

4 DISCLAIMER

The opinions expressed in this report are strictly those of the author and are not necessarily those of Raytheon, Texas Tech University, nor any U.S. Government agency.

5 ABSTRACT

In recent times wikis have become an important tool for online collaboration and information dissemination in both the public, academic, and enterprise arenas.

“A wiki is a collection of web pages designed to enable anyone who accesses it to contribute or modify content, using a simplified markup language. Wikis are often used to create collaborative websites and to power community websites. For example, the collaborative encyclopedia Wikipedia is one of the best-known wikis. Wikis are used in businesses to provide affordable and effective intranets and for Knowledge Management.” [Wikipedia/Wiki, 2008]

Any user can very easily add content to a wiki by means of their markup languages. The underlying , like HTML, is loosely structured. This allows for each author or authors to individually or collectively determine how to organize, display, and reference content. Because of this loose structure, wikis run the risk of disorientation. Structure and style may be inconsistent and a reader may not understand the relevance of the material within the overall context.

Cognitive mapping can be a solution to this issue. Cognitive mapping considers thinking as a self-organizing information system, i.e. a cognitive map will maintain the relevance of information regardless if new information is added or existing information is modified or removed. Cognitive maps are graphical representations of the structure of this information. They can give the necessary structure to a wiki in order to aid in understanding.

The human brain typically identifies visual patterns more easily than non-visual. Visualization tools such as cognitive maps are also less restricted to barriers such as language. Using techniques to visualize information helps to communicate and clarify ideas, reveal hidden patterns and relationships, and gain insight to new ideas.

In this paper, we shall discuss some of the issues with wikis and describe a system that uses cognitive mapping to visualize the structure of information within wiki articles in an attempt to address these issues, and to utilize the brain’s ability to comprehend through visualization.

6 LIST OF FIGURES

Figure 1 The Problem with Wikipedia 11

Figure 2 Sample Concept Map 15

Figure 3 Sample 17

Figure 4 Sample Dialogue Map 18

Figure 5 Context Diagram 28

Figure 6 Activity Diagram 37

Figure 7 Functional Decomposition Breakdown 38

Figure 8 Logical Block Diagram 39

Figure 9 Data Flow Diagram 39

Figure 10 Sequence Diagram - Search and View Wiki Mind-Map 40

Figure 11 Sequence Diagram - Browse Wiki Mind-Map 40

Figure 12 Sequence Diagram - View Wiki Mind-Map Source 41

Figure 13 Physical System 43

Figure 14 Mockup Opening Page 46

Figure 15 Mockup Search and View Results 47

Figure 16 Mockup Search the Contents of the Map 48

Figure 17 Mockup Expand the First Level Node 49

Figure 18 Mockup Expand Second Level Node 50

Figure 19 Mockup Select Main Topic 51

Figure 20 Mockup View Main Topic 52

Figure 21 Mockup Select Category 53

Figure 22 Mockup View Category 54

Figure 23 Mockup Select Linked Topic 55

Figure 24 Mockup View Linked Topic 56

7 Figure 25 Mockup Select External Page 57

Figure 26 Mockup View External Page 58

8 LIST OF TABLES

Table 1 System Requirements 24

Table 2 Relationship Matrix 27

Table 3 External Deliverables 29

Table 4 Functionality Deliverables 29

Table 5 Data Deliverables 30

Table 6 Technical Deliverables 30

Table 7 Quality Pugh Chart 31

Table 8 Use Case - Search for wiki articles 33

Table 9 Use Case - Render mind-map of wiki article 34

Table 10 Use Case - Browse linked wiki articles 35

Table 11 Use Case - View original wiki article 35

9 CHAPTER I INTRODUCTION

Bo Leuf and Ward Cunningham, developers of the first wiki, asked the following question,

“What is the main limit of current network-based collaboration models?” [Leuf and Cunningham, 2001]

They were not satisfied by the current models of the day, i.e. e-mail exchange and mailing lists, shared repositories, and interactive content update systems employing version control techniques. These models all suffer from a high degree of redundancy and the difficulty of writing in collaboration with others.

In the 1990’s wikis became a more convenient and popular collaboration model. Some of the reasons that wikis became popular are as follows:

• It is very easy to add content to a wiki by means of their simple markup languages

• A wiki’s underlying markup language is not tightly structured, so that people can decide

collectively how to organize their content

• Any user can add or organize content in a wiki

• Wikis actively allow for and encourage collaboration

Although wikis clearly have their benefits, they also have their limitations. Wiki contributors add

to a wiki topic using their own dialect. The use of such general-purpose, or natural, language makes it easy for these users, in their own fashion, to contribute to the knowledge sharing effort in a wiki.

Conversely, the use of natural language restricts the possible set of knowledge providers and consumers to people who comprehend the language. Such restrictions place limits on the level of collaboration that these systems were meant to enhance.

The loose structure of wikis is both a benefit and a limitation. The starting phase of a wiki article is crucial in determining its success. Although a loose structure encourages content generation, if the structure of the article is not well defined at the onset, then further additions by others will amplify the initial problem. There is no taxonomy previously decided on by the wiki community. Information in the

10 article may often grow organically. When this occurs, there eventually will become a moment by which it becomes necessary to restructure the article to put everything in order. This is often not an easy task.

A third limitation is the fact that links within an article are often unidirectional, i.e. the linked topic has no knowledge of the referrer. To understand a topic a reader may begin to follow the included

links to gain further insight. Unfortunately, within a few levels of indirection, the reader is now reading

topics that have little or nothing to do with the original material.

Figure 1 The Problem with Wikipedia

When you combine these limitations with the fact that structure between distinct topics is usually inconsistent, the risk of disorientation is high.

Throughout the course of human evolution, the brain has had a MUCH longer time to develop the skills to recognize, understand, and recall visual patterns than it has had for the development of formal languages, spoken or written. As a result, the human brain recognizes shapes and drawings more easily than understanding words and numbers.

11 Barriers introduced by the use of natural language can be overcome by the use of a visual representation of the information within the article. Disorientation due to poor structure is also overcome by the mind’s ability to easily comprehend visual information. Visual representation can also provide authors a way to see relationships between concepts, allowing them to be aware of the inner structure of what they are writing.

This paper will define a light-weight and convenient system that shall utilize cognitive mapping techniques as a solution to overcome the aforementioned limitations of the wiki model through visual representation. This newly defined system aims to clarify relationships between concepts written in wikis articles. It will highlight the structure of the wiki article to in turn allow the user to more easily identify, recognize and further understand the content written within.

12 CHAPTER II BACKGROUND

2.1 The Background of Wikis

In 1994, a man by the name of Howard G “Ward” Cunningham designed and implemented a system for authoring and linking web pages using a web browser only. This software was titled

WikiWikiWeb, or Wiki for short. The term “WikiWiki” was chosen because he remembered once when he was told to take the Wiki Wiki Shuttle at the Honolulu International Airport. The word “wiki” in

Hawaiian means “fast” and Cunningham thought the word would accurately reflect his system. He describes a wiki to be “the simplest online that could possibly work.” [Leuf and Cunningham,

2001]

In 1995 Cunningham deployed his wiki as an add-on to the Portland Pattern Repository , a site hosted by Cunningham’s company, C2, where topics on software development patterns are discussed.

Since then, the wiki concept has found wide acceptance, first within other technical and academic communities and then within the general population. The most well-known example, Wikipedia, currently contains “over 10 million articles in 253 languages” [Wikipedia/Wikipedia, 2008], and is arguably the most accessed resource of information online today.

The essence of the wiki idea has been summarized on many occasions. For the sake of the argument of this paper, we are condensing the wiki concept as follows:

• “Wiki is a piece of server software that allows users to freely create and edit Web page content using any Web browser. Wiki supports hyperlinks and has a simple text syntax for creating new pages and crosslinks between internal pages on the fly.” [Wiki.org/WhatIsAWiki, 2008]

• “Wiki is unusual among group communication mechanisms in that it allows the organization of contributions to be edited in addition to the content itself.” [Wiki.org/WhatIsAWiki, 2008]

• “Like many simple concepts, "open editing" has some profound and subtle effects on Wiki usage. Allowing everyday users to create and edit any page in a Web site is exciting in that it encourages democratic use of the Web and promotes content composition by nontechnical users.” [Wiki.org/WhatIsAWiki, 2008]

13 2.2 The Background of Cognitive Mapping

Cognitive maps are a means by which we can organize and store knowledge spatially in order to allow us to conceptualize information in a visual manner. Cognitive mapping may be defined as a

“process composed of a series of psychological transformations by which an individual acquires, codes, stores, recalls, and decodes information about the relative locations and attributes of phenomena in their everyday spatial environment.” [Downs and Stea, 2005] Cognitive maps can be represented by various methods and for our system we shall consider the following three: Concept Maps, Mind Maps, and

Dialogue Maps.

2.2.1 Concept Mapping

Initiated by J. D. Novak 1960 at Cornell University and based on theories of David Ausubel,

Novak concludes that “Meaningful learning involves the assimilation of new concepts and propositions into existing cognitive structures.” [Novak and Gowin, 1984] His work culminates in the creation of concept mapping.

Concept mapping is a type of cognitive map, in this sense, representing “a structured process, focused on a topic or construct of interest, involving input from one or more participants, that produces an interpretable pictorial view (concept map) of their ideas and concepts and how these are interrelated.”

[Social Research Methods/Concept Mapping, 2008] Nodes in the map represent concepts. Links between the nodes represent relationships between concepts. These relationships are labeled on the link.

Links can be one-way, two-way, or non-directional. A set of concepts representing a specific category may be grouped and labeled as such. The map may express temporal or causal relationships between concepts.

The following example is a concept map on the topic of “Avionics”. This map was made available from the IHMC (Institute for Human and Machine Cognition).

14

Figure 2 Sample Concept Map

2.2.2 Mind Mapping

Mind mapping is another type of cognitive mapping, sometimes referred to as semantic mapping, idea mapping, or word webbing. It describes a variety of strategies designed to show how key words or concepts are related to one another through graphic representations. Mind mapping techniques have been used for centuries as a device to aide in brainstorming, visual thinking, and problem solving. Although

several people are currently credited with the creation of the modern mind map, Dr. Allan Collins is

widely know and credited through is work in the early 1960’s with mind mapping of semantic networks

of natural languages developed in the late 1950’s. Mind maps are structured around a central key word.

Like concept maps, relationships fork from this key word, but mind maps are typically more radial in structure and concepts are arranged by order of importance. Each branch represents a conceptual grouping of concepts for the topic. The non-linear structure of mind maps is said to promote non-linear

15 thought and encourage brainstorming and the realization of new ideas. Unlike concept maps, mind maps do not distinguish between concepts (nodes) and the relationships between them (links). In mind maps nodes and links are the same and are non-directional.

The following example is a mind map generated from a Wikipedia article on “Avionics”.

16

Figure 3 Sample Mind Map

17 2.2.3 Dialogue Mapping

Dialogue Mapping is a visual representation of group communication. As the group’s discussion of a particular topic unfolds, the map grows. This tool is to be used in a collaborative environment where each participant can view a representation of the discussion so far. The map serves as a “group memory,” to convey the discussion threads and dialogue paths used to make a decision on a particular topic.

The following example is a dialogue map regarding the “School Budget Decision”. This map was made available from the Compendium Institute.

Figure 4 Sample Dialogue Map

18 CHAPTER III ANALYSIS OF TOOLS

3.1 Cognitive Mapping Tools

For each of the three different cognitive mapping paradigms discussed afore, there exists a wide array of applications for which to use. We will not go into detail on every tool available; instead, we will

discuss the more popular ones for each type and choose the best tool and map for the system. For a list of

these resources, see Appendix B.

3.1.1 Concept Mapping

3.1.1.1 CmapTools

CmapTools is a freely downloadable software toolkit that facilitates the creation and manipulation of Concept Maps. The toolkit is made available by the IHMC (Institute for Human and

Machine Cognition), a not-for-profit research institute affiliated with the Florida University System. The toolkit is well designed and very intuitive, but it lacks the ability to import raw data types, like RDF

(Resource Description Framework), and does not provide an API (Application Programming Interface) for interfacing via other software.

3.1.2 Dialogue Mapping

3.1.2.1 Compendium

Compendium is a software tool used to “provide a flexible visual interface for managing the connections between information and ideas.” [Compendium, 2008] It is provided freely by the

Compendium Institute for non-commercial use. The application’s focus is on Dialogue Mapping, specifically “visualizing the connections between people, ideas, and information at multiple levels, in mapping discussions and debates, and what skills are needed to do so in a participatory manner that

19 engages all stakeholders.” [Compendium, 2008] In light of this, the dialogue mapping paradigm and its

tools, does not seem to be a good fit for the type of information we wish to convey for our system.

3.1.3 Mind Mapping

3.1.3.1 MindManager

MindManager, formerly MindMan, is an enterprise level mind mapping software application commercially developed by Mindjet Corporation. MindManager is marketed for use as a commercial application to provide services that allow teams to collaborate in real time. MindManager provides for its license holders an open API for reading in data from RSS feeds or COTS applications such as Microsoft

Excel and Outlook. Although the MindManager API would be convenient for our needs, its cost and focus towards corporate collaboration makes it prohibitive for our system.

3.1.3.2 FreeMind

FreeMind is a freely available, open source, mind mapping application licensed under the GNU

Public License. It is written in and its source code is available for download. These factors allow

FreeMind to be used and enhanced for our system via the Java API. Since it is written in Java, it can run on any system via the Java Runtime Environment or be easily ported as an application accessible from a web browser. The application provides multiple export formats and already integrates with most of the popular wiki engines, including MediaWiki, the engine behind Wikipedia.

3.1.3.2.1 FreeMind Flash Browser

Since FreeMind is open source software and is licensed under the GNU Public License (GPL), developers are free to modify the code and publish their work. In fact, a group of artists and developers have created a Flash browser for FreeMind and have released it under the GPL. Acting as a mini browser, it handles browser events, such as clicking hyperlinks for navigation, allowing navigation

20 forward and back in history, and copying results to the clipboard. Map specific events are also handled, such as the ability to unfold and fold all nodes, pan and zoom, and opening links in a new window. Since it uses Flash as the container instead of Java, users can interface it via the Adobe Flash plugin, rather than

a more heavyweight Java applet running in a Java Runtime Environment. Its API is accessible via

JavaScript, allowing for easy integration to the View component of the system we are designing.

3.1.4 Cognitive Mapping Conclusion

“A Mind Map is a diagram used to represent words, ideas, tasks, or other items linked to and arranged radially around a central key word or idea.” [Wikipedia/Mind-map, 2008] This is the paradigm that fits best with the information we wish to convey, and of the three types of cognitive maps, a mind

map is the fastest and easiest to use. The FreeMind Flash browser is a very convenient, powerful, and

flexible tool for rendering cognitive maps from a given data source. For these reasons we shall use a

mind map as the model for the cognitive map and use the FreeMind Flash browser for its rendering.

3.2 Wikis

3.2.1 Platypus Wiki

Platypus Wiki is “a project to develop an enhanced Wiki Wiki Web with ideas borrowed from the

Semantic Web.” [Platypus, 2008] The Platypus Wiki software directs authors to encode semantic data within wiki pages using RDF and Web Ontology Language (OWL). This is beneficial because articles written for a Platypus Wiki will already have a semantic description of the underlying information defined. This highly structured information can easily be rendered as a cognitive map without the need for parsing the source text. Unfortunately, the model for the information in the article is established via the inclusion of formal notation to the source by the authors. This action must occur manually upon the creation, or update, of the article. At this time, this action cannot be performed programmatically after the fact though many researchers are looking into this. Relying on the semantic information at this time

21 limits the scope and flexibility of our system. Only new articles written for this wiki could be used, thus preventing the users of our system from having access to the millions of other wiki articles already written elsewhere.

3.2.2 MediaWiki

MediaWiki is “a free software wiki package originally written for Wikipedia. It is now used by several other projects of the non-profit Wikimedia Foundation and by many other wikis.” [MediaWiki,

2008] MediaWiki provides a rich feature set and provides a mechanism for users to develop extensions for additional functionality. Rich text content is also supported, allowing for the rendering of mathematical formulas, graphs, musical notation, and foreign language notation. Such extensive features and ease of use have made MediaWiki one of the most popular wiki software applications today, most notably, through its use by Wikipedia.

3.2.2.1 Semantic MediaWiki

Semantic MediaWiki is an extension to MediaWiki. Semantic MediaWiki allows for the addition of semantic annotations to wiki articles. Much like Platypus Wiki, these annotations need to be added to the articles source to be used. Most available MediaWiki articles are not created using the Semantic

MediaWiki extension, so we face the same issues as discussed with Platypus Wiki.

3.2.3 Wiki Conclusion

Due to the enormous volume of topics already published in Wikipedia and other popular wikis that are also running MediaWiki, our system would be mostly beneficial if we target MediaWiki, specifically Wikipedia, as the source material for our system. Almost none of these topics contain the semantic annotations used by Semantic MediaWiki. This implies that in order to process these articles we would require the system to have a means of parsing the wiki source text in a fashion that can be used for rendering the topic in a mind map.

22 3.3 Source Parsing Tools

3.3.1 Natural Language Processing

One method we can apply to generate structure for our mind map would be to employ Natural

Language Parsing algorithms. Natural language parsing (NLP) is “a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages.” [Wikipedia/NLP, 2008] Using defined lexicons of a given language, NLP can process the text of an article into an OWL ontology. This ontology can then be processed for its semantic relationships and rendered by the mind mapping software.

3.3.2 Structure Parsing

Another method for parsing the content could be to ignore the semantic meaning of the text altogether. If we were to consider that the main concepts of a well laid out article can be represented by the structure of the article itself and its references to outside sources, then we can apply a simple algorithm to parse this structure and describe the topic with a mind map regardless of the meaning of its content.

3.3.3 Source Parsing Conclusion

The use of Natural Language Parsing is a science that can be written about extensively on its own. Unfortunately, the complexities involved place it beyond the scope of the system we are defining here. Additionally, this technology is difficult to deploy for the multiple languages prevalent in the wiki

communities and NLP cannot understand words that it has never seen before. A statistical based

language parser can help resolve this, but NLP based on statistics is less accurate and does not understand

words. The universal and light weight nature of the structure parsing approach makes it the best fit for the

process of parsing the wiki source text.

23 CHAPTER IV WIKI-POWERED CONCEPT MAP DESIGN

4.1 Scope

This system is being designed as an academic proof of concept. Its goal is to utilize established standards and techniques in the field of knowledge modeling in a new and resourceful way. This initial

phase will interface COTS, Commercial Off-The-Shelf, products via their APIs using the Java

programming language.

Future phases will focus on limiting the use of COTS in favor of developing new algorithms to

have capabilities that further realize the goals of the system. Such capabilities may be the application of

natural language processing and the processing of additional data sources, such as raw text. Engines for the rendering of other cognitive maps may also be developed for processing multiple types of content.

4.2 Design Process

The design for this system was performed using an object oriented approach. Analysis was performed by identifying the use cases and detailing the flow of events for each. The Contextual,

Operational, Logical, and Physical aspects of the architecture are derived and explained in detail below.

4.2.1 Requirements

Since this system is an academic proof of concept, there are no formal customer requirements; therefore, all requirements have been derived though the analysis of the design. These requirements are provided as follows:

Table 1 System Requirements ID Description Search Engine Parser Map Browser

1 Any MediaWiki, including Wikipedia, shall be X searchable 2 Search wildcards shall be supported X

24 3 Multiple search results shall be handled X

4 MediaWiki disambiguation pages shall be handled X

5 NULL search results shall be handled X

6 MediaWiki error messages shall be handled X X

7 MediaWiki articles shall be processed X

8 Mind maps shall be dynamically rendered X

9 Mind maps shall be imbedded X

10 Mind maps shall be exportable X

11 Mind maps shall be browsable X

12 Mind maps shall be bookmarked X

13 Mind maps shall be panned X

14 Mind maps shall be zoomed X

15 Mind maps shall be refreshed X

16 Mind maps shall be recentered X

17 Mind maps shall link to source content X

18 Mind maps shall collapsible X

19 Mind maps shall expandable X

4.2.2 System Architecture

4.2.2.1 Contextual

In this section, we shall perform the following:

• Purpose: Define the area of interest and its interactions

• Activities: Gather information, identify gaps and overlaps, resolve conflicts, set scope,

boundaries and quality attributes

25 • Artifacts:

o Interrogatives

o Context Diagrams

o Lists

4.2.2.1.1 Interrogative Data

Scope the context of the architecture via interrogatives: why, who, what, how, where and when.

4.2.2.1.1.1 Why does the work need to be done? (Goals)

The work needs to be done because currently the concepts within the content of wiki articles are loosely coupled and the relevance of the material and overall context is easily lost.

4.2.2.1.1.2 Who are the participants? (Stakeholders)

The participants of the system are the data analysts using the system to mine for pertinent information and the content authors for whose data the system will interact.

4.2.2.1.1.3 What information and entities are used? (Data / Entities)

The data the system uses is the content of the bodies of the articles published to the wiki. The articles in the wiki shall be accessible online for processing by the system.

4.2.2.1.1.4 How do they operate? (Processes)

A content processing component shall use algorithms to process the content of the wiki article and produce a model of its structure.

A mapping component shall then render the structural data into a visual representation corresponding to the layout of a mind map.

A browser component shall display the map and handle user driven events, such as refresh, zoom, pan, and click.

26 4.2.2.1.1.5 Where do they operate? (Geography)

The system is web based, so by definition it is geographically agnostic. The cognitive mapping system and the wiki server may reside on the same host or exist on separate systems. If on separate systems, the two hosts must be able to communicate within their representative network.

4.2.2.1.1.6 When are the various actions taken? (Timeline)

An author publishes a wiki page in accordance to the markup language. They reference other wiki articles or external sources and imbed media such as images, video, and audio. The user of the system will perform a search on a particular topic of interest. The user will then select from the search results. Each search result corresponds to a published wiki article. Upon selecting the search result, the user is presented with a mind map of the content of the article. The user can select nodes in the map and follow relationships to other content. The user can also choose to view the source content from which the

map was derived.

4.2.2.1.1.7 Relationship Matrix (Why vs. Who)

The analyst will want to discover relationships and relevance of the subject they are researching.

The author may wish to know that they have established the relationships they wish to present, but they already believe their content is relevant, so would not use such a system to determine this.

Table 2 Relationship Matrix Analyst Author

Display Relationships X X

Display Relevance X

27

4.2.2.1.2 Context Diagram

Scope the players involved in the architecture

Figure 5 Context Diagram

4.2.2.1.3 Scope and Boundaries

Define the attributes that identify the system’s scope and boundaries, i.e., the data and services that will be provided by the system, and the ones that will be externally required. Define the scope and boundaries from multiple viewpoints via an in/out list.

4.2.2.1.3.1 Deliverables

Define the outline of the system so that it is clear whether entities and actions near the perimeter of the system are “in” or “out”.

28 4.2.2.1.3.1.1 External

Table 3 External Deliverables Topic In Out

Software license agreement X

Wiki Concept Map System software executables X

Wiki Concept Map System software source code X

User Documentation X

Requirements specification X

API documentation X

4.2.2.1.3.1.2 Functionality

Table 4 Functionality Deliverables Topic In Out

Interface any wiki X

Dynamically render mind-maps X

Cache maps in database X

Cache wiki articles in database X

Cache wiki data in database X

Support HTTPS (port 43) X

Support processing of non-English content X

Imbedded mind-map user interface X

Bookmark maps X

Deploy to any web server X

29 4.2.2.1.3.1.3 Data

Table 5 Data Deliverables Topic In Out

Process textual data X

Process imbedded media X

Process non-ASCII text X

Import external data from files X

Export to image X

Export to various mind-map COTS formats X

4.2.2.1.3.1.4 Technical Structure

Table 6 Technical Deliverables Topic In Out

Allow for remote wiki server X

Allow Wiki Concept Map System server and Wiki server (with database) on X

same host

Server to be customer furnished X

4.2.2.1.4 Quality Attributes

Support the evaluation and enhancement of the architecture in areas of key importance by defining the properties which the architecture’s quality will be judged by the stakeholders.

Possible Attributes:

• Modular

• Open

30 • Standards Compliant

• Loosely Coupled

• Robust to Errors

• Functional

• Performance

• Maintainability

• Testability

• Reliability

• Efficiency

• Adaptability

• Usability

Table 7 Quality Pugh Chart Modular Open Compliant Standards Coupled Loosely to Errors Robust Functional Performance Maintainability Testability Reliability Efficiency Adaptability Usable Rank Modular <^^^^^^^^^^^ 1 Open ^^^^^^^^^^^ 0 Standards Compliant < ^ ^ ^ ^ ^ ^ ^ ^ ^ 3 Loosely Coupled ^ ^ ^ ^ ^ ^ ^ ^ ^ 2 Robust to Errors < < < < ^ < < < 11 Functional < < < ^ < < ^ 9 Performance < < ^ < < ^ 8 Maintainability < ^ < ^ ^ 6 Testability ^ ^ ^ ^ 4 Reliability < < ^ 11 Efficiency ^ ^ 5 Adaptability ^ 7 Usable 11

31 Based on the results of the Pugh analysis of the set of attributes determined to be important to the stakeholders, the top quality attributes (in no particular order) are:

• Robust to Errors

• Reliability

• Usable

4.2.2.2 Operational

In this section, we shall perform the following:

• Purpose: Describe the work that the system must do

• Activities: Identify actors and actions and describe their interactions

• Artifacts:

o Scenarios

o Use cases

o Activity Diagrams

4.2.2.2.1 Operational Concept

• Mission: Construct cognitive map of a wiki article

• Capabilities : Contextual processing, Establish cognitive relationships, Automated layout

rendering engine, Interactive map interface

• Scenario : The user of the system will perform a search on a particular topic of interest. The

user will then select from the search results. The user is presented with a mind map of the

content of the article. The user can select nodes in the map and follow relationships to other

content or choose to view the source content from which the map was derived.

• Use Case : See below

32 4.2.2.2.2 Key Scenarios

• Search for wiki articles

• Render mind-map of wiki article

• Browse linked wiki articles

• View original wiki article

4.2.2.2.3 Actors

• Analyst

• Wiki

• Wiki Article

• Wiki Concept Map System

4.2.2.2.4 Use Cases

Table 8 Use Case - Search for wiki articles Description Search for wiki articles

Goal Return article or articles whose content matches the search criteria

Actors Analyst, Wiki, Wiki article, Wiki Concept Map System

Preconditions Wiki is accessible and contains published articles

Triggers N/A

Flows Basic : Browse to Wiki Concept Map System, enter search criteria in search

field, receive search results

Alternate : Browse to Wiki Concept Map System, enter search criteria in

search field, receive message indicating no results found

33 Exception : Browse to Wiki Concept Map System, enter invalid search

criteria in search field, receive error message from the system

Post Conditions N/A

Supplemental N/A

Requirements

Table 9 Use Case - Render mind-map of wiki article Description Render mind-map of wiki article

Goal Display the mind-map for a particular wiki article

Actors Analyst, Wiki, Wiki article, Wiki Concept Map System

Preconditions Wiki is accessible and contains published articles. Wiki search has returned

result(s).

Triggers N/A

Flows Basic : Select entry in the search results, view mind-map of selected wiki

article

Alternate : Select browser favorite (bookmark) of previously viewed map,

view mind-map of selected wiki article

Exception : Select entry in the search results, receive error message from

the system

Post Conditions N/A

Supplemental N/A

Requirements

34 Table 10 Use Case - Browse linked wiki articles Description Browse linked wiki articles

Goal Analyst can navigate from the current map to a linked article

Actors Analyst, Wiki, Wiki article, Wiki Concept Map System

Preconditions Wiki is accessible. Analyst is viewing the map of a published article. The

wiki article references other wiki article(s).

Triggers N/A

Flows Basic : Select linked article in mind map, receive mind map of linked article

Alternate : N/A

Exception : Select linked article in mind map, receive error message from

the system regarding page not found.

Post Conditions N/A

Supplemental N/A

Requirements

Table 11 Use Case - View original wiki article Description View original wiki article

Goal Analyst can view the original contents of the article from which the current

map is rendered

Actors Analyst, Wiki, Wiki article, Wiki Concept Map System

Preconditions Wiki is accessible. Analyst is viewing the map of a published article.

Triggers N/A

Flows Basic : Select center node of current mind-map, receive source wiki article

35 in current browser page

Alternate : Select center node of current mind-map while selecting hotkey

for a new tab, receive source wiki article in a new browser tab.

Exception : Select linked article in mind map, receive error message from

the system regarding page not found.

Post Conditions N/A

Supplemental N/A

Requirements

36 4.2.2.2.5 Activity Diagram

Model the logic captured by the use cases.

Figure 6 Activity Diagram

4.2.2.3 Logical

In this section we shall perform the following:

• Purpose: Relate the structure of the solutions to the structure of the work to be done

• Activities: Decompose, aggregate and allocate, identify patterns

• Artifacts:

37 o Functional Decomposition Tree

o Logical Block diagrams

o Data flow diagrams

o Sequence diagrams

4.2.2.3.1 Functional Decomposition

Break down of the functional components of the system

Figure 7 Functional Decomposition Breakdown

4.2.2.3.2 Logical Solution

The sequence of interactions required to use the system

1. Search wiki

2. Select search result

3. View mind-map

4. Select linked article

5. View wiki article

38 4.2.2.3.3 Logical Block Diagram

Logical relationships of the functional components of the system

Figure 8 Logical Block Diagram

4.2.2.3.4 Data Flow Diagram

Flow of data within the system

Figure 9 Data Flow Diagram

39 4.2.2.3.5 Sequence Diagrams

Possible sequence of events in the system

4.2.2.3.5.1 Search and View Wiki Mind-map

Figure 10 Sequence Diagram - Search and View Wiki Mind-Map

4.2.2.3.5.2 Browse Wiki Mind-map

Figure 11 Sequence Diagram - Browse Wiki Mind-Map

40 4.2.2.3.5.3 View Wiki Mind-map Source

Figure 12 Sequence Diagram - View Wiki Mind-Map Source

4.2.2.3.6 Evaluation

In evaluating the results of the logical breakdown against the quality attributes from the above section, we see no indication that the logical model will not result in a system that is reliable, usable, and robust to errors.

4.2.2.4 Physical

In this section, we shall perform the following:

• Purpose: Define the physical attributes and organization of the system

• Activities: Allocate, partition, define interfaces, select technical approaches

• Artifacts:

o Block diagrams

o Weighted matrices

o Interface definitions

o Data schema

41 4.2.2.4.1 Physical Components

4.2.2.4.1.1 Hardware

• System Server

• Wiki Server

• Server Rack

• Server Room

• Workstation

• Network Infrastructure

4.2.2.4.1.2 Software

• Application Server

• Browser

• Adobe Flash browser plugin

• System Executables

• Java Runtime Environment

• FreeMind Flash browser

• MediaWiki Application

4.2.2.4.1.3 Data

• Published Wiki Articles

4.2.2.4.1.4 Interface

• HTTP Protocol

42 4.2.2.4.2 Physical Approach

The physical approach is to connect a physical web server to the same network that has access to the wiki. The server may be installed in a rack in a server room in order to handle high traffic volume, or run on a workstation for limited individual use. The system may or may not be installed to the same physical server as the wiki. It is required that the network infrastructure be in place for the system server

to communicate with the wiki server. Users shall access the system via a workstation.

4.2.2.4.3 Static View of Physical System

Figure 13 Physical System

4.2.2.4.4 Trade of approaches to quality attributes

4.2.2.4.4.1 Usability

IEEE defines usability as, “the ease with which a user can learn to operate, prepare inputs for, and interpret outputs of a system or component.” [IEEE, 1990]

The architecture provided allows for the presentation layer to be customizable to the extent that the resulting map is understandable and navigable. Language is derived from the source content and should be considered clear and simple in respect to the context of the subject. The presentation layer shall

43 provide the mechanisms to allow the user to navigate within the map, or follow relationships to other subjects.

4.2.2.4.4.2 Robust

As defined in Wikipedia, a robust architecture “is said to be one that exhibits an optimal degree of fault-tolerance, backward compatibility, forward compatibility, extensibility, reliability, maintainability, availability, serviceability, usability, and such other quality attributes as necessary and/or desirable.”

[Wikipedia/SA, 2008]

IEEE defines robustness as, “the degree to which a system or component can function correctly in the presence of invalid inputs or stressful environment conditions.” [IEEE, 1990]

Data for the system is derived from published wiki articles. Maps of such articles are rendered dynamically. In the event that a search does not return any results, no map based on the users search criteria will be rendered. Search results in the system are dependent on information published to the wiki.

If for any reason the data is unsearchable; the system could not be used. Certain methods to remove this coupling may include the pre-processing and caching of wiki articles. Such a design introduces additional requirements for increased storage and synchronization. Resolution of these issues shall be considered in

future phases.

4.2.2.4.4.3 Reliability

IEEE defines reliability as “the ability of a system or component to perform its required functions under stated conditions for a specified period of time.” [IEEE, 1990]

For this system, the biggest factor of reliability is that when a user selects a link, either in the search results, or navigating the mind-map, that the system responds with pertinent information. The

W3C (World Wide Web Consortium) states that, “successful link traversal generally means finding a resource with perfect precision and recall, and retrieving an authentic representation of the resource in a

44 timely fashion, i.e. with sufficiently low latency.” [W3C, 2008] In a web application, successful link traversal correlates to the application’s reliability.

Successful link traversal within the system is dependent on the validity of the information published to the wiki. If for any reason the linked data is unreachable; the system could not be used.

Certain methods to remove this coupling may include the implementation of a web crawler to validate links within source material. Such a design introduces additional requirements for increased processing.

Resolution of these issues shall be considered in future phases.

45 CHAPTER V System Mockup

5.1 Search and View Wiki Mind-map

5.1.1 Opening Page

As defined in the architecture, the system is accessed via a web browser and is configured to search a particular wiki, in this example Wikipedia. The interface to the system may appear like the mockup below. A simple Google-like form is presented containing a single text field to enter search criteria.

Figure 14 Mockup Opening Page

46 5.1.2 Search and View Results

For our case, we have entered a search for the topic “Avionics”. Using the MediaWiki API, we submit our search criteria to Wikipedia and receive the topic page as a result. Internally, the context parser processes the source text and submits the information to the FreeMind Flash browser which is imbedded in page below the search form. The FreeMind Flash browser constructs the mind map from the data and displays it to the user within the browser.

Figure 15 Mockup Search and View Results

47 5.1.3 Search the Contents of the Map

The integrated FreeMind Flash browser allows us to search within the map itself. A text field is provided where the search criteria is entered. Upon submitting the criteria, the first field that matches the

criteria is highlighted. Resubmitting the search will result in the next match being highlighted, etc.

Figure 16 Mockup Search the Contents of the Map

48 5.2 Browse Wiki Mind-map

The integrated FreeMind Flash browser allows us to navigate the map. Functions for pan, and zoom are provided.

5.2.1 Expand the Nodes

The initial map presented by the browser has all branches of the nodes fully collapsed. This presents to the user the main structure of the article. The user can then expand all the nodes or choose to drill down a particular path. Selecting the “+” will expand the next level below the selected node. In our example “Main categories” has been expanded and all the child nodes are presented.

Figure 17 Mockup Expand the First Level Node

49 Nodes can contain other nodes or be leaf nodes that can no longer be expanded. In our example, we further drill down by expanding the node “Aircraft avionics”. This node represents one of the many main categories within the article. All of the child nodes within “Aircraft avionics” are leaf nodes and cannot be further expanded.

Figure 18 Mockup Expand Second Level Node

50 5.3 View Wiki Mind-map Source

5.3.1 View Main Topic

As with all mind-maps, the topic of the map, the term that we searched by, is signified by the center node. This correlates to the top level, or main topic, of the wiki article. If we wish to view the

contents of the article starting at the top level, simply select the node in the browser.

Figure 19 Mockup Select Main Topic

The FreeMind Flash browser handles the event and in a new browser window displays the wiki page at the location of the node selected, in this case the top level.

51

Figure 20 Mockup View Main Topic

52 5.3.2 View Categories

Any node in the map can be selected in the browser to show the content. Nodes in the map surrounded by a border correspond to distinct sections in the Wikipedia article. For example, selecting the node “Aircraft avionics” will present to us, in a new window, the Wikipedia article at the corresponding category, or chapter.

Figure 21 Mockup Select Category

The FreeMind Flash browser handles the event and in a new browser window displays the wiki page at the location of the node selected, in this case “Aircraft avionics”.

.

53

Figure 22 Mockup View Category

54 5.3.3 View Linked Topic

Not all nodes correspond to sections in the article matching the topic. Links within the source article that link to other Wikipedia topics are also rendered as nodes in the mind map. These nodes are indicated on the map by a green double-arrow icon. For example, selecting the node “direct current / DC” will present to us, in a new window, the Wikipedia article matching that topic.

Figure 23 Mockup Select Linked Topic

The FreeMind Flash browser handles the event and in a new browser window displays the wiki article for the node selected, in this case “Direct Current”.

55

Figure 24 Mockup View Linked Topic

56 5.3.4 View External Page

Not all links in Wikipedia link to other Wikipedia topics. Links to external sites are also rendered as nodes in the mind map. External links are identified by the system by noting that the domain in the link’s URL (Uniform Resource Locator) is different than the domain of the wiki in which we are searching. External links can be distinguished by the icon of an arrow pointing away from a page. For example, selecting the node “400Hz Electrical Systems” will present to us, in a new window, the external page referenced by the link in the article.

Figure 25 Mockup Select External Page

57 The FreeMind Flash browser handles the event and in a new browser window displays the external page linked in the Wikipedia article, in this case the site referenced by the link having the label

“400Hz Electrical Systems”.

Figure 26 Mockup View External Page

58 CHAPTER VI SUMMARY AND CONCLUSIONS

Mind mapping is a technique for visually organizing and working with information. It is used as a tool to aid in creativity, organization, productivity, and memory. Mind mapping can help capture ideas, organize, prioritize, and visualize complex information. This technique can be used to work with both the big picture and the details of a specific topic. It can reveal potential connections and where there is a need for additional information. Mind mapping is a useful tool for any project involving large amounts of information, including, research, writing, brainstorming, and project planning and management. One area that many of these fields describe, and where mind maps can be particularly beneficial, is the analysis of published wiki articles.

Traditionally mind maps were created by hand on a piece of paper. Naturally this process has been enhanced through the use of commercial software such as MindManager, or open source software such as FreeMind. Although very useful, these tools still require us to manually construct the mind map.

This paper describes a design for a light weight system, using existing technologies, that automatically generates mind maps from a given data source, specifically Wikipedia articles.

As technology continues to progress in the fields of cognitive mapping and language processing, it is certain that ideas such as those presented here will be further expanded upon. Future versions may include the integration of natural language processing such that the relationships within the granularity of individual lexicons can be derived and mapped. The relationship between a text and its map may even become bi-directional, where not only does text generate a mind map, but modifications to the map also modify the source text. Systems such as this will help automate the visualization of complex information

and eventually play a role in its organization and publication, thus facilitating the spread of information.

59 REFERENCES

1. [Wikipedia/Wiki, 2008] Wikipedia Website: http://en.wikipedia.org/wiki/Wiki .

2. [Wikipedia/Mind-map, 2008] Wikipedia Website: http://en.wikipedia.org/wiki/Mind_map .

3. [Wikipedia/RDF, 2008] Wikipedia Website: http://en.wikipedia.org/wiki/Resource_Description_Framework .

4. [IEEE, 1990] IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries. (New York, NY: 1990).

5. [Wikipedia/SA, 2008] Wikipedia Website: http://en.wikipedia.org/wiki/Systems_architecture .

6. [W3C, 2008] The World Wide Web Consortium (W3C) Website: http://www.w3.org/Propagation/reliable-links.html .

7. [Leuf and Cunningham, 2001] The Wiki Way: Quick Collaboration on the Web , Author : Bo Leuf andWard Cunningham, Publisher : Addison-Wesley, Boston, 2001.

8. [Wikipedia/Wikipedia, 2008] Wikipedia Website: http://en.wikipedia.org/wiki/Wikipedia .

9. [Wiki.org/WhatIsAWiki, 2008] Wiki.org Website: http://wiki.org/wiki.cgi?WhatIsWiki .

10. [Downs and Stea, 2005] Image and Environment: Cognitive Mapping and Spatial Behavior , Author : Roger M. Downs and David Stea, Publisher : Aldine Publishing Company, Chicago, 1973.

11. [Social Research Methods/Concept Mapping, 2008] Social Research Methods Knowledge Base Website: http://www.socialresearchmethods.net/kb/conmap.htm .

12. [Novak and Gowin, 1984] Learning How to Learn, First Edition, Author : Joseph D. Novak and D. Bob Gowin, Publisher : Cambridge University Press, Cambridge, 1984.

13. [Compendium, 2008] Compendium Institute Website: http://compendium.open.ac.uk/institute/ .

14. [Platypus, 2008] Platypus Wiki Website: http://platypuswiki.sourceforge.net/ .

15. [MediaWiki, 2008] MediaWiki Website: http://www.mediawiki.org/wiki/MediaWiki .

16. [Wikipedia/NLP, 2008] Wikipedia Website: http://en.wikipedia.org/wiki/Natural_language_processing .

60 APPENDIX A ACRONYM LIST

Acronym Definition ASCII American Standard Code for Information Interchange API Application Programming Interface COTS Commercial Off-The-Shelf GNU GNU’s Not Unix GPL GNU Public License HTML HyperText Markup Language HTTP HyperText Transfer Protocol HTTPS HTTP over SSL IEEE Institute of Electrical and Electronics Engineers IHMC Institute for Human & Machine Cognition JRE Java Runtime Environment NLP Natural Language Processing OWL Web Ontology Language RDF Resource Description Framework RSS Really Simple Syndication SSL Secure Socket Layer URL Uniform Resource Locator W3C World Wide Web Consortium

61 APPENDIX B RESOURCE LIST

• Adobe Flash Plugin: http://www.adobe.com/products/flashplayer/

• Compendium Institute: http://compendium.open.ac.uk/institute//index.htm

• FreeMind: http://freemind.sourceforge.net/wiki/index.php/Main_Page

• FreeMind Flash Browser: http://www.efectokiwano.net/mm/

• IEEE: http://www.ieee.org/portal/site

• IHMC CMapTools: http://cmap.ihmc.us/Index.html

• Java Runtime Environment: http://www.java.com/en/

• MediaWiki: http://www.mediawiki.org/wiki/MediaWiki

• Mindjet MindManager: http://www.mindjet.com/products/mindmanager_pro/default.aspx

• Platypus Wiki: http://platypuswiki.sourceforge.net

• Semantic MediaWiki: http://semantic-mediawiki.org/wiki/Semantic_MediaWiki

• W3C: http://www.w3.org

• Wikipedia: http://en.wikipedia.org/wiki/Main_Page

• WikiMindMap: http://www.wikimindmap.org

• WordNet: http://wordnet.princeton.edu/

62