The Pennsylvania State University The Graduate School College of Earth and Mineral Sciences

GEOAGENT-BASED KNOWLEDGE SYSTEMS

A Thesis in

Geography

by

Chaoqing Yu

© 2005 Chaoqing Yu

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

December 2005

The thesis of Chaoqing Yu has been reviewed and approved* by the following:

Donna J. Peuquet Professor of Geography Thesis Adviser Chair of Committee

Brenton M. Yarnal Professor of Geography

Alan M. MacEachren Professor of Geography

John Yen Professor of Information Sciences and Technology

Roger M. Downs Professor of Geography Head of the Department of Geography

*Signatures are on file in the Graduate School.

ABSTRACT

Modern geography focuses on studying processes. In addition to observed phenomena, the study of geographic processes must (and does) place emphasis on understanding how components interact within geographic systems. As a fundamental tool for geographic representation and spatial analysis, current GISystems (geographic information systems) are nevertheless still data centered. While they are good at representing “what” and “where” information, they have limited capabilities in representing higher-level knowledge. This is because in the current GISystems there is a lack of means of capturing and representing human understanding of geographic processes to address “how” and “why” questions. In addition, non-observational factors such as laws, policies, regulations, plans, and cultural elements (e.g. religions, customs) cannot be easily represented.

Instead of the traditional data-centered approach, this dissertation presents a knowledge-oriented strategy for the representation of geographic processes. To reach that end, two major steps are adopted: (1) introducing the concept of GeoAgents as the spatiotemporally distributed knowledge-representation components, and (2) presenting an integrated approach to incorporate multiple knowledge-representation techniques with geospatial . GeoAgents are defined in this dissertation as spatial, dynamic, and scale-dependent agents within an explicitly geographic context. By incorporating

GeoAgents with graph-based concept maps, rule-based expert systems, quantitative models, and geospatial databases, this research develops a Java-based prototype —

iii GeoAgent-based Knowledge System (GeoAgentKS) — that allows the representations of diverse kinds of geographic knowledge and spatial data to be integrated in a single cohesive software system.

To examine the knowledge-oriented strategy of geographic representation in real- world problems, GeoAgentKS are employed in a case study to represent the complex geographic processes relevant to community water systems (CWSs) in Central

Pennsylvania. In this case study, geographic knowledge is captured via interpretation of the pre-existing documents and computer-based-concept-mapping interviews with domain experts. To evaluate the usability of GeoAgentKS, evaluation interviews with different experts and novices were conducted to assess the adequacy of the knowledge representation and the effectiveness in conveying knowledge. The experts in the evaluation interviews believed that it was possible to use the GeoAgentKS to represent the complex, dynamic and scale-dependent human-environment interactions. And the knowledge stored in the GeoAgentKS could be quickly learned by novices.

iv TABLE OF CONTENTS

List of figures...... viii List of tables...... x Acknowledgements ...... xi Chapter 1 Introduction ...... 1 1.1 Introduction ...... 1 1.2 What are data, information, and knowledge?...... 4 1.3 Geographic processes and knowledge representation ...... 8 1.3.1 Defining process...... 8 1.3.2 Complexities in representing process...... 9 1.4 Existing strategies for representing knowledge...... 11 1.5 Goals and methods of the current research...... 14 1.6 Case study: representing human-environment interactions relevant to CWSs...... 15 1.7 Organization of this dissertation...... 16 Chapter 2 Existing knowledge-related representation techniques...... 18 2.1 Bridging data and knowledge...... 19 2.1.1 Categorization ...... 20 2.1.2 Fuzzy set theory ...... 22 2.1.3 Ontology...... 22 2.2 Explicit and implicit knowledge representation techniques...... 25 2.2.1 Rule-based knowledge systems...... 25 2.2.2 Graph-based knowledge representation ...... 28 2.2.3 Implicit knowledge representation strategies...... 36 2.3 Distributed knowledge representation: intelligent agents ...... 40 2.3.1 Definitions of intelligent agents ...... 41 2.3.2 Typology of agents...... 42 2.3.3 Agent applications in GIScience...... 46 2.4 Summary...... 49 Chapter 3 A description of GeoAgents and implementation in an integrated representation scheme ...... 51 3.1 Introduction ...... 51 3.2 Definition of GeoAgents ...... 52 3.2.1 The basic concept of GeoAgents...... 52 3.2.2 Interactions between GeoAgents and geospatial databases ...... 54 3.2.3 Communication among GeoAgents ...... 55 3.2.4 Concept maps in representing world state dynamics ...... 56 3.3 A brief example: using multi-GeoAgents via concept graphs for representing geographic processes ...... 57

v 3.4 Facilitating geographic knowledge sharing and decision making...... 60 3.5 Summary...... 61 Chapter 4 Implementation of the GeoAgent-based Knowledge System...... 62 4.1 Introduction ...... 62 4.2 Open source software adapted for GeoAgentKS...... 63 4.2.1 MadKit ...... 63 4.2.2 JESS ...... 64 4.2.3 Touchgraph...... 64 4.2.4 GeoTools ...... 65 4.3 Integrating multiple representation techniques in GeoAgentKS...... 65 4.3.1 The architecture of GeoAgentKS and overall information flow...... 66 4.3.2 The user interface...... 68 4.4 Knowledge acquisition for GeoAgentKS...... 74 4.5 Summary...... 75 Chapter 5 Case study design: geographic knowledge acquisition, representation, and performance evaluation ...... 76 5.1 Objectives of the case study ...... 76 5.2 Overall methodology ...... 77 5.2.1 Knowledge acquisition and representation ...... 78 5.2.2 Evaluation of the knowledge representation ...... 80 5.3 Capturing knowledge and populating the in GeoAgentKS...... 81 5.3.1 Capturing knowledge via interpretation of text-based documents...... 82 5.3.2 Capturing domain knowledge via interviews...... 83 5.3.3 Integration of the discrete knowledge representations...... 86 5.3.4 Building a corresponding ...... 87 5.4. Evaluating the integrated knowledge base ...... 88 5.4.1 Evaluation by domain experts...... 88 5.4.2 Evaluation by novices ...... 89 5.4.3 Scenario simulation for a comprehensive performance test of GeoAgentKS....90 5.5 Summary...... 91 Chapter 6 Case study: representing the process of community water system management in central pennsylvania...... 92 6.1 Introduction...... 92 6.2 General characteristics of CWSs and an overview of the case study...... 93 6.3 Knowledge acquisition and representation...... 95 6.3.1 Formalizing behavioral rules and concept maps from documents...... 96 6.3.2 Capturing experts' how /why knowledge in computer-supported interviews....104 6.4 Integration of diverse knowledge representations and geospatial data ...... 115 6.4.1 Integrating diverse knowledge representations...... 116 6.4.2 Integrating the knowledge representation with the database ...... 120 6.5 Evaluation and use of knowledge representation in GeoAgentKS...... 123 6.5.1 The experts' evaluations ...... 124 6.5.2 Evaluations of non-experts...... 125 6.5.3 Complex scenario simulation...... 128 6.6 Summary...... 141

vi Chapter 7 Summary and discussion...... 142 7.1 The research goal and approaches...... 142 7.2 The functionality of GeoAgentKS...... 143 7.3 Contributions of this research...... 144 7.4 Future research challenges...... 146 7.5 Conclusions ...... 147 References...... 149 Appendices...... 164 Appendix A: agent-related functions in Madkit (http://www.MadKit.org) ...... 164 Appendix A.1 The general definitions and functions in Madkit...... 164 Appendix A.2 Agent-related JESS functions provided by Madkit ...... 164 Appendix B: Two examples of the interviewing transcripts ...... 166 Appendix B.1 Transcript A ...... 166 Appendix B.2 Transcript B ...... 169 Appendix C: Examples of the execution results from GeoAgents’ rule firing for simulation example...... 173 C.1 Independent responses to a power-outage...... 173 C.2 Local cooperative responses to a power outage ...... 173 C.3 Hierarchical social-environment interactions in drought conditions...... 175 Appendix D The standards of the indices for determining drought severity...... 180 D.1 Precipitation Deficit Drought Indicators ...... 180 D.2 Drought triggering criteria for the stream flows, groundwater levels, and PHDI...... 180 D.3 Drought triggering criteria for the reservoir in Central Pennsylvania...... 180

vii LIST OF FIGURES

Figure 2.1: (a) forward chaining; (b) backward chaining...... 26 Figure 2.2: A sample semantic network regarding a chair ...... 30 Figure 2.3: A frame-based representation of a chair ...... 33 Figure 2.4: A conceptual graph, representing ‘a cat is on a mat’ ...... 35 Figure 2.5: A neuron of the neural network...... 36 Figure 2.6: A subsumption architecture...... 38 Figure 2.7: The hierarchical conceptual framework for agents ...... 42 Figure 3.1: The interactions among GeoAgents and geospatial databases...... 55 Figure 3.2: Using multi-GeoAgents for representing the social and natural elements influencing community water systems and the interaction among them...... 58 Figure 4.1: Connectivity of the major GeoAgentKS modules/facilities ...... 66 Figure 4.2: The user interface of the GeoAgent-based Knowledge System...... 69 Figure 4.3: The properties of a concept node ...... 71 Figure 4.4: Adding the data environment for the GeoAgent ...... 72 Figure 4.5: GeoAgents are aware of their environmental status via checking the concept nodes ...... 73 Figure 6.1: Centre County, Pennsylvania ...... 92 Figure 6.2: The CWSs involved in the case study...... 94 Figure 6.3: The text of the Power Outage Plan of the College Township CWS ...... 96 Figure 6.4: Sentence analysis of the Power Outage Plan of College Township CWS ..... 97 Figure 6.5: Planning the GeoAgent's goal-driven actions from the Power-Outage Emergency Plan of the College Township CWS...... 99 Figure 6.6: Constructing concept maps to establish the GeoAgent’s environmental conditions from the text documents...... 103 Figure 6.7: The author (front) interviews a water manager (back) using computer-based concept mapping on May 31, 2004...... 106 Figure 6.8: (a) A part of the concept map of Millheim CWS...... 108 Figure 6.8: (b) The concept map of Upper Halfmoon CWS ...... 110 Figure 6.8: (c) The concept map of Aaronsburg CWS ...... 112 Figure 6.9: A portion of the pseudo code used for development of the communication rules in the CollegeTownship_CWS and Centre_Daily_Times GeoAgents 117 Figure 6.10: Linking the separate concept maps of individual CWSs to shared concept nodes ...... 118 Figure 6.11: (a) Integrating the database with the interview-derived concept map of Aaronsburg CWS...... 121 Figure 6.11: (b) Integrating the database with the interview-derived concept map of Millheim CWS...... 122 Figure 6.12: (a) An early stage of the drought development on May 01, 1999...... 133 Figure 6.12: (b) GeoAgents automatically identified a drought warning and evoked social responses in drought warning operations...... 134

viii Figure 6.12: (c) The GeoAgents identified a drought emergency on July 17 th , 1999; the drought-related laws and plans are activated...... 135

ix LIST OF TABLES

Table 1.1: Comparisons of the understandings of data, information, and knowledge...... 7 Table 1.2: Properties of data, information, and knowledge from the scientific perspective ...... 8 Table 6.1: The duration, percentage of the information captured the three computer-based-concept-mapping interviews ...... 115 Table 6.2: Summary of the students' evaluation...... 126 Table 6.3: The environmental conditions of the power outage in Example I...... 128 Table 6.4: The environmental conditions of the power outage in Example II ...... 130

x ACKNOWLEDGEMENTS

I would like to express my deepest gratitude to my advisor, Donna Peuquet, for her support and encouragement throughout my graduate studies. My best thanks go to the other professors on my dissertation committee: Brent Yarnal, Alan MacEachren, and

John Yen. My best thanks also go to the help from David O’Sullivan and Mark Gahegan.

Without their help, I would never have been able to finish this dissertation.

I wish to express my sincere appreciation to Susan Spaugh, Noelle Capparelle ,

Bob Hibbert, and Michele Henry for their generous advice, assistance, and technical support during my study. I would also like to thank Ola Ahlqvist, Isaac Brewer, Guo

Chen, Xiping Dai, Jianwei Dou, Sven Fuhrmann, Diansheng Guo, Allyson Jacobs,

Junyan Luo, Rob Neff, Bill Pike, Dan Wei, Jessica Whitehead, and Biliang Zhou for the great time of working together and sharing many memorable moments.

Special thanks go to my wife, Xinghua Han, for her love, understanding, and support during this Ph.D program.

xi Chapter 1

INTRODUCTION

" I n t h e l a t t e r p a r t o f t h e 2 0 t h c e n t u r y t h e r e h a s b e e n a s u b s t a n t i a l c h a n g e i n t h e

n a t u r e o f g e o g r a p h i c k n o w l e d g e . T h r o u g h o u t m o s t o f t h e h i s t o r y o f t h e d i s c i p l i n e ,

g e o g r a p h i c k n o w l e d g e h a s b e e n d e c l a r a t i v e — i . e ., i t h a s f o c u s e d o n c o l l e c t i n g a n d

r e p r e s e n t i n g t h e p h y s i c a l a n d h u m a n f a c t s o f e x i s t e n c e . I n t h e l a t t e r p a r t o f t h i s

c e n t u r y t h e r e h a s b e e n a c h a n g e f r o m i n v e n t o r y d o m i n a t e d a c t i v i t y t o t h e c r e a t i o n

o f k n o w l e d g e g e n e r a t e d b y e m p h a s i z i n g c o g n i t i v e d e m a n d s , s u c h a s

u n d e r s t a n d i n g ' w h y ' a n d ' h o w ' i n a d d i t i o n t o ' w h a t ' a n d ' w h e r e ' . T h i s h a s

r e q u i r e d a c h a n g e f r o m a n e m p h a s i s o n f o r m t o a n e m p h a s i s o n p r o c e s s . "

2 0 0 2 , p 1 )  R. Golledge (

1.1 Introduction

With the advancement, availability, and popularization of modern computer technologies, GISystems (Geographic Information Systems) are being applied to an increasingly wide range of geographic domains including urban planning, land parcel and land-use change, natural resource assessment, public utility management, agricultural management, and traffic and transit monitoring. GISystems and related software have subsequently come to be viewed as essential tools for geographic representation, spatial analysis, and problem solving (Mark et al., 1996; Johnston, 1999; Longley, Goodchild et al., 2001).

The first GISystems were motivated by the need to record large collections of

Earth-related data in an integrated manner, and to make rapid retrieval and simple calculation tasks more efficient than manually processing paper maps. Examples of early GISystems include CGIS (the Canada Geographic Information System in 1963) (see

Tomlinson, 1973) and MAGI (Maryland Automated Geographic Information system in

1974 (see http://www.msgic.state.md.us/msgicinf/magi.htm )). Since 1992, GIScience

(Geographic Information Science) has emerged as a distinct interdisciplinary field

(Goodchild, 1992; Goodchild 2004b). It has become recognized as more than just the concepts behind a collection of tools for geographic data handling and analysis, but rather as an integrated approach and methodology for understanding Earth-related phenomena.

Modern geography focuses on understanding geographic processes . The interactions among the components within geographic processes tend to be highly dynamic, interlinked, and scale-dependent. In addition to physical processes, human alteration of the Earth is substantial and growing (Vitousek, Mooney et al., 1997), and involves complex chains of cause-and-effect. For example, environmental water quality is affected by urbanization (housing and road density, etc.), which is in turn affected by public policy and economics governing how and where things may be built at both local and regional scales. Because of the complex and non-deterministic interactions of social and natural components, improving our understanding of geographic processes generally requires representation and analysis of higher-level, domain-specific human knowledge in addition to observational data. The complexity of the processes being investigated also requires the application of expertise from multiple knowledge domains.

There have been significant advances in statistical and other analytical capabilities over the 40 years since the first GISystems. One such advance is the recent development of collaboratories, or collaborative problem solving environments, allowing multiple users to bring differing knowledge domains to bear to solve a problem. Nevertheless, the

2 basic approach of tabulating and analyzing what can be seen and counted has never changed since the first GISystems in the 1960s. Current GISystems are thus data-driven.

Higher-level, derived information, which could aid in providing insights into how to interpret the observational data, is rarely stored. While there has been much recent attention in the GIScience research community to the representation of domain knowledge as ontologies, these efforts tend to focus on using such ontologies for resolving varying views in data interoperability (e.g., Fonseca, Egenhofer et al., 2001) data categorization (e.g., Gahegan et al., 2003; and Smith and Mark, 2001), and database design (e.g., Frank, 2001). Interpretation of the data is assumed to be dependant upon the human users to derive more general insights. What can be derived from current

GISystems is thus limited by the expertise, experience, and memories of the specific people using the system at a given time. This can be problematic, particularly in an emergency situation when a solution must be quickly derived.

This dissertation therefore aims to greatly extend the current capabilities of

GISystems. The intention is to go beyond the limitations of data-driven approaches and

(1) explore an integrated knowledge-oriented strategy for geographic representation and analysis, and (2) make explicit representation of process with an emphasis on how natural and social components interact dynamically in the Earth system.

Goodchild (2004b) asserted that three elements are required if GISystems are to gain the appropriate capabilities for dealing with process: (1) progress in representing time in GISystems and in the development of methods for the analysis of spatiotemporal data; (2) a much closer coupling between hypotheses about process and the methods of analysis and visualization implemented in GISystems; and (3) using "process objects" to

3 represent process, which he described as like digital data but different in that they are dynamic rather than static. While he brought attention to this critical need and emphasized that temporal data is an essential ingredient, he offered no insights for how to represent or use derived information or prior knowledge stored within GISystems for studying process. Moreover, the term "process object" as he used it is ambiguous, but he seems to be suggesting a solely technological (and data-based) solution rather than a different conceptual approach.

The conceptual approach used in the current research is to include a knowledge layer that can help users more effectively interpret the large, heterogeneous stores of data now available. This approach requires both a means of representing and integrating space-time knowledge within the GISystem, and a means of integrating a highly interactive visualization facility that allows the user to easily gather higher-level information from the system and guide the analysis.

Before proceeding further, some basic concepts need to be clarified, including:

What, exactly, are information and knowledge, as distinct from data? What is a geographic process? How does representation of process relate to knowledge? Ways to achieve the knowledge-oriented representation of geographic process will then be discussed in the remainder of this chapter.

1.2 What are data, information, and knowledge?

We often use the terms data, information, and knowledge interchangeably in everyday speech. For example, the Merriam-Webster English Dictionary (www.m- w.com) defines data as:

4 “1: factual information (as measurements or statistics) used as a basis for reasoning, discussion, or calculation… 2: information output by a sensing device or organ that includes both useful and irrelevant or redundant information and must be processed to be meaningful 3: information in numerical form that can be digitally transmitted or processed ”

information as :

“1: the communication or reception of knowledge or intelligence 2 a (1): knowledge obtained from investigation, study, or instruction (2): intelligence, news (3): facts, data b: the attribute inherent in and communicated by one of two or more alternative sequences or arrangements of something (as nucleotides in DNA or binary digits in a computer program) that produce specific effects c (1): a signal or character (as in a communication system or computer) representing data (2): something (as a message, experimental data, or a picture) which justifies change in a construct (as a plan or theory) that represents physical or mental experience or another construct d: a quantitative measure of the content of information; specifically : a numerical quantity that measures the uncertainty in the outcome of an experiment to be performed 3: the act of informing against a person…”

and knowledge as :

“2 a (1) : the fact or condition of knowing something with familiarity gained through experience or association (2): acquaintance with or understanding of a science, art, or technique b (1): the fact or condition of being aware of something (2): the range of one's information or understanding c: the circumstance or condition of apprehending truth or fact through reasoning: COGNITION d: the fact or condition of having information or of being learned … 4 a : the sum of what is known: the body of truth, information, and principles acquired by mankind…"

This reflects a significant overlap in meaning among these terms in common usage, which has carried-over into the GIScience literature. Nevertheless, there is general agreement of the distinctions among these terms in the scientific literature.

Based on Stenmark (2002), Table 1.1 summarizes the views of various authors.

Longley et al. (2001) claim that data are independent from any particular context.

Davenport (1997) argues that data are obtained from simple and direct observation. Most authors share the view that information has meaning. Longley et al. (2001) argue that

5 information is interpreted from data according to some purpose. Knowledge is described very differently, using terms such as beliefs, concepts, insights, experience, and abstraction.

To summarize, data can be said to refer to uninterpreted (raw) facts; information contains fragmentary meaning in that it is interpreted and contextualized; and knowledge denotes a broader understanding that is cumulative and interlinked. Data, information, and knowledge are thus seen together as progressive levels of understanding (Ackoff,

1989). Similar to Quigley and Debons (1999), Zeleny (1987) asserts that these terms represent a hierarchy of knowing-nothing, knowing-what, knowing-how and knowing- why. Machlup (1983) claims that knowledge is usually well organized, while information is always fragmented and less organized.

6 Table 1.1: Comparisons of the understandings of data, information, and knowledge

AUTHORS DATA INFORMATION KNOWLEDGE Zeleny (1987) knowing-nothing knowing-what knowing-how and knowing-why Machlup (1983) raw, reification of fragmented, ephemeral, and structured, enduring, and information empirical flow consistent stock Woodward raw quantitative or data ordered and the cumulative (1992) qualitative facts for contextualized in ways that understanding of creating information gives them meaning information Wiig (1993) - facts organized to describe a truths and beliefs, situation of condition perspectives, and concepts, judgments and exceptions methodologies and know-how Nonaka and - a flow of meaningful commitments and beliefs Takeuchi (1995) messages created from these messages Spek and not yet interpreted data with meaning the ability to assign Spijkervet symbols meaning (1997) Davenport simple observations data with relevance and valuable information from (1997) purpose the human mind

Davenport and a set of discrete facts a message meant to change experience, values, Prusak 1998 the receiver’s perception insights and contextual information Quigley and does not answer answers the questions what, answers the questions Debons (1999) questions to a particular where, when, or who why or how problem Choo, Detlor, et. Facts data vested with meaning justified true beliefs al (2000) Bellinger, is raw; simply exists data that has been given is the appropriate Castro, et al. and has no significance meaning by way of collection of information, (2000) beyond its existence relational connection such that its intent is to be useful Schreiber, uninterpreted signals meaning attached to data (1) attach purpose and Akkermans, et competence to al. (2000) information (2) potential to generate actions Longley, raw data, data being given some value of information Goodchild, et al. context- free interpretation, serving some being added by (2001) purposes interpretation based on contexts, experiences, and purpose Setzer (2001) a sequence of informal abstraction as a personal, inner quantified or representing something of abstraction of something quantifiable symbols significance to a particular that has been experienced person by someone Lo, Yeung, et al. a collection of facts in processed data that are the concepts of data and (2002) numerical values, meaningful, valuable and information characters, symbols, or useful to users signals

7 What (as well as where and when ) questions can be answered using information, whereas knowledge addresses how something behaves, and provides explanatory assumptions as to why certain circumstances may occur — a conceptual model of the world. While it is difficult to draw clear boundaries between these terms, there is a definite progression from the concrete and distinct to the abstract and integrated that can be summarized as shown in Table 1.2.

Table 1.2: Properties of data, information, and knowledge from the scientific perspective

Data • raw and uninterpreted facts • has meaning attached Information • interpreted from data • what/where/when/who • fragmentary and less organized

• cumulative understanding of information Knowledge • well organized • how/why • linked to judgment and actions

1.3 Geographic processes and knowledge representation

1.3.1 Defining process

As already stated, and as reflected in the quote from Golledge (2002) on the first page, modern Geography is increasingly concerned with understanding dynamic processes occurring in the landscape. Process goes far beyond simple sensory or observational information (Gaile and Willmott 1989, Golledge 2002). Based on the discussion above concerning knowledge vs. data as viewed in science, geographic process is about knowledge as it relates to how/why as a level of explanation and

8 understanding. Process, therefore, transcends observed facts and is rather a holistic and unified description of the entire form and mechanism of a dynamic system. This can be expressed simply as:

Form + Mechanism = Process

Form here refers to the component parts of a phenomenon and their spatial arrangement.

The term mechanism refers to how something operates and how components interact dynamically.

1.3.2 Complexities in representing process

Different types of models as used in science (mostly in mathematical form) correspond to different levels of understanding of a given process, from description

(derived by knowing what) to simulation and prediction (derived by knowing how/why).

Descriptive models address only the form of a phenomenon . These are typically the first types of models developed. As integrated understanding of the mechanism increases, simulation and prediction models can be built. These incorporate both mechanism and form, and thereby represent process. While simulation and prediction models exist in a variety of earth-related fields (e.g., meteorology, economic, transportation, ecology, and other models), these models are largely dependent upon the availability of data and have limited capabilities to represent complex social systems (e.g. culture and law systems).

In addition to using mathematical models, many GIScientists have attempted to represent geographic processes through building space-time database models. For instance, Langran (1989) proposed the models of “amendment vectors” and the “temporal

9 grid” to add a time component to the vector and raster based spatial databases. Raper and

Livingstone (1995) presented an object-oriented spatial modeling (i.e. OOgeomorph) to add the time property to geographic objects for representing coastal geomorphological phenomena over a raster data set (also see Wachowicz (1999) for the object-oriented approaches). Peuquet and Duan (1995) proposed an event-based spatio-temporal data model (ESTDM) to represent temporal information of land-use change (also see Peuquet

1999). Yuan (2001) integrated raster and vector data representations with temporal events for modeling precipitation changes in Oklahoma using ArcInfo GIS. Wang and Cheng

(2002) developed a mobility-oriented spatio-temporal data model to support transport modeling with consideration of traveling activities by extending ArcView GIS. And

Shirabe (2004) attempted to extend the raster data model with the consideration of spatio- temporal criteria to support the decisions in scheduling timber harvest. In general, these research efforts have been focusing on describing the observed changing phenomena in

GISystems, rather than human understanding of the mechanism behind the phenomena.

To fully address geographic processes, representing human knowledge of how the world works then becomes essential. Compared with the complexity of the world and the totality of human knowledge, an individual's knowledge is very limited. No person can be an expert in everything in today's society, and no one’s knowledge is perfect or complete.

Professionals in various disciplines and fields tend to specialize, allowing for relatively more complete knowledge within a confined knowledge domain. Because of a unique totality of life experiences and different social-cultural contexts and perspectives, different people also do not interpret the same geographic phenomena in exactly the same way or give exactly the same explanation for what is observed (Brodaric, Gaheganb, et

10 al. 2004). For instance, a meteorologist would have a very different view of a rain event than that of a farmer in order to understand a phenomenon within a given problem or phenomenon context (air mass circulation vs. crop growth). Even different people within the same scientific field or problem context may give very different explanations due to different goals and beliefs. In addition, many geographic phenomena evolve gradually and continuously over space and time without clear boundaries. To represent human understanding of geographic process, therefore, it is necessary to address the context- dependency and inexactness of geographic knowledge to facilitate sharing how the world is conceptualized by different people.

1.4 Existing strategies for representing knowledge

Mathematics is widely employed for representing scientific understanding of the relations amongst the elements involved in a system. Following this tradition, many existing geographic models use mathematical formulae to simulate and predict geographic phenomena and have a basis in quantifiable, observable features.

In the Artificial Intelligence community, rule-based expert systems (i.e., systems that represent knowledge in form of IF THEN rules) are broadly utilized to formalize sets of interrelated qualitative rules to achieve and complex problem solving (Luger, 2002). Graph-based knowledge representation techniques, such as semantic networks (Quillian, 1968), frames (Minsky,

1975), concept maps (Novak and Gowin, 1984), and conceptual graphs (Sowa, 1984), have been widely applied to represent qualitative and individualized understanding of relations. Each of these knowledge representation techniques has inherent strengths and weaknesses.

11 Agent-based technologies have been rapidly growing since the early 1990s. An agent is conceptualized as a software entity situated within a part of a simulated virtual environment (e.g., its digital surroundings, other agents, and networked communities) with its own knowledge domain. It can interact with this environment, can communicate with other agents, and is capable of performing goal-driven actions (Genesereth and

Ketchpel, 1994; Franklin and Graesser, 1997; Hogg and Jennings, 1997; Rana, Preist et al., 2000). Problems are solved within multi-agent systems via communication and cooperation among agents (Nwana, 1996; Wooldridge and Jennings, 1999). Agents also have the capability for storing knowledge and for learning as the basis of their actions.

The mechanism for internal knowledge and inter-agent relationship representation can

(and often does) include other graph-based, rule-based, and mathematical techniques.

Agent-related technologies have already been adopted for various geographic applications (Batty and Jiang, 1999; Bryson, Luck et al., 2000; Manson, 2000; O'Sullivan and Haklay, 2000; Batty, 2001; Conte, Edmonds et al., 2001; Parker, Berger et al., 2001;

Raubal, 2001; Parker, Manson et al., 2002; Parker and Meretsky, 2002; Sengupta and

Bennett, 2003). Existing application of agents within the GIScience literature can generally be classified into two distinct areas: emergence-oriented agent-based modeling

(ABM), and distributed problem solvers.

From the emergence -oriented ABM perspective, an interconnected complex system consists of relatively simple elements that are organized to form more intelligent, more adaptive higher-level behaviors (Johnson, 2001). The general purpose of an ABM application is to discover simple rules that operate complex systems. Before the appearance of the term intelligent agents, what are traditionally called self-organizing

12 cellular automata (CA) and the Schelling Model for social segregation had already been utilized in geographic contexts (see Couclelis, 1985; Couclelis, 1988) and (Schelling,

1969; Schelling, 1971; Schelling, 1978)). Agents in this emergence-oriented modeling strategy are often called artificial-life agents (Langton, 1988; Adami, 1998; Dean, 2003).

More recent examples of ABM applications include simulation of societies (Gilbert and

Doran, 1994; Epstein and Axtell, 1996; Gimblett, 2002), the self-organizing city

(Portugali, Benenson et al., 1994; Portugali, Benenson et al., 1997), and the modeling of land use and land cover change (Manson, 2000; Parker, Berger, Manson and McConnell,

2001; Parker, Manson et al., 2003). In these bottom-up modeling applications (Brooks,

1991b), agents are relatively homogenous. They have no inter-agent communication, and no independent knowledge base (Franklin and Graesser, 1997). In addition, their actions are usually decided by simple behavioral rules.

Another way in which agent-related technologies have been applied in geography is from a tool-oriented perceptive. In this perspective, agents are viewed as distributed problem solvers, also called "software agents" in AI (artificial intelligence). They are viewed as tools for producing “smarter” solutions for specific tasks. For example,

Rodrigues and Raper (1999) define the concept of “spatial agents” for performing spatial data mining, improving GISystem interfaces, facilitating spatial tasks, and connecting different spatial systems. Another example is the “Distributed Intelligent Geographical

Modeling Environment” (Sengupta and Bennett, 2003), which was developed for spatial decision support based on Web-accessible repositories of spatial data and models. This type of agent can have very diverse and complex internal structures (incorporating other

13 representational elements such as behavioral rules), an explicit communication language, and sophisticated inter-agent cooperation.

Although the above techniques can represent human knowledge from the given perspectives, compared with the diverse human understanding of the world, no single technology is able to address the complexity of geographic processes. Instead of using these techniques separately, this dissertation attempts to incorporate multiple knowledge- representation techniques with the conventional data-centered GISystems to represent the geographic world in an integrated way.

1.5 Goals and methods of the current research

The primary goal of this dissertation is to propose a new form of GISystem to encompass means of representing and using high-level, integrative knowledge concerning both mechanism and form of geographic processes. The overarching objective is to provide an integrated representation approach to link spatially distributed and multiple- scaled knowledge representations together with the conventional observational spatial databases, as known in existing GISystems. Achieving this objective will represent progress beyond the current what/where , data-oriented focus in GISystems to an emphasis of how/why knowledge of geographic process.

To represent process, the methods used for geographic knowledge representation should be able to deal with diverse perspectives of how humans understand the dynamic interactions among geographic components. It should also allow the stored knowledge to be used by GISystem software for deriving conclusions in an automated fashion, and by individuals and groups of diverse individuals as an aid to, and extension of, their own knowledge for facilitating process modeling and problem-solving. In the current research,

14 there are two major steps to approach these needs: (1) the introduction of GeoAgents as the spatiotemporally distributed knowledge-representation layer, and (2) the use of multiply linked GeoVisualization techniques in a highly interactive environment.

Although the agent-based approach, as discussed above, has already been applied in geographic contexts, agents are generally considered as a technology, or a methodology, to solve task-specific problems. This dissertation extends the agent-based approach to a more generic notion of a "geographic agent" or "GeoAgent", which more broadly integrates a number of knowledge representation techniques, has direct links with a spatial database, and communicates with the user interactively through

GeoVisualization. The approach used in this dissertation integrates contemporary technologies in GIScience and artificial intelligence to significantly enhance the representational power available within GIS, and thus also greatly extend their representational, and subsequently their analytical power.

To demonstrate this approach, this research develops a Java-based prototype, the

GeoAgent-based Knowledge System (GeoAgentKS), which integrates GeoAgents with graph-based concept maps, rule-based expert systems, models, and geospatial databases.

1.6 Case study: representing human-environment interactions relevant to

CWSs

To test the knowledge-oriented representation strategy on a real-world problem, the GeoAgentKS prototype is applied in a case study to represent the dynamic, complex, and scale-dependent human-environment interactions relevant to community water systems (CWSs) in Central Pennsylvania. In this case study, a systematic set of methods is employed to capture and formalize geographic knowledge as GeoAgents' behavioral

15 rules, concept maps, and models in GeoAgentKS via (1) interpretation of written documentation of laws, regulations, and plans, and (2) computer-supported interviews with domain experts. Geospatial data are collected after the knowledge-acquisition process and integrated with the geographic knowledge representation. Different experts and novices are then interviewed to evaluate the correctness, adequacy, and usability of

GeoAgentKS.

Overall, this dissertation aims to break through the current data-centered paradigm in GIScience toward a knowledge-oriented representation and analysis of geographic processes. This requires using multiple knowledge and data representation technologies in an integrated way. The integrated representational approach implemented in GeoAgentKS provides a means of representing the complexity of geographic processes, and a means for communicating this complexity in an intuitive way as an aid to decision-makers.

1.7 Organization of this dissertation

The second chapter provides a literature review of knowledge representation, discussing the range of existing techniques such as graphs, expert systems, and intelligent agents, as well as the current research efforts on knowledge representation within

GIScience. Chapter 3 introduces the concept of GeoAgents in detail and gives a brief example of how GeoAgents can be employed in representing the complexity of a real- world problem. Chapter 4 illustrates the architecture of the GeoAgent-based Knowledge

System (GeoAgentKS) and describes how it is implemented. Chapter 5 describes the research design of using the GeoAgentKS in the case study. Chapter 6 describes the detailed methodologies and results of the case study relevant to community water systems

16 (CWS) in Central Pennsylvania. Finally, Chapter 7 provides a summary of this research, and suggests potential research topics for future study.

17

Chapter 2

EXISTING KNOWLEDGE-RELATED REPRESENTATION TECHNIQUES

T o s o l v e r e a l l y h a r d p r o b l e m s , w e ’ l l h a v e t o u s e s e v e r a l d i f f e r e n t r e p r e s e n t a t i o n s . ”

 M. Minsky (1991, p38)

As described in Chapter 1, there can be differing views of geographic phenomena depending on human individuals’ expertise and problem contexts. There are also limitations in the knowledge of individuals, inherent fuzziness of object definitions and variable levels of exactness of spatial information. To increase our understanding of the complexity of geographic processes, it is important to represent the knowledge of such diverse views and uncertainty in GISystems.

Although a number of knowledge-representation techniques have been developed in various fields, including mathematics and Artificial Intelligence (AI) for dealing with representing diverse kinds of knowledge, no single technique has the capability of the complexity of the social and natural interactions in geographic processes. The current chapter focuses on existing knowledge-representation approaches, discusses their applications in GIScience, and identifies the feasibility and potential problems of these approaches for use in an integrated knowledge-oriented strategy for representing geographic processes. Since process includes form and mechanism (page 9), any attempt at representing geographic processes needs to consider both observed phenomena and a high-level understanding of the physical and natural relations. In another words, it is necessary to integrate geographic data with knowledge representation in GISystems. In recent years, research efforts in categorization, fuzziness, and ontology have provided a foundation to link human knowledge with observational data. For representing geographic processes, in addition to considering these research efforts, it is also important to address the dynamic interactions and causes-and-effects among the involved geographic components. Before introducing the new methodologies proposed in this research, this chapter provides a background review of the existing knowledge representations and discusses their feasibilities for the representation of geographic processes.

Section one below reviews the recent research on categorization, fuzziness, and ontology. Section two describes and compares approaches that have been developed for knowledge representation, including rule-based expert systems, concept maps, and frames. Section three provides a description of the various types and capabilities of the intelligent agents that can be used to store knowledge and perform knowledge-driven actions.

2.1 Bridging data and knowledge

As discussed in section 1.3.2, representing geographic knowledge needs to address context dependency and inexactness, also needs to facilitate sharing different people’s conceptualization of the world. Techniques for dealing with context dependency and inexactness have been approached from both a practical data handling perspective and from a cognitive perspective. In current GIScience, the study of categorization

19 provides a way to deal with how the world is conceptualized from the observed phenomena. The use of fuzzy set theory provides a means of representing uncertainty.

And the ontology-related research attempts to facilitate sharing knowledge among different domains. As categorization, fuzzy set theory, and ontology have relevance in an anticipated integrated approach, I review these below.

2.1.1 Categorization

Categorization is an essential human cognitive strategy for reducing unneeded detail. It plays a major role in using knowledge schemata (or mental models) to structure what we know and what we see (MacEachren, 1995). Lakoff (1987) considered a category to be a grouping of things that are considered similar, or are treated in a similar way. Rosch (1978) asserted that categories are structured information based on the perceivers' interpretation of the world. The function of the category structure is to manage a maximum amount of information with the least amount of effort. Tversky (1992) asserted that by using categories it is possible to reduce the number of properties needed for a functional cognitive representation to a few simple and general properties (also see

Peuquet, 2002).

Suchan (1998) described differences between classical category theory and prototype category theory as discussed within the cognitive literature. In the former, category members are treated as homogeneous, with no overlaps between categories. In the latter, categories can be internally heterogeneous, with idealized examples, graded memberships, and indeterminate boundaries. A prototype generally refers to the best or

‘prototypical’ example. Lakoff (1987) proposed what are known as radial categories.

20 Radial categories also include examples that are less typical and may differ from the prototype in one or more features.

There have been a number of researchers who have applied cognitive category theory within GIScience and Geography. Perhaps one of the best known geographic examples is the Anderson hierarchical land use and land cover classification system, which was developed by the USGS (U.S. Geological Survey, see Anderson, Hardy et al.,

1976) and quickly became a de facto standard for defining land use in spatial databases.

Suchan (1998) applied prototype category theory in her rural vs. urban categorization study, and concludes that this theory can be a valuable strategy for extending the capabilities of representation in GISystem databases. Lloyd et al. (2002) conducted cognitive experiments to test how background knowledge and category levels could affect categorization of aerial photographs by human subjects. They found that all subjects could more easily categorize higher-order land-use classes (e.g., Agriculture,

Forest, and Water) than lower-order categories (e.g., commercial, Industrial, and

Residential). Also, geographers had more success than non-geographers during a single categorization round.

With the advancement of remote sensing and other data-acquisition technologies, massive amount of geographic data now are available to allow quick detections of the changing geographic processes. The large data sets, on the other hand, challenge human capability in direct data interpretation and effective data use. Categorization provides a means to group and organize the observed data in a way that is more natural to how humans cognitively store and use geographic knowledge.

21 2.1.2 Fuzzy set theory

The concept of a fuzzy set was introduced by Zadeh (1965), and provides a quantitative approach for dealing with vagueness or non-discrete category membership.

Fuzzy set theory allows for gradations of memberships being expressed as probabilities, rather than traditional Boolean (i.e., yes or no) memberships.

In recent years, fuzzy set theory has been applied in various geographic contexts, including accuracy assessment (Woodcock and Gopal, 2000), fuzzy measures in multi- criteria evaluation (Jiang and Eastman, 2000), fuzzy cellular automata models of urban growth (Liu and Phinn, 2001), modeling uncertainty in natural resource analysis (Davis and Keller, 1997), modeling time in GISystems (Dragicevic and Marceau, 2000), assessing similarity of categorical maps (Hagen, 2000), hierarchical fuzzy pattern matching in land use maps (Power, Simms et al., 2001), and fuzzy classifications (Irvin,

Ventura et al., 1997; Wilson and Burrough, 1999; Lucieer and Kraak, 2004).

In general, these applications have focused on handling uncertainty inherent in the data in order to derive more meaningful information for the specific analytic task at hand, rather than as a top-down knowledge-oriented strategy for understanding the complexity of geographic processes (see Liu and Statur 1999; McNeese, et al. 2000; Perusich and

McNeese 2005).

2.1.3 Ontology

Ontology has recently become a dominant theme in GIScience, particularly as a strategy for defining concepts and their interrelationships for shared databases.

Nevertheless, the term is not well understood. Much research in GIScience involving geographic categorization and cognition has become ‘ontological,’ perhaps partly

22 because of the popularity of the term. The basic purpose of an ontology for any shared geographic database is to establish a set of agreed-upon objects or elements from the users’ perspective that can be commonly understood and shared in different problem or application domains.

According to Smith and Mark (2001), ontology has two distinct perspectives, one from philosophy, and the other from information science or information technology (IT).

As a branch of philosophy, ontology is defined as dealing with the nature and the organization of reality (Guarino and Giaretta, 1995). In the information science sense, an ontology is a ‘ neutral and computationally tractable description or theory of a given domain which can be accepted and reused by all information gatherers in that domain ’

(see Smith and Mark 2001, p594). In other words, ontology in information science is viewed as a methodology for formalizing what is known conceptually from the users' point of view (Gruber, 1993).

As described by Guarino (1995), GIScience uses the information science perspective of ontology. Ontology is thereby used as a tool for overcoming semantic differences and for finding commonalities among user views for the purpose of developing data models (and databases) that can be shared among a broader user community. It ‘ starts with conceptualizations, and goes from there to a description of corresponding domains of objects or closed world data models ’ (see, Smith and Mark,

2001, p594).

Ontology has recently been applied to problems of interoperability (Wiederhold,

1994), including semantic granularity (Fonseca, Egenhofer et al., 2001), ontology-based metadata generation (Stuckenschmidt and Harmelen, 2001), and ontological geographic

23 category investigation (Smith and Mark, 2001). Frank (2001) attempted to develop a single, agreed-upon ontology to maintain consistency in multiple spatial databases. Kokla and Kavouras (2001) employed the notion of ontology to allow information exchange among different domain ontologies in order to facilitate sharing and reusing geographic information within diverse data standards and different geographic concepts. Gahegan et al. (2003) developed an ontological tool, ConceptVISTA, and applied it to describe the hierarchies of concepts for land cover classification. Cruz et al. (2004) developed an approach for the alignment of the concepts between different ontologies that allows users to query hundreds of databases using a single query that hides the underlying heterogeneities. Fonseca and Martin (2004) investigated the importance of space and time within the structure of ecological ontologies. Jones et al. (2004) incorporated a geographic ontology with their spatial search engine for Web-document collection, spatio-textual indexing, and metadata extraction. These research efforts overall provide examples to share knowledge among different domains, and can be potentially used for representing diverse human understanding of the complex geographic processes.

In summary, the research on categorization, fuzziness, and ontology establishes the connections between the observed data and high-level human knowledge, and provides the necessary ways to describe how the component parts of an observed phenomenon are conceptually categorized, represented, and shared. Nevertheless, these research efforts have focused on the representation of what/where/when (see Mennis et al. 2000) rather than of how and why questions. The current research is to make up this deficiency with a focus on capturing and representing human understanding of the mechanism, the cause-and-effect chains, and the dynamic interactions in geographic

24 processes. The rest of this chapter discusses the potential knowledge-representation technologies and their feasibilities for representing geographic processes.

2.2 Explicit and implicit knowledge representation techniques

Existing knowledge representation techniques for representing derived, higher- level knowledge for subsequent storage and use in a computing environment have been predominately developed within Artificial Intelligence (AI). This section provides a brief introduction to the most popular strategies for either explicit or implicit knowledge representations, including rule-based systems, graph-based techniques, neural networks, and the subsumption architecture. Examples of previous research using most of these techniques within GIScience will also be given.

2.2.1 Rule-based knowledge systems

Rule-based production systems are known as a type of expert system since they are intended to simulate the reasoning process of a human expert. According to Luger

(2002, see p248), an expert system allows easy update, maintenance, and reuse of the stored knowledge, using (necessarily) imperfect and context-dependent knowledge to obtain useful solutions to problems, and providing explanation-subsystems to support how/why explanations. A rule-based expert system is comprised of two parts: the knowledge base for representing knowledge within a given domain, and the , which evaluates conclusions based on using the knowledge base via a logical reasoning process.

Domain knowledge in the knowledge base is usually expressed as qualitative

IF …THEN … rules:

25 IF THEN

The ‘conclusion’ part is also called ‘action,’ so that the rules are usually described as condition-action rules. To implement the general knowledge represented in the rules, case-specific FACTS are needed to meet the conditions in the IF part of the rules. Facts are unconditional information. They are assumed to be true at the time they are used.

The inference engine has three main functions: (1) collecting rules whose conditions in the IF parts can match the available FACTS, (2) performing actions in the

‘THEN’ parts in rules that are executed (or fired), and (3) using a conflict resolution to ensure only one rule will be fired when more than one rules' conditions are matched.

Rule 1 (a) FACT1 FACT2 AND Decision A FACT3

AND Decision C FACT4 OR FACT5 Rule 3 AND Decision B FACT6 Rule 2

Rule 1 (b) FACT1 FACT2 AND Decision A FACT3

AND Decision C FACT4 OR Rule 3 FACT5 AND Decision B FACT6 Rule 2

Figure 2.1: (a) forward chaining; (b) backward chaining

26

There are two strategies for evaluating the rules: forward chaining , and backward chaining (see Figure 2.1). Forward chaining is also known as data-driven inference, and backward chaining as goal-driven inference (Russell and Norving, 1995; Hopgood,

2001). In the forward-chaining strategy, the rules are fired whenever the facts can satisfy the IF parts of these rules (e.g., in Figure 2.1 (a), rules 1 and 2 fire first, then rule 3). In the back-chaining strategy, the inference engine seeks steps to activate rules whose preconditions are not yet met (e.g., rule 3 in Figure 2.1 (b)). A set of explicit goals are required that consist of statements about which rules (e.g., rules 1 an 2 in Figure 2.1 (b)) need to be used in the next step. The rules are fired only when they can potentially satisfy a goal. When more information is needed, new sub-goals are automatically created to fire additional relevant rules until the goal is satisfied or is determined to be non-satisfiable

(Winston, 1993; Forgy, 2004). Eventually, a decision must be justified by facts, which serve to contextualize the rule-based knowledge within a given situation.

Rule-based knowledge systems have been applied in GIScience for the purposes of database management, data analysis, or decision support, but few of them have been used for the knowledge-oriented representation of complex geographic processes. In early examples, Smith and Peuquet (1984) used a semantic net and an expert system shell for context-dependent, knowledge-based search in large spatial databases. Mackay et al.

(1993) used a knowledge-based system for managing spatiotemporal ecological simulations. Ross (1993) applied an expert system for soil erosion mitigation in logging operations on steep lands. Loh et al (1994) incorporated a rule-based expert system with a

GISystem for forest resource management. Frank (1996) used a Spatial Expert System to encode cardinal topological directions in a qualitative manner for spatial reasoning.

27 Ferrier and Wadge (1997) integrate the ArcInfo GISystem (see www.esri.com) and a rule-based expert system called NEXPERT OBJECT for geological analysis of sedimentary basins. More recently, Lukasheh et al. (2001) review methodologies for integrating expert systems, GIS, and Decision Support Systems (DSS), and their application in landfill design and management. Prasad and Sinha (2003) develop the technology of using an expert system for the natural resource database management to facilitate remote sensing applications. Liang et al. (2004) applied a GIS-based expert system for agricultural development in Gansu Province, China.

To represent geographic processes, in addition to the above applications, it is possible to use expert systems to store the behavioral rules of the relevant geographic components so as to represent the dynamic interactions among these components.

Nevertheless, because expert systems conventionally have been utilized in a centralized fashion, they have limited capabilities to represent the spatially distributed interactions within a geographic system. Therefore, more knowledge-representation technologies are required to address the representation of the complexity of geographic processes.

2.2.2 Graph-based knowledge representation

Much of human knowledge is less well-defined than the expert systems just described require. Rule-based systems work best for very constrained problem contexts.

In the hope of increasing the flexibility of computer-based knowledge representation, AI researchers have developed graph-based representational techniques intended to simulate the relationships of concepts in human cognition (e.g., for representing human language) as a means of learning more about how the human mind works. This approach is based on what is known as associationist theory (see, Luger, 2002 p199). For example, the

28 concept of ‘bird’ is related to other concepts as ‘feather,’ ‘wing,’ ‘fly,’ ‘animal,’ and

‘egg.’ Using graph-based methods, concept inheritance and exception can be easily represented. Concept inheritance defines a ‘kind-of’ relationship between classes. A subclass inherits all the attributes of its super-class. For example, the class of ‘canary’ inherits all the attributes of the class of ‘bird.’ Concept exception also defines a ‘kind-of’ hierarchy, in which most of the attributes in the sub-class can be inherited from the super- class, but some attributes in the sub-class and supper-class may not be inherited in specific cases. For example, an ostrich is a bird, which has feathers and wings like other birds, but cannot fly.

The original purpose in the development of associationist theory was simulation of the neural interconnections within the human brain. A number of knowledge representation technologies have been derived using this approach, including semantic networks (Collins and Quillian, 1969), concept maps (Zaff, McNeese et al., 1993), frames

(Minsky, 1975), and conceptual graphs (Sowa, 1984). Each of these is discussed below.

2.2.2.1 Semantic networks and concept maps

Semantic networks and concept maps are similar network approaches differentiated mostly by application and disciplinary context. This technology is known in AI and computer science generally as semantic networks, but is often called concept maps elsewhere. For both, there are two basic labeled components: nodes and links.

nodes (points/vertices): representing concepts, instances, or events

links (arcs/edges): representing relationships between nodes, such as is-a, part-of, kind-of, interactive relations (e.g., used-by, caused-by, made-of)

29 Semantic networks and concept maps are commonly represented graphically, where the links are labeled with relationship names, and the direction of the relationship is denoted via arrows, as shown in Figure 2.2. The simple example semantic network shown here can be interpreted as: Tom is a student of the Geography Department. He is using chair #1, which is a wooden yellow chair owned by the Department. And a chair consists of different parts, including legs, a seat, and a back.

furniture

back

of

kind_of t_ ar Geography p Department chair part_of seat o w n pa e rt d _o _ f b isa y legs chair #1

student_of by m _ a d d se e u is _ o Tom f

yellow wood

Figure 2.2: A sample semantic network regarding a chair

One well-known study on semantic networks within AI was Collins and Quillian's

(1969) experiment in modeling human information storage. They asked people different questions about birds, such as ‘Is a canary a bird?’ ‘Can an ostrich fly?’ or ‘Can a canary sing?’ and measured the response time to answer such questions. They argued that their results provided psychological evidence that human knowledge was organized in a hierarchical network fashion. This was based in the observation that more time was

30 needed when searching multiple levels of connected concepts than when searching those more directly linked ones.

Novak defines a ‘concept’ within a concept map as a ‘ perceived regularity in events or objects, or records of events or objects, designated by a label ’

(http://cmap.coginst.uwf.edu/info/printer.html ). Novak and Gowin (1984) consider concept mapping as a hierarchical process of organizing concepts and relationships.

While concept maps and semantic networks are very similar, some researchers have described some differences between them. According to Zaff et al. (1993), there are two ways in which they differ. First, concepts in semantic networks are organized in a principally hierarchical fashion with relatively few links. In contrast, concept maps are constructed with heterogeneous links emerging among the concepts, with not as much hierarchical structure (also see Bruillard and Baron 2000). Second, it is possible to include a time-line with the concept map in order to represent temporal relationships

(Zaff et al. 1993).

Because the semantic networks and concept maps can be used to provide the users interrelated information, they have been applied in GIScience for geographic information retrieval, hierarchical category management, text analysis, GISystem user-interface improvement, and visualization. For example, Maderlechner and Mayer (1994) used semantic networks and frames for automated acquisition of geographic information from scanned maps. Chen et al. (1997) defined a geographic knowledge representation system that employed semantic networks to allow text analysis in the management of digital libraries. Tönjes and Grown (1998) proposed a method to use semantic networks for road extraction from multi-sensor imagery. Baatz and Schäpe (1999) applied semantic

31 networks to manage the class hierarchy in their classification of multi-scale remote sensing images. Xu (2003) implemented a semantic network representation of spatial relations to improve the user interface of the ArcView GISystem. Marangoz et al. (2004) incorporate object-oriented image analysis and semantic networks for managing hierarchical classes in extracting the roads and buildings from remote sensing images.

Jayakumar and Barua (2004) integrate concept maps and spatial maps in GISystems in the context of urban municipality reforms to provide support for spatial decisions. Dai and Gahegan (2004) applied a concept-map-based approach and visualization technologies for category development.

Because of the flexibility of the techniques of semantic networks and concept maps, the users can define any relationships between concept nodes. Such techniques provide an intuitive way to users to describe their understanding of what components are involved and how they are related in a geographic process.

2.2.2.2 Frames

A frame , originated by Minsky (1975) as a knowledge-representation technique for computer applications, is generally described as a formalized structure for representing a stereotyped situation. Minsky describes his theory in cognitive terms as:

When one encounters a new situation (or makes a substantial change in one’s view of the present problem), one selects from memory a structure called a Frame (Minsky, 1975).

This is a remembered structure that is adapted to fit reality by changing details as necessary. A frame structure can be viewed as a network of nodes and relations (Figure

2.3). Each frame node can be made up of many slots. The type of information can vary considerably from slot to slot within the same frame node; scalar values, a range of

32 values, relationships (part-of, etc.) or behavioral rules. Each slot may also contain multiple entries, called facets.

node 1: chair node 2: wooden chair

chair wooden chair

superclass: furniture superclass: chair

owner: geography department ID: #1 slots frame part: (back, seat, legs) color: yellow include: (wooden chair, made_of: wood metal chair, ...) user: student

student

superclass: person

name: Tom department: geography

university: Penn State

node 3: student

Figure 2.3: A frame-based representation of a chair (see Figure 2.2)

The key characteristic of frames is that individual nodes within this structure provide a great deal of flexibility in the types of information that can be contained within it. The slots in a frame can contain information such as name, relationships (i.e., linkages) with other frames, attributes and their default values (or potential value ranges), and behavioral rules of the object or situation represented.

According to Luger (2002), the frame structure extends the semantic network representation in two important ways. First, it has the ability to contain behavioral rules

33 within a slot, and offers support of more flexible inheritance in classes and slot values.

Second, because complex structures are allowed within frames, this type of knowledge structure is more amenable to representing complex objects than semantic networks.

Minsky's frame structure has been implemented within many expert systems, including

JESS (Java-based Expert System Shell) as an alternative to rule-based systems.

Because of the above advantages, frames have been applied within GIScience for storing category information, facilitating data analysis, and structuring spatial relations.

For instance, Peuquet (1984, 1987) proposed a frame-based data structure for a knowledge-based GISystem. Smith and Peuquet et al. (1987) presented a frame-based prototype called KBGIS-II, which integrated a layer of semantic representation with a

GISystem. Mori and Cosoli (1991) applied frames to describe the perimeter and area of a region, its classification, and its location in analyzing remote sensing imagery.

Maderlechner and Mayer (1994) utilized frames for acquiring geographic information from scanned maps. Raubal and Worboys (1999) used a frame structure in implementing a knowledge representation for assisting way finding. Mennis, Peuquet, and Qian (2000) used a frame structure to represent the hierarchical categories of space-time knowledge.

And Sha and Hu (2004) applied a frame structure to store the relations among the spatial entities in their Structured Spatial Knowledge Management (SSKM) system.

2.2.2.3 Conceptual graphs

In addition to semantic networks and frame structures, another well-known graph- based knowledge representation is the conceptual graph (see Sowa 1984). The nodes of the conceptual graph are either concepts or relationships. These are conventionally represented graphically as rectangles and circles respectively (Figure 2.4). The links

34 between these elements are not labeled. This approach has been integrated with to allow reasoning (Sowa, 2000). The major intent of this strategy, however, is for representing human language. It has not yet been applied as widely in GIScience as concept maps, semantic nets or frames.

Cat on mat

Figure 2.4: A conceptual graph, representing ‘a cat is on a mat’ Source: http://www.jfsowa.com/cg/cgexampw.htm

In recent years, however, conceptual graphs have been employed in GIScience for improving interoperability between databases and GISystem interfaces at the conceptual/design level. For example, Yuan (1997) attempted to use a conceptual graph to facilitate information communications among data models in order to enhance geographic information interoperability. Sharma et al. (2003) applied conceptual graphs to improve their speech-gesture driven GISystem interface and grammatical analysis for crisis management applications. Roddick et al. (2003) utilized conceptual graphs to assist the determination of the semantic similarity of the attribute values in geospatial databases. Karalopoulos et al. (2004) used conceptual graphs to achieve semantic interoperability focused on geographic categories.

The above graph-based knowledge-representation techniques in general can be used as a graphical device to describe relations in a very flexible way. These techniques usually can also be integrated with databases and other knowledge-representation technologies, such as expert systems, to allow automated reasoning, and thus can be potentially used to represent human understanding of the dynamic relations among

35 geographic components in geographic processes. Nevertheless, the above explicit knowledge-representation strategies are not the only way to represent how humans understand the world. The following subsection introduces a couple of implicit knowledge representation approaches, which can be potentially applied for automated learning and distributed representation.

2.2.3 Implicit knowledge representation strategies

Different from the above knowledge representation techniques, neural networks and the subsumption architecture (Brooks 1991b) do not use explicit knowledge representation for problem solving. Instead of working at the cognitive level as those described above, these are intended to simulate the physical operation of the human brain. They emphasize dynamic interactions with very low-level elements and connections.

x1 w1

x2 w2

y0 x3 w3

... xn w4

Figure 2.5: A neuron of the neural network

Neural networks , also known as ‘connectionist architecture’ (Luger 2002), were originally envisioned as a simulation of the neural architecture of the human brain for information processing. A neural network is composed of multiple neurons. Each neuron has multiple numerical inputs x1, x2, … xn (Figure 2.5). The weights w1, w2, … wn

36 describe the varying connection strength with adjoining neurons. The output value y0 indicates an activation level of a given neuron, which is determined by the inputs and weights. With multiple interconnected neurons, the calculation of connection strength for each neuron takes place in parallel, and weights and output values are adjusted dynamically (Bechtel, 1987; Bechtel, 1988; Fodor and Pylyshyn, 1988). Using this strategy, it is possible to derive meaningful information or associations from complicated or imprecise data. Luger (2002) argued that neural networks are particularly useful for machine learning (such as classification tasks), pattern recognition, noise filtering, and prediction.

Neural networks have been broadly applied in GIScience mostly for improving classification, identifying patterns from data, or spatial simulations. For instance, Easson and Barr (1996) integrated neural networks combined with the ArcInfo GISystem to interpret natural resource information. German (1999) employed the neural network approach to improve the search strategies in their classifiers for GISystem data. Graham and Goswami (2001) used a neural network to identify trends or patterns in the urban environment using historical census data for Baltimore, Maryland. Rigol et al. (2001) described a methodology using neural network technology for the spatial interpolation of daily minimum air temperature. Li and Yeh (2002) presented a method to simulate the evolution of multiple land uses based on integrating neural networks and cellular automata in GISystems. Tatem et al. (2003) developed a neural-network-based method to increase the spatial resolution from existing agricultural land cover maps.

37

Inhibitor

Level 3

Level 2 Inputs Outputs

Level 1

Suppressor Reset Level 0 Sensors Actuators

(a) (b)

Figure 2.6: (a) A single layer of the subsumption architecture, (b) A subsumption architecture (Brooks 1986, 1991)

Subsumption architecture (Figure 2.6), introduced by Brooks (1986, 1998,

2001), is built on multiple simple layers using parallel and distributed connections. This approach was originally intended to provide mobile robots with the functionality displayed by lower-level life forms, such as ants and flies. An ant, for example, may have simple navigation techniques that can be described as a series of sensors and actuators.

The central idea of Brooks' argument is based on his approach to replicate human-level intelligence in a machine. He believes that human intelligence is too complex to be implemented as a whole at present. Instead, at the beginning we should start by building robots that model simple actions because high-level complexity intelligence behavior emerges from the interactions of various simpler behaviors in the context of particular tasks and local environment. Thus it should be possible, according to this view, to gradually achieve increasingly complex intelligent behaviors based on the lessons learned from experiments.

38 Subsumption architecture is similar to electronic circuitry diagrams. Each layer has inputs, outputs, and reset lines, as well as suppressor and inhibitor nodes (see Figure

2.6 (a)) that are software controlled. The inputs and outputs can receive or send messages, and a reset serves to initialize a given layer to a predefined starting condition.

A suppressor node on the input line allows the replacement of the normal flow of data into a layer by data from another layer. On the output line, the inhibitor node allows the layer to inhibit the output of another layer at a specific time. The connection of sensors

(inputs) and actuators (outputs) in multiple layers formalizes a subsumption architecture, in which all layers operate at the same time in parallel (see Figure 2.6 (b)).

Robots have indeed been built using subsumption architecture. They have goals, deal with multiple objectives within these overall goals, and can pass messages to each other to resolve possible conflicts. Based on combinations of simple actions, they can perform relatively complex tasks. For example, they can avoid both static and dynamic obstacles while navigating through space and can explore distant places using sonar.

Subsumption architecture has been criticized because it (1) does not have an explicit knowledge representation; (2) does not have the capability of reasoning due to the absence of explicit knowledge representation, and (3) is situated in a purely local environment. It is difficult to adopt or develop learning mechanisms, and although theoretically possible, it is not easy to build very complex systems using this approach because of these characteristics (Jennings, et al., 1998; Luger, 2002).

The subsumption architecture was an important development in that while it does not represent knowledge in any explicit way, it introduced a mechanism for problem

39 solving in a distributed and cooperative fashion. As such, it became the basis of independent intelligent agents, as described below.

2.3 Distributed knowledge representation: intelligent agents

Intelligent agents have recently become a popular research topic in computer science as a method for distributed knowledge representation and problem solving. While there is a continuing debate on exactly what an intelligent agent is, the fundamental concept of the agent-based approach is understood as consisting of multiple agents, each possessing relatively constrained knowledge and acting in a coordinated way. Through the use of multiple intelligent agents, an overall system of distributed knowledge can be built that has substantially greater functionality (in problem-solving power) than the sum of the individual agents by acting collectively (d'Inverno and Luck, 1997).

The origin of the concept of ‘agent’ is rooted in the fundamental understanding of the term ‘intelligence’ in the field of AI. The heart of AI is seen broadly as knowledge representation and reasoning, which can be traced back to philosophy and logic (Newell and Simon 1976, Luger 2002). Due to the appearance of distributed computing 1, DAI

(Distributed AI) has itself become an important research direction since the late 1970s, and has challenged the traditional method within the field of AI of problem centralized solving (Nwana, 1996). The concept of agents can be traced back to the 1970s. Hewitt

(1977) has proposed a concept of an ‘actor’ as a self-contained, interactive, and concurrently executing object.

1 Distributed computing allows a big computation task to be split into smaller chunks and performed by many independent computers in order to take advantage of the idle time in these computers.

40 2.3.1 Definitions of intelligent agents

As stated above, there is no commonly agreed definition of what an agent is.

According to Shoham (1993), an intelligent agent is different from other knowledge representation strategies within AI because it has internal or mental states such as knowledge, belief, intention, and obligation (Shoham, 1993; Wooldridge and Jennings,

1995; Bradshaw, 1997). Franklin and Graesser consider an agent to be a system situated within a part of an environment that senses and acts on that environment over time in pursuit of its own agenda (Franklin and Graesser, 1997). Jennings et al. argue that generally an agent is characterized by situatedness, autonomy, and flexibility (Jennings,

Sycara and Wooldridge, 1998; Luger, 2002). Situatedness means that the agent receives sensory inputs from its environment so that it can perform actions, and change the environment in some way. Autonomy refers to the fact that the agent can act without the direct intervention of humans (or other agents), and has control over its own actions and internal states. And flexibility indicates that an agent can perform reflex, goal-driven, cooperative, and social actions. Besides the above three characteristics, Flores-Mendez

(1999) asserts that agent attributes should also include knowledge-level communication ability, mobility, reactivity, and temporal continuity.

Because the multiple criteria in characterizing the concept of agents make it difficult to reach a commonly accepted definition, Luck and d’Inverno (2001) attempted to avoid rigid definitions of agents. Instead, they tried to develop a conceptual framework for agents by simply specifying the minimum requirements for an entity to be classified as an agent. They proposed an environment with a four-tiered containment hierarchy, consisting of entities , objects , agents , and autonomous agents as shown in Figure 2.7. In their framework, an entity is a collection of attributes, and an environment is a collection

41 of entities. An object is an entity with a set of actions. An agent is an object with goals.

An autonomous agent is a self-motivated agent that pursues its own agenda rather than being under the control of other agents. They believe that these four classes are the fundamental components of a human cognitive view of the world. (Luck and d'Inverno,

2001)

autonomous agents

agents objects environment

Figure 2.7: The hierarchical conceptual framework for agents (source: derived from Luck and d’Inverno 2001)

2.3.2 Typology of agents

Because of the continued debate on the definition of ‘agents,’ there are various ways of classifying agents in the literature (Nwana 1996). The following distinctions are the most widely used.

2.3.2.1 Reactive vs. cognitive agents

Agents can be generally classified as reactive (reflexive) or cognitive

(deliberative) with respect to how they respond to external stimuli (Ferber, 1999; Aylett and Luck, 2000). Reactive agents usually take actions immediately after an external stimulus using a simple sensor-actuator mechanism (e.g., Brooks’ subsumption

42 architecture). They have no internal representation or knowledge base from which to retrieve prior information about the environment. Therefore, environmental stimuli can directly determine an agent’s actions. Researchers generally view reactive agents as having no goals, but Luck and d’Inverno (2001) argue that this is an over-simplification.

In their conceptual framework for agent definitions, any agent should be designed with either explicit or implicit goals. For example, in Brooks’ experiment, agents (or robots) are reactive in term of their other characteristics, but have goals (Brooks, 1991b).

In contrast, cognitive agents respond to external stimuli somewhat indirectly because they have internal states or memory. Each agent contains knowledge for determining its actions. Such knowledge may include goals, intentions, beliefs, commitments, roles, tasks, plans, behavioral rules, memories of past experiences, known or remembered states of the environment, and states of other agents (Ferber, 1999).

Therefore, when an agent receives a stimulus from its external environment, its decision making is affected both by its internal knowledge and by environmental conditions

(Faratin, Sierra et al., 1999; Luck, 1999; Sierra, Faratin et al., 1999). Each cognitive agent can operate in a relatively independent way due to the knowledge it contains and its reasoning capability. The actions of cognitive agents can be driven by either external stimulation or internal motivations.

2.3.2.2 Artificial-life agents vs. software agents

In addition to categorizing agents as being either reactive or cognitive, agents can also be categorized as being either artificial-life or software agents. For instance, Franklin and Graesser (1997) put the notion of agents into a larger context, and classified them hierarchically, borrowing the biological classification methodology. In their view, agents

43 can be classified into three high-level categories: biological agents (e.g., humans and animals), robotic agents , and computational agents . Computational agents can be further classified into artificial-life agents , and software agents.

Artificial-life agents are based on the bottom-up view of emergence theory

(Langton, 1988; Adami, 1998; Dean, 2003). The term emergence can be understood as something that is more than the sum of its parts. For instance, a person consists of innumerable cells, but a person is much more than the sum of these single cells. Johnson

(2001) gives a formal definition: ‘(Emergence) is what happens when an interconnected system of relatively simple elements self-organizes to form more intelligent, more adaptive higher-level behavior’ (also see Epstein and Axtell, 1996). The general purpose of using this type of agent is to discover simple rules that operate complex systems.

According to Franklin and Graesser (1997), artificial life agents live only in artificial environments on a computer screen (such as being considered as single grid cells on a screen) or in computer memory. A single artificial life agent usually incorporates no internal concepts or independent knowledge. It does not send or receive messages to or from other agents, and cannot do much without interacting with others. Thus these are always reactive agents.

According to Genesereth and Ketchpel (1994), an entity in an application program can be called a ‘software agent’ if and only if it communicates with other agents using some form of communication language. A software agent can be either a reactive agent or a cognitive agent (Nwana, 1996), but cognitive agents are always software agents. In general, software-agent applications integrate the mode of the communication-based cooperation with AI's tradition in knowledge representation and reasoning.

44 2.3.2.3 Other classifications

Nwana (1996) tried to classify agents using multiple standards of functionality such as mobility, deliberative thinking, attributes, and roles. He then concluded that there are seven types of agents: collaborative agents, interface agents, mobile agents, information/internet agents, reactive agents, hybrid agents, and smart agents .

Collaborative agents are distinguished in their capability of cooperating with each other for problem solving. Interface agents are used to facilitate the interface management of a multi-agent system. Mobile agents are characterized by the ability to move around in their environment. Information/internet agents can be utilized to help the management of the vast amount of information on the Internet. Reactive agents, as previously defined, have no internal knowledge representation and no reasoning capability. Smart agents have capabilities to learn from their experiences in interacting with the external environment. And hybrid agents possess two or more of the above attributes.

Doran et al. (1997) defined two categories of agents according to their capability of cooperation: independent agents , and cooperative agents. Independent agents include discrete and emergent agents, and cooperative agents include communicative and non- cooperative agents. Agents in a multi-agent system are discrete if they are independent from each other without communication. From a users’ viewpoint, emergent agents appear to be working together. But from the viewpoint of software functionality, they are not because they are simply carrying out their own individual behavior (analogous to artificial life agents). Communicative agents are distinguished in that they can communicate with each other by sending and receiving signals (i.e., communication).

45 2.3.3 Agent applications in GIScience

Agent-based approaches have been applied in GIScience for many different purposes. In this dissertation, these applications are classified into two categories: agent- based modeling (ABM), and distributed problem solvers. Each group of applications is described below.

2.3.3.1 Agent-based modeling (ABM)

The most popular agent-based application in GIScience has been agent-based modeling (ABM), especially in modeling land use and land cover change (LUCC). For example, Evans and Kelley (2004) used this technique to simulate forest re-growth with consideration of varying scales. These agents represent individual households in the research area. Evans and Kelley used a complex mathematical method for representing their agents' behavioral rules akin to classical mathematical modeling, in which geographic parameters are weighted quantitatively. For instance, the impact of terrain slopes, household preference in land use (e.g., agriculture, farming, and pasture), available household labor hours, and spatial neighborhood relationships were given numerical values to simulate their impacts on the LUCC. They also explored the sensitivity of the model to scale effects. A similar approach has also been applied to other

LUCC-related ABM applications (Parker, Berger, Manson and McConnell, 2001; Parker,

Manson, Janssen, Hoffmann and Deadman, 2003; Manson 2000; and Chong 2004).

In an early example of ABM even before the term was fist used, the Schelling

Model (Schelling, 1969; Schelling, 1971) used two types of cellular automata (black and white) to simulate segregated societies. With the popularization of agent-based approaches, more geographic applications have been seen in the recent literature. Epstein

46 and Axtel (1996) proposed ‘Sugarscape,’ in which artificial life agents, or cellular automata, are distributed in a raster data environment and move randomly in a pixel- space to collect sugar in order to increase their ‘health.’ The authors believe that using this bottom-up strategy with simple rules can simulate the dynamics of a complex society as previously described in section § 2.3.2.2. Sanders et al. (1997) introduced a multi-agent approach for simulating urbanism. Batty and Jiang (1999) summarized several agent- based applications on simulating shortest-path finding, visual field, and watershed dynamics. Schelhorn et al. (1999) provided an overview of STREETS, an agent-based pedestrian model that simulates the movement of the pedestrian population inside the urban areas. Westervelt and Hopkins (1999) described a model to keep track of the movements of individual species (i.e., animal predators and prey) within a small wildlife population over space and time based on using the GISystem software GRASS. More recent efforts have been focused on integrating ABM approaches with GISystem data layers to provide geographic agent-modeling software toolkits (e.g., Repast http://repast.sourceforge.net/index.html ) for simulating social and ecological phenomena

(also see Gimblett, 2002).

ABM applications have generally been based on emergence theory (Parker,

Manson, et al., 2003; Epstein and Axtel 1996). Agents as employed in these applications are relatively homogenous artificial-life agents. The movements (or actions) of each agent are usually determined by the status of its neighboring agents or local environmental conditions using mathematical models or simple rules. There is no direct inter-agent communication or cooperation.

47 2.3.3.2 Distributed problem solvers

In addition to ABM applications, agents can be viewed as distributed problem solvers, or tools, to provide ‘smarter’ solutions for users in a GISystem environment.

These applications are generally inherited from the concept of software agents . For example, as noted in Chapter 1, Rodrigues and Raper (1999) envisioned the concept of

‘spatial agents’ that can be potentially applied to perform spatial data mining, improve

GISystem interfaces, facilitate spatial analysis, connect different GISystems, and share data and information over networks. Gimblett et al. (1997) applied agents in their

Intelligent Decision Support and Simulation System (IDSS), where agents are used to assist natural resource managers in assessing and managing dynamic recreation behaviors and social interactions, and in resolving conflicts in wilderness settings. Raubal (2001) used agents to simulate behaviors in a cognitively plausible way for finding paths through an airport (also see Frank, Bittner et al., 2001). Sengupta (2003) presented a framework, called ‘Distributed Intelligent Geographical Modeling Environment’ (DIGME), for spatial decision support that utilizing Web-accessible repositories of spatial data and models. Nute et al. (2004) developed an agent-based decision support system for forest ecosystem management called ‘NED-2,’ in which agents were used to manage the interface, GISystem data files, and simulation models.

In contrast to ABM applications, agents as distributed problem solvers can have sophisticated internal knowledge representations and reasoning capabilities, and can communicate with other agents. Although the application contexts can be very different, agents are generally viewed as tools for improving user interfaces, facilitating data sharing, developing better algorithms, or building a decision-making environment. They

48 are not utilized as a means of more generalized process representation in a GISystem context.

2.4 Summary

Due to the level of complexity involved in the real world, there is no single technology sufficient for representing geographic processes as previously described in

Chapter 1. As discussed in this chapter, each existing approach for representing derived, higher-level knowledge has its specific strengths and uses. Categorization is a fundamental strategy for reducing unneeded detail. Fuzzy set theory is useful to in handling uncertainty in order to derive more meaningful information from using quantitative approaches. Ontology aims to overcome semantic differences in order to achieve better data sharing and interoperability. These approaches currently have been mostly applied to improve data representation, e.g., for enhancing database organization, for guiding better data collection, for uncovering new information from data, or for finding commonalities among user views on data model design within GIScience.

As a graph-based knowledge representation strategy, concept maps allow flexible representation of geographic relationships, can be extended to include time in knowledge representation, and can provide vivid knowledge visualization to users. Nevertheless, it is difficult to use concept maps to perform automated reasoning to solve very complex problems. Rule-based expert systems often include logic that allows such systems to use stored knowledge for analysis and problem solving, but they are largely constrained to narrowly defined problem domains.

Although there are many instances of applying multiple agents within the

GIScience literature, they are either used from the emergence-oriented perspective to

49 explore relatively simple rules (e.g. ABM) for simulating the dynamics of geographic phenomena, or applied from the software-engineering perspective to improve the functionality of GISystems. Thus these applications have limited capabilities to represent the highly complex geographic processes with the involvement of both physical and social components. The distributed and cooperative approach for knowledge representation in multi-agent systems, however, provides great potential and provides an opportunity to develop an agent form that also incorporates other representation strategies. Such an agent form is called ‘GeoAgents’ in this dissertation to represent the dynamic social and natural interactions in geographic processes.

In the current research, an integrated solution is provided to combine multiple representation techniques. First, the concept of GeoAgents is defined in Chapter 3 to significantly extend the notion of agents by integrating knowledge-representation techniques with geospatial databases with aims to represent the knowledge and environmental conditions of different levels of social groups, organizations, or individuals. An integrated representational framework will then be proposed in Chapter 4 to incorporate concept mapping technology, GeoAgents, and space-time representations to achieve knowledge-oriented representation of geographic processes.

50

Chapter 3

A DESCRIPTION OF GEOAGENTS AND THEIR IMPLEMENTATION IN AN INTEGRATED REPRESENTATION SCHEME

I f s p a t i a l r e p r e s e n t a t i o n i s t o r e m a i n c e n t r a l t o o u r t h e o r i z i n g a b o u t t h e e x t e r n a l

w o r l d , i t f o l l o w s t h a t t h e c h a l l e n g e f o r t h o s e w h o w i s h t o c r e a t e a n d u s e s p a t i a l

r e p r e s e n t a t i o n s i s t o e m p l o y e x i s t i n g G I S c r i t i c a l l y a n d t o l o o k f o r n e w w a y s t o

e n l a r g e t h e i r s c o p e a n d e x p r e s s i v e n e s s . I n t h i s w a y , s p a t i a l r e p r e s e n t a t i o n w i l l

c o n t i n u e t o o p e n u p n e w w a y s o f e x p l o r i n g s t r u c t u r e , r e l a t i o n s h i p s , a n d c a u s a l i t y

i n t h e w o r l d . ”

 J. F. Raper (1999, p61)

3.1 Introduction

As discussed in Chapter 1, representing geographic processes involves both form and mechanism. In a GISystem context, this requires the capability to store not only what/where data, but also how/why level knowledge in order to address qualitative relationships, inexactness, context-dependencies and interdependencies (see section

1.3.2). GISystems should also have the capability of using the stored process representation in an automated fashion so as to enhance our insight of geographic processes on a theoretical level.

Chapter 2 detailed the existing methods for handling these requirements, but also showed that no single approach currently handles all of them. The concept of GeoAgents is described here as an integrative approach. 3.2 Definition of GeoAgents

Although exact definitions differ, the general idea of agents, as described in the previous chapter, is that they are computer representations of separable and discernable elements in a process that taken together can be used to describe as well as potentially simulate and/or control some process. As such, the agent approach can be considered a distributed-knowledge-representation technique. Unlike other distributed knowledge techniques, however, agents actively interact with each other as separate actors within a given process. Given this key characteristic, an agent-based representation is inherently dynamic. Cognitive agents also have awareness of their environment and may act according to their own goals, which may or not coincide with the goals of other agents.

3.2.1 The basic concept of GeoAgents

GeoAgents (geographic agents) go beyond the standard notion of agents in that they are spatial, dynamic, and scale-dependent agents within an explicitly geographic context. Because of the complex and highly interrelated nature of elements within geographic processes, a GeoAgent-based representation requires a more diverse set of elements than other agent-based applications. A GeoAgent-based representation requires a multi-level integration of reactive and cognitive agents as described in the computer science literature, as well as a concept map to provide a tangible, graphic link between the process represented and the human user.

Some cognitive GeoAgents can perform goal-driven actions, including learning, reasoning, moving within geographic space-time, communicating with other GeoAgents, and retrieving information from geospatial databases. Such an individual GeoAgent would correspond to an independent intelligent agent, and could represent a person,

52 company, government entity, or some other element of the human milieu. An independent agent, without goals or beliefs but still with behavioral rules, would correspond to some element of the physical environment (stream, hill, etc.). Agents of the appropriate type can also represent generic representations of any of these (cities, streams, etc.). Generally speaking, a GeoAgent is an approach for distributed geographic processes representation. As such, they can be employed for simulation and learning, as well as collaborative problem solving.

Characteristics of individual GeoAgents can differ depending on their role in the process represented, but spatiality, dynamism, and scale-dependency are the basic characteristics of all GeoAgents. Spatiality means that GeoAgents have a location in geographic space. Moreover, they have the capability of being ‘aware’ of their natural and social environment and can respond to the environmental changes in that space.

Therefore linking the behavioral rules of individual GeoAgents with corresponding geospatial databases is essential.

The behavioral rules of GeoAgents are also dynamic . A GeoAgent often has a lifespan, which can include both birth and death. Its action rules are subject to change over time as the environment changes, as well as the states of other GeoAgents. The changing behavioral rules in force at any given time parallel the behavior of the entities or process components they represent in the real world at a given time. For instance, if a

GeoAgent is used to represent a community water system in year 2001, its actions should be based on the technology that was feasible in 2001, as well as the social polices and laws in force at that time. Its physical environment (e.g., land cover, geology, water sources, weather) should also be based on the state of the environment in 2001. To avoid

53 temporal incongruities, proper time representation among multiple GeoAgents, geospatial databases, and behavioral rules is thus crucial.

GeoAgents have multi-directional interactions (e.g. one-to-one, one-to-many, and many-to-many) with regard to scale. In another words, they can interrelate with other

GeoAgents at the same geographic scale, or with GeoAgents at larger or smaller scales.

Relations among GeoAgents can also be peer-to-peer or hierarchical with regard to geographic entity hierarchies. In representing the human-environment, for example, scale can pertain to different levels of social institutions or groups.

3.2.2 Interactions between GeoAgents and geospatial databases

As described in section 3.2 above, multi-level reactive and cognitive GeoAgents are aware of their physical environment and react to it, both individually and collectively.

Some cognitive GeoAgents can be used to represent human behaviors and store social laws, regulations, plans, and goals. Reactive GeoAgents can be used to represent the behaviors of physical elements (e.g., diffusion of pollutants). Technically, the lowest- level GeoAgents are ‘bots’ (i.e., robots), which have very simple and clearly defined behaviors. These behaviors may consist of only a single task. In this case, tasks are usually expressed in deterministic terms algorithmically. An important role of these bots is to fetch data from databases or perform numerical calculations on the data at the request of, and for, higher-level GeoAgents. Calculations could include something as simple as averaging, or something more complex such as examining multiple drought indices to determine drought severity according to a prescribed formula. Figure 3.1 shows the interaction among multi-GeoAgents and their physical environment stored in

54 geospatial database. In the figure, ‘b’ refers to low-level bots that receive requests from higher-level GeoAgents.

Natural behaviors Human behaviors

Reactive agents Cognitive agents

b b b b b b b b b b b Low-level bots

Geospatial databases (observed facts)

Figure 3.1: The interactions among GeoAgents and geospatial databases

3.2.3 Communication among GeoAgents

Inter-GeoAgent communications is crucial for representing dynamic and interconnected geographic processes. Unlike cellular automata, GeoAgents can affect interactions beyond their immediate spatial neighbors by sending messages. In AI, there are several types of agent-communication languages being developed. The most popular ones include KQML (Knowledge Query and Manipulation Language, see Finin, Fritzson et al., 1994) and FIPA ACL (Foundation for Intelligent Physical Agents, Agent

Communication Language, http://www.fipa.org ). These two languages are very similar.

Both are message-oriented, have similar syntax, have an ontology to provide vocabulary,

55 and have the capability for an agent to express its intentions. Each message generally includes the information of the sender, the receiver, the action (e.g., A requests B to perform action X), and the content (or message) to be interpreted by the receiver.

According to Hon, one major difference between these two languages is that ACL can describe the effects on the internal status of the sender and receiver, but KQML cannot

(see Hon, 2004). For GeoAgents, the user can choose either of these two languages according to their individual preferences. In this dissertation, ACL is applied in the case study (see Chapter 6).

3.2.4 Concept maps in representing world state dynamics

To convey understanding of geographic process as represented collectively by the

GeoAgents, it is important to represent relationships among GeoAgents and their environment in a user-friendly way. As noted in Chapter 2, because they lend themselves to visualization in the form of a graph, concept maps provide a means for describing relationships.

As described above, GeoAgents provide a means of representing specific actors or entities within an environment that may have goals and behavioral rules. This is, then, a distributed representation of how and why knowledge, but does not represent the state of a process at any given time for a specific circumstance (including which elements are actively influencing other elements and in what ways). Concept maps can serve this function, showing active elements visually. These elements also may or may not be

GeoAgents. In other words, not all social and physical elements have behaviors and goals that necessitate their definition as agents. Some elements are simply entities within the environment that are influenced by agents. Concept maps provide a means of

56 representing ‘process in motion,’ including what agents and other entities may be relevant to that state as a process evolves. This also means that the concept map serving this function needs to be dynamic. In essence, a concept map can be changed via the rule firings and other actions of GeoAgents. This is a significant extension of how concept maps have been used in the past.

In this role, concept maps would be tightly interlinked with a GeoAgent structure.

Concept nodes here can be GeoAgents, their surrounding elements, or events. The edges in the concept map can represent causal, taxonomic, or temporal relations between

GeoAgents and their environmental elements. From the concept maps, therefore, users can see a visual portrayal of the process, and a concept map at any given moment portrays the process state at that moment. The GeoAgents can sense such state by checking the concept map and then take response to the environmental changes.

This extended use of concept maps also does not conflict with the use of this knowledge representation approach in its usual application. Concept maps can be used to aid in capturing users’ understanding (e.g. via interviews) of the interrelationships of geographic entities and providing a user-friendly graphical device for portraying that knowledge.

3.3 A brief example: using multi-GeoAgents via concept graphs for representing geographic processes

GeoAgents make it possible to represent complex geographic processes in a flexible manner that allows the integration of qualitative information and quantitative data and models in multiple scales. Figure 3.2 shows an example of a water management- related geographic process using GeoAgents via a concept map to provide a visual

57 portrayal of the interrelationship among these GeoAgents. The symbols with shaded squares designate the GeoAgents for representing social elements, such as the EPA

(Environment Protection Agency), the Pennsylvania DEP (Department of Environment

Protection), local CWSs (community water systems), factories, or individual water users.

The shaded circles designate individual GeoAgents for representing the components of physical processes regarding precipitation, groundwater, or stream flow. And ‘b’ refers to bots for fetching data and performing calculation for higher-level GeoAgents. The relations among these social and natural elements can be represented with concept maps.

The more naturalistic symbols (rain, cloud, trees, etc.) represent the stored data.

E

D b p

c2 c1

u1

u2 s f1 g u3 u4

b b

Figure 3.2: Using multi-GeoAgents for representing the social and natural elements influencing community water systems and the interaction among them (E: EPA; D: DEP; c(i): CWS(i); u(i): water user (i); f1: a factory; p: rules regarding precipitation; g: rules regarding groundwater; s: rules regarding stream flow; b: low-level bots) (Source of the image: EPA, 1999, p.14)

58 GeoAgents can incorporate laws, regulations, or plans of various social elements

(e.g., EPA, DEPs, or CWSs) and the rules governing their interactions. For example, the

EPA enforces nation-wide laws and standards for safe drinking water. State-level DEPs inherit these national standards, also enforce state-wide regulations and plans, and monitor water sources of local CWSs. In addition to enforcing regulations, each CWS has its own plans to ensure stable and safe water supply for individual water users via policy.

Therefore, high-level social goals (or standards) can be hierarchically inherited and specified in different levels of institutions, and eventually passed to and embodied in individual water users’ daily life.

Social process is also interrelated with physical process. For example, drought, pollution, and land-use land-cover changes can significantly affect the quantity and quality of water supply. Some GeoAgents can be set with goals to analyze development of drought conditions or to track pollution level changes over space and time. For instance, there are five major indices that are utilized to determine drought severity in

Pennsylvania, including amount of precipitation, stream flow, groundwater levels, reservoir storage levels, and the PHDI (Palmer Hydrologic Drought Index). The indices can be considered as drought models that interpret (or calculate) the observed data as a meaningful drought signal, such as drought watch, drought warning, or drought emergency. As noted earlier, some low-level bots can be used to fetch data or calculate the status of particular drought indices. Higher-level GeoAgents can be used to store rules for analyzing drought severity according to the results from these bots, and then inform a specific GeoAgent (e.g., DEP) about drought development. Once a drought emergency is identified, for example, the DEP will inform local CWSs; and each CWS will follow its

59 own drought contingency plans, and communicate with individual water users and local news media in order to conserve water use during the drought period.

It is important to note that the above process of human-environment interactions is scale-dependent . For example, the GeoAgents of DEPs should interact with the database that stores the statewide data, and CWSs should interact with local scale databases. The interactions among these GeoAgents represent a cross-scale integration in geographic process.

3.4 Facilitating geographic knowledge sharing and decision making

As noted earlier, representation of geographic processes should be able to increase our ability to make complex spatial decisions quickly and to enhance our insight of complex geographic processes on a theoretical level. Therefore, a user-centered system design is required to make the stored representation intuitive for users. Using the

GeoAgent-based approach combined with GeoVisualization, users should be able to retrieve, easily understand, and share at least three different levels of geographic information and knowledge: (1) locations, (2) relations, and (3) overall process.

The location information can be displayed in maps that allow users to directly retrieve what/where information from geospatial databases . The geographic relations

(i.e., causal, categorical, or temporal relations) can be represented and displayed with graph-based concept maps , so that ‘how’ and ‘why’ elements are interconnected. To support decision making in complex problem environments, GeoAgents should have the capability of learning from past circumstance, reasoning, and using the stored social and scientific knowledge so as to provide guidance for users on ‘what to do’ under given conditions.

60 3.5 Summary

This chapter attempts to significantly extend the existing notion of agents in

GIScience to a more generic concept here called GeoAgents. Potentially, GeoAgents can be used to store knowledge relating to social as well as physical process components, and to represent the complexity of scale-dependent geographic process, especially human- environment interactions. Integrated with geospatial databases, GeoAgent-based representation should be able to facilitate user retrieval of complex spatial information, increase the understanding of geographic processes, and support appropriate spatial decision-making. Based on the conceptual-level description of GeoAgents in this chapter, the implementation of a Java-based prototype will be introduced in the next chapter.

61

Chapter 4

IMPLEMENTATION OF THE GEOAGENT-BASED KNOWLEDGE SYSTEM

4.1 Introduction

The previous chapter described the concept of GeoAgents in detail and their use within a larger software environment designed for knowledge-based analysis and decision making. A brief example of how GeoAgents could be employed for solving real- world problems was also given. The GeoAgent-based Knowledge System (GeoAgentKS), a Java-based prototype intended to demonstrate the power and usefulness of GeoAgents within an integrated framework, is described in the current chapter.

As discussed in Chapters 2 and 3, the agent-based approach allows compartmentalization and thus great flexibility for knowledge representation over various domains and sub-domains in a distributed knowledge base, as well as efficiency advantages at the implementation level due to the ability to perform tasks in parallel.

Concept maps lend themselves to visual representation of the overall knowledge structure represented as a roadmap for the user. GeoAgentKS integrates concept maps and a complex GeoAgent structure with constrained, domain-specific mathematical simulation models (stream flow model, drought model, etc.) and a geospatial database within a single, cohesive software environment. The prototype described here was not started from scratch. Instead, multiple pre- existing open-source software packages were adapted to facilitate implementation. Each of the pre-existing software packages used is introduced in section two below. How these multiple components were employed to achieve an integrated software tool is described in section three. In section four, the knowledge-acquisition strategy used for building a

GeoAgentKS is introduced. How the knowledge base was built for the current prototype is discussed in detail in Chapter 5.

4.2 Open source software adapted for GeoAgentKS

The open-source software packages adapted for GeoAgentKS include MadKit,

JESS, Touchgraph, and GeoTools. The general capabilities that each provides are briefly described below.

4.2.1 MadKit

First released in 1997, MadKit (a Multi-agent Development Kit, http://www.MadKit.org ) was developed by Jacques Ferber and his group in the Computer

Science and Artificial Intelligence Department at the University of Montpellier, France.

MadKit is a Java-based multi-agent environment in which agents play roles within groups. MadKit provides general agent facilities, including lifecycle management, agent organization, and message passing (see Appendix A.1 for more detail), and allows a high degree of heterogeneity in how individual agents are internally structured and the subsequent roles they may play. Both ACL and KQML (see Chapter 3) are implemented within Madkit as inter-agent communication languages. Madkit is integrated with JESS

63 and provides more than 20 functions (see the definitions in Appendix A.2) for constructing agent-related behavioral rules.

4.2.2 JESS

JESS (Java-based Expert System Shell, http://herzberg.ca.sandia.gov/jess/ ) is a rule-based inference engine, originally conceived as a Java version of CLIPS (C

Language Integrated Production System). CLIPS was initially developed in 1984 at

NASA's Johnson Space Center in response to the high cost of expert systems and the poor integration of expert systems with other languages. The first version of JESS was released in 1995 at Sandia National Laboratories (http://www.sandia.gov/ ). JESS now possesses many advanced features, including both forward-chaining and backward- chaining reasoning strategies. JESS rules are represented as IF …THEN… structures in a distributed knowledge based on the specific rules associated with each agent (Friedman-

Hill, 2004).

4.2.3 Touchgraph

Touchgraph (http://www.touchgraph.com/ ) was developed by Alexander Shapiro at TouchGraph LLC, and was first released in 2002. Touchgraph has been utilized in many commercial Websites to make complex information linkages more intuitive for

Web users. In recent years, the Touchgraph has been extended as an ontology tool called

ConceptVISTA (www.geovista.psu.edu/ConceptVISTA/index.jsp ) by the GeoVISTA

Center (see www.geovista.psu.edu/index.jsp ) for ontology creation and visualization.

Touchgraph utilizes a graph-based knowledge representation. Users are able to navigate visually through the graph by rotating and zooming in/out to read the labeled graph nodes and their relationships.

64 4.2.4 GeoTools

GeoTools (http://www.geotools.org ) is a Java-based GISystem open source package, which began as the GeoTools library in 1996 at the University of Leeds, UK.

GeoTools currently provides a GIS toolkit for the Open Geospatial Consortium (OGC), an international consortium for develop publicly available geo-processing tools with hundreds of companies, government agencies, and universities participating. Similar to many commercial GISystem products (e.g., ArcGIS), the GeoTools library provides standard GISystem capabilities for storing, manipulating, and displaying data in both raster and vector form. Within GeoTools, operations on vector data are accomplished through the use of JTS (Java Topology Suite, http://www.vividsolutions.com/jts/ jtshome.htm ), which defines the topology of points, lines, and polygons.

4.3 Integrating multiple representation techniques in GeoAgentKS

In order to integrate the capabilities of the above software packages within

GeoAgentKS, the primary programming work in this research included the development of a GUI (graphic user interface) for GeoAgentKS, and extension of Madkit and

Touchgraph. The GUI developed as the front-end of GeoAgentKS displays the concept map, the geographic maps, and the GeoAgents via linked views, and also allows users to control these components. For MadKit, the agent kernel was extended to allow direct user interaction with GeoAgents. This extension entailed the addition of concept nodes as a basic element that can be registered together with the active agents. If an agent is active, the concept node that represents this agent in the concept map is treated as an internal property of this agent. An agent in Madkit can have an independent user interface to display the results from an internal rule firing. This user interface was extended with a

65 map window for displaying the spatial information with which the agent interacts. For the concept map in Touchgraph, concept nodes with a time component, spatial properties, and an agent linkage were all added.

This section describes the implementation of GeoAgentKS in detail, including its architecture, information flow, and GUI, utilizing these pre-existing open-source packages.

GUI

Group/Role Concept map Agent engine Lifecycle kernel Geospatial Messaging database (Touchgraph) engine (Madkit)

KB KB (GeoTools) KB KB GeoAgents

Models

Inference engine Geospatial (JESS) database

Figure 4.1: Connectivity of the major GeoAgentKS modules/facilities (KB: knowledge base)

4.3.1 The architecture of GeoAgentKS and overall information flow

Figure 4.1 shows how the various components of GeoAgentKS are interrelated.

GeoAgents contain behavioral rules in what together comprise a distributed knowledge base. As shown in Figure 4.1, MadKit provides the agent kernel that manages agent

66 groups, the roles of GeoAgents, and passes messages among agents. The group and role information pertaining to a specific GeoAgent is defined as part of its internal knowledge and is registered in the agent kernel when this GeoAgent is active. GeoAgents can use the information of the group and role to identify each other for message passing (see

Appendix A.2 for the methods of passing messages, and of defining groups/roles in JESS rules).

The concept map plays a central role in GeoAgentKS, and this module is connected to all other functional components except the inference engine. Concept maps are often used as a graphical knowledge representation device to allow the user to more easily understand an ontology (i.e., the elements relevant to a given knowledge domain and their interrelationships) and allow database designers to derive a common schema for shared databases. In addition to being used in this conventional way, the concept map in

GeoAgentKS is also used for representing the state of a process at any given moment in time and for a given place. For example, a concept node can represent a drought or a well. The status of the concept node can be calculated from a model (e.g., drought model), queried from the database (e.g., depth of the well), or input from the user. If calculated from a model, the concept node points to the definition of this model, in which the variables (e.g., precipitation or soil moisture) are linked to the observed data in the database. If the database stores a set of time serial data, the modeling outputs can be considered as a simulation of the dynamic environmental changes (e.g. the drought development from start to end). The geospatial database engine uniquely provides access to observational data stored in the geospatial database. Within the geospatial database, observational data are stored in standard shape-file format. The GeoAgents are ‘aware’ of

67 their environmental conditions via interpretation of the stored concept map and the model outputs.

Using the overall system architecture shown in Figure 4.1, the conceptual schemes in Figures 3.1 and 3.2 are functionally achieved in GeoAgentKS. For instance, the conceptual relationships among the social and natural elements (e.g., EPA, DEP,

CWS, precipitation, stream flows, and wells) can be represented in the concept map. The roles and actions performed by these elements within geographic processes can be represented as GeoAgents’ condition-action rules or models. By using the geospatial database engine, as discussed above, data relating to the concept map, GeoAgents, and models can be dynamically retrieved from the database to support the representation of the interactions among the relevant social and natural elements.

4.3.2 The user interface

To represent the knowledge pertaining to geographic processes, the graphic user interface (GUI) of GeoAgentKS was designed to allow the users to simultaneously see the GeoAgents’ actions, each GeoAgent’s data environments, and the relations among

GeoAgents and their environmental components. For example, Figure 4.2 shows the GUI of the current prototype of GeoAgentKS. As seen in this screen image, relationships among the many elements can be represented in the concept map (upper left). Users can use colors and shapes to enhance the knowledge visualization. In this concept map, for example, the concept nodes with blue sharp-corner rectangles are linked with the files that store the behavioral rules of the specific GeoAgents. The concept nodes with round- corner rectangles in different colors represent the environmental components of different

GeoAgents. When a concept node is selected by the user, this node is changed to yellow.

68

Figure 4.2: The user interface of the GeoAgent-based Knowledge System

Each GeoAgent has a private GUI (right) including a text window and a map window. For example, there are two GeoAgents displayed in Figure 4.2, which are

CollegeTownship_CWS (i.e., College Township Water Authority) and DEP_PA (i.e., the

Pennsylvania Depart of Environmental Protection). The text windows show the actions from the rule firing within the GeoAgents' knowledge bases. The GeoAgents' private map

69 windows display the multiple scales of data environment with which they interact. For instance, the DEP_PA GeoAgent interacts with a statewide data environment, and

CollegeTownship_CWS with a more detailed local data environment. A particular concept node can be dynamically linked to the geographic entity represented in the database. For example, note that performing a mouse-over on the concept node of

‘CollegeTownship_CWS’ results in the service area of the College Township Water

Authority being highlighted in red on all map displays.

As a part of how the user interface functions, each concept map in this prototype has a special, invisible concept node named ‘GISPARAMETERS’ that stores the needed spatial parameters. These include the data file names within the database and the display properties (e.g., colors). When the concept map is loaded, the related geospatial information in the database is automatically displayed cartographically below the concept map (see ‘the Study Area’ (lower left) in Figure 4.2).

Figure 4.3 demonstrates how a particular concept node can be linked with the geospatial database and a GeoAgent. This is a dialogue window of the concept node

‘CollegeTownship_CWS.’ In this window, the user can define the feature attribute (i.e., the ‘feature’ box) and the data file (or a map layer, i.e., the ‘GIS file’ box) for this concept node. For instance, the current concept node is linked with ‘College Township

Water Authority’ in the ‘CWS’ data layer. If the user graphically selects this concept node, the database engine automatically queries the defined feature name in the data layer and highlights this feature in the cartographic display (Figure 4.2).

70

Linked to the GeoAgent’s rule file

Links to its feature name in the GIS database

The name of the loaded data file

Figure 4.3: The properties of a concept node

The user can also specify behavioral rules for a particular GeoAgent and its data environment from the associated concept node. As shown in Figure 4.3, for example, the concept node ‘CollegeTownship_CWS’ is linked with its agent rule file

‘\geoagents\CollegeTownship_CWS.clp’ in the ‘Agent file’ box. In addition, using a pop- up menu (Figure 4.4), the user can add the data environment (e.g., data layers) for the

GeoAgent. The method of doing so is to store the properties of this data environment

(e.g. names of the data files, fields of the data tables, and the display colors) in a file.

Once the user loads this file for the GeoAgent, the concept node can point to the actual data stored in the database. When the user clicks ‘Start GeoAgent’ on the pop-up menu, a new GeoAgent is initialized. During the initialization, the rule file is loaded to Madkit,

71 and a new agent (i.e., the GeoAgent of CollegeTownship_CWS) and its user interface are created and displayed, and the data environment of this GeoAgent is also displayed.

Similarly, the concept node ‘DEP_PA’ is linked to the rule file and its data-environment of the GeoAgent via ‘DEP_PA’ (Figure 4.2).

Figure 4.4: Adding the data environment for the GeoAgent of CollegeTownship_CWS

Figure 4.5 illustrates in more detail how the user can visualize the relationships between GeoAgents and the state of the world as represented via the concept map. The example portrayed shows the GeoAgents responding to a power outage. As discussed earlier, the status of the concept nodes can be either input directly by the user, or retrieved from the database by querying the spatial attributes defined in the concept nodes. As an example in the GUI, the ‘system_status’ of the ‘StateCollege_CWS’ now is

‘in_good_condition.’ Similarly, various environmental elements are described in other concept nodes, such as ‘surroundings,’ ‘emergency generator,’ ‘system pressure,’ and

‘station.’

72

Figure 4.5: GeoAgents are aware of their environmental status via checking the concept nodes

GeoAgents automatically check the relevant concept nodes based on their behavioral rules to determine the environmental conditions at a given place and time.

These actions are also noted in the display. For example, in Figure 4.5, the concept nodes turn to green in the display when they are checked by GeoAgents. From the changed colors, the user is able to see which environmental elements may affect a GeoAgent's actions in a given situation. The specific actions (i.e., results of rule firings) for individual

GeoAgents are displayed in the text window of their private interfaces (also see the

‘GeoAgent's actions’ in Figure 4.2). To support decision making, the actions here are defined as suggestions to users about what to do. For example, actions in Figure 4.2 are

73 the appropriate steps for DEP_PA or CollegeTownship_CWS to cope with a drought warning.

Concept maps were also extended to include a time component. As shown in

Figure 4.3, the user can directly input a time property into individual concept nodes so that the temporal relationships among them can be represented.

4.4 Knowledge acquisition for GeoAgentKS

Knowledge sources for building the concept map and GeoAgents’ distributed knowledge base included both domain experts and pre-existing documents. Because building the sophisticated human-computer interface mechanisms needed for knowledge acquisition was beyond the scope of the current research, the GeoAgentKS prototype does not yet have a user-friendly interface necessary for domain experts to enter complex and often abstract concepts directly. A knowledge engineer (i.e., knowledge programmer) with a strong technical background is still required to operate this prototype tool and to act as translator for the domain experts in order to capture and represent their knowledge within GeoAgentKS. Much of the experts' knowledge can be captured via interviews, with the knowledge engineer operating this tool and performing the necessary translation as part of the conversation. This approach allows the domain expert to verify and augment information as the entry process proceeds. The knowledge engineer can also interpret written materials (e.g., laws, regulations, or plans) and build a knowledge representation (or components of one), and then ask domain experts to evaluate and improve the representation. The next chapter describes a case study illustrating this process.

74 4.5 Summary

This chapter described the implementation of the GeoAgentKS prototype, which integrates a complex structure of independent intelligent agents with a concept mapping facility, a rule-based inference engine, and a geospatial database into a single, cohesive tool for representing geographic processes and assisting complex decision making.

Within this prototype, the concept map plays a central role in linking knowledge representations with databases, both for internal functioning and for aiding the user.

Using a distributed approach for geographic knowledge representation, it is possible to address the heterogeneous and scale-dependent nature of geographic processes, including physical processes, cultural processes, and the interactions of the two.

Dealing with temporal dynamics was the most difficult task encountered in designing and implementing the prototype. This research only addressed time representation by expanding the concept map with a time component to allow the establishment of temporal relations among observed geographic events, and by saving observed changes (e.g., the weekly drought data) into the database to allow dynamic modeling. But how to represent the changing laws and regulations and how to build a mechanism to achieve temporal congruity among the distributed knowledge/data representations still remain topics for future work.

The next chapter introduces a CWS-related case study in Central Pennsylvania using the GeoAgentKS prototype. This case study demonstrates the prototype’s advanced functionality in representing complex human-environment relations.

75

Chapter 5

CASE STUDY DESIGN: GEOGRAPHIC KNOWLEDGE ACQUISITION, REPRESENTATION, AND PERFORMANCE EVALUATION

5.1 Objectives of the case study

As discussed in Chapter 1, current GISystems are designed to handle discrete and

(usually) quantified observational data. As such, they provide no tools for either (1) handling qualitative and inexact socially based knowledge such as laws, regulations, and policies within a given problem context to assist decision making , or (2) representing higher-level, how/why knowledge of geographic relations to better understand geographic process. GISystems currently rely completely on the expertise and personal knowledge of the individual user for these two critical aspects of geographic representation and analysis.

Given the complexity of real-world problems being currently addressed using

GISystems and the multiple-domain knowledge required, these are significant limitations, and the GeoAgentKS prototype, as developed in this research, is designed to specifically address these limitations. This is a new form of GISystem, incorporating a knowledge layer that contains both qualitative information and higher-level knowledge in addition to a data layer (i.e., a database) with observational data. As described in Chapter 4, the knowledge layer is implemented by integrating graph-based, rule-based, and distributed knowledge-representation techniques (Figure 4.2). As this research was supported by the HERO project (Human-Environment

Regional Observatory, http://hero.geog.psu.edu ), one of the research topics incorporated within that larger project was used for the application context of the case study  studying the community water systems (CWSs) in Central Pennsylvania. The objective of this case study was to (1) demonstrate how non-observational and higher-level abstract knowledge as represented in GeoAgentKS can be captured, stored, and represented to the user, and (2) evaluate the utility of such knowledge for both experts and novices. The current chapter describes the case study design. In section two below, the overall strategy is outlined. Section three describes the methods used for capturing and verifying knowledge from multiple sources in more detail. Section four discusses how the combined geographic knowledge/database, once built, was used and evaluated. Examples of these methods and the results of the CWS case study are described in detail in Chapter

6.

5.2 Overall methodology

Knowledge acquisition and representation for most application contexts involves deriving information from text sources as well as from interviews with domain experts.

Testing the adequacy of the knowledge representation is done in two ways: (1) having both domain experts and novices use the prototype system, and (2) simulating specific real-world events from the past and evaluating the system response against the historical response of the individuals and organizations involved.

77 5.2.1 Knowledge acquisition and representation

As mentioned above, geographic knowledge can be captured by interpreting pre- existing documents and by interviewing experts. In this research, external documents are used for constructing behavioral rules of the GeoAgents and initial versions of the concept maps. To extend document-derived knowledge, knowledge of domain experts is captured via interviews using the concept-mapping function within GeoAgentKS.

Interviews are essential for capturing up-to-date information (since written documents can rapidly become obsolete), and for capturing individualized understanding of geographic processes relevant to functioning of the GeoAgents.

In the example application context used for the current case study (i.e., the management of CWSs), there are many written documents (e.g., laws, policies, the emergency plans, and operation manuals) that record the descriptions of both the physical and organizational structures of individual CWS, and the appropriate responses to environmental changes by regulation and accepted practice. Because these procedures are generally well organized and are written utilizing fairly standardized vocabulary, as is the case for many organizations, it is relatively easy to formalize the knowledge contained within these documents as behavioral rules for GeoAgents.

In addition to the documents, there is much domain-specific knowledge that is not documented, including recent changes relating to organizational structure or operating rules and individualized understanding of the causes and effects of particular events learned through personal experience. Such knowledge can only be elicited directly from the experts who run or otherwise have operational linkages with the specific organization.

The concept-mapping function within GeoAgentKS was utilized in the current case study to facilitate knowledge elicitation during the interviews with experts. Concept maps were

78 initially developed to provide a tool to aid knowledge elicitation through the use of diagrammatic visualization of the knowledge space being conveyed. This interview approach is referred to computer-based concept-mapping (also see Bruillard and Baron

2000) in the rest of this dissertation.

The knowledge engineer begins the knowledge acquisition and representation process by building independent sets of behavioral rules for individual GeoAgents and separate concept maps for various portions of the knowledge domain using text sources.

Multiple rounds of checking and comparing the knowledge stored in GeoAgentKS with the original text sources may be required to ensure consistency before proceeding with the expert knowledge construction interviews. Once the independent representations are established, they are gradually expanded, refined, and integrated into a single and interrelated knowledge base through additional knowledge construction interviews via concept-mapping and subsequent consistency checks by the knowledge engineer.

Integration of concept maps can include the resolution of conflicting nodes derived from differing sources, and the merging of redundant nodes. It is by linking shared concept nodes that separate concept maps are merged into a larger concept map.

To resolve the conflicts between different knowledge sources, the knowledge engineer then asked different experts to validate whether the linkages are correct or not.

Integration in the current context also includes building the communication mechanisms between GeoAgents (i.e., formalizing how the rules of individual GeoAgents interact).

Two important things related to the strategy of knowledge acquisition and representation using concept maps are noted here:

79 First, concept maps should always be constructed initially from the written documents before the concept-mapping interviews, if the documents are available. This allows the experts to: (1) verify the knowledge captured from the text documents; (2) add their own unique knowledge during the interviews; (3) validate the resulting concept map.

Second, geospatial data should be collected after completion of the knowledge acquisition process and then linked to the knowledge representation so as to support the concepts and rules represented within the GeoAgentKS knowledge base.

5.2.2 Evaluation of the knowledge representation

After the knowledge acquisition phase, user evaluations are required to examine:

(1) if the contents of the integrated knowledge representation is correct and adequate, and

(2) whether experts, as well as non-experts, find the integrated knowledge representation in GeoAgentKS is easy to comprehend and use. Because non-experts usually have a different perspective on the domain knowledge represented, it is necessary to acquire reaction from domain novices as well as domain experts in order to examine the capability of GeoAgentKS in conveying and sharing knowledge.

Experts, by definition, possess abundant domain-specific background knowledge and expertise, and can thus judge accuracy and adequacy of the existing knowledge representation. In conducting both the knowledge-construction and the separate evaluation interviews with experts, they must first be asked to review the entire knowledge representation (i.e., complete concept map, GeoAgents’ actions, and geospatial maps). This step allows them to become familiar with the visual and conceptual representation methods used, as well as the knowledge represented to that

80 point in the process. They are then asked to identify any knowledge they believe to be incorrect and to add information they think is missing.

Note that expert evaluation interviews are different from expert validation of their own concept maps in the knowledge-acquisition (i.e., concept mapping) interviews. The evaluation interviews focus on evaluating the usability of the entire integrated knowledge representation, while the purpose of the experts’ validation of their own concept maps at the end of a concept mapping interview is to check for any errors or omissions in the knowledge representation as they had just conveyed it, focusing on identifying specific elements and how they interact.

In the second phase of the evaluation interviews, experts and novices are asked how they will use the GeoAgentKS if someday they were to have GeoAgentKS available on their desks as a problem-solving tool. The focus in interviewing novices, with no or little knowledge in the given domain, is to examine if these non-experts can directly and quickly learn and understand the captured complex knowledge (i.e. organized information, see page 9) from the knowledge representations. If so, it means that the concept maps within the GeoAgentKS can be potentially used as an effective tool for explanation and learning.

Details of the entire knowledge acquisition, representation, and evaluation process are described in sequence in the remainder of this chapter.

5.3 Capturing knowledge and populating the knowledge base in

GeoAgentKS

In this research, as discussed earlier, the knowledge-engineering process involves three basic phases: (1) initial knowledge capture via interpretation of text documents; (2)

81 subsequent computer-based concept-mapping interviews to allow experts to verify and refine the knowledge representation derived in step (1); and (3) integration of discrete knowledge elements into a single system. This section discusses each of these steps in sequence.

5.3.1 Capturing knowledge via interpretation of text-based documents

To formalize document-based knowledge, there are two tasks that proceed in parallel: (1) identification of GeoAgents and construction of the GeoAgents' behavioral rules, and (2) derivation of the concept nodes and their interrelationships. These two tasks are done in parallel, because the concept map can serve as a visual aid for deriving a knowledge structure to help the establishment of GeoAgents’ behavioral rules.

To construct the behavioral rules from documents, text analysis is performed by hand. Although there have been many research efforts exploring technologies for achieving knowledge formalization from text in an automated fashion (Gelbart and

Smith, 1992; Schweighofer, Rauber et al., 2001), these technologies are generally still at an early stage. Another reason for manually performing the knowledge-formalization process for GeoAgentKS in this research is because it allows the knowledge engineer to maintain closer control over the process, integrating multiple knowledge-representation techniques and diverse knowledge sources.

To build the rules for a collection of interacting individual GeoAgents from written documents, the knowledge engineer needs to analyze the text and convert it into the executable rules for GeoAgents. More specifically, this research explored three procedures: sentence analysis, pseudo coding, and rule formalization. For sentence analysis, the first step is to plan GeoAgents’ goals, tasks, and actions. Achieving a goal

82 usually requires a set of tasks. A task consists of a set of actions (Ferber, 1999). For example, ‘a book published’ is a goal state, ‘to write a book’ is a task, and ‘writing’ is an action. A goal is derived from the overall text. For high-level tasks, the text is divided into multiple groups of sentences using a text editor (e.g., Word or Notepad). A particular task for an individual GeoAgent may be derived from a group of sentences. For low-level actions, the general method is to highlight the verbs of individual sentences. Pseudo codes are a set of non-executable codes, and usually combine some of the structure of a programming language with an informal natural-language description. Using pseudo codes (or pseudo coding) is to capture the generalized structure of behavioral rules for

GeoAgents so that knowledge engineer is not distracted by the detailed syntax of the particular expert-system language (i.e., JESS) utilized within GeoAgentKS. The pseudo codes can be just a list of key words. Rule formalization is used, in turn, to translate the pseudo code into executable JESS rules within the distributed GeoAgent knowledge bases by using a text editor. For many condition-action rules, as discussed in Chapter 4, the FACTS of the condition components (i.e., the IF part) are guided by reference to the relevant concept nodes in the concept map.

5.3.2 Capturing domain knowledge via interviews

The objective of the concept-mapping interviews, as discussed earlier, is to capture the experts' how/why level of understanding of the geographic processes relevant to the given application or knowledge domain. The concept-mapping interviews consist of three components; (1) pre-interview preparation, (2) interviews for constructing concept maps, and (3) concept map validation.

83 5.3.2.1 Pre-interview preparation

According to the Pennsylvania State’s human subject protection rules for the current research, each interview is planned to be completed within about one hour. Since capturing and formalizing experts' informal knowledge for any domain is an individualized and time-consuming process, careful preparation before the interviews is key to obtaining a good result. Because the purpose of the knowledge construction interviews is to capture how/why level knowledge, the interview questions were designed to prompt experts to explain the causes and effects relevant to geographic events pertaining to their specific domain of expertise. It is also necessary to prepare in advance

(1) a printed sample concept map to help the expert participants understand how to use concept maps to represent their knowledge, and (2) paper maps upon which the expert can mark some spatial information relevant to geographic features that appear in the concept map.

A sequence of questions is designed to prompt a description by the expert, yet also leave sufficient latitude for them to describe their knowledge domain in terms that feel natural to them. The initial questions therefore must be fairly broad, but still focus on deriving key elements of the knowledge domain at-hand, as would be represented within a concept map (e.g., things, events, and the relationships among them). Tailoring the exact wording to the given context, an initial question must focus on determining what the main elements in this knowledge domain are. A follow-up question would then focus on where these geographically are located. Given elicitation of these basic elements, subsequent questions are intended to guide the conversation toward what the events that are integral to the process or affect its operation are. This form of question is intended to

84 guide the conversation toward the functional interrelationships among the elements. To summarize, the overall sequence of questions is on the initial focus, with subsequent focus on refinement and revision, and with follow-up questions on specific how and why elements and combinations of elements.

5.3.2.2 Knowledge-acquisition interviews

During the knowledge-acquisition interviews, the experts cannot be expected to learn how to operate GeoAgentKS by themselves. The interviewees need to focus on the structure of the relevant knowledge domain with the guidance of the knowledge engineer.

Before making concept maps, the knowledge engineer demonstrates the document- derived concept maps (i.e., the printed sample concept map) that relates directly to the experts’ own area, and explains conceptually how the concept-mapping function of

GeoAgentKS is used to modify and extend this particular form of knowledge representation. Once the expert understands the method, he/she starts verifying and adjusting the document-derived concept maps, and then adds his/her own knowledge to the concept map. Note that if text documents are not available for the specific domain to be addressed, the printed sample concept map is only used as an example for the purposes of demonstration, and the construction of the concept map must start from scratch.

The knowledge engineer operates GeoAgentKS as the user responds, adding concept nodes (i.e., various relevant entities and events described by the interviewees) and adding/changing relationships among these concept nodes. If the expert mentions some mappable geographic features (e.g., wells, or filtration plants) during the interview, the knowledge engineer asks the experts to mark the locations of these features on paper maps. This is initially a learning process for the interviewees, and adding the first few

85 concept nodes gives them a better understanding of what a concept map is and how it can be used to represent their knowledge.

5.3.2.3 Concept map validation

After the initial knowledge capture phase of each concept-mapping interview, the expert interviewee is asked to validate the contents in the concept map that he/she just constructed in order to examine how well the concept mapping functions in GeoAgentKS captured the relevant domain knowledge (note that because of the limited time in each interview, the expert interviewee usually is not requested to validate the document- derived GeoAgents’ rules in the same interview). To validate whether or not the concept map (i.e., that the expert just constructed) can successfully represent his/her conveyed knowledge, the expert interviewee is asked to review the entire concept map and assess how much knowledge has been captured in GeoAgentKS.

The interviewee first reviews the entire concept map in detail, with the interviewer rotating and zooming in/out as requested. The interviewee is then asked: (1)

Has GeoAgentKS captured his/her knowledge? The answer to this question is measured with a ‘yes’ or ‘no’ score. (2) How much of his/her knowledge relevant to the interview topic is captured in the concept map? The interviewee is asked to give a direct estimate using percentage. (3) If the knowledge representation is incomplete, what is the reason?

5.3.3 Integration of the discrete knowledge representations

As noted earlier, the GeoAgents' behavioral rules and concept maps are developed in a relatively independent way during the knowledge-acquisition stage, with rules for specific GeoAgents, and individual concept maps developed via the text analysis process and by individual interviews. It is thus necessary to integrate the entire knowledge

86 representation by combining the individual concept maps into a single and coherent concept map and then establishing the functional interrelationships of the GeoAgents.

The method for integrating independent behavioral rules of the GeoAgents is to build inter-GeoAgent communication protocols. For example, the documents of a particular CWS usually describe when this CWS needs to communicate with other agencies. In the GeoAgent representing this CWS, a communication rule then can be formalized to define what information needs to be sent to the GeoAgents of other agencies. For the message receivers, the corresponding receiving rules are needed to interpret the contents of the messages so as to respond to the message sender’s request.

As discussed in Chapter 4, an ACL (Agent Communication Language) encoded within

Madkit is used in GeoAgentKS to implement such communication.

To integrate the separate concept maps into a large concept map, separate concept maps are linked via shared concept nodes. For example, if the separate concept maps portray different CWSs within the same county, the concept node of this county name can then be used to link the concept nodes of these CWSs so as to link the separate concept maps. Detailed examples of how to integrate multiple concept maps are described in Chapter 6 as part of the case study description.

5.3.4 Building a corresponding database

Collecting spatial data and constructing the geospatial database containing the observational data needs to take place after the interpretation of documents and the concept-mapping interviews with the domain experts. This sequencing is necessary simply because knowing exactly which data would be relevant (i.e., relating to the entities and events stored in the concept maps and GeoAgents' rules) is impossible until

87 completion of the knowledge acquisition phase. The spatial information required for the database (e.g., geology, land cover, streams, road systems, and well locations) can be collected from the relevant organization directly, from public data sources via the

Internet, and from digitized paper maps marked by the experts during the interviews. The detailed method for integrating databases with concept maps and GeoAgents' rules was previously described in Chapter 4 as part of the system design.

5.4. Evaluating the integrated knowledge base

To examine the effectiveness of the above database-enhanced geographic knowledge base, evaluation interviews must be conducted with both experts and novices to test (1) if the domain knowledge is correctly and adequately captured within

GeoAgentKS, and (2) if the stored knowledge can be easily learned and used. As a separate evaluation step, simulations using GeoAgentKS independently test if the prototype can adequately represent the real-world and provide appropriate results in real- world problem scenarios without direct guidance by a human expert.

5.4.1 Evaluation by domain experts

After construction and integration of the knowledge base, subsequent evaluation interviews with different experts were conducted to assess how well the entire knowledge system in this prototype can represent the particular knowledge domain, as well as how well such a system can help provide a tool for experts in problem-solving. The objectives of these evaluation interviews are (1) to assess the GeoAgents' action rule base, and (2) to evaluate the use of GeoAgentKS for modeling specific event sequences.

88 At the beginning of each of the evaluation interviews with an expert who has not previously seen GeoAgentKS, the knowledge engineer interactively operates

GeoAgentKS to demonstrate how geographic information and knowledge is represented in the concept maps, the agent rules, and in the geospatial database. Once the experts understand how GeoAgentKS works, they are asked to read the integrated concept map with the guidance of the knowledge engineer, and then to verify if the concept nodes and relationships between them are correctly represented in their opinion. If not, the experts can request the knowledge engineer to correct whatever they believed as errors.

For evaluating the collective performance of GeoAgents, two procedures are needed. First, the knowledge engineer demonstrates the GeoAgents' environmental conditions represented within the relevant concept nodes to the experts, and asks the experts to think of what will be the appropriate actions assuming they are now facing such conditions. Second, the knowledge engineer then asserts the same conditions to the relevant GeoAgents and asks the expert to count how many actions recommended by the

GeoAgents are incomplete, missing, or wrong by reading the rule-firing results from the interface of GeoAgentKS. After the above evaluations, the knowledge engineer asks the expert to elicit how they might use GeoAgentKS if it were available to them.

5.4.2 Evaluation by novices

In addition to evaluation by domain experts, interviews with novices are also needed to test whether the knowledge representation within GeoAgentKS can be easily learned and used. To avoid the direct impact from the knowledge engineer’s background knowledge, the novice participants are requested to explore the concept map by themselves. At the beginning of each of these evaluation interviews, the novice is taught

89 to use some simple functions of GeoAgentKS so that they can navigate the concept maps and query the spatial information of the geographic features from the concept nodes.

Once these participants finish reading each of the concept maps alone, they are asked pre- designed questions to test if they truly understand the represented knowledge.

Because one of the foci of this dissertation is representing how /why -level knowledge, the questions for the evaluation interviews are designed to test if these novices are able to answer relevant how /why questions. Thus these questions need to be specifically related to what is represented in GeoAgentKS. These questions can often be the same ones asked to the experts during the concept-mapping interviews. The method for measuring the performance of each novice is to count how many of the expected key points are mentioned in his/her answers.

Finally, the novice participants are asked to give any comments about the advantages or disadvantages of using GeoAgentKS they might see for geographic representation.

5.4.3 Scenario simulation for a comprehensive performance test of GeoAgentKS

In addition to the experts and novices’ evaluations, which inherently are influenced by effectiveness of the graphics, a different test is needed to examine the robustness of GeoAgentKS in automated reasoning based on the knowledge representation. The capabilities of GeoAgentKS in representing and interpreting complex, dynamic, and scale-dependent geographic processes are accomplished by running complex scenario simulations. This simulation testing does not have the benefit of interactive human guidance and interpretive input during analysis, and therefore relies only on the reasoning abilities of GeoAgentKS alone. Such an input-output simulation

90 needs to utilize the GeoAgents’ internal rules with quantitative models and the database working together. Evaluation of the simulation results is accomplished by comparing (1) the simulation results with the original knowledge sources to validate logical correctness, or (2) the simulation results with what actually happened, in terms of both the event and the response of the human decision makers (assuming the decision makers did what they should do).

5.5 Summary

This chapter introduced the methods and steps needed for building a knowledge base and for demonstrating and evaluating how GeoAgentKS can be used to effectively capture and use higher-level knowledge. For capturing and representing geographic knowledge, I discussed the research design of first using text analysis of written documents for formalizing GeoAgents' behavioral rules and concept nodes and then using computer-based concept-mapping interviews to capture domain experts' how/why level knowledge. To evaluate the correctness, adequacy, and usability of the knowledge representation within GeoAgentKS, this chapter also introduced an approach for conducting the evaluation interviews with experts and novices and the methods for measuring these evaluation results. The next chapter shows how these methods were applied in a specific case study to represent the complex human-environment relations relevant to community water systems (CWSs) in Central Pennsylvania.

91

Chapter 6

CASE STUDY: REPRESENTING THE PROCESS OF COMMUNITY WATER SYSTEM MANAGEMENT IN CENTRAL PENNSYLVANIA

6.1 Introduction

As discussed in Chapter 5 (page 66), the objectives of the GeoAgentKS case study in this research include (1) examining the capabilities of GeoAgentKS in capturing, representing, and conveying the knowledge of geographic processes via interpretation of text documents and interviews, and (2) evaluating how easily the system can be used for decision making by both domain experts and novices. The methods used to meet the two objectives and the reasoning behind them were also described.

Figure 6.1: Centre County, Pennsylvania

As discussed in Chapter 1, one of the research topics incorporated within the

HERO project was adopted for the current case study  studying the community water systems (CWSs) of Centre County, Pennsylvania (Figure 6.1). In the second section below, the general characteristics of CWSs are introduced. The third section presents how the knowledge-capture methods described in the previous chapter were applied. In the fourth section, construction of the databases to support the knowledge representation is described. The fifth section discusses the procedures and results of the knowledge-base evaluation.

6.2 General characteristics of CWSs and an overview of the case study

The United States Environmental Protection Agency (EPA) defines CWSs as water systems that serve at least 15 connections or 25 people on a year-round basis. There are approximately 54,000 CWSs in the United States drawing on surface and groundwater supplies to serve roughly 268 million people (EPA, 2003). Whether serving large cities or small communities, CWSs are a key organizational mechanism for providing a sustainable and safe water supply for the general population, and for protecting public health and community well-being. Water quality in streams and aquifers is largely affected by the natural environment, including such factors as geology, land cover, climate, and extreme weather events (e.g., drought, cold, or storms). The quality of water delivered by CWSs is further influenced by public policy, regulations, financial constraints, and other socially based factors. Each CWS has its own unique aspects within a broader set of physical infrastructure factors and interactions with the social and natural environment.

In this research, as noted in the previous chapter, the methods to elicit the knowledge of exactly how any individual CWS works entailed interpretation of text documents as well as interviews with local water managers. Because many of the water managers within the research area had already participated in several undergraduates’

93 research projects in the HERO before this case study, only a limited number of water managers agreed to participate in the current research. Figure 6.2 shows the six CWSs involved in this research: the Aaronsburg Waterpipes Corporation, the College Township

Water Authority, the Millheim Borough Water System, the Penn State University Water

System, the State College Borough Water Authority, and the Upper Halfmoon Water

Association.

Figure 6.2: The CWSs involved in the case study

As discussed in Chapter 5, analysis of text documents describing relevant laws, regulations, and plans was used as a means to derive an initial identification of

GeoAgents and their behavioral rules, and to formalize the concept nodes of the overall knowledge domain and their interrelationships. The documents used in this research included the emergency plans and standalone drought contingency plans (e.g., College

Township Water Authority, 2002b; College Township Water Authority, 2002a; Millheim

94 Borough Water System, 2003). On the state level, the DEP's drought management plan

(DEP PA, 2001), and the law of PA Code 35, Chapter 119 (i.e., ‘Prohibition of

Nonessential Water Uses in a Commonwealth Drought Emergency Area,’ see

Pennsylvania Emergency Management Agency and Council, 1985) were utilized.

Interviews with experts were conducted to capture and represent their how /why - level knowledge, and subsequently to evaluate and use the existing knowledge representation in GeoAgentKS. In this research, six experts in total (i.e., four water managers and two local planners) agreed to participate in the interviews. Two water managers (i.e., Aaronsburg and Upper Halfmoon) participated in the knowledge acquisition interviews. The water manger of College Township CWS and the two planners evaluated the integrated knowledge representation (i.e., the large concept map and GeoAgents' performance) within GeoAgentKS. The Millheim water manager voluntarily participated in both.

In addition to the above interviews with experts, seven Penn State students volunteered to evaluate the use of GeoAgentKS regarding knowledge understanding and use as domain non-experts.

6.3 Knowledge acquisition and representation

Chapter 5 outlined the general methods needed to conduct text analysis of documents and computer-supported interviews to capture and represent geographic knowledge. This section illustrates how these methods were used in the CWS case study.

95 6.3.1 Formalizing behavioral rules and concept maps from documents

Text documents were used in the case study for initially formalizing the behavioral rules encapsulated within the GeoAgents and constructing the initial concept map. It is challenging for anyone to program a perfect set of formalized rules via JESS directly from the complex documents because both the rule structure and JESS syntax would have to be considered simultaneously. In order to address this, as discussed in

Chapter 5, looser pseudo codes, with an emphasis on the rule structure without considering the detailed syntax, were derived first. These pseudo codes were translated into exact JESS syntax as a subsequent step. From text analysis to formalizing

GeoAgents' behavioral rules (i.e., JESS), three prerequisite steps were thus required, including (1) sentence analysis, (2) pseudo coding, and (3) rule formalization. The concept nodes describing elements in the GeoAgents’ environment were built in parallel with the rule formalization.

Name of the CWS : College Township CWS Emergency: Power outage Corrective actions: Station A and Station B are considerably dependent upon electric power. The operator should visually inspect the stations for smoke, fire, or alarms causing or resulting from power failure. The operator should determine whether the entire station or only portions of the station are without power. The operator should determine if the station itself or the surrounding area is without power. The operator should contact the local power company or an electrical contractor. If the power failure is the result of the local power company, the operator should call the power company and inform the power company that a part or the whole public water system has been effected, and find out how long the power outage is expected to last. If the power outage is short term (less than two hours), the existing storage facilities should be able to supply the water system. If the power outage is expected to last more than two hours, the following procedures should be followed: 1. Emergency generator: obtain, connect, and use emergency generator to operate the Station A. If Emergency generator is not an option then, 2. Emergency interconnection: operate to the extent possible. 3. Restrictive Water use: contact the local fire departments, local radio station, and newspaper. 4. If the system pressure drops below 20 PSI (Pounds per Square Inch), additional boil water order restriction must be issued and notification to the public. 5. Water hauling.

Figure 6.3: The text of the Power Outage Plan of the College Township CWS

96

These steps are discussed below using the ‘ Power-Outage Emergency Plan ’ (see

Figure 6.3 for the original text) of the College Township Water Authority (labeled

‘College Township CWS’) as an example. This plan was documented as a part of the

Emergency Response Plan, which also includes the response plans for Disinfection

System Failure, Contamination of Supply, Source Pump Failure, and Drought Conditions.

This document as well as another standalone drought-contingency plan was collected from the office of College Township CWS.

Sentence group 1: checking the causes of the power outage

The station is considerably dependent upon electric power. The operator should visually inspect the station for smoke, fire, or alarms causing or resulting from power failure. The operator should determine whether the entire station or only portions of the station are without power. The operator should determine if the stations or the surrounding areas are without power.

Sentence group 2: determining how long it will last

The operator should contact the local Power Company or an electrical contractor. If the power failure is the result of the local power company, the operator should call the power company and inform the power company that a part or the whole public water system has been effected, and find out how long the power outage is expected to last.

Sentence group 3: taking actions to recover water supply

If the power outage is short term ( less than two hours ), the existing storage facilities should be able to supply the water system. If the power outage is expected to last more than two hours , the following procedures should be followed: 1. Emergency generator : obtain, connect, and use emergency generator to operate the Station A. If Emergency generator is not an option then, 2. Emergency interconnection : operate to the extent possible 3. Restrictive water use : contact the local fire departments , local radio station , and newspaper . 4. If the system pressure drops below 20 PSI (pounds per square inch), additional boil water order restriction must be issued and notification to the public. 5. Water hauling

Figure 6.4: Sentence analysis of the Power Outage Plan of College Township CWS

97 6.3.1.1 Sentence analysis for planning GeoAgents' goal-driven actions

In the above Power Outage Emergency Plan, a power outage results in two simultaneous states: out of power, and water supply stopped. The primary goal for the water operator is to restore the water supply, even though restoring power can be a means to recover the water supply. Therefore, the goal state in this emergency plan is ‘water supply restored.’ To reach this goal state, the tasks and actions are identified from sentence analysis by grouping sentences and underlining key words. For instance, as shown in Figure 6.4, three high-level tasks are derived from grouping the entire text into three sets of sentences. The low-level actions are derived from the bold verbs.

Figure 6.5 shows the result of the sentence-analysis as a hierarchical structure of the goal, tasks, and actions. Three high-level tasks are identified from Figure 6.4, including (1) identifying the causes of the power outage, (2) determining how long the power outage will last, and (3) taking actions to recover the water supply. Each of the high-level tasks can be further decomposed into multiple lower level tasks, and ultimately into executable actions. For instance, the task of identifying the causes of the power outage includes two lower-level tasks: checking surroundings, and checking the stations.

If the surrounding areas are experiencing a power outage, the GeoAgent needs to send a message to the power company. Otherwise, it needs to check the station.

98

• No power: assert power company Task 1: Check failure surroundings • Power: assert station failure To identify the causes of the power Check the • Check fire, smoke, alarm

outage Stations • Check failure: entire or partial

• Partial & no fire: < 2 hours

Task 2: Station failure • Otherwise: > 2 hours

Goal state: To identify • Inform the power failure low long it Power comp. Water • Ask how long it will last will last failure Supply Recovered Use the stored • Recovered < 2h, goal reached water • Check availability Use generator • Available: goal reached

Task 3: • Not available >> interconnection

Use inter- To recover • Check extent connection water supply • Operate: goal reached

Restrict water use • Contact fire Dep. & media

Water hauling • Goal reached

Check water pressure • PSI > 20: do nothing • PSI < 20: contact media

Goal High-level-tasks Low-level-tasks Actions

Figure 6.5: Planning the GeoAgent's goal-driven actions from the Power-Outage Emergency Plan of the College Township CWS

To encode these actions as behavioral rules, the key words appearing in the text are highlighted according to the types. For example, the sentences followed by ‘whether’ and ‘if’ (enclosed in boxes) denote the condition-indicator parts of the GeoAgent's condition-action rules. The verbs, such as ‘inspect,’ ‘determine,’ or ‘find out,’ ‘contact,’

‘call,’ or ‘inform’ are in bold to denote the action part of the rules. The nouns are

99 underlined; these denote the subject of specific actions within the rules, and are also mapped into the concept map as concept nodes (see Figure 4.5). Some of these nouns may also be GeoAgents. The critical states of the environmental element are in italic to denote their possible values (i.e., FACTS) in given conditions.

6.3.1.2 Pseudo coding

The pseudo codes are derived from the conceptual outline constructed via the sentence analysis (see Figure 6.5). These pseudo codes can be simply a list of key words in a text editor, e.g., the required rules for accomplishing a task, the key conditions in the

‘IF’ part of a rule, or the needed actions in the ‘THEN’ part; see examples below.

(1) the goal and high-level tasks of the power outage plan: IF (Power == power_outage) StateUpdating: (1) power_outage = true (2) goal = water_supply_recovered (3) goal_status = not_achieved Tasks: (1) identify causes (2) identify lasting time (3) to recover water supply (4) determine boil water order

(2) the task of recovering water supply: IF (task == recover_water_supply && goal_status == not_achieved) Tasks: (1) use the stored water (2) check generator (3) check interconnection (4) restrict water use (5) check system pressure (6) water hauling (7) withdraw the current task

(3) the actions of using generator (i.e., from the other part of the emergency plan): IF (task == check_generator && emergency_generator == available && goal_status == not_achieved) Actions: (1) check the generator's fuel and oil level (2) connect generator leads to external terminal box (3) put Off-Auto switches for High Service pumps to the Off position (4) use key interlock to open the main circuit breaker (5) use key interlock to close the standby generator breaker

100 (6) start generator when the breaker is closed (7) goal reached: water_supply_recovered (8) withdraw the current task (9) withdraw the current goal StateUpdating: (1) goal_status = achieved

6.3.1.3 Formalizing executable GeoAgents' behavioral rules

Using the conceptual rule structures revealed in the pseudo codes, executable behavioral rules can be formalized with consideration of the syntax of the expert system

(JESS). The example below shows three of the formal JESS rules derived from the pseudo codes shown in the previous section.

(1) The rule of inspecting power status in the GeoAgent of College Township CWS

(defrule inspect_station_power "" ?powerStatus <- (power outage) => (printout t ">Inspect power: CollegeTownship water authority power outage! " crlf) (assert (goal (goal_state water_supply_recovered))) (assert (goal_achieved (status FALSE)) (printout t ">Goal state: water_supply_recovered" crlf) (assert (task_is_to (task examine) (argument_1 causes_of_power_outage ))) (assert (task_is_to (task identify) (argument_1 power_outage_lasting_time ))) (assert (task_is_to (task recover) (argument_1 water_supply ))) (assert (task_is_to (task determine) (argument_1 boil_water_order ))) )

(2) The rule of recovering water supply in the GeoAgent of College Township CWS

(defrule recover_water_supply "" ?goal <- (goal_achieved (status FALSE )) ?task1 <- (task_is_to (task recover) (argument_1 water_supply )) ?lasting_time <- (power_outage more_than_two_hours) => (printout t ">Power outage will last more than two hours" crlf) (assert (water_system_status (condition emergency ))) (printout t ">The water system is under an emergency condition" crlf) (assert (task_is_to (task use) (argument_1 emergency_generator) )) (assert (task_is_to (task check) (argument_1 interconnections) )) (assert (task_is_to (task launch) (argument_1 water_hauling) )) (assert (task_is_to (task restrict) (argument_1 water_use) )) (retract ?task1) )

101

(3) The rule of using emergency generator in the GeoAgent of College Township CWS (i.e., derived from the content beyond Figure 6.3)

(defrule use_emergency_generator "" ?goal <- (goal_achieved (status FALSE )) ?task1 <- (task_is_to (task use) (argument_1 emergency_generator)) ?availability <- (emergency_generator available) => (printout t "::Need to operate the emergency generator" crlf) (printout t "__Check the generator's fuel and oil level. " crlf) (printout t "__Connect generator leads to external terminal box. " crlf) (printout t "__Put Off-Auto switches for High Service pumps to the Off position. " crlf) (printout t "__Hand-Off-Auto switch for well pump to off position. " crlf) (printout t "__Use Key interlock to open the main (normal power) circuit breaker. " crlf) (printout t "__Use Key interlock to close the standby generator breaker. " crlf) (printout t "__When the breaker is closed (turned on), start generator. " crlf) (printout t "__The generator started. " crlf) (printout t "__Now using the emergency generator..." crlf) (assert (generator (status running ) )) (printout t "> The emergency generator is running" crlf) (retract ?task1) (assert (goal (goal_state water_supply_recovered) (status TRUE ))) (assert (goal_achieved (status TRUE )) (retract ?goal) )

To represent the dynamic nature of the GeoAgents' internal knowledge, it is important that each GeoAgent maintains its own internal state, as well as knowledge of environmental (external) states that are relevant to its tasks. For instance, if a power outage (i.e., an external state) is identified, the relevant GeoAgent needs to generate a new goal and a set of high-level tasks (i.e., internal states). These external and internal states are stored in the GeoAgents' memory. Once the environmental condition is changed, the goal is reached, or a high-level task is decomposed into lower-level tasks or actions, then the old condition, goal, or task needs to be deactivated to avoid redundant

102 execution of the related rules. This kind of distributed knowledge is one reason why inter-agent communication is critical.

Figure 6.6: Constructing concept maps to establish the GeoAgent’s environmental conditions from the text documents

6.3.1.4 Formulating an initial concept map

As discussed in Chapter 5, formalizing GeoAgents’ behavioral rules and constructing the initial concept maps proceed in parallel because these two types of knowledge are highly interrelated. GeoAgents ‘sense’ their environmental conditions

(i.e., FACTS) and fire their internal rules via checking the relevant concept nodes (see

Chapter 4). As shown in Figure 6.4, the underlined nouns (e.g., ‘power,’ ‘stations,’

‘surrounding areas,’ ‘system pressure,’ and ‘emergency generator’) can be used to describe the environmental conditions in the concept map. For example, as shown in

103 Figure 6.6, the concept nodes, such as ‘power,’ ‘surroundings,’ ‘station,’

‘emergency_generator,’ and ‘system_pressure’ (i.e., the brown, round-corner rectangle concept nodes) are used to describe the power-outage-related status of the water system.

Their conditions can be specified inside the concept nodes. For instance

‘system_pressure’ can be ‘below_20_PSI’ or ‘above_20_PSI’; ‘surroundings’ can be

‘no_power’ or ‘normal’; and ‘emergency_generators’ can be ‘available’ or

‘not_available.’ The relevant GeoAgents are mapped with the blue sharp-corner rectangle concept nodes. To make these concept nodes ‘understandable’ to the GeoAgent, the terms used in the concept nodes have to be the same as what are used in the GeoAgent’s internal rules. The green oval concept of ‘Power Outage’ is considered to be an event.

6.3.2 Capturing experts' how /why knowledge in computer-supported interviews

As noted in the previous chapter, in addition to the document-derived knowledge, computer-based concept-mapping interviews were conducted in this case study to capture experts' up-to-date, detailed, and individualized understanding of the geographic processes relevant to their CWSs. The method used was to ask the experts first to verify the document-derived knowledge representation, then to add new contents to it utilizing their own knowledge, and finally to verify the entire resulting concept map. Because it was found that not every organization has documented regulations and plans available, as was the case for the Aaronsburg CWS, sometimes the interviewee must construct the concept map from scratch. In this case, an initial concept map from another CWS was used as an example for the purpose of explaining the idea of concept maps to the water manager. The following subsection describes the three procedures relevant to knowledge

104 acquisition, including the question design, the interviews for developing the concept maps and GeoAgent rules, and the validation.

6.3.2.1 Question design

The questions used during the concept-mapping interviews were designed to capture three types of interrelated knowledge specific to the given knowledge domain: (1) the existing social and natural elements related to the relevant CWS, (2) the significant changes or events that happened or can happen to these elements, and (3) the cause and effect relationships among these changes or events. As mentioned in the previous chapter, the questions used in the case study interviews were, of necessity, very broad, with the primary intention of prompting the interviewee to describe the potential phenomenon of their own CWS. As much latitude as possible was given to allow the expert to describe their knowledge domain in terms that were natural to them and to describe the elements they felt were most important.

The following are the questions that were used for the knowledge-acquisition interviews.

 Using the concept map diagram, describe the current status of your water system regarding the water sources, geology, filtration, personnel, customers, and interconnections.  Where are the above elements? Can you mark them on the paper map?  Please describe several interesting social or natural events that happened in the past that affected your water system, such as drought, cold, storm, flood, lightening, contamination, or any social changes.  Please describe on the concept map why your water system was affected by these events, and how your system responded.

The concept map at this stage is used, in graphic form, as an overall representational device to aid the knowledge-acquisition process.

105 6.3.2.2 Developing concept maps in computer-supported interviews

At the beginning of each interview, the water manager was told that the interview would be computer-based. A printed sample concept map was shown to demonstrate how to make a concept map using GeoAgentKS. Generally, this sample concept map was derived from the documents of the interviewee’s CWS and used as the starting point

(e.g., Figure 6.6). Once the water manager understood the purpose and the method of the interview, the pre-designed questions given above were asked in order to start the concept mapping process.

Figure 6.7: The author (front) interviews a water manager (back) using computer-based concept mapping on May 31, 2004

As shown in Figure 6.7, during a knowledge-acquisition interview, a water manager sat beside and watched the author operating GeoAgentKS on a laptop computer.

The water manager's answers to the questions were represented as the organized concept nodes and relations in a concept map as the interviewee watched. Usually, the concept-

106 mapping process required the interviewer to follow the general questions above with simple questions representing specific components of the more general question. This technique allowed the interviewee to focus on specific aspects of his system, when needed. Such questions included: What is the primary water source? How many water customers are served in this water system? Are there any interconnections with other water systems for water sharing?

Using this approach, the water managers then started to think of the terms for the concept nodes and of the relations to describe the status of the water system on the concept maps. When the water manager mentioned a particular geographic feature (e.g., water source), he was asked to mark the location of this feature on the prepared paper map.

Finishing the description of the current water-system status, the water manager was asked about historical events that affected the water system and influenced its current state. The water manager usually would describe several specific past or current events, e.g., ‘We are abandoning the old primary well this year,’ ‘We built a new filtration plant in the last year,’ or ‘The snowstorm in 1995 severely affected our system.’ The author then picked up one topic to ask more how/why questions, such as ‘Why do you have to abandon the old primary well?’ or ‘How did your system respond to the snowstorm in

1995?’ Again, the water manager answered these questions by making new, or linking the existing, concept nodes in the concept map.

The three examples of the computer-based concept-mapping interviews are illustrated in Figure 6.8 ((a), (b) and (c)) below.

107

Figure 6.8: (a) A part of the concept map of Millheim CWS

108 Figure 6.8 (a) illustrates an example of a part of the concept map made by the water manager for the GeoAgent of the Millheim Borough Water System (i.e., labeled as

‘Millheim_CWS’ in the concept map). In the left branches of this concept map, the water manager explained why Philips Creek, instead of Elk Creek, was utilized as the primary source. He also mapped how Millheim CWS responded to the drought event in 1999. As shown in Figure 6.8 (a), Millheim CWS is using surface water sources, including Philips

Creek and Elk Creek. Philips Creek is used as the primary water source because it is surrounded by forest, which protects it from pollution. In contrast, Elk Creek is sensitive to pollutants from agricultural and natural sources, which are mostly caused by pesticides, chemical fertilizers, or heavy rains. In 1999, a drought severely affected

Phillips Creek and resulted in the stream flow being below the required level to meet water demands. Elk Creek was also influenced by this drought (i.e., the comments for the concept node ‘less flow’ added by the water manager), but this flow remained sufficient to provide the required water supply. During the drought, therefore, the primary water source was switched from Philips Creek to Elk Creek.

109

Figure 6.8: (b) The concept map of Upper Halfmoon CWS

110 The concept map shown in Figure 6.8 (b) was developed in an interview with the manager of the Upper Halfmoon CWS. This concept map represents the water manager's understanding of how and why the CWS had been developed and modernized from a tiny water company over the past 40 years, with all of the component elements and their interrelationships (i.e., his understanding of the process). His understanding, represented in this concept map, is summarized as follows:

In 1964, a drought triggered the initial formation of the water company, which only served 26 households in a tiny rural community at the beginning. Until 1990, the water quality had been poor due to geological impacts on the two old wells. On the other hand, because of the rapid growth of nearby State College since the 1970s, Upper

Halfmoon Township has also become a fast-growing area in terms of population and rural development, which in turn has increased demand for high water quality and a stable water supply. In 1990, cooperation between the water company and the real-estate developers was a ‘win-win’ decision, which resulted in the abandonment of the two old wells and the development of two new wells. These two wells have provided high water quality and a stable water supply. Because of better water sources, this town was able to attract and support continuing population increases, which in turn brought more income to the CWS. Therefore the company invested more money to rebuild its infrastructure and expand the water system. With 14 years of system expansion and modernization since

1990, this CWS now serves more than 550 households. Overall, the Upper Halfmoon

CWS is currently in good shape in terms of water sources and revenue.

111 Figure 6.8: (c) The concept map of Aaronsburg CWS

112 The concept map in Figure 6.8 (c) shows more complex human-environment relations related to the Aaronsburg CWS, which is a small private water system serving about 400 people in eastern Centre County. During the interview with the manager for this water system, he explained why this CWS was trying to abandon the old primary water source well #8:

This well is located on a limestone aquifer. Naturally occurring acid-bearing water dissolves the limestone and produces sinkholes. Because of the sinkholes, surface water can flow into the ground and result in direct surface influence to the water source.

Therefore the water quality in well #8 is poor.

This well was drilled in 1988, and people knew this water-quality problem at the very beginning. This well had nevertheless been utilized as the primary water source since small water systems (e.g., Aaronsburg CWS) were not required to do SWIP

(Surface Water Identification Protocol) testing at that time. But in 1996, the federal Safe

Drinking Water Act Amendments (SDWAA) of 1996 was enacted, requiring every public water supplier to meet federal quality standards. To comply with this law, Pennsylvania

DEP conducted the first SWIP test on well #8 in 1997. The results prompted DEP to suggest that the Aaronsburg CWS pump less water from well #8, adopt a boil water advisory in the short term, and find new water sources as a longer term solution. In 1998,

Aaronsburg CWS adopted a boil water advisory, and obtained water from the nearby

Millheim CWS.

The Aaronsburg CWS drilled a new well (i.e., well #4) on a sandstone aquifer in

2002. The DEP did a SWIP test on this new well in the same year and determined that this new well was also under direct surface influence. As a result, the Aaronsburg CWS

113 decided to build a new filtration plant in 2003. Well #4 has nevertheless continues to be the primary water source. The Aaronsburg CWS started to abandon well #8 in 2004.

Because of the cost of the new well and the new filtration plant, the price of water is expected to increase to almost 30 times higher than before these expenses (i.e., $0.16 per thousand gallons vs. an estimated $4.50 per thousand gallons on completion of the project).

6.3.2.3 The experts' validation of the concept maps at the end of knowledge-acquisition interviews

After the concept-map development, as discussed in Chapter 5, the interviewee was requested to review the entire concept map constructed and validate (1) if using

GeoAgentKS was able to capture his/her knowledge in the computer-based interview, and (2) how much of his knowledge was captured. If the knowledge representation was incomplete, the interviewee was asked to evaluate what the reason was.

The assessment of the satisfaction of these experts for whether GeoAgentKS could adequately capture their knowledge was scored with ‘yes’ or ‘no.’ All three water managers (100%) answered ‘yes.’ For example, the following is a dialogue between the author and the water manager of Upper Halfmoon CWS:

Author : " If we have enough time (because the interview ran out of time), do you think using this (concept-mapping) tool can capture what you are thinking? " Manager : " Oh, yeah, easily. Well, I shouldn't say ‘easily’; it takes some time. But I think it could certainly do it. Yeah, that is an interesting tool there. It is interesting, I mean, that final product there. I had not really looked at the history, you know, the causes and effects before. So, yeah, I think you could capture . "

To assess how much of his knowledge was captured in the concept map, the question asked to the interviewees was: ‘Can you use a percentage to estimate how much

114 of your knowledge relevant to the topics in this interview was captured in the concept map?’ All the interviewees could provide an estimate in percentage values. Table 6.1 summarizes the lengths of the interviews, percentages of the captured information, and the reasons for incompleteness. For the Aaronsburg and Upper-Halfmoon CWSs, the interviews took 75 and 70 minutes respectively, and the water managers believed that the concept maps captured about 70% and 85% of what they intended to express in the concept maps. The reason for the incompleteness in both of these cases was a lack of time. The interview with the water manager of Millheim CWS voluntarily took much longer  about 110 minutes  and the water manager believed that 100% was captured.

Table 6.1: The duration, percentage of the information captured the three computer-based-concept-mapping interviews

Name of CWS Interview Captured Reasons of duration information incompleteness Aaronsburg Waterpipes 75 minutes 70% Out of time Corporation Upper Halfmoon Water 70 minutes Out of time 85% Association Millheim Borough Water 110 minutes 100% - System

6.4 Integration of diverse knowledge representations and geospatial data

After the knowledge acquisition phase, the knowledge engineer integrated the separate knowledge representations derived via text and expert interviews into a single system representing knowledge about all CWS within the study area. The relevant geospatial data to support such knowledge representations were subsequently collected.

115 6.4.1 Integrating diverse knowledge representations

As discussed in Chapter 5, to integrate the independent knowledge representations of the individual CWSs, the knowledge engineer needs to (1) build inter-GeoAgent communication linkages to allow GeoAgents to interact with each other, and (2) link the separate concept maps to a single, integrated concept map. These two steps are described below.

As shown in Figure 6.3, the operation documentation of a CWS defines when the

CWS needs to communicate with other agencies. For example, when the power outage is expected to last more than two hours, College Township CWS needs to contact the fire department, local radio station, and newspaper, and check the emergency interconnections. These communications indicate the social relationships among these agencies. To represent such social relationships, the mechanism of inter-GeoAgent communications needs to be established in the GeoAgents’ internal rules. Within

GeoAgentKS, as discussed in Chapter 4, both ACL and KQML agent communication languages are implemented within MadKit (see Appendix A.2). In the current research, the ACL language was utilized.

116

(a) CollegeTownship_CWS

(defrule check_system_pressure IF (task = = check_system_pressure && system_pressure = = below_20_PSI) Actions: (1) query the names of the newspaper and radio station (2) sendACLMessag to the newspaper and radio station => action: broadcast; content: boil_water_needed_for_CollegeTownship_water_users (3) withdraw the current task )

(b) CentreDailyTimes

(defrule broadcast_the_requested_notification "" IF (action_in_received_message = = broadcast)

Actions:

(1) check the content of the received message

(2) broadcastMessage : the content of the received message

(3) withdraw the message

)

Figure 6.9: A portion of the pseudo code used for development of the communication rules in the CollegeTownship_CWS and Centre_Daily_Times GeoAgents

Figure 6.9 shows part of the pseudo codes related to the communication between the GeoAgents for CollegeTownship_CWS and the print news media CentreDailyTimes concerning public notification of the boil-water order. According to the text document

(Figure 6.3), if the system pressure is below 20 PSI, the public must be notified of the boil-water order. The knowledge engineer can subsequently build the rules for the communication between these two GeoAgents. For example, Figure 6.9 (a) shows a rule in the GeoAgent of CollegeTownship_CWS that checks the system pressure from the concept map (see Figure 6.6). If the system pressure is ‘below_20_PSI,’ the CWS must notify the newspaper and the radio station. As discussed in Chapter 3, each ACL message contains the information of the sender, the receiver, the action (e.g., A requests B to perform action X), and the content. For instance, the message sent by

117 CollegeTownship_CWS in Figure 6.9 (a) includes an action as ‘broadcast’ and the content as ‘boil_water_needed_for_CollegeTownship_water_users.’ Figure 6.9 (b) shows that once a particular news medium, such as Centre_Daily_Times (i.e., a local newspaper) receives a message with an action ‘broadcast,’ it checks the content of the message and broadcasts this message to the public (i.e., all the GeoAgents under the relevant groups).

DEP_PA

serves

Pennsylvania

includes

Centre_County sources

sources s has ha ha s pipes connections Millheim_CWS Aaronsburg_CWS filtration plant personnel UpperHalfmoon _CWS ...... has s h ... ha as 550 households personnel three wells

Figure 6.10: Linking the separate concept maps of individual CWSs to shared concept nodes

As noted in Chapter 5, separate concept maps are linked together via the concept nodes that are common to (i.e., shared by) two or more concept maps. For example, because all of the CWSs studied in this research are located in Centre County,

Pennsylvania, as shown in Figure 6.10, the knowledge engineer was able to link the concept maps of individual CWSs to the concept node ‘Centre_County’ as a common, higher-level node, and linked ‘Centre_County’ with ‘DEP_PA’ via ‘Pennsylvania.’ The

118 merged concept map needs to be verified during the evaluation interviews with the experts (who did not make new concept maps). Generally, the experts agreed with the contents represented in the merged concept map except for two elements, which were modified using their input. First, the experts suggested linking ‘DEP_PA’ with the concept nodes of individual CWSs directly (see Figure 4.2) because each CWS has a much stronger relationship with DEP than with any county-level government agency.

Second, the water manager of CollegeTownship_CWS noted that they just drilled a new backup well a couple of months before the interview, and they were trying to make new emergency plans for this well. The concept map was therefore modified to include a node for that new well.

Once the concept-map integration is complete, the knowledge engineer can link the rule files of specific GeoAgents with their corresponding concept nodes (see Figure

4.3). As such, the integration of GeoAgents’ behavioral rules and the graph-based concept maps is achieved via the following three aspects. First, many concept nodes are reinforced by the behavioral rules, because the environmental conditions for the IF parts of the relevant rules are retrieved from the concept nodes. Second, knowledge represented within the concept maps can be shared among GeoAgents. Third, knowledge contained in the concept-maps can be used to generate new rules for GeoAgents as the result of some specific analysis. For example, as learned from the concept map made by the water manager of Millheim CWS, surface water sources are usually sensitive to drought and nearby land use. From the Aaronsburg concept map, the chemical process in the limestone aquifer was shown to often result in poor water quality in groundwater.

Thus it is possible for the knowledge engineer to generalize a set of new rules relevant to

119 the water quality and the environment elements for the GeoAgents. If a CWS is planning to search new water sources, it can learn from the lessons provided by other CWSs (as stored in the concept maps and GeoAgents) to decide what type of water source is more appropriate, where the new source should be placed, or what problems it could encounter as part of this search.

6.4.2 Integrating the knowledge representation with the database

As also discussed earlier, data collection takes place after knowledge acquisition and integration. Examples of integrating GeoAgents with the database were introduced in

Chapter 4. The data used in the CWS case study were collected from the Internet or digitized from the relevant documents or the paper maps marked by the water managers during the interviews. For example, the data on land use and land cover, geology, surface water, cities, and roads were collected from PASDA (Pennsylvania Spatial Data Access, http://www.pasda.psu.edu/). Furthermore, drought was clearly identified as having a major impact on water supply in this area of Pennsylvania. The weekly drought data of

Centre County were therefore collected from multiple Websites, including the drought information center of the DEP (http://www.dep.state.pa.us/dep/subject/hotopics/ drought/DroughtTech.htm ), and the USGS (US Geological Survey, e.g., at http://pa.water.usgs.gov/gw_report/index.html , http://pa.water.usgs.gov/ar/wy99/susq_ intro.html , and http://pa.water.usgs.gov/monitor ). The weekly Palmer Drought Severity

Index (PDSI) from the Climate Prediction Center was also utilized (see http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/regional_monitoring

/palmer/1999/weekly_PALMER_1999.shtml ).

120 In the remainder of this subsection, two examples are provided to illustrate how the database is used to support the knowledge representation in the interview-derived concept maps. A third example is also given in the next section, which illustrates a more complex simulation application relevant to drought management.

Figure 6.11: (a) Integrating the database with the interview-derived concept map of Aaronsburg CWS

As shown in Figure 6.11 (a, b), geospatial data were integrated with the concept maps made by the water mangers of Aaronsburg CWS and Millheim CWS. For

Aaronsburg CWS, groundwater is used as the primary water source, 2 so that geology data

2 For security concerns, the spatial information of the water sources in this dissertation is only for demonstration purposes. The water sources in the maps may not represent their actual locations.

121 were collected to support the knowledge representation. From Figure 6.11 (a), the user can observe an overlapping relation between well #8 and the limestone aquifer in the spatial map. With such spatial relationships in mind, it is easy to understand why the physical process (i.e., carbonate weathering of the limestone aquifer) described in the concept map causes the poor water quality in well #8.

Figure 6.11: (b) Integrating the database with the interview-derived concept map of Millheim CWS

122 In contrast to the Aaronsburg CWS, the primary source in the Millheim CWS is surface water. Land use and land cover data were thus collected because the surface environment can significantly affect the water sources for this CWS. The map in Figure

6.11 (b) provides the user an intuitive visualization of the spatial relationships between the water source and land cover to allow easier understanding of why different water sources were used in this CWS. For instance, Philips Creek looks much smaller than Elk

Creek. Nonetheless, Philips Creek is used as the primary water source because forests surround and protect this stream. In contrast, much of the upstream portion of Elk Creek runs through cleared farmland and consequently suffers from agricultural pollution and from excessive sedimentation during heavy rains.

6.5 Evaluation and use of knowledge representation in GeoAgentKS

After completion of the knowledge and data acquisition phases, evaluation interviews were conducted to evaluate the usability of the entire integrated knowledge representation in GeoAgentKS. More specifically, as discussed in Chapter 5, the objectives of such evaluations were to examine the effectiveness of the knowledge representation within GeoAgentKS. The evaluation steps include: (1) evaluation by domain experts of the GeoAgents’ performance, (2) evaluation by non-domain experts of the effectiveness of GeoAgentKS in conveying new knowledge, and (3) scenario simulations and subsequent comparison of the performance of GeoAgentKS with the original knowledge sources and historical situations. The detailed methods and results of these evaluations are described below.

123 6.5.1 The experts' evaluations

As noted in Section 6.2, four experts (two water managers and two local planners) participated in the evaluation interviews to evaluate the entire integrated knowledge representation within GeoAgentKS. These four experts participated in three separate interviews. In the first two interviews, the water managers of the College Township CWS and Millheim CWS were interviewed separately. In the third interview, two local planners, who had expertise in the CWSs of the entire Centre County, were interviewed together. As described in Chapter 5, the purpose of these evaluations was to assess how reasonable the GeoAgents' actions were, and to capture the domain experts’ reaction concerning how they might see GeoAgentKS being useful.

To evaluate performance of the GeoAgents, as discussed in Chapter 5, the method used was to ask the experts to read the environmental conditions defined in the concept nodes, and think of what the reasonable responses would be under such conditions. The same conditions were then asserted to the GeoAgents. From the rule-firing results displayed in the interface of GeoAgentKS, the experts were asked to count how many actions taken by GeoAgents they saw as wrong, missing, or incomplete. As noted in

Chapter 4, an action here is defined as a suggestion made by the GeoAgent to users on what to do (see examples in Appendix C).

The experts evaluated the local emergency plans (e.g., power outage) and the drought contingency plans of the College Township CWS and Millheim CWS, as well as the state-level drought emergency regulations and plans as encoded in the knowledge base. About 70 actions were examined in total in the evaluation interviews. But none of the experts identified inappropriate actions from the outputs of rule firing. One water manager explained that the GeoAgents could indeed accurately follow what was defined

124 in the documented regulations and plans. Thus no obvious errors were identified. In realty, however, water operators' actual actions could be much more flexible than the ways the GeoAgents performed in the experiments.

The experts were also asked to suggest purposes for which GeoAgentKS might be useful. One water manager said this system could be useful for larger agencies or policy makers, such as the DEP, to test their operational rules and to verify how their (current or proposed) policies would work in response to particular events, with the possibility of modify their rules and policies, if needed. Another water manager believed that

GeoAgentKS was “a very useful tool for small municipalities or county level offices

(with limited professional staff).” The local planners said that they face many laws and regulations everyday, which change frequently and are difficult to memorize. Using this

GeoAgent-based system would be valuable in supporting quick decision making in emergency situations.

6.5.2 Evaluations of non-experts

In addition to the domain experts' evaluations, this research also interviewed seven non-experts (i.e., two Ph.D students, three masters, and two undergraduates, recruited from different geography classes at the Pennsylvania State University) to examine whether the knowledge representation within GeoAgentKS can be easily learned and used. Before the evaluations, all the participants declared that they had little background knowledge regarding CWSs. As discussed in Chapter 5, because the novices had no background knowledge to judge the accuracy of GeoAgents' actions and only limited time was allowed in each interview, these evaluations were focused on testing if

125 the experts’ how /why -level knowledge represented in the concept maps could be easily conveyed to these non-experts.

As also discussed in Chapter 5, the method of such evaluations was to ask the students to read the concept maps alone, and then to ask them how /why questions to test if they truly understood the concept maps. Each student’s understanding was measured by counting how many of the key points of the expected answers were mentioned in their actual answers (see Appendix B for two examples of these interview transcripts).

Table 6.2: Summary of the students' evaluation The students who Questions Expected answers answered correctly total Percentage Do you understand the Yes 7 100% concept maps? No 0 Can you explain why the Philips: protected by forest 7 100% smaller Philips Creek, instead of the larger Elk Elk: exposed to farmland; 7 100% Creek, is used as the pollution primary water source for Millheim CWS? Drought 1964: formation of 4 57% the CWS Can you explain how Upper Tiny CWS, poor water 7 100% Halfmoon CWS has quality before 1990 developed into a modern Growth of State College 5 71% CWS over the past 40 since 1970s years? Incorporation of the CWS 6 86% and the developers in 1990 New sources since 1990 6 86% Population growth, more 7 100% income, modernization, since 1990 Limestone aquifer, sinkhole, 7 100% Can you explain why poor water quality Aaronsburg CWS was trying EPA SDWAA, 1996 5 71% to abandon its previous DEP SWIP test, 1997 5 71% primary water source, Well New well # 4 (2003) 1 14% #8?

126 Table 6.2 summarizes the interview questions, the expected key points of the answers, and the number of the students who addressed these key points. From the evaluation results, all of the student participants (100%) claimed that they could understand every concept map presented to them, and answered correctly why Philips

Creek was used as the primary water source in the Millheim CWS. Although there were one or two points missed from some of their answers, all of the participants could explain most of the causal relations between events and could describe how the Upper Halfmoon

CWS had developed into a modern CWS over the past 40 years. For the concept map of

Aaronsburg CWS, when answering why this CWS was trying to abandon Well #8, at least five of the seven participants (70%) answered both geological impacts and social influences from the DEP (i.e., SWIP test) and EPA (i.e., Safe Drinking Water Act

Amendments of 1996), but only one participant mentioned the point of using the new well #4 in the answer.

Overall, the students agreed that by using GeoAgentKS, they could gain considerable knowledge quickly. For example, the students provided many insightful comments regarding the advantages of using GeoAgentKS for geographic representation in comparison with the data-centered representations in many current GISystem products

(e.g., ArcGIS). Below are some examples:

• "I guess the (geospatial) map will show me the 'what,’ the 'what' of the (water) systems, but not necessarily how things work; or the 'how' and 'why'; (such as) where things are; why they are; and how things operate within a (water) system. The (geospatial) map will show me kind of 'what' and 'where,’ but not 'how' and 'why,’ really. " • "The (geospatial) map shows relationships based on space, but not necessarily based on causes and effects. On the concept map, the linkages show, you know, 'this is required by that,’ 'this is equal to this,’ 'this includes this,’ or 'has this,’ or 'is based on that.’ And those terms add knowledge to the concept map. But you don't get it in (geospatial) maps."

127 • "The GIS software would not have much information on it. It is not very meaningful if you only know where the location is. It is better to have both information (i.e., the knowledge layer) and location." • "You can see the story, the history, very quickly… So you can see the history, why these things happened. For example, I can see here, because of the vulnerability to drought (i.e., the drought of 1999 on the Millheim concept map), they switched the source (from Philips Creek) to Elk Creek. " • "I think it (i.e., the concept map) is better than having a narrative paragraph, and then looking at the (geospatial) map. It allows seeing the two things (i.e., the concept map and the geospatial map). It's very nice… (I) could get a lot of knowledge very quickly."

6.5.3 Complex scenario simulation

As noted in Chapter 5, the purpose of complex scenario simulation was to test the capabilities of GeoAgentKS in representing and reasoning about complex, dynamic, and scale-dependent geographic processes. In this subsection, the scenarios of power-outage and drought management were used as examples for demonstrating system performance for complex simulations. Three examples are presented below to demonstrate the results of scenario simulations by comparing them with the original knowledge sources and with what actually happened in the past.

Table 6.3: The environmental conditions of the power outage in Example I

Environmental factors Status Power Outage Surroundings Normal Seriousness None Station fire_and_smoking Emergency_generators in_good_condition Eystem_pressure above_20_PSI System_status (StateCollege_CWS) in_good_condition System_status (PSU_CWS) in_good_condition

128

6.5.3.1 Example I: independent responses to a power-outage

This first example demonstrates that the GeoAgent is able to interact with its environment independently by following its internal agenda. Assuming there is a power outage, the GeoAgents are aware of the environmental conditions, as shown in Table 6.3, by checking the concept map (see Figure 4.5). Rule-firing results (see Appendix C.1) of

GeoAgents' responses show that only CollegeTownship_CWS responded to such conditions. Using its internal behavioral rules, this GeoAgent checked the environmental conditions, and identified that the power outage was caused by ‘fire_and_smoking’ within the station. It then inspected the emergency generator, which was

‘in_good_condition,’ and followed the detailed operating steps and started the generator

(e.g., checking fuel level, connecting, and opening the circuit breaker). When the generator was running, the water supply was recovered, and the goal state (see Figure

6.5) was reached.

6.5.3.2 Example II: local cooperative responses to a power outage

The second example illustrates that GeoAgents are able to interact with each other to deal with a local emergency event cooperatively. When the environmental conditions of the power outage in the concept map were changed to those listed in Table 6.4, five

GeoAgents responded to the environmental change, including CollegeTownship_CWS, the Fire_Department, the AlleghenyPower_Company, the StateCollege_CWS, and the

Centre_Daily_Times newspaper (see Appendix A.2 for their detailed actions).

129 Table 6.4: The environmental conditions of the power outage in Example II

Environmental factors Status Power Outage Surroundings no_power Seriousness Serious Station no_fire_or_smoke Emergency_generators Not_available Eystem_pressure Below_20_PSI System_status (StateCollege_CWS) In_good_condition System_status (PSU_CWS) In_good_condition

The CollegeTownship_CWS GeoAgent identified that the power outage was due to the failure from the power company, because the station was ‘no_fire_or_smoke,’ and

‘no_power’ in the surrounding areas. It sent a message to the local power company (i.e.,

AlleghenyPowerCompany) to ask how long the power outage would last. Because this was a ‘serious’ power outage, the GeoAgent of AlleghenyPowerCompany answered to

CollegeTownship_CWS that the duration for this state would be ‘more_than_two_hours.’

CollegeTownship_CWS then attempted to check the emergency generator and found that it was ‘not_available.’ Therefore CollegeTownship_CWS considered this situation to be an emergency.

As specified for this situation, CollegeTownship_CWS sent out three messages.

The first message was sent to the StateCollege_CWS GeoAgent to request use of the emergency interconnection. The second message was sent to the local news media (i.e.,

Centre_Daily_Times here) to broadcast the power-outage emergency to public for restrictive water use. The third message was sent to the Fire_Department requesting the preparation of water hauling as a backup solution. When the CollegeTownship_CWS detected that the system pressure was ‘below_20_PSI,’ it sent another message to the

Centre_Daily_Times with the request to broadcast a boil-water advisory. After receiving

130 an ‘OK’ message from StateCollege_CWS for using the interconnection,

CollegeTownship_CWS started to use the emergency interconnection to share water from

StateCollege_CWS. As results of the responses, the water supply was restored, and the goal state was thus reached.

From the rule-firing results in the above two examples, the GeoAgents could perform both independent and cooperative actions to achieve the goals at the local level.

The decision-making behaviors of the GeoAgents were based upon reasoning about the environmental conditions and interpreting the messages of inter-GeoAgent communications. In these examples, although there were no historical events available for validating the accuracy of GeoAgents’ performance, in comparison with the original text of the emergency plan (see Figure 6.3), the GeoAgents could analyze the environmental conditions and perform reasonable actions. In the next subsection, a more complex example is presented to show hierarchical interactions among GeoAgents in a drought situation and compare the simulation with a real-world scenario.

6.5.3.3 Example III: integrating GeoAgents with models for drought-related hierarchical interaction

A drought typically involves complex physical and social interactions at multiple scales. The third example demonstrated here is focused on representing such complexity by integrating quantitative drought models (i.e., launched from the concept map) with multiple GeoAgents. The hierarchical (top-down and bottom-up) GeoAgents' interactions are examined for representing scale-dependent human-environment relations.

In Pennsylvania, there are five major indices utilized to determine drought severity (watch, warning, or emergency). These indices include precipitation deficits,

131 stream flows, groundwater levels, reservoir storage, and PHDI (Palmer Hydrologic

Drought Index, see Smith, 1998, and www.dep.state.pa.us/dep/subject/hotopics/ drought/facts/FS2472DroughtMgmtInPA.htm ). These indices are calculated independently from their corresponding variables (see the calculation methods in Smith,

1998, and http://nadss.unl.edu/PDSIReport/pdsi/calculation.html ). If three or more of the indices indicate a drought watch, warning, or emergency in a county or group of counties,

DEP_PA will issue a corresponding drought phase, and activate the relevant drought management operations to those areas. The standards of each index for determining the drought severity are given in Appendix D. Considering the complexity of the calculation, the drought model indices were pre-calculated and subsequently entered into the database.

The functions of fetching the pre-calculated drought indices from the database are linked with a particular concept node (e.g., ‘drought severity’), and the user can launch a graphical interface from the concept map to observe the dynamic changes of the drought indices. Once the drought-index data are retrieved from the database, the GeoAgents can automatically interpret these results as a particular drought phase (e.g., watch, warning, or emergency), and then respond to it cooperatively. The example here simulates the human-environment interactions in the drought of 1999.

132 Figure 6.12: (a) An early stage of the drought development on May 01, 1999

133

Figure 6.12: (b) GeoAgents automatically identified a drought warning on June 26 th , 1999

134 Figure 6.12: (c) The GeoAgents identified a drought emergency on July 17 th , 1999

135 Figure 6.12 (a, b, c) shows the dynamic interactions among multiple GeoAgents and the reaction to drought in summer 1999. Cognitive GeoAgents are used to represent social/organizational interrelationships including the state-level DEP_PA, Governor_PA, the local newspaper (i.e., Centre_Daily_Times), CollegeTownship_CWS,

StateCollege_CWS, Millheim_CWS, and the individual-level Millheim_WaterUser1 and

CollegeTownship_WaterUser1, as well as physical features (see Appendix C.3 for detailed execution results of these GeoAgents).

Figure 6.12 (a) shows an early stage of the drought development on May 01,

1999, as the process is simulated via observed weather data and the organizational response represented in the knowledge base. In Centre County, no drought was identified, and therefore none of the organization GeoAgents had to respond. On June 26, 1999 (see

Figure 6.12 (b)), a ‘drought warning’ was identified from the outputs of the drought models (i.e. three or more outputs of the drought models were interpreted as ‘drought warning’, see Appendix D). The GeoAgent of DEP_PA announced a ‘drought warning’ and sent messages to local CWSs to follow the drought-warning operations, and to the news media for broadcasting the drought-warning condition. On the local level, according to the request from DEP_PA, for instance, the CollegeTownship_CWS monitored its groundwater source, and the Millheim_CWS monitored its surface water source Elk Creek and Phillips Creek. The Centre_Daily_Times GeoAgent broadcast the drought warning to the public, including the request for a voluntary reduction of water usage by 10-15%. Note that the GeoAgent of Governor_PA did not have to respond to this drought warning because only voluntary water-use reduction was required, and no endorsement was needed from the Governor_PA to activate the drought-related laws.

136 Continuing the simulation, on July 17, 1999, a drought emergency is identified in

Centre County from the drought indices (Figure 6.12 (c)). The GeoAgent of DEP_PA recommended the Governor_PA to proclaim a drought emergency. Once the Governor declared a drought emergency, the related laws and drought management regulation were activated in the declared regions. DEP_PA then scheduled the weekly meeting with the

Commonwealth Drought Task Force, started daily updating the drought report on its

Website, and increased the drought monitoring frequency from weekly to daily.

Mandated water conservation measures were activated at this point. The state- level goal of an immediate water-use reduction of 25% was transferred to individual-level water users via local GeoAgents (i.e., top-down communications). For instance, DEP_PA informed the news media to broadcast to the public enactment of the law prohibiting non- essential water use. After receiving the message from DEP_PA and by sharing

DEP_PA’s knowledge base, the GeoAgent of Centre_Daily_Times began to broadcast the water-uses restricted during the drought emergency, e.g., watering grass, watering golf courses, filling swimming pools, and cleaning mobile equipment (i.e., defined by PA

Code 35, Chapter 119, 1985).

The DEP_PA GeoAgent also sent messages to local CWS GeoAgents requesting that they enact their own drought contingency plans. After receiving the message from

DEP_PA, for instance, the CollegeTownship_CWS then informed its water users that the allotted daily water use was 40 gallons per day per household, and the monthly charges for excess use were $8.88 per thousand gallons for the first 2000 gallons, and $17.54 thereafter. The Millheim_CWS also sent similar messages to its water customers but with different prices; the monthly charges for excess use as $7.00 per thousand gallons for the

137 first 2000 gallons, and $15.00 thereafter. The detailed simulation results (see Appendix

C.3) show that the individual water users, the CollegeTownship_WaterUser1 and the

Millheim_WaterUser1, correctly received the messages from their corresponding water suppliers.

In addition to the top-down communications described above, two individual requests for water use were introduced to test bottom-up communications. In the first case, the GeoAgent of CollegeTownship_WaterUser1 intended to use water for filling a swimming pool during this drought emergency condition, and sent a message to the

CollegeTownship_CWS applying for this water use. After automated reasoning by sharing the GeoAgent of DEP’s knowledge base, the CollegeTownship_CWS answered that this application was not approved, and explained to this applicant that ‘the use of any water to fill and top off swimming pools’ was prohibited in a drought emergency (i.e., the law of PA Code 35 Chapter 119). In the second case, the GeoAgent of

Millheim_WaterUser1 sent a message to Millheim_CWS GeoAgent to apply a water use for watering new grass during non-working hours. According to the encoded water use laws, the Millheim_CWS approved this application, but the applicant had to use water only from 5:00 pm to 9:00am using restrictive means such as a bucket, a can, or a hand- held hose with an automatic shut-off nozzle.

Emergency backup plans were also triggered during the simulated drought emergency condition, and different CWSs adopted different solutions appropriate for their varying situations. For instance, the CollegeTownship_CWS agent sent a message to the StateCollege_CWS (i.e., peer-to-peer communication) to use the interconnection for water sharing. For Millheim_CWS, it monitored the water source, detected that the

138 flume flow in the primary source Philips Creek was less than 0.037 cubic feet per second, and switched the water source to Elk Creek.

Overall, compared with what actually happened in this historical event, the above drought scenario simulation shows that generally GeoAgentKS is able to describe dynamic, complex, and scale-dependent human-environment interactions. In both the simulation and the real world, the drought warning was identified in June, and the drought emergency in July. In the simulation, the earliest drought warning was identified on June 26, and the drought emergency on July 17, 1999. In reality, the DEP issued a drought warning on June 10 th , and the Governor Tom Ridge declared an emergency on

July 20, 1999. A reason for these differences may be that the data used in this research were collected from different Websites and the data could thus be different from what were actually used by DEP. The water manager in College Township CWS noted in the interview that they did receive the letter from the DEP, and reacted to the drought as the drought-contingency plan regulated (e.g., reducing water use by 25%, and increasing the water price). The water manager of the Millheim CWS mentioned that both they and the news media notified the public of the non-essential water use restrictions, and that they switched the water source to Elk Creek. The Millheim manager also noted that the automated-reasoning results about the bottom-up water use application were reasonable, and it was ‘for sure’ that the drought example was able to represent the complex process of human-environment interactions.

In summary, from the simulated actions under these historical drought conditions, the GeoAgents can respond to environmental changes independently and cooperatively.

The GeoAgents' interactions performed both locally and hierarchically. The actions

139 performed by the GeoAgents depend upon their internal agenda, environmental conditions, and inter-GeoAgent communications. The drought example demonstrated that

GeoAgents' qualitative, goal-driven actions could be integrated with the quantitative models to achieve the representation of the interactions between physical process and the social process. The overall internal evaluation indicates that it possible to use

GeoAgentKS to represent complex human-environment interactions.

What the GeoAgents can do is dependent upon what knowledge is encoded in their knowledge bases. In the current case study, the GeoAgents have the capabilities of dealing with drought or power outage, but could not cope with other situations, such as when terrorists attempting to damage the water supply. To do so, extra knowledge sources (e.g. the plans of dealing with terrorism) and data sources (e.g. what and where the damages are) are needed to be formalized into the GeoAgentKS using the methodologies presented in this chapter. In addition, the examples in the case study only showed that the GeoAgents were dealing with one event each time. The current

GeoAgentKS also has the capabilities to deal with two or more events at once (e.g. a power outage during a drought) without any extra changes. For example, when the drought models are running, the GeoAgents check the modeling results from the weekly drought data, as well as other environmental conditions (e.g. power status). If a power outage and a drought warning are identified at the same time, the GeoAgents will give suggestions to the users on how to respond to the power outage and the drought warning at the same time.

140 6.6 Summary

As discussed in Chapter 5, the objective of the case study was to demonstrate how to capture and represent non-observational and higher-level knowledge within

GeoAgentKS, and how to evaluate the correctness, adequacy, and usefulness of the knowledge-oriented geographic representation. Using the methods presented in Chapter

5, this chapter discussed the detailed procedures and results of representing the geographic processes of the CWS-related human-environment interactions. In the three knowledge-acquisition interviews, the expert interviewees were able to use GeoAgentKS to represent their how /why -level knowledge. In addition, the document-derived

GeoAgents' rules could be integrated with the concept map, as well as with mathematical models to represent independent, local, and hierarchical human-environment interactions.

From the experts' evaluations and the system simulations, GeoAgents were shown to be capable of responding to their environmental changes rationally. The expert evaluators believed that GeoAgentKS could be useful for supporting decision making (especially in emergency management). The novice evaluators believed that the knowledge represented within GeoAgentKS could be quickly learned and easily shared.

141

Chapter 7

SUMMARY AND DISCUSSION

This chapter summarizes the overall research, presenting the research goal, the approaches, the new tool GeoAgentKS, and the case study. It then discusses the contributions of this dissertation and the research challenges for future study.

7.1 The research goal and approaches

The primary goal of this dissertation was to extend the current data-centered

GISystem with capabilities of representing and using high-level knowledge in order to address both the form and mechanism of a geographic process. In addition to what is derived from observation or data, geographic knowledge must include the how/why level understanding of the dynamic relations among geographic components, as well as the non-observational social laws, policies, regulations, plans, beliefs, and other cultural elements (e.g., religious, customs). Thus, representing geographic process, per se, requires a knowledge-oriented approach, instead of the traditional data-centered methodology.

A fundamental shortcoming of current data-centered GISystems is their inability to represent knowledge. This dissertation aimed to overcome this shortcoming and to approach the research goal by developing a knowledge layer that is connected with the geospatial database. Requirements for this knowledge layer were that it must have a means of (1) representing and integrating space-time knowledge within the GISystem, and (2) presenting that knowledge in a way that allows the users to easily gather higher- level information to guide spatial analysis.

The approach in answering the above needs is to integrate multiple knowledge and data representation technologies. Among the existing representation techniques, graph-based knowledge-representation techniques are good at representing relations, rule-based expert systems are good at automated reasoning and reusing the stored knowledge in actions, the agent-based approach is good at distributed representation and cooperative problem solving, and geospatial databases are good at representing location information.

This dissertation approaches the integration of these representation techniques to address the complexity of geographic processes with two steps: (1) significantly extending the concept of ‘agent’ to ‘GeoAgent’ as a spatial, temporal, distributed, and scale-dependent component for knowledge representation, and (2) integrating GeoAgents with multiple knowledge-representation techniques and geospatial databases.

7.2 The functionality of GeoAgentKS

To validate the feasibility of the knowledge-oriented strategy in representing geographic process, the above integrated-representation approach has to be achieved on the implementational level and examined in a real-world problem context. In this research, a Java-based prototype  the GeoAgent-based Knowledge System

(GeoAgentKS)  was implemented and applied in a case study for representing complex human-environment interactions relevant to CWSs in Central Pennsylvania.

143 The GeoAgentKS prototype includes both a database layer and a knowledge layer. In the knowledge layer, geographic knowledge can be represented as a combination of graph-based concept maps, mathematical models, and condition-action rules in a knowledge base distributed among GeoAgents (Figure 4.1). The concept map plays a pivotal role within GeoAgentKS to link the various knowledge representations with the geospatial database. For example, a concept node can be linked to a GeoAgent's rules, to an algorithm of the model, and to its data environment. Using their internal rules,

GeoAgents can sense their environmental conditions by checking the concept nodes or interpreting the outputs of models, communicate with each other, and perform goal- driven actions to respond to social and natural environmental change.

In this integrated system, users can retrieve from GeoAgentKS (1) what and where information in the geographic maps, (2) how and why geographic elements are interrelated, (3) the dynamic interactions among the scale-dependent social and physical elements, and (4) direct suggestions regarding what to do to support decision making. To validate such advanced functionality, in the case study, GeoAgentKS represented the complex CWS-relevant human-environment relations.

7.3 Contributions of this research

This dissertation presents a new means for representing geographical processes in a GISystem context, demonstrates a prototype that integrates multiple techniques in geographic knowledge and data representation, and provides examples regarding how to represent complex, dynamic, and scale-dependent human-environment interactions. In this section, the contributions of this research are summarized as they relate to (1) the conceptual-level understanding of geographic process, (2) the technical implementation

144 to achieve the representation of geographic process, and (3) the practical applications in representing process, with focus on human-environment interactions.

Representation of geographic processes should address not only the facts of changes and events, but also the human understanding of dynamic relations or mechanisms of the geographic systems, which goes far beyond the observed data. This dissertation presents the need for representing geographic processes as a holistic and unified description of the entire form and mechanism of a dynamic geographic system

(i.e., process = form + mechanism ).

On the implementation level, although there are multiple techniques already available for knowledge representation, they are generally utilized separately in geographic applications, rather than in an integrated manner. The Java-based

GeoAgentKS developed in this research demonstrates the power of such an integrated approach. Because of this integration, it is possible to apply the knowledge-oriented strategy of geographic representation to complex real-world problems. This prototype also demonstrates a tangible example for developing similar academic or commercial software tools to achieve knowledge-oriented representation.

For the application level contributions, this research explored a systematic set of knowledge-engineering methods to achieve knowledge-oriented representation of geographic processes using examples relevant to community water systems (CWSs).

Such methods include (1) geographic knowledge acquisition from interviews and written documents, (2) representation of the captured knowledge with multiple techniques, (3) validation of the correctness and adequacy of the knowledge representation, (4)

145 integration of the knowledge representation and the database, and (5) evaluation of the usability of the represented knowledge in GeoAgentKS.

7.4 Future research challenges

Developing the knowledge-oriented representation of geographic processes and

GeoAgent-based technologies is just beginning. Future research challenges relate to knowledge representation and advanced problem solving.

For knowledge representation, it is important to consider the following three themes: development of representation standards, methods for knowledge acquisition from heterogeneous sources, and time representation. First, an immediate need is to develop a protocol or standards to allow diverse knowledge representations to be sharable and interchangeable. For example, a possible solution can be establishing a generic set of ontologies that can be used for inter-GeoAgent communication and merging concept maps. Second, it is important to develop new approaches for capturing heterogeneous sources of geographic knowledge. These approaches include automated text interpretation and effective capture of individual-level informal knowledge. The technology of automated text interpretation can potentially facilitate the knowledge- elicitation process for geographic knowledge representation, but this technology is not yet well developed. Although interviews are widely applied in many fields for individual- level knowledge acquisition, this approach is usually time consuming and expensive. The usability of GeoAgentKS also needs to be improved to allow experts to more directly, quickly, and easily input their knowledge into a knowledge base. For instance,

GeoAgentKS could be shared as a public knowledge acquisition tool on the Internet with a highly user-friendly GUI. The experts themselves could make concept maps or

146 formalize GeoAgents' behavioral rules without being constrained with time or location.

Third, in addition to space-time representation, it is necessary to build temporal knowledge bases to better represent observed dynamics of geographic processes.

The current techniques for inter-agent communication (e.g., ACL and KQML) are good at synchronous message sending and receiving, but asynchronous inter-GeoAgent communication is still undeveloped. In the real world, message passing can be achieved in an asynchronous way. For example, the DEP and some water authorities usually upload their decisions to a shared web site, and the other water authorities may find out what they are doing later via exploring the web site. How to establish such asynchronous communication for GeoAgents will be an interesting research topic for future study.

In many cases, accuracy is very important for decision making . According to the evaluations by the experts in the case study, the GeoAgents can accurately follow documented laws and regulations represented within the system. In some situations, however, users need more ‘intelligent’ or more flexible solutions for problem solving. In the current research, although GeoAgents can deal with what is known or planned in their knowledge bases well, they have limited capability of coping with unknown situations. In the future, GeoAgents need to possess more advanced functionality, such as machine learning and automated planning.

7.5 Conclusions

Overall, this research explored a knowledge-oriented approach for representing geographic processes in a GISystem context. By doing that, the concept of GeoAgents, which can be used to represent the heterogeneous geographic knowledge in a spatially distributed and interactive manner, was introduced. Integrating GeoAgents with concept

147 maps, expert systems, mathematical models, and databases, this dissertation demonstrated a systematic set of methods and examples to achieve geographic knowledge acquisition, representation, integration, and evaluation by using GeoAgentKS, a new form of GISystem developed in this research.

The results of the case study show that it is possible to use GeoAgentKS to represent the complex, dynamic, and scale-dependent human-environment relations with the consideration of how/why -level understanding and socially based knowledge, such as laws, regulations, and plans. The case study also showed that the knowledge represented within GeoAgentKS can be easily shared and used by users for their learning and decision-making purposes.

I envision that the next generation of GISystem will go beyond the current situation of being simply data-rich toward a knowledge-rich era. This research provides a theoretical foundation—with an implementational prototype and application-level examples––for developing the knowledge-oriented GISystem.

148

REFERENCES

Ackoff, R.L., 1989. From Data to Wisdom. Journal of Applied Systems Analysis 16: 3-9. Adami, C., 1998. Introduction to Artificial Life. Springer-Verlag, Santa Clara, CA. Anderson, J.R., Hardy, E.T., Roach, J.T. and Winmer, R.E., 1976. A land use and land cover classification system for use with remote sensing data: U.S. Geological Survey Professional Paper 964. Aylett, R. and Luck, M., 2000. Applying Artificial Intelligence to Virtual Reality: Intelligent Virtual Environments. Applied Artificial Intelligence, 14(1): 3-32. Baatz, M. and Schäpe, A., 1999. Object-Oriented and Multi-Scale Image Analysis in Semantic Networks, 2nd International Symposium: Operationalization of Remote Sensing, 16-20 August, ITC, NL. Barnsley, M., 1999. Digital remotely-sensed data and their characteristics. In: P.A. Longley, M.F. Goodchild, D.J. Maguire and D.W. Rhind (Editors), Geographical Information Systems. John Wiley & Sons, Inc., New York, pp. 451-466. Batty, M., 2001. Polynucleated urban landscapes. Urban Studies., 38(4): 635-655. Batty, M. and Jiang, B., 1999. Multi-Agent Simulation: New Approaches To Exploring Space-Time Dynamics Within GIS, the Annual Meeting of GISRUK '99 (Geographical Information Systems Research - UK), University of Southampton, 14-16 April, 1999. Bechtel, W., 1987. Connectionism and the Philosophy of Mind: an Overview. The Southern Journal of Philosophy, Supplement: 17-41. Bechtel, W., 1988. Connectionism and Rules and Representation Systems: Are They Compatible? Philosophical Psychology, 1: 5-15. Bellinger, G., Castro, D. and Mills, A., 2000. Data, Information, Knowledge, and Wisdom. http://www.outsights.com/systems/dikw/dikw.htm. Bradshaw, J.M., 1997. An Introduction to Software Agents. In: Software Agents. In: J.M. Bradshaw (Editor), Software Agents. AAAI Press, Menlo Park, Calif, pp. 3-46. Brodaric, B., Gaheganb, M. and Harrap, R., 2004. The art and science of mapping: computing geological categories from field data. Computers & Geosciences, 30: 719-740. Brooks, R.A., 1986. A Robust Layered Control System For A Mobile Robot. IEEE Journal Of Robotics And Automation, RA-2: 14-23. Brooks, R.A., 1989. A robot that walks: Emergent behavior from a carefully evolved network. Neural Computation: 235-262.

149 Brooks, R.A., 1991a. Intelligence without reason, Proceedings of IJCAI-91. Morgan kaufman, San Mateo, CA, pp. 569-595. Brooks, R.A., 1991b. Intelligence without representation. Artificial Intelligence, 44: 139- 159. Bruillard, E. and Baron, G.L., 2000. Computer-based concept mapping: a review of a cognitive tool for students. In: D. Benzie and D. Passey (Editors), Proceedings of ICEUT2000, 16th IFIF World Computer Congress, pp. 332-338. Brule, J.F. and Blount, A., 1989. Knowledge Acquisition. McGraw-Hill, New York. Bryson, K., Luck, M., M., J. and D.T., J., 2000. Applying Agents to Bioinformatics in GeneWeaver. Cooperative Information Agents IV, Lecture Notes in Artificial Intelligence, 1860, 60-71, Springer-Verlag. Chamberlin, D. and Boyce, R., 1974. SEQUEL: a structured English query language, Proceedings ACM SIGFIDET Workshop Conference. ACM Press, New York, pp. 249-264. Chen, H., Smith, T.R., Larsgaard, M.L. and Ramsey, L.L.H.a.M., 1997. A geographic knowledge representation system (GKRS) for multimedia geospatial retrieval and analysis. International Journal of Digital Libraries, 1(2): 132-152. Chen, P.P.-S., 1976. The entity-relationship model- toward a unified view of data. ACM Transactions on Database Systems, 1(9-36). Choo, C.W., Detlor, B. and Turnbull, D., 2000. Web Work: Information Seeking and Knowledge Work on the World Wide Web. Kluwer Academic Publishers, Dordrecht. Christopherson, R.W., 2000. Geosystems: An Introduction to Physical Geography. Prentice Hall, Inc. Upper Saddle River, New Jersey. Codd, E., 1970. A relational model for large shared data banks. Communications of the ACM, 13:377-387. College Township Water Authority, 2002a. Drought Contingency Plan, State College, PA. College Township Water Authority, 2002b. Emergency Response Plan, State College, PA. Collins, A. and Quillian, M.R., 1969. Retrieval time from semantic memory. Journal of verbal learning & verbal behavior, 8: 240-247. Conte, R., Edmonds, B., Moss, S. and Sawyer, R.K., 2001. Sociology and social theory in agent based social simulation: a symposium. Computational and Mathematical Organization Theory, 7: 183-205. Couclelis, H., 1985. Cellular worlds: A Framework for Modeling Micro-Macro Dynamics. Environment and Planning A 17: 585–596. Couclelis, H., 1988. Of Mice and Men: What Rodent Populations Can Teach Us about Complex Spatial Dynamics. Environment and Planning A, 20: 99–109. Cruz, I.F., Sunna, W. and Chaudhry, A., 2004. Semi-Automatic Ontology Alignment for Geospatial Data Integration. In: M.J. Egenhofer, C. Freksa and H. Miller

150 (Editors), Geographic Information Science: third International Conference. Adelphi, MD, USA, October 2004. Lecture Notes in Computer Science 3234. Springer, pp. 51-66. Dai, X. and Gahegan, M., 2004. Integrated Approach for Category Development. In: M.J. Egenhofer, C. Freksa and H. Miller (Editors), Geographic Information Science: third International Conference: Extended abstracts and Poster Summaries. Adelphi, MD, USA, October 2004. Lecture Notes in Computer Science 3234. 59- 61, pp. 51-66. Davenport, T.H., 1997. Information Ecology. Oxford University Press, New York, NY. Davenport, T.H. and Prusak, L., 1998. Working Knowledge: How organizations manage what they know. Harvard Business School Press, Boston. Davis, T.J. and Keller, C.P., 1997. Modelling uncertainty in natural resource analysis using fuzzy sets and Monte Carlo simulation: slope stability prediction. International Journal of Geographical Information Science, 11(5): 409-435. Dean, S., 2003. Artificial Life, http://www.webslave.dircon.co.uk/alife/intro.html. d'Inverno, M. and Luck, M., 1997. Making and Breaking Engagements: An Operational Analysis of Agent Relationships. In: Zhang and Lukose (Editors), Multi-Agent Systems Methodologies and Applications: Proceedings of the Second Australian Workshop on Distributed Artificial Intelligence. Lecture Notes in Artificial Intelligence, 1286. Springer-Verlag, pp. 48-62. Doran, J.E., Franklin, S., Jennings, N.R. and Norman, T.J., 1997. On cooperation in multi-agent systems. The Review, 12(3): 309-314. Dragicevic, S. and Marceau, D.J., 2000. A fuzzy set approach for modelling time in GIS. International Journal of geographical information science, 14(3): 225-246. Easson, G.L. and Barr, D.J., 1996. Integration of GIS and ArtificiNeural al Networks for Natural Resource Applications, 1996 ESRI User Conference, Palm Springs, California, May 20-24, 1996. EPA, 1999. 25 Years of the Safe Drinking Water Act: History and Trends. EPA 816-R- 99-007, Environment Protection Agency. EPA, 2003. Water on tap: what you need to know. EPA 816-K-03-007, U.S. Environmental Protection Agency, Office of Water, Washington, DC. Epstein, J. and Axtell, R., 1996a. Growing Artificial Societies: Social Science from the Bottom Up. Brookings Institutions Press, Washington, D.C. Epstein, J.M. and Axtell, R., 1996b. Growing Artificial Societies: Social Science from the Bottom Up. Brookings Press & MIT Press, Washington, DC. Faratin, P., Sierra, C., Jennings, N.R. and Buckle, P., 1999. Designing Responsive and Deliberative Automated Negotiators, AAAI Workshop on Negotiation: Settling Conflicts and Identifying Opportunities, Orlando, FL, pp. 12-18. Ferber, J., 1999. Multi-agent systems: an introduction to distributed artificial intelligence. Addison-Wesley, New York, NY.

151 Ferrier, G. and Wadge, G., 1997. An integrated GIS and knowledge-based system as an aid for the geological analysis of sedimentary basins. International Journal of Geographical Information Systems, 11(3): 281-297. Finin, T.W., Fritzson, R., McKay, D. and McEntire., R., 1994. KQML as an agent communication language, Proceedings of the 3rd International Conference on Information and Knowledge Management (CIKM’94). ACM Press, Gaithersburg (Maryland), pp. 456–463. Flores-Mendez, R.A., 1999. Towards a Standardization of Multi-Agent System Frameworks. ACM Crossroads http://www.acm.org/crossroads/. Fodor, J.A. and Pylyshyn, Z.W., 1988. Connectionism and Cognitive Architecture: A Critical Analysis. Cognition, 28: 3-71. Fonseca, F., Egenhofer, M., Davis, C. and Câmara, G., 2001. Semantic Granularity in Ontology-Driven Geographic Information Systems. AMAI Annals of Mathematics and Artificial Intelligence - Special Issue on Spatial and Temporal Granularity. Fonseca, F. and Martin, J., 2004. Space and Time in Eco-Ontologies. AI Communications - The European Journal on Artificial Intelligence, 17(4): 259- 269. Forgy, C., 2004. Forward and Backward Chaining. http://www.rulespower.com/contents/article1.html. Frank, A., 1996. Qualitative spatial reasoning: cardinal directions as an example. International Journal of Geographical Information Systems, 10(3): 269-290. Frank, A., Bittner, S. and Raubal, M., 2001. Spatial and Cognitive Simulation with Multi- agent Systems. In: D.R. Montello (Editor), Spatial Information Theory - Foundations of Geographic Information Science, Proceedings of COSIT 2001. Springer, Berlin, Heidelberg, New York, Morro Bay, CA, USA, September 2001, pp. 124-139. Franklin, S. and Graesser, A., 1997. Is It an Agent, or Just a Program?: A Taxonomy for Autonomous Agents. In: J.P. Muller, M.J. Wooldridge and N.R. Jennings (Editors), Intelligent Agents III: Agent Theories, Architectures, and Languages. Springer-Verlag, Berlin, pp. 21 - 35. Gahegan, M., Dai, X., Macgill, J., Oswal, S. and Pike, W., 2003. From concept to data and back again: connecting mental spaces with data and analysis methods, Proceeding of the 7th International Conference on GeoComputation, Southampton, UK. Gaile, G.I. and Willmont, C.J., 1989. Geography in America. Merrill Pub. Co. Gelbart, D. and Smith, J., 1992. Towards combining automated text retrieval and case- based expert legal advice. Law Technology Journal, 1(2): 19-24. Genesereth, M.R. and Ketchpel, S.P., 1994. Software Agents. Communications of the ACM, 37(7): 48-53. Gilbert, N. and Doran, J., 1994. Simulating Societies: The Computer Simulation of Social Phenomena. UCL Press, London.

152 Gimblett, H.R.e., 2002. Integrating Geographic Information Systems and Agent-based Modeling Techniques for Simulating Social and Ecological Processes. Oxford University Press, Oxford and New York. Girard, M.-C. and Girard, C.M., 2003. processing of remote sensing data. A.A. Blaikema Publishers, Paris. Golledge, R.G., 1988. Integrating spatial knowledge, Proceedings, International Geographical Congress. Sydney, Australia. Golledge, R.G., 2002. The Nature of Geographic Knowledge. Annals of the Association of American Geographers, 94(2): 1-14. Goodchild, M.F., 1992. Geographical Information Science. International Journal of Geographical Information Science, 6(1): 31-45. Goodchild, M.F., 2001. Models of scale and scales of modeling. In: N.J. Tate and P.M. Atkinson (Editors), Modeling scale in Geographic information science. John Wiley & Sons, Ltd., Chichester, UK, pp. 3-10. Goodchild, M.F., 2004a. GIScience, Geography, Form, and Process. Annals of the Association of American Geographers, 94(4): 709-714. Goodchild, M.F., 2004b. The Validity and Usefulness of Laws in Geographic Information Science and Geography. Annals of the Association of American Geographers, 94(2): 300-303. Goodchild, M.F., Egenhofer, M.J., Fegeas, R. and Kottman, C., 1999. Interoperating geographical information systems. Kluwer Academic Publishers, Boston. Gould, P., 1981. Letting the data speak for themselves. Annals of the Association of American Geographers, 71(2): 166-176. Graham, T.E. and Goswami, I., 2001. Baltimore's Urban Environment Using GIS and Neural Networks, 2001 ESRI User Conference, San Diego, CA, July 11, 2001. Guarino, N. and Giaretta, P., 1995. Ontologies and Knowledge Bases: Towards a Terminological Clarification. In: N.J.I. Mars (Editor), Towards Very Large Knowledge Bases. IOS Press. Hagen, A., 2000. Fuzzy set approach to assessing similarity of categorical maps. International Journal of geographical information science, 74(3): 235-249. Haklay, M.M., 2004. Map Calculus in GIS: a proposal and demonstration. International Journal of geographical information science, 18(2): 107-127. Harley, J.B. and Woodard, D., 1987. The history of cartography: Cartography on prehistoric, ancient, and medieval Europe and the Mediterranean. University of Chicago Press, Chicago. Hewitt, C., 1977. Viewing Control Structures as Patterns of Passing Messages. Artificial Intelligence, 8(3): 323-364. Hogg, L.M. and Jennings, N.R., 1997. Socially Rational Agents, AAAI Fall symposium on Socially Intelligent Agents, Boston, Mass., November 8-10, pp. 61-63. Hon, H.S., 2004. Comparison of KQML and FIPA ACL. http://infoeng.ee.ic.ac.uk/~malikz/surprise2001/hsh99e/article2/Article%202.htm.

153 Hopgood, A.A., 2001. Intelligent System for Engineers and Scientists. CRC Press, Boca Raton, Florida. Irvin, B.J., Ventura, S.J. and Slater, B.K., 1997. Fuzzy and isodata classification of landform elements from digital terrain data in Pleasent Valley, Wisconsin. Geoderma, 77:137-154. Jennings, N.R., Sycara, K. and Wooldridge, M., 1998. A Roadmap of Agent Research and Development. Int Journal of Autonomous Agents and Multi-Agent Systems, 1(1): 7-38. Jiang, H. and Eastman, J.R., 2000. Application of fuzzy measures in multi-criteria evaluation in GIS. International Journal of Geographical Information Science, 14(2): 173-184. Johnson, S., 2001. Emergence: the connected lives of ants, brains, cities, and software. Scribner, New York. Johnston, R.J., 1999. Geography and GIS. In: P.A. Longley, M.F. Goodchild, D.J. Maguire and D.W. Rhind (Editors), Geographical Information Systems. John Wiley & Sons, Inc., New York, pp. 39-48. Jones, C.B., Abdelmoty, A.I., Finch, D., Fu, G. and Vaid, S., 2004. The SPIRIT Spatial Search Engine: Architecture, Ontologies and Spatial Indexing. In: M.J. Egenhofer, C. Freksa and H. Miller (Editors), GIScience 2004, University of Maryland, Maryland, pp. 125-139. Karalopoulos, A., Kokla, M. and Kavouras, M., 2004. Geographic Knowledge Representation Using Conceptual Graphs, 7th AGILE Conference on Geographic Information Science, Crete, Greece. Kavouras, M. and Kokla, M., 2002. A method for the formalization and integration of geographical categorizations. International Journal of geographical information science, 18(5): 439-453. Kim, W., Garza, J., Ballou, N. and Woelk, D., 1990. Architecture of the ORION next generation database system. IEEE transactions on knowledge and data engineering, 2: 109-125. Kokla, M. and Kavouras, M., 2001. Fusion of top-level and geographical domain ontologies based on context formation and complementarity. International Journal of geographical information science, 15(7): 679-687. Lakoff, G., 1987. Women, fire, and dangerous things: What categories reveal about the mind. University of Chicago Press, Chicago. Langran, G., 1989. Time in Geographic Information Systems. PhD dissertation Thesis, University of Washington, Seattle, USA. Langran, G. and Chrisman, N.R., 1988. A framework for temporal geographic information. Cartographica, 25(3): 1-14. Langton, C., 1988. Artificial life. Addison Wesley, Redwood City and Menlo Park, CA. Li, X. and Yeh, A.G.-O., 2002. Neural-network-based cellular automata for simulating multiple land use changes using GIS. International Journal of geographical information science, 16(4): 323-343.

154 Liu, Y. and Phinn, S.R., 2001. Developing a Cellular Automaton Model of Urban Growth Incorporating Fuzzy Set Approaches, Proceedings of the 6th International Conference on GeoComputation, University of Queensland, Brisbane, Australia. 24 - 26 September 2001. Liu, Z. Q. and Satur, R. 1999, Contextual fuzzy cognitive map for decision support in geographic information systems. IEEE Transactions on Fuzzy Systems, 7(5), 495- 507. Lloyd, R., Hodgson, M.E. and Stokes, A., 2002. Visual Categorization with Aerial Photographs. Annals of the Association of American Geographers, 92(2): 241– 266. Lloyd, R., Patton, D. and Cammack, R., 1996. Basic-level categories. Professional Geographer, 48(2): 181–194. Lo, C.P. and Yeung, A.K.W., 2002. Concepts and techniques of geographic information systems. PH series in geographic information science. Prentice Hall, Upper Saddle River, New Jersey. Loh, D.K., Hsieh, Y.T., Choo, Y.K. and Holtfrerich, D.H., 1994. Integration of a rule- based expert system with GIS for forest resource management. Journal of Computers and electronics in agriculture, 11: 215-228. Longley, P.A., Goodchild, M.F., Maguire, D.J. and Rhind, D.W., 1999. Introduction. In: P.A. Longley, M.F. Goodchild, D.J. Maguire and D.W. Rhind (Editors), Geographical Information Systems. John Wiley & Sons, Inc., New York, pp. 29- 38. Longley, P.A., Goodchild, M.F., Maguire, D.J. and Rhind, D.W., 2001. Geographical Information Systems and Science. John Wiley & Sons, Inc., Chichester, England. Lucieer, A. and Kraak, M.J., 2004. Interactive and visual fuzzy classification of remotely sensed imagery for exploration of uncertainty. International Journal of Geographical Information Science, 18(5): 491-512. Luck, M., 1999. From definition to deployment: What next for agent-based systems? Knowledge Engineering Review, 14(2): 119-124. Luck, M. and d’Invernoz, M., 1998. Motivated Behaviour for Goal Adoption. In: C. Zhang and D. Lukose (Editors), The Fourth Australian Workshop on Distributed Artificial Intelligence. Springer-Verlag, 1998, pp. 58-73. Luck, M. and d'Inverno, M., 1998. Motivated Behaviour for Goal Adoption. In: Zhang and Lukose (Editors), Multi-Agent Systems: Theories, Languages and Applications - Proceedings of the Fourth Australian Workshop on Distributed Artificial Intelligence. Lecture Notes in Artificial Intelligence 1544. Springer- Verlag, pp. 58-73. Luck, M. and d'Inverno, M., 2001. A Conceptual Framework for Agent Definition and Development. The Computer Journal, 44(1): 1-20. Luck, M. and d'Inverno, M., 2003. Unifying Agent Systems. Annals of Mathematics and Artificial Intelligence, 37(1): 131-167.

155 Luck, M., d'Inverno, M., Fisher, M. and contributors, F., 1998. Foundations of Multi- Agent Systems: Techniques, Tools and Theory. Knowledge Engineering Review, 13(3): 297-302. Luger, G.F., 2002. Artificial intelligence, structures and strategies for complex problem solving. Pearson Education, Harlow, England. Lukasheh, A.F., Droste, R.L.y. and Warith, M.A., 2001. Review of Expert Systems (ES), Geographic Information System (GIS), Decision Support System (DSS), and their applications in landfill design and management. Waste Management & Research, 19: 177-185. MacEachren, A. M. 1995. How Maps Work: Representation, Visualization and Design, New York: The Gilford Press Machlup, F., 1983. Semantic Quirks in Studies of Information. In: F. Machlup and U. Mansfield (Editors), The Study of Information. John Wiley & Sons, New York, pp. 641-671. Mackay, D.S., Robinson, V.B. and Band, L.E., 1993. An integrated knowledge-based system for managing spatiotemporal ecological simulations. AI Applications, 7(1): 29-36. Maderlechner, G. and Mayer, H., 1994. Automated acquisition of geographic information from scanned maps for GIS using frames and semantic networks, 12th International Conference on Pattern Recognition, IEEE, pp. 361-363. Manson, S.M., 2000. Agent-based dynamic spatial simulation of land-use/cover change in the Yucatan peninsula, Mexico., Fourth International Conference on Integrating GIS and Environmental Modeling (GIS/EM4), Banff, Canada. Marangoz, A.M., Oruc, M. and Buyuksalih, G., 2004. Object-oriented Image Analysis and Semantic Network for Extracting the Roads and Buildings from Ikonos Pan- sharpened Images. International Archives of Photogrammetry and Remote Sensing, 35(B3). Mark, D.M., Chrisman, N., Frank, A.U., McHaffie, P.H. and Pickles, J., 1996. The GIS History Project. http://www.geog.buffalo.edu/ncgia/gishist/. McNeese, M. D., Rentsch, J. R., and Perusich, K. 2000, Modeling, measuring, and mediating teamwork: the use of fuzzy cognitive maps and team member schema similarity to enhance BMC3 I decision making, pp. 1081. Mennis, J.L., Peuquet, D.J. and Qian, L., 2000. A conceptual framework for incorporating cognitive principles into geographical database representation. International Journal of Geographical Information Science, 14(6): 501-520.

Millheim Borough Water System, 2003. Emergency Response Plan, Millheim, PA. Minsky, M., 1975. A Framework for Representing Knowledge. In: P. Winston (Editor), The Psychology of Computer Vision. McGraw-Hill. Minsky, M., 1991. Logical Versus Analogical or Symbolic Versus Connectionist or Neat Versus Scruffy. AI Magazine, 12(2): 34-51. Minsky, M., 2000. Commonsense-based interfaces: to build a machine that truly learns by itself will require a commonsense knowledge representing the kinds of things

156 even a small child already knows. COMMUNICATIONS OF THE ACM, 43(8): 67-74. Mori, A. and Cosoli, P., 1991. An Expert Tool for SPOT-Landsat/TM Data Integration. nternational Geoscience and Remote Sensing Symposium, 3: 1869-1872. Newell, A. and Simon, H.A., 1976. Computer Science as Empirical Inquiry: Symbols and Search. Communications of the ACM, 19(3): 113-126. Nonaka, I. and Takeuchi, H., 1995. The knowledge-creating company: how Japanese companies create the dynamics of innovation. Oxford University Press, New York. Novak, J.D., 1990. Concept mapping: A usefull tool for science education. Journal of Research in Science Teaching, 27(10): 937-949. Novak, J.D., 1991. Clarify with concept maps: A tool for students and teachers alike. The Science Teacher, 58(7): 45-49. Novak, J.D. and Gowin, D.B., 1984. Learning How To Learn. Cambridge University Press, New York. Nute, D., Potter, W.D., Maier, F., Wang, J., Twery, M., Rauscher, H.M., Knopp, P., Thomasma, S., Dass, M., Uchiyama, H. and Glende, A., 2004. NED-2: An Agent- Based Decision Support System for Forest Ecosystem Management. Envirnomental Modeling and Software, 19: 831-843. Nwana, H.S., 1996. Software Agents: An Overview. Knowledge Engineering Review, 11(3): 205-244. O'Sullivan, D. and Haklay, M., 2000. Agent-based models and individualism: is the world agent-based? Environment and Planning A, 32(8): 1409-1425. Parker, D.C., Berger, T., Manson, S.M. and McConnell, W.J., 2001. Agent-Based Models of Land-Use / Land-Cover Change (LUCC Report Series No.6). 6, LUCC Focus 1 Office, Anthropological Center for Training and Research on Global Environmental Change, Indiana University, 2002. Parker, D.C., Manson, S.M., Janssen, M.A., Hoffmann, M.J. and Deadman, P., 2002. Multi-Agent Systems for the Simulation of Land-Use and Land-Cover Change: A Review. http://www.csiss.org/events/other/agent- based/papers/maslucc_overview.pdf. Parker, D.C., Manson, S.M., Janssen, M.A., Hoffmann, M.J. and Deadman, P., 2003. Multi-Agent Systems for the Simulation of Land-Use and Land-Cover Change: A Review. Annals of the Association of American Geographers, 93(2): 314-337. Parker, D.C. and Meretsky, V., 2002. Measuring Emergent Properties of Agent-Based Land-use Models Using Spatial Metrics. Agriculture, Ecosystems, and Environment. Pattison, W.D., 1964. The four tradtions of geography. Journal of Geography, 63: 211- 216. Perusich, K. and McNeese, M. D. 2005, Using fuzzy cognitive maps as an intelligent analyst. CIHSPS 2005 – IEEE International Conference on Computational

157 Intelligence for Homeland Security and Personal Safety, Orlando, FL, USA, 31 March - 1 April 2005, pp. 9 Peucker, T.K. and Chrisman, N., 1975. Cartographic data structures. The American Cartographer, 2(1): 55-69. Peuquet, D.J., 1984. An conceptual framework and comparison of spatial data models. Cartographica, 21(14): 66-113. Peuquet, D.J., 1987. Research issues in artificial intelligence and geographic information systems, Proceedings--International Geographic Information Systems (IGIS) Symposium: The Research Agenda I, pp. I-119. Peuquet, D.J., 1988. Representations of geographic space: toward a conceptual synthesis. Annals of the Association of American Geographers, 78: 375-394. Peuquet, D.J., 1993. What, Where and When - A Conceptual Basis for Design of Spatiotemporal Databases, Workshop on Advances in Geographic Information Systems, in conjunction with Conference on Information and Knowledge Management, Association for Computing Machinery, Washington, D.C. Peuquet, D.J., 1994. It’s about time: a conceptual framework for the representation of temporal dynamics in geographic information systems. Annals of the Association of American Geographers. 84(3): 441–461. Peuquet, D.J., 1999. Time in GIS and geographical databases. In: P.A. Longley, M.F. Goodchild, D.J. Maguire and D.W. Rhind (Editors), Geographical Information Systems. John Wiley & Sons, Inc., New York, pp. 91-103. Peuquet, D.J., 2001. Making Space for Time: Issues in Space-Time Data Representation. GeoInformatica, 5(1): 11-32. Peuquet, D.J., 2002. Representations of space and time. Guilford Press, New York. Peuquet, D.J. and Duan, L., 1995. An Event-Based Spatiotemporal Data Model (ESTDM) for Temporal Analysis of Geographical Data. International Journal of Geographical Information Systems, 9(1): 7-24. Portugali, J., Benenson, I. and Omer, I., 1994. Sociospatial residential dynamics: stability and instability within a self-organizing city. Geographical Analysis., 26(4): 321- 340. Portugali, J., Benenson, I. and Omer., I., 1997. Spatial cognitive dissonance and sociospatial emergence in a self-organizing city. Environment and Planning B: Planning & Design, 24(2): 263-285. Power, C., Simms, A. and White, R., 2001. Hierarchical fuzzy pattern matching for the regional comparison of land use maps. International Journal of geographical information science, 15(1): 77-100. Prasad, R. and Sinha, A.K., 2003. Role of Expert System in Natural Resources Management. www.gisdevelopment.net/ application/nrm/overview/ma03130.htm. Quigley, E.J. and Debons, A., 1999. Interrogative Theory of Information and Knowledge, in Proceedings of SIGCPR '99, ACM Press, New Orleans, LA., pp. 4-10. Quillian, M.R., 1968. Semantic memory. In: M. Minsky (Editor), Semantic information Processing, Cambridge, MA, pp. 227-270.

158 Rana, O., Preist, C. and Luck, M., 2000. Progress in Multi-Agent Systems Research. Knowledge Engineering Review, 15(3): 285-292. Raper, J. and Livingstone, D., 1995. Development of a Geomorphological Spatial Model Using Object-Oriented Design. International Journal of Geographical Information Systems, 9(4): 395-383. Raper, J.F., 1999. Spatial representation: the scientist's perspective. In: P.A. Longley, M.F. Goodchild, D.J. Maguire and D.W. Rhind (Editors), Geographical Information Systems. John Wiley & Sons, Inc., New York, pp. 61-70. Raper, J.F., 2000. Multidimensional geographic information science. Taylor & Francis, London and New York. Raubal, M., 2001a. Human wayfinding in unfamiliar buildings: a simulation with a cognizing agent. Cognitive Processing: 363-388. Raubal, M., 2001b. Ontology and epistemology for agent-based wayfinding simulation. International Journal of geographical information science, 15(7): 653-665. Raubal, M. and Worboys, M., 1999. A Formal Model of the Process of Wayfinding in Built Environments. In: C. Freksa and D. Mark (Editors), Spatial Information Theory - Cognitive and Computational Foundations of Geographic Information Science, International Conference COSIT '99. Stade, Germany, pp. 381-399. Rigol, J.P., Jarvis, C.H. and Stuart, N., 2001. Artificial neural networks as a tool for spatial interpolation. International Journal of Geographical Information Science, 15(4): 323-343. Roddick, J.F., Hornsby, K. and de Vries, D., 2003. A unifying semantic distance model for determining the similarity of attribute values, Proceedings of the twenty-sixth Australasian computer science conference on Conference in research and practice in information technology, Adelaide, Australia, pp. 111 - 118. Rodrigues, A. and Raper, J., 1999. Defining spatial agents. In: A.S. Câmara and J. Raper (Editors), Spatial multimedia and virtual reality. Taylor & Francis, London. Rodríguez, M.A. and Egenhofer, M., 2004. Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure. International Journal of geographical information science, 18(3): 229-256. Rosch, E., 1978. Principles of categorization. In: E. Rosch and B. Lloyd (Editors), Cognition and Categorization. Lawrence Erlbaum Associates, Hillsdale, N.J, pp. 27–77. Ross, J., 1993. An expert system for soil erosion mitigation in logging operations on steep land. AI Applications in Natural Resources, Agriculture, and Environmental Sciences, 7(4): 69-70. Russell, S. and Norving, P., 1995. Artificial intelligence, a mordern approach. Prentice- Hall, Inc., Upper Saddle River, NJ. Satur, R. and Liu, Z. Q. 1999, A contextual fuzzy cognitive map framework for geographic information systems. IEEE Transactions on Fuzzy Systems, 7(5), 481- 494.

159 Schelhorn, T., O’Sullivan, D., Haklay, M. and Thurstain-Goodwin, M., 1999. STREETS: an agent-based pedestrian model. Schelling, T.C., 1969. Models of segregation. American Economic Association Papers and Proceedings., 59(2): 488-493. Schelling, T.C., 1971. Dynamic models of segregation. Journal of Mathematical Sociology, 1: 143-186. Schelling, T.C., 1978. Micromotives and macrobehavior. Norton, New York. Schreiber, G., Akkermans, H., anjewierden, A., de Hoog, R., Shadbolt, N., Van de Velde, W. and Wielinga, B., 2000. Knowledge Engineering and Management: the CommonKADS Methodology. The MIT Press, Cambridge, Massachusetts. Schweighofer, E., Rauber, A. and Dittenbach, M., 2001. Automatic text representation, classification and labeling in European law, Proceedings of the 8th international conference on Artificial intelligence and law. ACM Press, NY, USA, St. Louis, Missouri, United States, pp. 78 - 87. Sengupta, R.R. and Bennett, A.D.A., 2003. Agent-based modelling environment for spatial decision support. International Journal of geographical information science, 17(2): 157–180. Setzer, V.W., 2001. Data, Information, Knowledge and Competency. http://www.ime.usp.br/~vwsetzer/data-info.html. Sha, Z. and Hu, Z., 2004. SSKM: A tool to improve spatial decision support ability of GIS, International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, Istanbul, Turkey, pp. 286-293. Sharma, R., Yeasin, M., Krahnstöver, N., Rauschert, I., Cai, G., Brewer, I., MacEachren, A. and Sengupta, K., 2003. Speech-Gesture Driven Multimodal Interfaces for Crisis Management. Proceedings of the IEEE, Special Issue on Multimodal Human-Computer Interface, 91(9): 1327-1354. Shirabe, T., 2004. Towards a temporal extension of spatial allocation modeling. In: M.J. Egenhofer, C. Freksa and H. Miller (Editors), GIScience 2004, University of Maryland, Maryland, pp. 285-298. Shoham, J. and Tennenholtz, M., 1995. Social laws for artificial agent societies: Off-line design. Artificial Intelligence(73). Shoham, Y., 1993. Agent-oriented programming. Artificial Intelligence, 60(1): 51–92. Sierra, C., Faratin, P. and Jennings, N.R., 1999. Deliberative Automated Negotiators Using Fuzzy Similarities, EUSFLAT-ESTYLF Joint Conference on Fuzzy Logic, Palma de Mallorca, Spain, pp. 155-158. Smith, B., 2003. Ontology. In: L. Floridi (Editor), The Blackwell Guide to the Philosophy of Computing and Information. Blackwell, Malden, MA, pp. 155-166. Smith, B. and Mark, D., 2001. Geographical categories: an ontological investigation. International Journal of geographical information science, 15(7): 591-612. Smith, E.E., 1998. An Expert System Approach to Regional Drought Monitoring in Pennsylvania. M.S. Thesis, The Pennsylvania State University, University Park, 159 pp.

160 Sowa, J.F., 1984. Conceptual structures: information processing in mind and machine. Reading, MA: Addison-Wesley. Sowa, J.F., 2000. Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks Cole Publishing Co., Pacific Grove, CA. Sowa, J.F., 2002. Semantic Networks. http://www.jfsowa.com/pubs/semnet.htm. Spek, R.v.d. and Spijkervet, A., 1997. Knowledge Management: Dealing Intelligently with Knowledge, CIBIT, Utrecht. Stenmark, D., 2002. Information vs. Knowledge: The Role of intranets in Knowledge Management. In Proceedings of HICSS-35, IEEE Press, Hawaii, January 7-10, 2002. Stonebraker, M., 1986. Inclusion of abstract data types and abstract indexes in a database system, Proceedings 1986 IEEE data engineering conference. IEEE Computer Society, Los Alamitos, pp. 262-269. Stuckenschmidt, H. and Harmelen, F.v., 2001. Ontology-based Metadata Generation from Semi-Structured Information, Proceedings of the 1st International Conference on Knowledge Capture (K-CAP 2001), Morgan Kaufmann. Suchan, T.A., 1998. Categories in geographic representation. Ph.D Thesis, The Pennsylvania State University, University Park, 236 pp. Tambe, M., Johnson, W.L., Jones, R.M., Koss, F., Laird, J.E., Rosenbloom, P.S. and Schwamb, K., 1995. Intelligent Agents for Interactive Simulation Environments. Spring 1995 Issue of AI Magazine. Tatem, A.J., Lewis, H.G., Atkinson, P.M. and Nixon, M.S., 2003. Increasing the spatial resolution of agricultural land cover maps using a Hopfield neural network. International Journal of Geographical Information Science, 17(7): 647-672. Tiangng, L., Quanggong, C., Jizhou, R. and Yuansu, W., 2004. A Gis-Based Expert System For Pastoral Agricultural Development In Gansu Province P. R. China. New Zealand Journal of Agricultural Research, 47(3): 313-325. Tomlinson, R.F., 1973. A technical description of the Canada geographic information system. Tönjes, R. and Grown, S., 1998. Knowledge based road extraction from multisensor imagery. In: T.I.S.O.R.A.S.C.F.M.A.M. Pixels" (Editor), Columbus, Ohio, USA, July 1998. Tversky, B., 1992. Distortions in cognitive maps. Geoforum, 23(2): 131-138. Tversky, B. and Hemenway, K., 1983. Categories of environmental scenes. Cognitive Psychology, 15: 121–149. Tversky, B. and Hemenway, K., 1984. Objects, parts, and categories. Journal of Experimental Psychology : General, 113(2): 169–193. U.S. Census Bureau, 1969. The DIME geocoding system, Report No. 4, Census use study. Vitousek, P.M., Mooney, H.A., Lubchenco, J. and Melillo, J.M., 1997. Human Domination of Earth's Ecosystems. Science, 277(25): 494-499.

161 Wachowicz, M., 1999. Object-oriented design for temporal GIS. Research monographs in GIS. Taylor & Francis, London, Bristol. Wang, D. and Cheng, T., 2001. A spatio-temporal data model for activity-based transport demand modelling. International Journal of Geographical Information Science, 15(3): 561-585. Westervelt, J.D. and Hopkins, L.D., 1999. Modeling mobile individuals in dynamic landscapes. International Journal of geographical information science, 13(3): 191- 208. Wiederhold, G., 1994. Interoperation, Mediation and Ontologies, International Symposium on Fifth Generation Computer Systems (FGCS94), Tokyo, Japan. Wiig, K.M., 1993. Knowledge Management Foundations: Thinking About Thinking - How People and Organizations Create, Represent, and Use Knowledge,. Schema Press, Arlington, TX. Wilson, J.P. and Burrough, P.A., 1999. Dynamic Modeling, Geostatistics, and Fuzzy Classificaiton: New Sneakers for a New Geography? Annals of the Association of American Geographers, 89(4). Winston, P.H., 1993. Artificial Intelligence. Addison-Wesley Longman. Woodcock, C.E. and Gopal, S., 2000. Fuzzy set theory and thematic maps: accuracy assessment and area estimation. International Journal of geographical information science, 14(2): 153-172. Woodward, D., 1992. Representations of the world. In: R.F. Abler, M.G. Marcus and J.M. Olson (Editors), Geography's Inner Worlds. Rutgers Universtiy Press New Brunswick, New Jersey. Wooldridge, M. and Jennings, N.R., 1995. Intelligent Agents: Theory and Practice. The Knowledge Engineering Review, 10(2): 115-152. Wooldridge, M.J. and Jennings, N.R., 1999. Software Engineering with Agents: Pitfalls and Pratfalls. IEEE Internet Computing, 3(3): 20-27. Xu, J., 2003. Implement an Intelligent ArcView User Interface Using SNePS, 23rd Annual ESRI User Conference, San Diego, California, July 7-11, 2003. Yuan, M., 1994. Wildfire conceptual modeling for building GIS space-time models, Proceedings of GIS/LIS '94, Phoenix. Yuan, M., 1997. Development of a Global Conceptual Schema for Interoperable Geographic Information, International Conference and Workshop on Interoperating Geographic Information Systems. http://www.ncgia.ucsb.edu/conf/interop97/program/papers/yuan/yuan.html, Santa Barbara, California. Yuan, M., 2001. Representing complex geographic phenomena in GIS. Cartography and Geographic Information Science, 28(2): 83-96. Zadeh, L.A., 1965. Fuzzy Sets. Information and Control, 8: 338- 353. Zaff, B.S., McNeese, M.D. and Snyder, D.E., 1993. Capturing multiple perspectives: a user-centered approach to knowledge and design acquisition. Knowledge Acquisition, 5: 79-116.

162 Zeleny, M., 1987. Management Support Systems: Towards Integrated Knowledge Management. Human Systems Management, 7(1): 59-70.

163 APPENDICES

Appendix A: agent-related functions in Madkit (http://www.MadKit.org )

Appendix A.1 The general definitions and functions in Madkit

• Agent: is specified as an active communicating entity, which plays roles within groups. • Groups : are defined as atomic sets of agent aggregation. Each agent is part of one or more groups. In its most basic form, the group is only a way to tag a set of agents. In a more developed form, in conjunction with the role definition, it may represent any usual multi-agent system. Groups have the following characteristics: (1) an agent can be a member of multiple groups at the same time; (2) groups can freely overlap; and (3) an agent may request the admission to any group. • Roles : are abstract representations of agents' functions, services or identifications within a group. Each agent can handle multiple roles, and each role handled by an agent is local to a group. • Life cycle : all agents contain the " activate" and " end" sections. When an agent is created, the Madkit kernel calls its activate() method to instantiate, construct, and register this agent. When an agent terminates, the kernel calls its end() method. • Messaging : MadKit provides several kinds of predefined messages such as ACLMessages and KQMLMessages. • Graphical interfaces : Each agent has its own graphical interface. An agent interface can be a simple label, or a complex its own window.

Appendix A.2 Agent-related JESS functions provided by Madkit

• (me) : returns a reference to the current agent. • (pause me) : pause the agent for the amount of time in milliseconds • (println s) : displays the strings in the standard output of the agent. • (getAddress a) : returns the agent address of the agent a. • (setName a s) , (getName a) : set (or returns) the name of an agent • (createGroup c g) , (createGroup g) : create a group g in the community c if it is present, or in the community "public" otherwise. • (requestRole c g r) , (requestRole g r) : request to play the role r into the a group g in the community c if it is present, or in the community "public" otherwise. As usual, groups, roles and communities are passed as strings. • (leaveRole c g r) , (leaveRole g r) :leaves the role r which exists in the group g of community c if the parameter is present, or of community "public" if not. If the current agent plays only the role r in g, then it leaves the group. • (leaveGroup c g) , (leaveGroup g) : leaves the group g of community c if the parameter is present, or of community "public" if not.

164 • (getAgentsWithRole c g r) , (getAgentsWithRole g r) : returns a list of all agents (represented by their AgentAddress) playing the role r in the group g. The group is taken from the community c if the parameter is present or of the community "public" if not. • (getAgentWithRole c g r) , (getAgentWithRole g r) : returns one agent taken at random from the list returned by getAgentsWithRole . • (isCommunity c) : returns TRUE if the community c exists, i.e., if the kernel (represented by the SiteAgent) is present in this community. Returns FALSE otherwise. • (isGroup c g), (isGroup g) : returns TRUE if the group g exists in the community c if the parameter is present, or exists in the community "public" if not. Returns FALSE otherwise. • (isRole c g r), (isRole g r) : returns TRUE if the role r of the group exists in the community c if the parameter is present, or exists in the community "public" if not. Returns FALSE otherwise. • (getMyGroups? c), (getMyGroups): returns the list of all groups of which the current agent is a member. Groups are taken from the community c if the parameter is present or of the community "public" if not. • (getExistingGroups? c), (getExistingGroups): returns the list of all groups existing in the community c if the parameter is present, or existing in the community "public" if not. • (getMyRoles? c g) , (getMyRoles g) : returns the list of all roles played by the current agent in the group g. The group is taken from the community c if the parameter is present or of the community "public" if not. • (getExistingRoles? c g) , (getExistingRoles g) : returns the list of all roles existing in the group g. The group is taken from the community c if the parameter is present or of the community "public" if not. • (getAvailableCommunities): returns the list of all communities in which the current kernel (through its SiteAgent) is present. By default, contains at least one element, the "public" community. • (sendMessage a m) : sends the message m to the agent represented by its agent address a. • (broadcastMessage c g r m), (broadcastMessage g r m) : Sends the message m to all agents playing role r within group g in community c (or in community "public" if this parameter is not present).

165

Appendix B: Two examples of the interviewing transcripts

Y: the interviewer, Chaoqing Yu P: the participant, interviewee (): comments, or not clear but with similar meaning as what in the ().

Appendix B.1 Transcript A

A1. Concept Map 1: Aaronsburg CWS

Y: Do you think you can understand this concept map? P: Yes

Y: Since you said that you did understand this concept map, can you explain why they (i.e., the Aaronsburg-CWS) were trying to abandon the primary source well #8 in 2004? P: Ah, they (DEP) don't like it. They did SWIP test, which was Surface Water Identification Protocol. And they concluded that it was under direct surface influence. So DEP suggested drilling a new well, boil water advisory, and reach water from Millheim water system. It was bad because it was in the limestone aquifer, which has sinkholes. Because of that, it got surface water influence. Y: The question is, this well was drilled in 1988 and had been used as the primary water source since then. Why did they just attempt to abandon it until recently? P: Because…, I don't know. Cost? (Exploring the concept map again) P: OK, I think I get the answer now. Because the Safe Drinking Water Act Amendments of 1996 came from EPA, and was adopted by DEP, which (required) that SWIP test (i.e., 1997). And it is related to some surface water influence, because it is on the limestone with sinkholes. Ah, so that's why. Y: Yes.

Y: With the aid of the layer of the geospatial map, do you think you can understand the spatial relationships between the wells and aquifers better? P: Yes. And I can see where the aquifers are. I think it is good. I have an idea. Y: Yeah? P: Maybe it is nice to have some dots to represent the cities. Or you can make some other spatial landmarks, or be able to zoom out to see the routes of Pennsylvania. Y: That is a very good idea.

Y: Can you imagine what problem this system now is facing? P: Cost, money. The cost (i.e., water price here) is increasing. They had leaking problems. And they are not very big because I can see the population is (only) around 400. So this makes them don’t have a lot of tax money to support this. And their pipes are old, about 12 miles. Maybe it will be nice to be able to drag a line here so I see what the 12-mile looks like on the map, just because it's on the concept map. So I might want to see what the 12-mile looks like.

Y: Can you summarize how the DEP affected this water system? P: The came in, did the SWIP tests. And the primary well was well #8 until recently. And they found surface water influence. So they had to find new wells and adopt boil advisory. So they drilled another well. And the DEP did another SWIP test. And they found cryptosperidian and gerardia whatever. So they need filtration plant. That cost them around $770k, a whole lot of money. That came to a big problem. The DEP, presumably, is responsible for them to have (safe drinking) water, like filtration. That cost them lots lots of money, very poor.

A2. Concept Map 2: The Upper Halfmoon Town CWS

P: Probably (you can put more spatial information, such as) the 322 (i.e., a road) and I-80 on the map. Y: OK. You want to know some more spatial information? P: Yeah. I know where Aaronsburg is, I know where Millheim is, but I didn't know where Upper Halfmoon was. Yeah, I definitely want some spatial information. I mean, it depends on what kinds of question you will ask me. But the first thing I would do is to locate where this place is.

Y: Can you explain how this system was formed and has developed from a tiny rural water system since the 1960s, and why this water system now has a very good shape in term of water sources and revenue? (The laptop screen was closed when the participants answering the question) P: Just all started back. A warm, hot, sunny day in 1964, ah…, there was a drought. The farmers got together to create the water company, led by Ralph Sealey, served 26 households. The time they had two wells, old wells. That's 1964. They wanted some new wells, but that didn't happen until 1990 when the real estate came, because the State College's booming. The whole State College area, including Upper Halfmoon, was booming. So there was cooperation between the developers and the water company. The old wells, at least one of them, had problem in water quality, maybe because of the limestone aquifer. So they were getting rid of these two wells, and got two new wells, I don't remember, because the sandy stone aquifer, but not a limestone aquifer anyway. These two wells were nice, were expanded and modernized. And they got more money, have more people. And everybody is happy. Y: Right. That is what's going on there. P: On this map, I didn't see any DEP test. Y: In that interview, we just ran out of time. We didn't mention that. P: OK. Maybe you can have a node to have "SWIP test" there, with an attribute window just says whatever information needed there.

167

A3. Concept Map 3: The Millheim CWS

Y: The Elk Creek is bigger than the Philips Creek. Can you explain why Elk Creek is a secondary source? P: OK. I just want to start from a small branch. (Exploring reading the concept map…), Because Philips Creek has better water quality. It is protected. Y: I mean Elk creek. P: Elk creek is poor because agriculture and natural runoffs.

Y: Can you explain how the snowstorm in 1995 affected the water system, and how the water system responded to this event? P: The snow knocked down the trees, and (caused) power outage. It caused the alarm, and … (exploring the concept map)… one full time guy was informed by the alarm. He decided to use a generator to pump water to the storage tank.

A4. Open questions:

Y: Do you feel that the spatial information was useful for your understanding the contents of the concept maps? If so, in which ways? P: Yes. The spatial information is important, not the questions you asked me. For example, that geology information, that is interesting. To answer your question, I don't think that is necessary. That is only (beneficial) for my own knowledge.

Y: With a layer of concept maps on the top of geospatial data, do you think there are any differences between using this tool and using the current data-oriented GIS tools, such as MapInfo or ArcGIS? P: I guess because I use Arc so much I wish you could put more spatial information on it. Not for criticizing, just for suggestions, it (i.e., ArcView) should have more capabilities, because I know what ArcView can do. Then I can help to compare to that. For example this (i.e., the GIS map of Millheim on the screen), I really want to know more information. So I would probably just look at ArcView. It would not be that much. You could have more layer, streets, and dots for cities. Y: The question behind is, do you think you can answer the how and why questions only use ArcGIS if without much background information? P: You mean the questions you just asked? Y: Yes. P: No. I mean, having a concept map is obviously very nice. I think, once I got hung on it, I tried so much myself to read a long line (i.e., branch of the concept nodes, linked with directional relationship). Sometime I think you have to stop, to another node. That's a little confusing at first. But once you got hung on it, I think it is better than having a narrative paragraph. And then looking the map, it allows seeing the two things (i.e., the concept map and the GIS map), it's very nice; and being able to see the linkages in the way in the concept map rather than the traditional narrative type thing. Y: text-based? P: test-based, yeah. I think it is better.

168 Y: You mean it (concept map) is better (than narrative paragraph) to build connections? P: yes, or could get a lot of knowledge very quickly. You look at the nodes, and get an idea of what's going on. We can go back to see the "snow storm", "water pressure", "pump station" later, and "online support.” I know eventually what happened there. If I want more information, I can move over. Yes, it is much better, definitely.

Y: Do you have any suggestions or comments to improve its functionality or usability? P: I guess a couple of things about the concept map, I just sort of picky. It will be nice if I don't have to rotate the concept map too often to read it. For example, this relationship "include", presumably is underneath the "reservoir" node, I have to rotate it in order to see it better. It will be nice if just automatically (to be displayed) when you move on a line. Because I just felt I had to rotate all the time. If it is a big concept map, I think I will come to get annoy. Y: OK. P: And I think it will be nice to be able to set my own color scheme. If you define each of the nodes as thing, time, whatever, and then I can make all those time green, all the places blue. And then I will get more information more quickly. Y: OK. P: And, I guess the things with shapes. I see the ovals, squares. I don't know what are the differences there. Just more interactive capability will make it nicer. Y: Yeah. P: Oh, the last thing. I see each of the nodes, or relationships, has a popup window. But some of them don't. You mean the "*" (indicates that)? Y: yes. P: To me, that is not apparent enough. Maybe sometimes you can make it bolder, with underline, color, whatever, just more apparent.

Appendix B.2 Transcript B

B1. Concept Map 1: Aaronsburg CWS

Y: Do you think you can understand this concept map? P: I think I do understand.

Y: Since you said that you did understand this concept map, can you explain why they (i.e., the Aaronsburg-CWS) were trying to abandon the primary source well #8 in 2004? P: Yeah, I can. Basically, well #8 is on the limestone aquifer, which has sinkholes that can give direct influence from the surface, where the surface water flows to the well and result in poor water quality. In conjunction with the DEP, they made them test the water. And they found direct surface influence. So they could have done a couple of different things. They could have treated the water from the well. And they did pump less water, ah… for using boil advisory when in the service, which they did. Or just find new wells, so they ended up going that direction (i.e., finding new wells). Y: The question is, this well was drilled in 1988 and had been used as the primary water source since then. Why did they just attempt to abandon it until recently?

169 P: Interesting. So the DEP really made difference. I think this program was new. I think so, Yeah. I can see that. OK, that happened in 96, yeah. I remember that happened in Bellfont (a town in the State College area, Center County, PA). They had to start testing now. They also have to test the big spring (in Bellfont), because of the SWIP.

Y: With the aid of the layer of the geospatial map, do you think you can understand the spatial relationships between the wells and aquifers better? P: As long as better legend is coded, but wasn’t. The map is confusing me (i.e., only one color used for highlighting the features).

Y: Can you imagine what problem this system now is facing? P: Huge. Y: What's the problem? P: Well, the biggest problem is that, OK, the wells were impacted by the SWIP tests. They abandoned the old primary source. They drilled another well (the new well), which is on the sandstone aquifer. They thought that's going to be safe because it is sandstone aquifer. They drilled the well, and did the SWIP test (in 2002). And they found that this well did also have surface water connection (i.e., one cryptosperidian and two gerardia found in this test). So they had to build the expensive filtration plant. So the price is going to rise up. I think that is the problem. It is not fun. Y: Because this is a tiny system. P: Yeah, poor people. Well, because I am studying some environment issues. I understand what DEP to do what it does. But it true that they have huge impact on small communities. Y: OK, that is actually the next question.

Y: Can you summarize how the DEP affected this water system? P: They have to abandon the primary well, and to do the very expensive treating. And they have springs, but I guess that must be not that much water. Y: That's the influence from the social side. What is the physical impact on this system? B: The connection between the surface water and ground water, and contamination. I am guessing, I don't know this area, that there would be (some influence from) agriculture and urbanization (note: not for the new well, because it is on a mountain, and is surrounded by forest). Y: You are doing pretty well.

B2. Concept Map 2: The Upper Halfmoon Town CWS

Y: Can you explain how this system was formed and has developed from a tiny rural water system since the 1960s, and why this water system now has a very good shape in term of water sources and revenue? (The laptop screen was closed when the participants answering the question) P: On the drought side, one of the physical things was that they were relying on springs that sounds they weren't deep springs because they were sensitive to drought (i.e., 1964, before the formation of this water system). The other physical thing was the

170 wells (i.e., the two old wells) they drilled. They just didn't yield very well, or they had poor quality. They had surface influence, or they just weren't yielding very well. I guess the other physical influence was, just proximity here, a growing community. That's sort of the physical thing. Y: OK, how about socioeconomic reasons? P: I guess, it (the Halfmoon Town) used to be a farming community. So they were relying on their water for their well-being. So they came together to form that water company in the first place. The other driver was obviously because the economic development and population growth in State College. That is a sort of demographic thing that changing the income level and tax.

B3. Concept Map 3: The Millheim CWS

Y: The Elk Creek is bigger than the Philips Creek. Can you explain why Elk Creek is a secondary source? P: Well, looks like mainly water quality. The agriculture runoff creates the problem, such as pollution and erosion.

Y: Can you explain how the snowstorm in 1995 affected the water system, and how the water system responded to this event? P: So the storm knocked down the trees, and cut off the power supply. There was a big power outage. The water pressure went down. They had an alarm to inform the full- time employee. They used a generator to pump water to the storage (tank) to increase the water pressure.

B4. Open questions:

Y: Do you feel that the spatial information was useful for your understanding the contents of the concept maps? If so, in which ways? P: For me, personally, not really. But if I am trying to come out alternative, and know the area very well, it will be helpful to have spatial data. If I am trying to plan or make decision, I think it will be very helpful. But for me, I don't think that makes that much difference. Again, it is interesting, but not very helpful. I guess that it will be helpful to people who are familiar with this area. But I think I got the most from the concept map.

Y: With a layer of concept maps on the top of geospatial data, do you think there are any differences between using this tool and using the current data-oriented GIS tools, such as MapInfo or ArcGIS? P: I guess the map will show me the 'what' of the systems, but not necessarily how things work; or the 'how' and 'why'; where things are; why they are; and how things operate within a system. The map will show me kind of 'what' and 'where,’ but not 'how' and 'why,’ really. Y: Wah… that is exactly what I am looking for. P: Oh, good! I am not cheating… Ha Ha… Y: You could see some organized information.

171 P: Right, that's very interesting. Y: Using regular GIS maps, you have to build the connections (or relationships) by yourself, is that true? P: That's a good point. To tell the stories in the map, you would need a lot more either time series maps, or categorize those… you know… ah… That will be hard to do. Yeah, You definitely need to use the concept map. You could tell some story. I think, with the map, you have to use a lot of series maps. Y: How about regulations or laws? Y: Yeah. You could put some event, such as SWIP test, but it is hard to present in the map, I think. Especially looking at the case of the SWIP thing, it came about 96, but they didn't actually test until 98 (i.e., should be 97 here), for whatever reason they didn't have to. So you do even know when the SWIP will happen. Yeah, I think that (i.e., the concept map) was helpful.

Y: Do you have any suggestions or comments to improve its functionality or usability? P: When you are trying to tell the stories, sometimes it will be helpful to make the concept map visually standardized, like water sources are always the main branch, maybe the infrastructure another branch, and the personnel another branch. Y: You are right. P: Very interesting.

172 Appendix C: Examples of the execution results from GeoAgents’ rule firing for simulation example

C.1 Independent responses to a power-outage

The executing results from the GeoAgent of CollegeTownship_CWS

CollegeTownship_CWS is running... The CollegeTownship_CWS GeoAgent joins the StateCollegeCWS_Team. __Inspect power: power outage in CollegeTownship Water Authority! ::Need to identify the cause_of_power_outage __Inspect powerStation: power outage due to fire_and_smoking! ::Need to check the system pressure __The system pressure above_20_PSI ::Need to check Emergency_Generator __The generator is available.

::Need to start the generator. __Check the generator's fuel and oil level. __Connect generator leads to external terminal box. __Put Off-Auto switches for High Service pumps to the Off position. __Use key interlock to open the main circuit breaker. __Use key interlock to close the standby generator breaker. __When the breaker is closed, start generator.

__The generator started. __Now using the emergency generator...

C.2 Local cooperative responses to a power outage

The executing results from the GeoAgent of CollegeTownship_CWS

CollegeTownship_CWS is running... The CollegeTownship_CWS GeoAgent joins the StateCollegeCWS_Team. __Inspect power: power outage in CollegeTownship Water Authority! ::Need to identify the cause_of_power_outage __Inspect powerStation: Station is in a good condition. __Inspect surroundings: Surroundings no power. __Power outage due to local power company failure.

::Need to ask the Power Company how_long_to_recover >> sending message: (ASK_POWEROUTAGE_TIME: content how_long_to_recover ) to mka:JESSAgent-3: AlleghenyPowerCompany,8@psu-fmswuxs7mn7:K1105739232018 << receiving message of type: madkit.lib.messages.ACLMessage __I received a message: ANSWER_POWEROUTAGE_TIME, with content: more_than_two_hours

173 __Get answer: PowerOutage will last more_than_two_hours

::Need to check Emergency_Generator __Check Emergency_Generator __Emergency generator not available ::Need to check the emergency interconnection. >> sending message: (REQUEST_INTERCONNECTION: content Emergency_Interconnection ) to mka:JESSAgent-2: StateCollege_CWS,7@psu-fmswuxs7mn7:K1105739232018

::Need to inform radio,newspaper:power problem, restrictive water use... >> sending message: (POWEROUTAGE_COLLEGETOWNSHIP_CWS: content restrictive_water_use_required_for_College_Township_CWS_water_users) to mka:JESSAgent-5: Centre_Daily_Times,10@psu-fmswuxs7mn7:K1105739232018

::Need to check the system pressure __The system pressure below 20 PSI. ::Need to issue boil water order restriction, and notify public. __Inform radio,newspaper:boil water order restriction... >> sending message: (POWEROUTAGE_COLLEGETOWNSHIP_CWS: content boil_water_required_for_College_Township_CWS_water_users) to mka:JESSAgent-5: Centre_Daily_Times,10@psu-fmswuxs7mn7:K1105739232018

::Need to inform the fire Department >> sending message: (LOW_SYSTEM_PRESSURE:content Emergency_Condition_in_College_Township_CWS ) to mka:JESSAgent-4: Fire_department,9@psu-fmswuxs7mn7:K1105739232018 << receiving message of type: madkit.lib.messages.ACLMessage __I received a message: ANSWER_FROM_FIRE_DEPARTMENT, with content: Message_received_and_preparing_water_hauling __Get answer from the fire Department: Message_received_and_preparing_water_hauling

<< receiving message of type: madkit.lib.messages.ACLMessage __I received a message: ANSWER_EMERGENCY_INTERCONNECTION, with content: OK __Get answer of emergency interconnection: OK ::Need to launch Emergency Interconnection __Emergency Interconnection is now in use...

The executing results from the GeoAgent of AlleghenyPowerCompany

AlleghenyPowerCompany is running... __A power outage at CollegeTownship __Condition: serious; will last more_than_two_hours << receiving message of type: madkit.lib.messages.ACLMessage __I received a message, ASK_POWEROUTAGE_TIME , with content: how_long_to_recover >> sending message: (ANSWER_POWEROUTAGE_TIME :content more_than_two_hours ) to mka:JESSAgent: CollegeTownship_CWS,6@psu-fmswuxs7mn7:K1105739232018 __Answered to ASK_POWEROUTAGE_TIME: more_than_two_hours

174 The executing results from the GeoAgent of StateCollege_CWS

StateCollege_CWS is running... The StateCollege_CWS GeoAgent joins the StateCollegeCWS_Team. << receiving message of type: madkit.lib.messages.ACLMessage __I received a message REQUEST_INTERCONNECTION with content: Emergency_Interconnection >> sending message: (ANSWER_EMERGENCY_INTERCONNECTION :content OK ) to mka:JESSAgent: CollegeTownship_CWS,6@psu-fmswuxs7mn7:K1105739232018 __Answered REQUEST_INTERCONNECTION: OK

The executing results from the GeoAgent of FireDepartment_StateCollege

FireDepartment_StateCollege is running... << receiving message of type: madkit.lib.messages.ACLMessage __I received a message INFORM_EMERGENCY_CONDITION with content: __Emergency_Condition_in_College_Township_CWS >> sending message: (ANSWER_FROM_FIRE_DEPARTMENT :content Message_received_and_preparing_water_hauling ) to mka:JESSAgent: CollegeTownship_CWS,6@psu-fmswuxs7mn7:K1105739232018 __Answered INFORM_EMERGENCY_CONDITION: Message_received_and_preparing_water_hauling

C.3 Hierarchical social-environment interactions in drought conditions

The executing results from the GeoAgent of DEP_PA

DEP_PA is running... __A drought emergency is identified. ::Need to convene_PEMA_meeting with Drought_Emergency_Management_Council __Convene a PEMA meeting with Drought_Emergency_Management_Council ::Need to recommend: the_Governor, a_proclamation_of_drought_emergency >> sending message: (DROUGHT_EMERGENCY :content a_proclamation_of_drought_emergency ) to mka:JESSAgent-2: Governor_PA,7@psu-fmswuxs7mn7:K1106169373957 << receiving message of type: madkit.lib.messages.ACLMessage __suggested the_Governor to a_proclamation_of_drought_emergency __I received a message: DROUGHT_EMERGENCY_PROCLAMATION, with content: agreed __ Chapters 118, 119 and 120 of the Emergency Management Regulation activated ::Need to post drought daily reports on the DEP drought Website..

::Need to send_letters to public_water_suppliers, drought_emergency, follow_drought_contingency_plans ::Need to increase monitoring_activities from_weekly_to_daily >> sending message: (DROUGHT_EMERGENCY :content follow_drought_contingency_plans ) to mka:JESSAgent-4: Millheim_CWS,9@psu-fmswuxs7mn7:K1106169373957 >> sending message: (DROUGHT_EMERGENCY :content follow_drought_contingency_plans

175 ) to mka:JESSAgent-5: CollegeTownship_CWS,10@psu-fmswuxs7mn7:K1106169373957 >> sending message: (DROUGHT_EMERGENCY :content follow_drought_contingency_plans ) to mka:JESSAgent-6: StateCollege_CWS,11@psu-fmswuxs7mn7:K1106169373957 >> sending message: (DROUGHT_EMERGENCY :content follow_drought_contingency_plans ) to mka:JESSAgent-3: PSU_CWS,8@psu-fmswuxs7mn7:K1106169373957 __Letters have been sent to public_water_suppliers to follow_drought_contingency_plans

::Need to schedule weekly_meetings with_Commonwealth_Drought_Task_Force __schedule weekly_meetings with_Commonwealth_Drought_Task_Force

::Need to inform_media: drought_emergency, nonessential water use prohibited >> sending message: (DROUGHT_EMERGENCY :content nonessential water use prohibited ) to mka:JESSAgent-7: Centre_Daily_Times,12@psu-fmswuxs7mn7:K1106169373957 __informed media: drought_emergency, nonessential water use prohibited

The executing results from the GeoAgent of Centre_Daily_Times

Centre_Daily_Times is running... << receiving message of type: madkit.lib.messages.ACLMessage __I received a message: DROUGHT_EMERGENCY, with content: nonessential water use prohibited __broadcasting: DROUGHT_EMERGENCY, request: nonessential water use prohibited

CHAPTER 119. PROHIBITION OF NONESSENTIAL WATER USES IN A COMMONWEALTH DROUGHT EMERGENCY AREA: watering_grass : is restricted in the drought emergency condition! watering_athletic_fields : is restricted in the drought emergency condition! watering_landscaped_areas: is restricted in the drought emergency condition! watering_golf_courses: is restricted in the drought emergency condition! washing_paved_surfaces: is restricted in the drought emergency condition! ornamental_water_use: is restricted in the drought emergency condition! cleaning_mobile_equipment: is restricted in the drought emergency condition! serving_water_in_restaurants_clubs_eating_places: is restricted in the drought emergency condition! filling_swimming_pools : is restricted in the drought emergency condition! using_a_fire_hydrant: is restricted in the drought emergency condition! other_non_beneficial_water_use: is restricted in the drought emergency condition!

The executing results from the CollegeTownship_CWS

CollegeTownship_CWS is running... The CollegeTownship_CWS GeoAgent joins the StateCollegeCWS_Team.

<< receiving message of type: madkit.lib.messages.ACLMessage __I received a message: DROUGHT_EMERGENCY, with content: follow_drought_contingency_plans __DROUGHT_EMERGENCY issued

__Objective is to reduce 25% water use ::Need to follow: DEP_rquired_drought_emergency_plan __DEP_rquired_drought_emergency_plan launched.

176 __activating law: PROHIBITION OF NONESSENTIAL WATER USES __Prohibition of nonessential water use activated

::Need to activate the drought contingency plan. __The drought contingency plan activated. ::Need to inform water users the restricted amount of water use and prices in excess: >> sending message: (INFORM_WATER_PRICE :content DroughtEmergency_AllottedWaterUse:40gal/day.person;MonthlyExcess:$8.77/kgal_for_first_2kgal_An d_$17.54/kgal_thereafter ) to mka:JESSAgent-6: CollegeTownship_WaterUser1,11@psu-fmswuxs7mn7:K1106070633444

::Need to check interconnections >> sending message: (CHECK_INTERCONNECTION :content Emergency_Interconnection ) to mka:JESSAgent-10: StateCollege_CWS,15@psu-fmswuxs7mn7:K1106070633444 << receiving message of type: madkit.lib.messages.ACLMessage __I received a message: ANSWER_EMERGENCY_INTERCONNECTION, with content: OK __Get answer of emergency interconnection: OK

DEP_rule: ::Need to daily monitor: Spring Creek Park Well; at The intersection of X Road and Y Drive, College Township, Centre County, PA __Monitoring list include: (water-usage well_static pumping_level well_run_time well_flow_rate stream_flow) __Monitored source: Spring Creek Park Well

<< receiving message of type: madkit.lib.messages.ACLMessage __I received a message: WATER_USE_APPLICATION, with content: filling_swimming_pools __According to The Emergency Management Services Code, 35 Pa.C., __Chapter 119. Prohibition Of Nonessential Water Uses In A Commonwealth Drought Emergency Area, __The application purpose is a kind of nonessential water uses: __Description: Code35 Chapter119, prohibited_water_use: the use of any water to fill and top off swimming pools. __Conclusion: 'filling_swimming_pools' is prohibited in drought emergency condition! >> sending message: ( APPLICATION_NOT_APPROVED :content Code35 Chapter119, prohibited_water_use: the use of any water to fill and top off swimming pools. ) to mka:JESSAgent-4: CollegeTownship_WaterUser1,11@psu-fmswuxs7mn7:K1106270452576

The executing results from the GeoAgent of CollegeTownship_WaterUser1

CollegeTownship_WaterUser1 is running... << receiving message of type: madkit.lib.messages.ACLMessage __I received a message INFORM_WATER_PRICE. __The content is: DroughtEmergency_AllottedWaterUse:40gal/day.person;MonthlyExcess:$8.77/kgal_for_first_2kgal_An d_$17.54/kgal_thereafter

__I want to apply using water for: filling_swimming_pools. >> sending message: (WATER_USE_APPLICATION : content filling_swimming_pools ) to mka:JESSAgent-5: CollegeTownship_CWS,10@psu-fmswuxs7mn7:K1106169373957 << receiving message of type: madkit.lib.messages.ACLMessage __I received a message: APPLICATION_NOT_APPROVED . __The content is:

177 Code35 Chapter119, prohibited_water_use: the use of any water to fill and top off swimming pools.

The executing results from the GeoAgent of Millheim_CWS

Millheim_CWS is running...

<< receiving message of type: madkit.lib.messages.ACLMessage __I received a message: DROUGHT_EMERGENCY, with content: follow_drought_contingency_plans __DROUGHT_EMERGENCY issued

__Objective is to reduce 25% water use ::Need to follow: DEP_rquired_drought_emergency_plan __DEP_rquired_drought_emergency_plan launched. __Activating law: PROHIBITION OF NONESSENTIAL WATER USES __Prohibition nonessential water use activated

::Need to activate the drought contingency plan. __The drought contingency plan activated. ::Need to inform water users the restricted amount of water use and prices in Excess: >> sending message: (INFORM_WATER_PRICE :content DroughtEmergency_AllottedWaterUse:40gal/day.person;MonthlyExcess:$7.00/kgal_for_first_2kgal_An d_$15.00/kgal_thereafter ) to mka:JESSAgent-9: Millheim_WaterUser1,14@psu-fmswuxs7mn7:K1106169373957

DEP_rule: ::Need to daily monitor: Elk_Creek; at east of the Feature B. __Monitoring list include: (water-usage creek_flows dropped_depth flow_in_flume) __Monitored source: Elk_Creek DEP_rule: ::Need to daily monitor: Phillips_Creek; at northwest of the Feature A. __Monitoring list include: (water-usage creek_flows dropped_depth flow_in_flume) __Monitored source: Phillips_Creek __The flume flow in Phillips Creek is less than 0.037 cubic feet per second. ::Need to switch the water source to Elk Creek . __Now using water from Elk Creek.

<< receiving message of type: madkit.lib.messages.ACLMessage __I received a message: WATER_USE_APPLICATION, with content: watering_new_grass_at_non_working_hours __Conclusion of the application: 'watering_new_grass_at_non_working_hours' is allowed. __But, according to the Emergency Management Services Code, 35 Pa.C., __Chapter 119. Prohibition Of Nonessential Water Uses In A Commonwealth Drought Emergency Area, __The application has to meet the extra requirements. >> sending message: ( APPLICATION_APPROVED :content Code35 Chapter119, water_use_allowed_in_this_condition: water may be used to establish and maintain newly seeded and sodded grass areas when applied between the hours of 5 p.m. and 9 a.m. by means of a bucket, can or hand held hose equipped with an automatic shut-off nozzle, or when applied between the hours of 7 p.m. and 11 p.m. by any other means designed and operated to ensure effective conservation. ) to mka:JESSAgent-9: Millheim_WaterUser1,14@psu-fmswuxs7mn7:K1106169373957

The executing results from the Millheim_WaterUser1

178

Millheim_WaterUser1 is running... << receiving message of type: madkit.lib.messages.ACLMessage __I received a message INFORM_WATER_PRICE. __The content is: DroughtEmergency_AllottedWaterUse:40gal/day.person;MonthlyExcess:$7.00/kgal_for_first_2kgal_An d_$15.00/kgal_thereafter __I want to apply using water for: watering_new_grass_at_non_working_hours. >> sending message: (WATER_USE_APPLICATION :content watering_new_grass_at_non_working_hours ) to mka:JESSAgent-4: Millheim_CWS,9@psu-fmswuxs7mn7:K1106169373957 << receiving message of type: madkit.lib.messages.ACLMessage __I received a message: APPLICATION_APPROVED . __The content is: Code35 Chapter119, water_use_allowed_in_this_condition: water may be used to establish and maintain newly seeded and sodded grass areas when applied between the hours of 5 p.m. and 9 a.m. by means of a bucket, can or hand held hose equipped with an automatic shut-off nozzle, or when applied between the hours of 7 p.m. and 11 p.m. by any other means designed and operated to ensure effective conservation.

179 Appendix D The standards of the indices for determining drought severity

D.1 Precipitation Deficit Drought Indicators

Watch Warning Emergency Duration of Deficit (Deficit as Percent of (Deficit as Percent of (Deficit as Percent of Accumulation (months) Normal Precipitation) Normal Precipitation) Normal Precipitation) 3 25 35 45 4 20 30 5 20 30 40 6 20 30 40 7 18.5 28.5 38.5 8 17.5 27.5 37.5 9 16.5 26.5 36.5 10 15 25 35 11 15 25 35 12 15 25 35 (source: www.dep.state.pa.us/dep/subject/hotopics/drought/facts/FS2472DroughtMgmtInPA.htm )

D.2 Drought triggering criteria for the stream flows, groundwater levels, and PHDI

(sources: Smith (1998) and www.dep.state.pa.us )

D.3 Drought triggering criteria for the reservoir in Central Pennsylvania

(Percentage (%) of usable reservoir storage. Sources: derived from Smith 1998, p98)

180

CHAOQING YU 120 Westway Apt. 103 Greenbelt, MD 20770 Phone: (216) 262-5455 Email: [email protected]

Education:

2005 Research Associate, Department of Geography, University of Maryland, MD, USA 2001 - 2005 Ph.D., GIScience, the Pennsylvania State University, PA, USA 1999 - 2001 M.A., GIS and Remote Sensing, Kent State University, Kent, OH, USA 1994 -1997 M.S., Physical Geography, Chinese Academy of Sciences, Beijing, China 1990 -1994 B.S., Physical Geography, the Southwest Normal University , Chongqing, China

Professional Experience

1997-1998 Engineer. The Information Center of the State Environmental Protection Administration (SEPA) of China, Beijing.