<<

SPATIAL PROBLEM SOLVING FOR DIAGRAMMATIC REASONING

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Philosophy in the

Graduate School of The Ohio State University

By

Bonny Banerjee, M.S.

* * * * *

The Ohio State University 2007

Dissertation Committee: Approved by Dr. Balakrishnan Chandrasekaran, Adviser

Dr. John R. Josephson

Dr. Tamal K. Dey ______Adviser

Graduate Program in Computer Science and Engineering

Copyright by

Bonny Banerjee

2007

ABSTRACT

Diagrammatic reasoning (DR) is pervasive in human problem solving as a

powerful adjunct to symbolic reasoning based on language-like representations.

However, is overwhelmingly based on symbolic

representations, with proportionately scant attention to . This dissertation is a

contribution to building artificial agents that can create and use diagrams as part of

their problem solving. The work is in a framework in which DR is modeled as a process in which subtasks are solved, as appropriate, either by inference from

symbolic representations or by perceived from a , and subtasks

may also act on the diagram, i.e., create or modify objects in the diagram. The

perceptions and actions are in fact domain- and task-specific 2D spatial problems

defined in terms of properties and relations involving diagrammatic objects. Most DR

systems built so far are task-specific, and their developers as a rule have hand-crafted

the required perceptions and actions.

Our goal is the development of a general, i.e., domain- and task-independent,

capability that takes specifications of perceptions and actions and automatically

executes them. Thus, the purpose of this dissertation is to investigate:

ii 1. A language for a human problem solver to communicate a wide variety of 2D spatial problems relevant to DR, and

2. A general domain-independent framework of underlying representations and reasoning strategies suitable for efficiently solving spatial problems without human intervention.

This dissertation will present a high-level language that is extensible, human- usable, and expressive enough to describe a wide variety of spatial problems in terms of constraints. The constraints are specified in first-order over the real domain using a vocabulary of objects, properties, relations and actions. Two general and independent strategies -- constraint satisfaction and spatial search -- are developed for automatically solving the spatial problems specified in that language. Several ideas about how to make these strategies computationally efficient are proposed and illustrated by examples. A traditional AI problem solver is augmented with this spatial problem solver for reasoning with diagrams in different domains for real-world applications. The utility of the framework is judged by the expressiveness of the language, and generality and efficiency of the two strategies.

iii

Dedicated to my wonderful family

iv ACKNOWLEDGMENTS

I gratefully acknowledge the guidance and support rendered to me by my

adviser Prof. B. Chandrasekaran (Chandra) in walking me through the problem,

developing the concepts related to the subject of diagrammatic reasoning and

representation, providing valuable suggestions and preparing this dissertation.

I am also indebted to Dr. John R. Josephson for helping me with discussions

and valuable suggestions related to my work. Thanks are due to Prof. Tamal K. Dey

for being in my dissertation committee and helping me with research issues from time

to time. Thanks are also due to Unmesh Kurup and Vivek Bharathan for helping me in

many different ways, especially through lively intellectual discussions. Several others in the Department of Computer Science & Engineering, whose names have not been mentioned, have also been very helpful. I wish to thank each of them and extend my sincerest apologies for overlooking their contribution.

Finally, I will always remain deeply indebted to my family for providing the support that made this dissertation possible.

The research reported in this dissertation was supported by participation in the

Advanced Decision Architectures Collaborative Technology Alliance sponsored by the U.S. Army Research Laboratory under Cooperative Agreement DAAD19-01-2-

0009.

v VITA

November, 1977 Born – Calcutta (Kolkata), India

July, 2000 B.E. Electronics & Telecommunication Engineering Jadavpur University, India.

August, 2002 M.S. Electrical Engineering The Ohio State University, Columbus, USA

October, 2001 – June, 2007 Graduate Research Associate Laboratory for Artificial Intelligence Research, Department of Computer Science & Engineering, The Ohio State University, Columbus, USA.

PUBLICATIONS

B. Banerjee, ”String tightening as a self-organizing phenomenon.” IEEE Transactions on Neural Networks, 18(5):1463-1471, (2007).

B. Banerjee and B. Chandrasekaran, ”A constraint satisfaction framework for visual problem solving.” Trends in Constraint Programming, F. Benhamou, N. Jussien and B. OSullivan, Editors, Hermes Science, Chapter 26, (2007).

B. Banerjee, ”A layered abductive inference framework for diagramming group motions.” Special Issue of Logic Journal of IGPL: Abduction, Practical Reasoning, and Creative Inferences in Science, L. Magnani, Editor, 14(2):363-378, Oxford University Press, (2006).

B. Chandrasekaran, U. Kurup, B. Banerjee, J. R. Josephson and R.Winkler, ”An architecture for problem solving with diagrams.” Diagrammatic Representation and Inference, A. Blackwell, K. Marriott and A. Shimojima, Editors, Lecture Notes in Artificial Intelligence 2980:151-165, Berlin: Springer-Verlag, (2004).

vi B. Banerjee, ”Recognition of partially occluded shapes using a neural optimization network.” Machine & Vision, Institute of Computer Science of the Polish Academy of Sciences, 13(1/2):3-23, (2004).

FIELDS OF STUDY

Major Field: Computer Science and Engineering Specialization: Artificial Intelligence

Minor Fields: Cognitive Science,

vii OF CONTENTS

Abstract…………………………………………………………………………..… ii Dedication………………………………………………………………………..… v Acknowledgments………………………………………………………………..…vi Vita……………………………………………………………………………….… viii

List of Figures……………………………………………………………………… xii

Chapters:

1. Introduction………………………………………………………………..…… 1

1.1 Diagrammatic reasoning as a problem solving activity.....………………… 1 1.2 What do we mean by a diagram? ………...... …………………….. 5 1.3 Perceptions and actions in diagrammatic reasoning…………………..…… 6 1.4 The problem…………………………………………………………..……. 13 1.5 Contributions………………………………………………………..……....14 1.6 Organization of the dissertation…………………………………………….16

2. A specification language..……………………………………………………....18

2.1 Vocabulary...…………………………………………………………..…....18 2.2 Specification language………………………………………………..…….20 2.3 Discussion……………………………………………………………..…....26

3. A constraint satisfaction framework…………………………………………....27

3.1 Overall algorithm………………...………………………………………....27 3.2 Modeling language……………………………………………………….....31 3.3 Mapping to a similar problem…………………………………………...... 32 3.4 Memory organization…………………………………………...... 35 3.5 Computational complexity…………………………………………...... 37 3.6 An example……………………………………………………………...... 39 3.7 Discussion……………….…………………………………………...... 41

viii

4. A spatial search framework…………………………………………...... 42

4.1 The overall idea...………………...…………………………………………43 4.2 Underlying representation……………………………….……………….....44 4.3 The core algorithm ……………………………………………………...... 44 4.4 Enhancing the efficiency of visual search ……………………...... 51 4.5 Computational complexity…………………………………………...... 55 4.6 Discussion……………….…………………………………………...... 56

5. Applications………………………………………………………………..…...58

5.1 Entity re-identification……………………………………………………...58 5.2 Ambush analysis…………………………………………………………… 62 5.3 Euclidean geometry theorem proving……………………………………… 64 5.4 Discussion………………………………………………………………..… 68

6. Conclusions………………………………………………………………..……70

6.1 Evaluations………………………………………………………………….71 6.2 Contributions………………………………………………………………..73 6.3 Future research……………………………………………………………...76

Bibliography…….……………………………………………………………..……78

Index…….……………………………………………………………..……………81

ix LIST OF FIGURES

1.1 A general purpose diagrammatic reasoning architecture……………………… 2

1.2 Diagrammatic reasoning by an army commander…………………………….. 4

1.3 Examples of graphs understood by SKETCHY……………………………….. 7

1.4 An example of a deflected frame analysis by REDRAW……………………... 7

1.5 An example of a geometry theorem demonstrated by ARCHIMEDES………. 9

1.6 An example of a mathematical theorem proven by DIAMOND……………… 10

1.7 An example of ambush analysis by GeoRep……………………………………11

1.8 An example of route planning in urban scenario by Chandrasekaran, et. al’s diagrammatic reasoning architecture…………………………………... 12

2.1 The BehindCurve as a decision problem ……………………………………… 24

2.2 The BehindCurve as a function problem………………………………………. 24

2.3 The FurthestBehindCurve as an optimization problem……………………….. 25

3.1 Flow diagram of the spatial problem solver using constraint satisfaction…….. 30

3.2 Hierarchical problem classification in memory……………………………….. 36

3.3 Parse tree for a subproblem of the BehindCurve problem……………………... 40

4.1 An example of abstraction for efficient computation…………………………. 52

4.2 Solving the BehindCurve problem by spatial search………………………….. 54

5.1 Problem solving for entity re-identification…………………………………… 61

5.2 Ambush analysis………………………………………………………………. 63

5.3 Diagrammatic proof of Pythagoras theorem…………………………………… 66

x CHAPTER 1

INTRODUCTION

1.1 Diagrammatic reasoning as a problem solving activity

This dissertation is a contribution to building problem solving agents in artificial intel- ligence (AI) that use diagrams, much as people do, but most of AI does not, given the almost exclusive emphasis in AI on language-like or predicate-symbolic representations. Diagrammatic reasoning (DR) is an emerging area of research in AI [Chandrasekaran et al., 2002, 2004, 2005; Ferguson, 1994; Ferguson and Forbus, 2000, 1998; Glasgow et al., 1995; Jamnik, 2001; Lindsay, 1998; Narayanan and Chandrasekaran, 1991; Pisan, 1995; Tessler et al., 1995], logic [Allwein and Barwise, 1999; Barwise and Etchemendy, 1998], psychology [Pinker, 1990; Tricket and Trafton, 2006; Tversky, 2000], and is fast becoming a multidisciplinary research area (see DIAGRAMS-2000, 2002, 2004, 2006 conference proceedings). While all research in DR is in one way or other dealing with diagrams, different research issues are addressed by different researchers. The research reported in this dissertation considers DR as a problem solving activity in which an agent makes use of two forms of representation – a spatial representation in the form of 2D diagrams and a symbolic representation that contains information in a predicate-symbolic form similar to logic and natural language. A schematic DR architecture, as proposed by Chandrasekaran et al. [Chandrasekaran et al., 2002, 2004, 2005], is illustrated in Fig. 1.1.

1 Problem

Spatial problem in Inference Spatial specification language Rules Problem Problem Solver Solver Symbolic Solution to the Information spatial problem

Diagram Traditional AI Solution problem solver

Fig. 1.1: A general purpose diagrammatic reasoning architecture.

The DR architecture shares the idea of problem solving as search in problem state space [Laird et al., 1986; Newell, 1990]. Recall that in this approach, starting from an initial state, the agent applies operators to bring about state transitions to reach the goal state. A goal is either reached or decomposed into subgoals by the use of general and domain knowledge. Reaching a goal or subgoal requires information which is generated in the traditional problem solving architectures (e.g., SOAR [Laird et al., 1987], ACT- R[Anderson, 1993]) by inference using predicate-symbolic representation. In the DR architecture, the agent can extract information from diagrams by applying perception-like operations in addition to inference using predicate-symbolic representation to reach the goal/subgoal. The agent can also create or modify objects in a diagram that propose new states from which the goal might be reached with subsequent perceptions and inferences.

To illustrate our conceptualization of DR, let us consider a real world problem. An army commander, planning strategic operations, uses a terrain to chalk out a path for his troops to safely travel from one base camp location L1 to another L2 within a given time. The only information he has is regarding the nature of the terrain (e.g., slow-go or no-go regions, altitude of different parts of the terrain, the speed at which his troops can travel in different kinds of terrain) and an estimate of the maximum firepower range of

2 the enemy. The commander, being a veteran in the field, is well aware of the possibility that his troops might be ambushed along any path by the enemy who might be hiding in the neighboring regions. His problem solving might proceed as follows. A diagram consisting of the part of the terrain map of interest for this particular problem is given, along with the peripheries of the no-go regions and the two points, L1 and L2 (Fig. 1.2(a)).

The commander draws one of the shortest paths from L1 to L2 maintaining a maximum distance from the neighboring no-go regions (Fig. 1.2(b)). He knows what kinds of spatial relations between points on the route and the points where the enemy could be hiding correspond to ambush potential. He then uses that knowledge to perceive (and mark) the portions of the path that are prone to ambush due to enemies hiding behind the neighboring no-go regions (Fig. 1.2(c)). If no such portion is found, the path is inferred to be safe. If the length of a safe path can be traversed in the given time, it is considered a suitable path for the operation. If the path drawn is not safe or does not satisfy the time constraint, another path is drawn (Fig. 1.2(d)) and analyzed. This procedure continues until all paths have been exhausted. If a suitable path is still not found, the least risky path might be considered for the operation. In the worst case, the commander might infer that this operation is not possible.

In the above example, it is noteworthy how the problem solver (the commander) op- portunistically brings together symbolic knowledge (such as the firepower range of ene- mies) and perception and action on a diagram to solve a real world problem. Such a phe- nomenon is characteristic of DR whenever it is used to solve problems in numerous dif- ferent domains, be it physics, thermodynamics, economics, geometry, civil engineering, computer-aided design, or military. The perceptions and actions require solving purely spatial problems with no involvement of domain knowledge. These spatial problems can be described in terms of primitive diagrammatic objects, such as points, curves, and re- gions, and spatial properties (e.g., length of a curve) and relations (e.g., point on a curve) involving them. For example, perceiving the portions of a path prone to ambush requires computing the set of points q on a curve (the path) c such that q is within a specified distance (the firepower range) d from some point p and the line segment {p, q} intersects a region r (i.e., p is behind r with respect to q). Formally, this can be written as:

3 (a) The given diagram consisting of two points, (b) One of the shortest paths drawn between L1 L1 and L2, and three region obstacles. and L2 avoiding the obstacles.

(c) Portions of the path prone to ambush are per- (d) Another path is drawn and will be analyzed. ceived and marked.

Fig. 1.2: Diagrammatic reasoning by an army commander for finding a safe path for transporting his troops from L1 to L2 within a given time.

RiskyP ortionsofP ath(q, c, r, d) ≡ On(q, c) ∧ ∃p, isaP oint(p) ∧ Intersect({p, q}, r) ∧ Distance(p, q) ≤ d

This dissertation is an investigation of a general and efficient framework for spatial problem solving to aid perceptions and actions in DR.

4 1.2 What do we mean by a diagram?

For our purposes, a diagram is an abstract data structure consisting of a set of labeled primitive objects (points, curves, regions), whose spatiality is relevant to reasoning, along with their spatial information [Chandrasekaran et al., 2002, 2004, 2005]. This definition of a diagram supports the functional representation of a diagram in an agent. A diagram on an external medium (piece of paper, computer screen, etc.) is at one level an image consisting of pixels with color intensities. At another level, it is a collection of diagram- matic objects in an interpreted representation relation to objects, properties and relations in some domain of interest. An abstract diagram, as we define it, is different from a physical external diagram, where point and curve objects are actually physically drawn as regions on some physical medium. The abstract diagram is ideal, i.e., one where the intended points are in fact dimensionless points, curves have no thickness, etc. Similarly, a physical diagram has marks that correspond to labels, which in an abstract diagram are symbols associated with the diagrammatic objects, without themselves having a spatial representation. One can thus think of our functional notion of a diagram as the inter- nal representation for an agent, irrespective of whether the diagram is abstracted from an image of an external diagram or constructed otherwise, such as from memory. It is note- worthy that objects in an abstract diagram can be reproduced in a diagram on an external medium for the purposes of or interaction.

This dissertation is concerned with the problem of using perception for extracting information from or action to create or modify objects in such abstract diagrams. Only diagrams that are line drawings with no color or intensity variation, which form a large class of diagrams in everyday use, will be considered in this dissertation. In the rest of the dissertation, the term ”diagram” will refer to abstract diagrams only, unless otherwise stated.

5 1.3 Perceptions and actions in diagrammatic reasoning

In the last couple of decades, numerous DR systems have been built for different prob- lems in different domains. Mandatory requirements of any such system are the abilities to obtain information from a diagram and to modify or create objects in a diagram. These abilities in general require solving a large variety of spatial problems. In the following is reviewed some well-known DR systems built for different tasks in different domains where a problem solving agent using diagrams. This review will help us real- ize the role of perception and action in DR, and the spatial problems involved in such perceptions and actions.

SKETCHY [Pisan, 1995] is a computer implementation of a model of graph under- standing. It recognizes a small set of primitive objects - points, lines, regions, and a vocabulary of properties and relations that includes coordinate at point, right of, above, inside, steeper, bigger, vertical, change in slope, touches, intersects, on line, on border, forms border, etc. for representing conceptual relationships in domains such as thermody- namics and economics. A domain translator is responsible for converting domain-specific conceptual questions into domain-independent graphical relations. Examples of percep- tion from a supply-demand graph in economics include how price effects the supply, demand, and market price of the product, which requires solving spatial problems such as ”At what point is supply equal to demand?” (specified as computing the intersection of two curves), ”What is the price for the supply line when the quantity is 350?” (specified as computing a point on a curve whose one coordinate is given), ”Are the quantity and price directly proportional?” (specified as checking whether the slope of a curve between two points is a positive constant or not), ”Are the quantity and price inversely proportional?” (specified as checking whether the slope of a curve between two points is a negative con- stant or not), etc. Actions in this model are not required due to the nature of its task. Examples of graphs understood by SKETCHY are shown in Fig. 1.3.

The REDRAW system [Tessler et al., 1995] combines diagrammatic and symbolic reasoning to qualitatively determine the deflected shape of a frame structure under a

6 (a) Graph from economics (b) Graph from thermodynamics

Fig. 1.3: Examples of graphs understood by SKETCHY.

Fig. 1.4: An example of a deflected frame analysis (from civil engineering) by REDRAW.

7 load, a structural analysis problem in civil engineering. It uses a vocabulary of proper- ties and relations including get-angular-displacement, get-displacement, symmetrical-p, connected-to, near, left, above, etc. and actions including rotate, bend, translate, smooth, etc. on three kinds of diagrammatic objects – lines, splines, circles. Though most prop- erties, relations, actions are domain-independent, some, such as bend reflect the assump- tions implicit in the domain and the task. Perceptions and actions are called inspection and manipulation operators in the system. The underlying representation is a combination of a grid-based and Cartesian coordinates – shapes are represented using the grid where each element in the grid corresponds to a point in the diagram while lines are represented by a set of coordinate points. Examples of perception and action include deflecting a beam in the same direction as the load, checking whether a beam and column are per- pendicular at a particular rigid joint, etc. which require solving spatial problems such as ”Bend Beam3 in the negative direction of the y-axis” (specified as computing a curve with a given slope), ”Make the angle between Beam3 and Column3 at Joint3 90 degrees without modifying Beam3” (specified as computing a curve such that it makes a particular angle at a given point with a given curve), ”Get the angle between Beam3 and Column3 at the ends connected by Joint3” (specified as computing the angle between two curves at a given point), etc. An example of a deflected frame analysis by REDRAW is shown in Fig. 1.4.

The ARCHIMEDES system [Lindsay, 1998] assists a human in demonstrating theo- rems in Euclidean geometry by modifying/creating diagrams according to his instructions and thereafter perceiving/inferencing from the diagram. It operates on two basic diagram- matic objects - points and line segments, and composes more complicated objects, such as square, triangle, path, etc. out of them. The underlying representation is array- or grid-based. The perceptions, called retrieval processes, are of different classes, such as verify relationship, test for a condition, etc. The actions, called construction processes, are also of different classes, such as create an object with certain properties, transform an object, etc. Executing the perceptions and actions require solving spatial problems, such as create a segment parallel to a given segment through a given point, rotate an object and

8 check whether it coincides with another object, etc. An example of a geometry theorem demonstrated by ARCHIMEDES is shown in Fig. 1.5.

Fig. 1.5: An example of a geometry theorem demonstrated by ARCHIMEDES.

The DIAMOND [Jamnik, 2001], a system for proving mathematical theorems, uses a sequence of actions on diagrams assisted by a human to prove specific ground instances and then generalizes by induction. It uses a mixture of Cartesian and topological represen- tations to represent a dot (equivalent to a point in Cartesian representation) as a primitive object in the discrete space, and a line and an area as primitive objects in the continuous space. Elementary objects, such as row, column, ell, and frame, are constructed from dots, while derived objects, such as square, triangle, rectangle, etc. are constructed from the elementary or other derived objects. The vocabulary consists of atomic or one-step operations (e.g., rotate, translate, cut, join, project from 3D to 2D, remove, insert a seg- ment, etc.). Spatial problems in this system are composite operations composed from the atomic ones, such as draw a right-angled triangle, translate a triangle, etc. The system does not need to execute perceptions as information from a diagram is perceived by a

9 Fig. 1.6: An example of a mathematical theorem proven by DIAMOND.

human who decides what actions to be applied during the proof search. An example of a mathematical theorem proven by DIAMOND is shown in Fig. 1.6.

GeoRep [Ferguson and Forbus, 2000] takes as input a line drawing in vector graph- ics representation and creates a predicate calculus representation of the drawing’s spatial relations. Five primitive shape types are recognized, namely line segments, circular arcs, circles and ellipses, splines (open and closed), and positioned text. Properties and rela- tions, such as proximity detection, orientation detection (e.g., horizontal, vertical, above, beside), parallelism, connectivity (e.g., detecting corner, intersection, mid-connection, touch), etc. are deployed to accomplish its task. It is worthy of mention that GeoRep has an abstraction, namely grouping based on similarity. The underlying representation is vector graphics or line drawings. Systems such as MAGI [Ferguson, 1994], JUXTA [Ferguson and Forbus, 1998], and COADD are built using GeoRep for symmetry detec- tion, critiquing diagrams based on their captions, and producing a description of the units, areas, and tasks from a course of action diagram, respectively. GeoRep, due to the lim- itation of its task, does not need to execute any action. Examples of spatial problems in

10 GeoRep include figuring out which cup contains more liquid (specified as comparing the polygons representing the cups to see if one cup is taller or wider), determine whether a figure is symmetric or not (specified as whether one polygon is congruent to the reflection of the other polygon), etc. An example of ambush analysis by GeoRep is shown in Fig. 1.7.

Fig. 1.7: An example of ambush analysis by GeoRep.

Chandrasekaran, et. al’s DR architecture [Chandrasekaran et al., 2002, 2004, 2005] is a generic proposal about multi-modal reasoning and problem-solving, currently being experimented with toy problems (e.g., blocks world) and real world military problems (e.g., ambush detection, entity re-identification). A human provides the broad problem- solving strategy for a given class of problems; given a specific problem from that class, the problem-solver calls the perceptions and actions accordingly. The system contains a vocabulary of properties, relations and actions that operate on three types of diagrammatic objects - points, curves, regions. Perceptions are categorized into four classes – emergent

11 Fig. 1.8: An example of safe route planning in urban scenario by Chandrasekaran, et. al’s diagrammatic reasoning architecture.

object recognition (e.g., determination of intersection point), relational (e.g., inside), ob- ject property extraction (e.g., length), and abstraction (e.g., grouping). Actions include object transformation (e.g., rotate), object creation with certain properties (e.g., comput- ing a curve (path) between two points avoiding regions (obstacles)), etc. See [Banerjee and Chandrasekaran, 2004] for more examples. Examples of spatial problems include computing the portion of a path (curve) that lies within the firepower range of the ene- mies and hidden from the neighboring obstacles (regions), computing a safe route (curve) avoiding obstacles (regions), etc. An example of safe route planning in urban scenario by Chandrasekaran, et. al’s DR architecture is shown in Fig. 1.8.

The preceding discussion leads to the observation that all DR systems require per- ceiving information from and/or acting on diagrams. It is noteworthy that there is no difference between a perception and an action from a computational point of view. The only difference is that for an action, a diagrammatic object is drawn in a diagram while

12 nothing is drawn in case of a perception. The commonality between the two is that for computing the solution of a perception or action, a domain-independent spatial problem needs to be solved. After computing the solution, if the problem solver1 modifies the dia- gram to incorporate the solution, it becomes an action; otherwise it remains a perception. For example, the portion of a path prone to ambush is a diagrammatic object. If the prob- lem solver draws the object in a diagram, the problem of computing the ambush-prone portion of a path is an action; otherwise it is a perception. Irrespective of whatever it is, computing the ambush-prone portion of a path requires solving a domain-independent spatial problem (which I had earlier defined as RiskyP ortionsofP ath). Thus the terms ”perception” and ”action” come with additional requirements which is completely irrel- evant to the SPS. It is the responsibility of the problem solver to determine whether a spatial problem is a perception or action based on problem solving goals. What is passed on to the SPS are domain independent spatial problems and the SPS should have no clue as to whether a spatial problem is a perception or action because that does not make any difference to how it solves the spatial problem. Since every perception and action requires solving a domain-independent spatial problem, a DR system requires solving a large va- riety of non-trivial domain-independent spatial problems. These spatial problems can be described in terms of primitive diagrammatic objects, such as points, curves, and regions, and spatial properties and relations involving them.

1.4 The problem

How are these spatial problems solved in a DR system? Typically, the human building a DR system identifies a priori the problem solving steps and a set of spatial problems, and efficiently programs a procedure for solving each spatial problem. If the problem solving steps need to be altered in future and as a result, a new spatial problem arises,

1The reader should always keep the DR architecture in mind. As shown in Fig. 1.1, there are two problem solvers – the main problem solver which will be always referred to as the ”problem solver” (this might be a human) and the spatial problem solver which will be referred to as the ”SPS” (this strictly has no human intervention). The problem solver is responsible for the entire problem solving strategy including converting domain-specific perceptions and actions into domain-independent spatial problems. The SPS is responsible only for solving the domain-independent spatial problems that it receives from the problem solver. It is extremely important to not get confused between the roles played by the two.

13 the human has to re-program the procedure for obtaining its solution. Thus, codes need to be written for solving each spatial problem. Clearly, this is inconvenient and time consuming in building a DR system, and does not allow fast and smooth experimentation with different problem solving steps for the same problem. These drawbacks are further magnified when the goal is to build a general purpose DR system which requires solving a large variety of spatial problems. This leads us to our quest for developing a general and efficient spatial problem solver (SPS) that can communicate with the problem solver and solve spatial problems without human intervention.

The goal of this dissertation is to investigate – 1. A language for a human problem solver to communicate a large variety of spatial problems relevant to DR, and 2. A general domain-independent framework of underlying representations and reasoning strategies suitable for efficiently solving spatial problems without human intervention.

Only diagrams that are line drawings with no color or intensity variation will be con- sidered in this dissertation. Such diagrams form a substantial class of diagrams in every- day use.

1.5 Contributions

The research reported in this dissertation contributes to building a general purpose DR system. In particular, 1. A high-level, finite, extensible, human-usable, and expressive language will be pro- posed in which can be specified a wide variety of spatial problems as constraints in first- order logic over the real domain using a vocabulary of objects, properties, relations and actions. 2. Two general and independent strategies – constraint satisfaction and spatial search – will be developed for autonomously solving the spatial problems specified in that lan- guage.

14 Constraint-based languages have been in use for quite some time in the constraint satisfaction community (see [Sturm and Weispfenning, 1998; Weispfenning, 2001] for example). However, such languages are close to machine language, devoid of any vo- cabulary of objects, properties, relations or actions, and problems are specified in terms of algebraic equations/inequalities. The proposed language supports the specification of a large variety of 2D spatial problems using predicates from a small vocabulary, thereby making the specification more elegant, understandable and easy to use. Recurring prob- lems can be stored in the vocabulary and reused for specifying more complicated prob- lems, thereby letting the vocabulary grow naturally and relieving much work from the problem solver by not having him to dig deep into an ocean of equations/inequalities each time he wishes to specify a problem.

Constraint satisfaction strategies2 have been in use for solving spatial problems since Sutherland’s Sketchpad [Sutherland, 1963]. That the use of quantifiers can significantly enhance the expressiveness of problem specification has been well-known; however, elim- inating quantifiers is a severely compute-intensive task. As a result, instead of aiming for generality, researchers have concentrated on developing quantifier elimination algorithms for small classes of problems, or for providing only partial solutions. In this dissertation, I propose an algorithm that, for solving a problem, decomposes it into subproblems, and stores new subproblems to be reused later. As we will see, this algorithm significantly reduces the complexity of quantifier elimination without compromising on generality.

A large body of research has been dedicated to algebraically involved constraint sat- isfaction strategies in the real domain. The spatial search strategy, a radically different approach compared to algebraic constraint satisfaction, is developed for spatial problem solving that accepts a problem in the same specification language and eliminates quan- tifiers and solves constraints to compute the solution without human intervention. The efficiency of this strategy has been enhanced by viewing a diagram at multiple levels

2The term ”constraint satisfaction” is being used to refer to a standard algebraic approach that the well- established constraint satisfaction community follows (see proceedings of the annual International Confer- ence on Principles and Practice of Constraint Programming and the CONSTRAINTS journal). While the goal of both the strategies in this dissertation is to compute a solution that satisfies the constraints, they do so very differently.

15 of resolution and cleverly filtering out space in the diagram at each level to reduce the amount of space and the number of objects on which constraints have to be checked for satisfaction in order to solve spatial problems.

1.6 Organization of the dissertation

The rest of the dissertation is organized as follows. In the next chapter, I describe a vocab- ulary of objects, properties, relations and actions that has been proposed in our group’s earlier research. I discuss a specification language for a problem solver to specify 2D spatial problems to the SPS using that vocabulary. Given in the specification language, a spatial problem in its general form can be defined as a quantified constraint satisfac- tion problem (QCSP). Depending on the nature of the solution, the spatial problems are classified into three classes - decision, function, and optimization. It is observed that the computational bottleneck in solving a non-trivial spatial problem is quantifier elimination.

Chapter 3 investigates a constraint satisfaction strategy for efficiently solving spatial problems, specified in the specification language, without human intervention. Since a QCSP is PSPACE-complete, a fast and general algorithm for solving QCSPs is not possible. The QCSPs of interest in this dissertation are limited to 2D spatial problems. Is it possible to devise a faster algorithm for solving this limited class of QCSPs? It will be shown that by taking the help of previously solved problems, this limited class of QCSPs can be solved in low-order polynomial time.

Chapter 4 proposes a spatial search strategy for efficiently solving spatial problems, specified in the same specification language, without human intervention. Unlike the con- straint satisfaction strategy, the search for the solution is executed by searching the space in the diagram. For solving real world problems, an exhaustive search is often required which becomes computationally very intensive. Is it possible to devise an algorithm to reduce the complexity of this exhaustive search? An algorithm will be proposed that in- deed reduces the computational cost of exhaustive spatial search to less than 2% (in the empirical case).

16 Chapter 5 illustrates the expressiveness of the specification language and efficiency of the two strategies by using the proposed frameworks for executing perceptions and actions in reasoning with diagrams. Three applications - entity re-identification, ambush analysis and proving Pythagoras theorem - are chosen from two different domains - mili- tary and Euclidean geometry - to showcase the capabilities. Finally, Chapter 6 discusses the contributions of this dissertation and avenues for future research.

17 CHAPTER 2

ASPECIFICATIONLANGUAGE

In this chapter, I discuss a high-level language that is finite, extensible, human-usable, and expressive enough to describe a wide variety of 2D spatial problems relevant to diagram- matic reasoning (DR). The problems specified in this language will be accepted as input by the spatial problem solver (SPS) and will be solved without human intervention. The specification language should be independent of the SPS, i.e., the problem specification should remain unchanged even if the underlying representation and reasoning strategy of the SPS change.

2.1 Vocabulary

In our group’s earlier work [Banerjee and Chandrasekaran, 2004; Chandrasekaran et al., 2002, 2004, 2005], a general vocabulary of objects, properties, relations and actions have been identified based on their wide usage in DR for expressing a variety of real world problems in different domains. The same vocabulary will be used in this dissertation. This vocabulary is not claimed to be complete and addition of new properties, relations and actions is allowed when a problem cannot be easily expressed using the existing ones. The observation is that a human does not encounter new objects, properties, relations or actions very often but quite often solves new spatial problems involving properties of, relations among, and actions on a few different kinds of objects. Our vocabulary is an

18 endeavor to capture these objects, properties, relations and actions. In the next section, we will see how a language for specifying a wide variety of spatial problems can be constructed using this vocabulary.

Diagrammatic objects. The vocabulary contains three kinds of diagrammatic objects – points, curves, regions. A point is the basic diagrammatic object. The other objects are defined in terms of a set of points. The declarations isaP oint(p), isaCurve(c), isaRegion(r) impose the constraints that p, c, r is a point, curve, region respectively and allow them to be the argument of only appropriate properties, relations and actions. If no such declaration is made, a variable is assumed to be a real number. Moreover, isaP oint(p) assigns a unique pair of coordinates to p, isaCurve(c) assigns a finite se- quence of points to c (represented piecewise linearly or by pixels), and isaRegion(r) assigns a finite sequence of points representing the periphery of r. Also, each occurrence of p, c, r is replaced with those assignments before spatial problem solving. Further, the SPS can be asked to recognize the kind of diagrammatic object obtained as the solution to a spatial problem by using the function Recognize(< Object >, < ObjectT ype >). For example, the set of all points behind a curve c with respect to a given point p can be a region object or a curve object depending on the nature of c and its location with respect to p. In order to recognize the output, the system colors the corresponding set of pixels on the diagram where each pixel corresponds to a point. The boundary pixels of the output are determined. If it is a curve, the starting and ending points of the boundary will be different. If it is a region, they will be the same and the inside of the closed curve (periphery) will be colored. The output might be a number of regions or curves in which case, each of them is identified. The Recognize function also helps to extract a curve or periphery of a region as a piecewise linear curve from a complicated logical combination of algebraic equations and inequalities.

Properties. Associated with each kind of object are a few properties – location of a point; location, closedness and length of a curve; and location, area and periphery of a region, where the periphery of a region refers to its boundary curve. The user can also define particular shapes (e.g., circle, triangle, annulus, etc.) for curves and regions as appropriate

19 for reasoning in his domain. Different shapes might have their own specific properties, such as radius of a circle, height of a triangle, etc. which can be easily associated with the objects in our vocabulary by the user. DR also requires solving spatial problems concerning a discrete set of points. For such problems, properties such as Centroid(S) and V ariance(S), where S is a set of points, are included in the vocabulary.

Relations. The vocabulary also contains a few widely used relations involving objects, such as Distance(p1, p2), Angle(p1, p2, p3), Leftof(p1, p2), T opof(p1, p2),

Collinear(p1, p2, p3), Between(p1, p2, p3) where p1, p2, p3 are points, Intersect(c1, c2),

T ouches(c1, c2), Subcurveof(c1, c2) where c1, c2 are curves, On(p, c) where p is a point and c a curve, Inside(p, r) where p is a point and r a region, Subregionof(r1, r2) where r1, r2 are regions, etc.

Actions. Further, there is a set of actions for identifying emergent objects or modifying existing objects or creating new objects. For example, IntersectionP oint(c1, c2) returns all points of intersection of two curves c1 and c2, T ranslate(o, tx, ty) returns a translation of object o for tx units along x-axis and ty units along y-axis, Rotate(o, c, θ) returns a ro- tation of the object o with respect to point c as center for θ degrees in the anti-clockwise direction, Reflect(o, {a, b}) returns a reflection of object o with respect to the line seg- ment {a, b} (i.e., from point a to point b), Scale(o, c, sx, sy) returns a scaling of the object o with respect to point c for sx units along x-axis and sy units along y-axis.

2.2 Specification language

This is the language in which a problem solving agent (human or artificial) de- scribes a spatial problem to the SPS. The internal representations of objects, proper- ties, relations, actions, and the problem-solving strategies are hidden from the agent. The specification language remains unchanged even if the underlying representation or problem-solving strategy is changed. I propose a functional constraint logic pro- gramming language as the specification language. Besides a vocabulary of objects, properties, relations and actions, it recognizes a set of mathematical/logical operators

20 {+, −, ×, ÷, <, >, =, 6=, ∧, ∨, ¬, ⇒, ∪, ∩, ∈, ⊂, ∃, ∀} and a set of functions including Maximize(f, {x, y, ...},C) (or Minimize(f, {x, y, ...},C)) which maximizes (or min- imizes) the function f with respect to the variables {x, y, ...} satisfying the logical com- bination of constraints C (which might involve quantifiers) and returns the maximum (or minimum) value of f along with the conditions on the variables.

Constraint Satisfaction Problem. An instance of a constraint satisfaction problem (CSP) consists of a tuple < V, D, C > where V is a finite set of variables, D is a domain, and C = {C1, ...Ck} is a set of constraints. A constraint Ci consists of a pair < Si,Ri > where Si is a list of mi variables and Ri is a mi-ary relation over the domain D. The question is to decide whether or not there is an assignment mapping each variable to a domain element such that all the constraints are satisfied. Example of a CSP is as follows:

Intersect(c, r) ≡ On(p, c) ∧ Inside(p, r) where p is a point, c is a curve, r is a region all in 2D space (<2). In this example, there are two constraints:

< {p, c}, On > < {p, r}, Inside >

Further, V = {p} and D = <2. The variables c, r are given. The question here is to decide whether or not there is an assignment mapping p to an element in <2 such that both the constraints are satisfied. If it does, then the curve c and region r intersect each other; otherwise they do not. Hence the problem is labeled Intersect(c, r).

Quantified Constraint Satisfaction Problem. All of the variables in a CSP can be thought of as being implicitly existentially quantified. The CSP involves deciding whether or not there exists an assignment to each of the variables such that all given constraints are satisfied. For example, the CSP Intersect(c, r) can be written as follows:

Intersect(c, r) ≡ ∃p, On(p, c) ∧ Inside(p, r)

21 A useful generalization of the CSP is the quantified constraint satisfaction problem (QCSP), where variables may be both existentially and universally quantified. An in- stance of the QCSP consists of a quantified formula in first-order logic, which consists of an ordered list of variables with associated quantifiers along with a set of constraints. A QCSP can be expressed as follows:

0 φ(v1, ...vm) ≡ Q(xn, ...x1)φ (v1, ...vm, x1, ...xn)

Q(xn, ...x1) ≡ Qnxn, ...Q1x1

where Qi ∈ {∀, ∃}, {x1, ...xn} is the set of quantified variables, {v1, ...vm} is the set of 0 free variables, V = {v1, ...vm, x1, ...xn}, and φ is a quantifier-free expression called the matrix. Such representation of a quantified expression φ, where it is written as a sequence of quantifiers followed by the matrix, is referred to as prenex form. Example of a QCSP is as follows:

Subcurveof(c1, c) ≡ ∀p, On(p, c1) ⇒ On(p, c)

2 where c1, c are curves in < .

Decision, Function and Optimization problems. In the proposed specification lan- guage, a spatial problem φ is expressed as a QCSP where V consists of variables of type point, curve or region and D = <2. Solving a spatial problem involves: 1. When there are no free variables in V (i.e., all variables in V are quantified), de- ciding whether or not there exists a mapping from V to D satisfying C. 2. When there are free variables in V , computing the conditions on the free variables such that a mapping from V to D satisfying C exists.

Thus, a spatial problem can be classified as a decision or a function or an optimization problem in the real domain. The first case constitutes a decision problem – the question is to decide whether or not there is an assignment mapping each variable to a domain element such that all constraints are satisfied, and hence, yields a True or False solution. The second case constitutes a function problem which involves computing the diagram- matic object(s) described by the conditions on the free variables. If a spatial problem

22 requires computing the ”best” mapping from V to D satisfying C, it is classified as an optimization problem.

Let us consider an example. Given a curve c and two points p, q, the spatial problem BehindCurve(q, c, p) is defined as deciding whether or not q is behind c with respect to p. This might be specified as deciding whether or not the curve c and line segment {p, q} intersect. Thus,

BehindCurve(q, c, p) ≡ Intersect(c, {p, q})

For particular instances of q, p, c, the solution to this problem is T rue or F alse, hence it is a decision problem (see Fig. 2.1). For particular instances of p, c, and generalized coordinates of q i.e., q ← {x, y} 1, the solution to the same problem is a logical com- bination of conditions involving x and y, which when plotted constitutes a region object (see Fig. 2.2). Hence, it becomes a function problem. While a decision problem merely requires checking whether or not a given instance of an object satisfies the constraints or not, a function problem requires computing all conditions for a general object to satisfy the constraints. Hence, a function problem is more difficult than a decision problem.

Let us now slightly modify the problem. Given a curve c and two points p, q, the spatial problem F urthestBehindCurve(q, c, p) is defined as deciding whether or not q is the furthest point behind c with respect to p. This might be specified as deciding whether or not q lies behind c with respect to p and distance between p and q is maximum. Thus,

F urthestBehindCurve(q, c, p) ≡ BehindCurve(q, c, p) ∧ ∀b, isaP oint(b) ∧ (BehindCurve(b, c, p) ⇒ Distance(b, p) ≤ Distance(q, p))

Again, for particular instances of q, p, c, the solution to this problem is T rue or F alse, hence it is a decision problem. For particular instances of p, c, and generalized coordi- nates of q i.e., q ← {x, y}, the solution to the same problem is a logical combination of conditions involving x and y, which when plotted constitutes a single point object, as-

1Since a point in the SPS is represented by a set of two real numbers as its coordinates, the {} has been used instead of the conventional ().

23 Fig. 2.1: The BehindCurve as a decision problem. One of the points q is behind c with respect to p while the other one is not.

Fig. 2.2: The BehindCurve as a function problem. The shaded region r is behind c with respect to p.

24 suming there is only one furthest point behind c with respect to p, which is dependent on the nature of c and how the Distance function is defined (see Fig. 2.3 ). Though this looks like a function problem, it is actually an optimization problem.

q

Fig. 2.3: The F urthestBehindCurve as an optimization problem. The point q is the furthest point behind c with respect to p.

An alternative way of specifying the same problem F urthestBehindCurve(q, c, p) is by explicitly asking to maximize the distance between p and q where q satisfies the constraint BehindCurve(q, c, p), written as:

F urthestBehindCurve ( q , c , p ) ≡ Maximize ( Distance ( q , p ) , { q } , BehindCurve ( q , c , p ))

This outputs the conditions involving x and y, which constitutes a single point object. The Maximize (or Minimize) function assumes the pool of candidates from which to choose the best are those that satisfy the set of constraints. This fact has to be stated explicitly if not using the Maximize (or Minimize) function which makes the specification more difficult to come up with and also cumbersome. On the flip side, the specification of

25 a problem using the Maximize (or Minimize) function cannot be used as a decision problem. That is, whether or not a particular instance of an object is the best candidate that satisfies the constraints cannot be computed from this specification, unlike the former specification. A problem of this type, which computes the best candidate out of a pool of candidates, is called an optimization problem. Thus, an optimization problem is a func- tion problem that also requires maximization (or minimization) of a function stated either explicitly or implicitly. Hence, it is a more difficult problem than a function problem.

Thus, a spatial problem φ is a mapping from a set of diagrammatic objects O satisfying a logical combination of constraints C to a set of booleans {T rue, F alse} or real numbers < or diagrammatic objects O0, i.e.,

φ : O → {T rue, F alse} ∪ < ∪ O0

2.3 Discussion

My goal in this chapter was to develop a language for humans to communicate a wide variety of spatial problems relevant to DR. A high-level specification language that is fi- nite, extensible, human-usable, and expressive enough to describe a wide variety of 2D spatial problems as constraints specified in first-order logic over the real domain using a vocabulary of objects, properties, relations and actions was developed. In the next two chapters, we will see how a SPS can be designed that accepts problems in this specifica- tion language and solve them without human intervention.

26 CHAPTER 3

A CONSTRAINT SATISFACTION FRAMEWORK

In this chapter, I discuss a constraint satisfaction strategy and the corresponding under- lying representation using which the spatial problem solver (SPS) solves spatial prob- lems without human intervention. In this strategy, a spatial problem is transformed from the specification language to a modeling language and then subject to constraint solving and quantifier elimination for computing the solution. Unfortunately, the complexity of quantifier elimination is, in general, doubly exponential in the number of blocks of vari- ables delimited by alternations of the existential and universal quantifiers [Davenport and Heintz, 1988]. I show how this complexity can be considerably reduced by taking the help of previously solved problems which requires intelligent memory design, effective problem representation, and a function for mapping one problem to another.

3.1 Overall algorithm

It is well-known that constraint satisfaction is a convenient framework for modeling search problems. As defined in the last chapter, an instance of the constraint satisfac- tion problem (CSP) is in general NP-complete while the quantified constraint satisfac- tion problem (QCSP) is in general PSPACE-complete. The complexity class PSPACE is believed to be much larger than NP. For more details on complexity issues concerning QCSPs, refer to [Chen, 2004].

27 The Tarski-Seidenberg theorem [Tarski, 1951, 1998] states that for every first order formula with equality and inequality predicates, multiplication and addition, over the real field there exists an equivalent quantifier-free formula. Furthermore, there is an explicit algorithm to compute this quantifier-free formula. The primary task in solving a spa- tial problem is quantifier elimination. Given a QCSP where the constraints are defined in terms of a logical combination of algebraic expressions, the complexity of quantifier elim- ination is inherently doubly exponential. That is, a doubly exponential time complexity for the quantifier elimination problem is unavoidable (see [Davenport and Heintz, 1988], section 11.4 of [Basu et al., 2003] for proof). Theoretically, the best complexity achieved so far is O(s(l+1)Π(ki+1)d(l+1)Πki ) where s is the number of polynomials, their maximum degree is d and coefficients are real, l is the number of free variables, ki is the number of th P variables in the i quantifier block while k = ki is the number of quantified variables [Basu et al., 2003]. However, this algorithm is too complicated to yet have a practical im- plementation. The most general and elaborately implemented method for real quantifier elimination is the cylindrical algebraic decomposition (CAD) [Collins and Hong, 1991], complexity of which is O((sd)O(1)k−1 ). Another implemented method, quantifier elimi- nation by virtual substitution [Weispfenning, 1988], is restricted to formulas in which the quantified variables occur at most quadratically. The complexity of this method is doubly exponential in the number of blocks of variables delimited by alternations of the existen- tial and universal quantifiers. Thus, while there exists general algorithms (see [Dolzmann et al., 1998] for a review) for quantifier elimination, for large real world problems, it soon becomes too time consuming.

In view of this, the goal of my design of the SPS is to strive for efficiency without com- promising generality. The idea is to bypass the general quantifier elimination algorithms as much as possible, either by taking the help of previously solved similar problems in memory to obtain the solution or by using a set of more practical algorithms each of which is developed for a limited class of problems. The next couple of paragraphs describe the overall control mechanism of the algorithm.

28 Given a spatial problem φ in the specification language, the SPS replaces numerical values in the problem by symbolic variables, and then transforms the symbolic problem from specification to a modeling language (to be described shortly) by progressively re- placing objects/predicates by base objects/predicates in their internal definitions. If a definition cannot be found, it flags an error and halts till provided. As a first step, the SPS decomposes φ into disjunctions and/or conjunctions of subproblems φi in prenex form.

As we will see later, all of these subproblems φi are similar to each other in that if one of them can be solved, solution to any of the others can be computed from it. Next, it searches the memory for problems similar to φi. The memory contains symbolic prob- lems and their corresponding quantifier-free symbolic solutions. If φi can be mapped to one of these problems, its solution is readily obtained by reverse-mapping from the cor- responding symbolic solution in memory. Obtaining a solution in such a way completely bypasses the quantifier elimination process, which is the computational bottleneck of the proposed approach, thereby reducing the computational costs considerably. If SPS can- not map φi to any problem in memory, it sends φi to the problem classifier that classifies and sends it to the appropriate quantifier elimination algorithm. The problem classifier and combination of quantifier elimination algorithms have been borrowed from Mathe- matica [Wolfram, 2003]. Once the SPS solves a new subproblem, the subproblem and its solution are stored in memory so that the solution can be used when a similar problem is encountered in future. Thus, the SPS grows more efficient as it solves more problems. Finally, the SPS computes the solution of the given problem φ by combining the solutions of all its subproblems. See Fig. 3.1 for a flow diagram of the SPS.

Unfortunately, for some problems, quantifiers cannot be eliminated symbolically in reasonable time. The SPS tries for a prescribed time, after which it resorts to more practical methods, such as techniques [Dolzmann et al., 1998; Loos and Weispfenning, 1993] especially suited for low degree polynomials and approximate methods [Lasaruk and Sturm, 2006; Ratschan, 2006] for obtaining a subset of the solution sufficient for immediate purposes. For optimization problems, SPS uses numerical techniques [Nelder and Mead, 1965; Storn and Price, 1997] to obtain a solution that is sufficiently close to the exact solution in a time much lesser than exact techniques. Once a subproblem is deemed

29 ProblemI in specification language

Convert problem to modeling language: Search vocabulary and replace terms in specification by their definitions, if exists; otherwise request their definition

ProblemI in modeling language

Decompose problemI into conjunctions and/or disjunctions of subproblems in prenex form

For the first subproblemI1 , search memory for a Memory similar subproblem

Match found Match not found

SubproblemI1 and its Compute the solution of Problem classifier and solution subproblemI1 with the combination of constraint help of the solution to solvers and quantifier the matched problem elimination algorithms

Compute solutions of other subproblems

from the solution ofI1 and combine them

Solution

Fig. 3.1: Flow diagram of the spatial problem solver using constraint satisfaction.

30 symbolically unsolvable in the prescribed time, its specification is stored in memory so that in future, a similar problem can be directly subjected to practical methods, thereby saving the prescribed time. The algorithms mentioned above are implemented in Mathe- matica [Wolfram, 2003] which I have been using for implementing the ”problem classifier and combination of constraint solvers and quantifier elimination algorithms” module of the SPS (see Fig. 3.1). Since none of these algorithms are my original contribution, I will not further elaborate on them in this dissertation.

3.2 Modeling language

This is the language in which a problem is described in terms of the underlying repre- sentations of objects/properties/relations/actions in a form that can be readily subjected to algebraic manipulation. A point is represented as a pair {x, y}, x, y ∈ <, as its coor- dinates. A curve c, represented piecewise linearly, is specified by a sequence of points

th {p1, p2, ...pn}, where #(c) = n is the number of points (vertices) in c, the i point th c[i] = pi and i line segment lsi = {c[i], c[i + 1]}. A line segment ls is specified by its pair of terminal points {p, q}. The x- and y-coordinates of ls are represented as

fx(ls, t) = ls[1].x + t(ls[2].x − ls[1].x)

fy(ls, t) = ls[1].y + t(ls[2].y − ls[1].y) where t is a parameter, 0 ≤ t ≤ 1, ls[1] = p and ls[2] = q. A region r is specified by its periphery. The periphery is represented as a piecewise linear closed curve. Internally a region is triangulated (computable in linear time [Chazelle, 1991]) with the aim of reducing and simplifying computations. Thus, internally, there are three base objects – point, line segment, triangular region.

The properties, relations, actions included in the vocabulary are internally represented in terms of base properties/relations/actions (or predicates). A base predicate is one which accepts only the base objects as arguments. Examples of base predicates include

Distance(p1, p2), Area(4), Angle(p1, p2, p3), Leftof(p1, p2), Collinear(p1, p2, p3),

31 Between(p1, p2, p3), On(p, ls), where p, p1, p2, p3 are points, ls is a line segment, 4 is a triangular region. For example, an important relation, Intersect(c1, c2) where c1, c2 are curves, in the vocabulary is defined in terms of the base objects/properties/relations as follows:

#(c1)−1 #(c2)−1 Intersect(c1, c2) ≡ ∨i←1 ∨j←1 Intersect({c1[i], c1[i + 1]}, {c2[j], c2[j + 1]})

Intersect(ls1, ls2) ≡ ∃q, isaP oint(q) ∧ On(q, ls1) ∧ On(q, ls2)

On(p, ls) ≡ ∃t, 0 ≤ t ≤ 1 ∧ fx(ls, t) = p.x ∧ fy(ls, t) = p.y

Another important relation, Inside(p, r) where p is a point and r is a region, in the vo- cabulary is defined in terms of the base objects/properties/relations as follows:

#4(r) Inside(p, r) ≡ ∨i←1 Inside(p, 4(r)[i]) Inside(p, 4) ≡ Leftof(p, {4[1], 4[2]}) ∧ Leftof(p, {4[2], 4[3]}) ∧ Leftof(p, {4[3], 4[1]}) Leftof(p, ls) ≡ Area({ls[1], ls[2], p}) > 0 Xn Area(4) ≡ 4[i].x × 4[i + 1\n].y − 4[i + 1\n].x × 4[i].y i←1

th where #4(r) is the number of triangles in region r after triangulation, 4(r)[i] is the i triangle of r, and 4[i] is the ith point on the periphery of triangular region 4.

3.3 Mapping to a similar problem

I define two problems (quantified expressions) to be similar if there exists a one-to-one correspondence between their free variables such that the solution (equivalent quantifier- free expression) of one can be obtained by replacing the corresponding variables in the solution of the other. Given two similar problems, φ1 and φ2, and the solution ψ1 of

φ1, the goal is to construct a one-to-one mapping Γ between the variables of φ1 and

φ2 such that the solution of φ2 can be obtained by replacing the variables in ψ1 by the corresponding variables, thereby completely bypassing the quantifier elimination process – the computational bottleneck of spatial problem solving.

32 For any two arbitrary problems, the task is to decide whether or not they are similar, and if they are, to compute the mapping Γ between their variables. By definition, if two problems are similar, the mapping will exist. However, the mapping might not be com- putable in reasonable time, in which case it will not serve our purpose. It is important to observe that if two problems are similar, then they are logically equivalent and vice versa. Unfortunately, equivalence checking for logical expressions is NP-complete [Dershowitz, 1990]. Thus, for our purposes, equivalence checking cannot be used to determine simi- larity.

I use the following idea. Instead of logical equivalence, it is checked whether or not two problems are structurally equivalent, i.e., whether or not their parse trees are isomor- phic. If they are, then the mapping Γ, if exists, is computed between the variables of the two problems. It is noteworthy that structural equivalence, unlike logical equivalence, does not imply a mapping exists simply because the base predicates and base objects occurring at the corresponding nodes of two isomorphic trees might not be the same. Though this idea saves a lot of computation time, it comes with a limitation. Two logical expressions might be similar but they might not be structurally equivalent. For example, the expressions (P ∧ ¬P ) ∨ Q and Q, where P and Q are base predicates, are logically equivalent, hence similar, but not structurally equivalent. In such cases, the idea will fail to recognize their similarity. However, whenever two problems are structurally equiva- lent and there exists a mapping Γ between their variables, they are similar. Thus, the idea expressed here is sufficient to determine similarity but not necessary. Given that this idea is efficient (as we will see soon) and is widely applicable for our purposes, I will continue to use it for bypassing quantifier elimination until a better idea can be formulated.

Problem features. Let φ0 be the quantifier free expression when φ is expressed in prenex form, i.e.,

0 φ(v1, ...vm) ≡ Q(xn, ...x1)φ (v1, ...vm, x1, ...xn)

where Q(xn, ...x1) is a sequence of quantified variables (∀xi or ∃xi), no variable appears more than once in Q, and Q contains no redundant variables. A quantifier block qb of

33 Q is a maximal contiguous subsequence of Q where every variable in qb has the same quantifier type. The quantifier blocks are ordered by their sequence of appearance in Q;

0 qb1 ≤ qb2 iff qb1 is equal to or appears before qb2 in Q. Each quantified variable xi in φ appears in some quantifier block qb(xi), and the ordering of the quantifier blocks imposes a partial order on the quantified variables. The variables in the same quantifier block are unordered.

0 0 0 0 Let φ1 ≡ Q1φ1 and φ2 ≡ Q2φ2 while τ1 and τ2 be the parse trees for φ1 and φ2 respectively. Two trees, τ1 and τ2, are isomorphic if there exists a bijection χ : τ1 → τ2 that preserves adjacency and root vertex, i.e.,

χ(u) is adjacent to χ(w) ⇔ u is adjacent to w, and

χ(root(τ1)) = root(τ2)

It follows that two isomorphic trees have the same number of levels and the same number of vertices at each level. Let l be the number of levels in τ while νi be the number of vertices at level i. The function ξ, defined as Yκ λi ξ(< λ1, ...λκ >) = ρi i←1

th where λi is an integer, <> denotes a sequence, and ρi is the i smallest prime number, a sequence of integers to an unique integer. From a problem φ, a tuple ϑ(φ) can be constructed as follows:

ϑ(φ) = < l, ξ(# vertices at different levels of parse tree), # quantifier blocks, order of quantifier blocks, ξ(# variables in different quantifier blocks) >

The algorithm. Given two problems, φ1, φ2, the SPS computes ϑ(φ1), ϑ(φ2). If

ϑ(φ1) = ϑ(φ2), SPS computes whether or not τ1, τ2, are isomorphic and the operators and base predicates occurring at each corresponding node are identical; otherwise it concludes that φ1 and φ2 are not similar. During the process of determining isomorphism, SPS maps

34 the free variables of τ1 to those of τ2. Two trees might be isomorphic but the contents of their nodes, i.e., the base predicates and operators, might not be the same, in which case there does not exist any mapping of the free variables. In order to map the variables, the trees are lexicographically sorted. That is, starting from the root node, the contents of the children of each node are lexicographically sorted. The sorted trees are then matched and all possible mappings of the variables stored. While matching the trees, mappings are added or updated (i.e., the mappings that create conflicts are deleted). Finally, if a mapping exists, the solution of φ2 is obtained by replacing the variables in ψ1 by the corresponding variables.

Theorem: All subproblems obtained by decomposing a problem are always similar. Proof: In the proposed framework, while a curve is represented by an arbitrary num- ber of points, a line segment is always represented by its two end points. Similarly, while the periphery of a region is represented by an arbitrary number of points, the periphery of a triangular region is always represented by three points. Hence, two line segments or triangular regions are always represented similarly and differ only in the coordinates of their constituent points, unlike two curves or regions. The base predicates are defined in terms of base objects – points, line segments, and triangular regions. Thus, when a predicate is defined as conjunctions or disjunctions of base predicates, the base predicates are always similar. Decomposition of a problem into subproblems merely replaces one or more of its predicates by similar base predicates. Hence, all the subproblems have to be similar. ¥

3.4 Memory organization

Memory in the SPS is hierarchically organized and stores problems in disjoint classes based progressively on a problem’s features in ϑ – number of levels l of its parse tree, number of vertices at different levels captured by the function ξ, number of quantifier blocks, order of quantifier blocks, and number of variables in different quantifier blocks also captured by ξ (see Fig. 3.2). After decomposing a problem into subproblems and

35 computing their ϑ, if the subproblems have the same value for ϑ, SPS checks whether or not their parse trees are isomorphic and a mapping exists between their variables. Since the memory hierarchy has a constant height, insertion of a problem or searching for a potential class of similar problems can be executed in constant time. Also, the features that classify the problems are discriminative enough to create a large number of classes (leaf nodes), each class containing only a few problems, thereby reducing search to a few problems belonging to a class.

l

l1 lk . . .

[(# vertices at different [(# vertices at different levels of parse tree) levels of parse tree) D D 1 . . . m . . . #(qb) #(qb)

q1 . . . qr . . .

order of qb’s order of qb’s

o1 . . . os . . .

[(# variables in [(# variables in different qb 's) different qb 's)

E E 1 . . . t . . .

Problems Problems

Fig. 3.2: Hierarchical problem classification in memory.

36 3.5 Computational complexity

A parse tree can be constructed in time O(t) where t is the number of base predicates and operators in φ0. Computing ϑ requires O(t) time, while checking whether two trees are isomorphic or not requires O(t) time [Aho et al., 1974]. Now, let us analyze the complexity of variable mapping. The operators occupy the non-leaf nodes in the parse tree while the base predicates occupy the leaf nodes. Lexicographically sorting a tree requires lexicographically sorting the contents of the children of each non-leaf node in

0 00 th the tree. Let t be the number of operators and ti be the number of children of the i Xt0 00 operator. Thus, ti = t − 1. Note that since each base predicate is always followed by i←1 an operator, t = κt0 where κ is a constant. Lexicographically sorting a list of the contents

00 00 of the children of a node requires O(ti logti ) time. Thus, the total time required for Xt0 00 00 repeating this process for all non-leaf nodes is O(ti logti ). Since the average number i←1 0 Xt t − 1 of children per node is 1 t00 = , the total time required for variable mapping t0 i t0 i←1 0 Xt t − 1 t − 1 reduces to O( log ) = O(tlogκ) = O(t). t0 t0 i←1 Let there be n subproblems to a problem. In a problem, some of the predicates are base predicates while the rest are not which can be decomposed into base predicates thereby decomposing a problem into subproblems. Each of these subproblems inherits the base predicates from the problem and also includes the new base predicates obtained from decomposition of the non-base predicates in the problem. Let α be the number of polynomials in the base predicates of a problem and β be the number of polynomials due to the newly obtained base predicates in a subproblem. Since all subproblems are similar, each of them will have α + β polynomials. The total number of polynomials s in a problem where objects are represented piecewise-linearly is O(α + βn).

Let d be the maximum degree of any polynomial in a subproblem. Since all subprob- lems are similar, each of them will have maximum degree d. The maximum degree of polynomials in a problem will also be d if objects are represented piecewise-linearly, in

37 which case d ≤ 2. If the objects are not represented piecewise-linearly, the degree will be much larger than two which might lead to a situation where the problem might not be even solvable. It is well known that a general equation of degree five or higher is unsolv- able. In real world application domains, such as military, where arbitrary shaped objects occur in abundance and often the particularities of their shapes is vital (for e.g.,, in the BehindCurve problem), occurrence of polynomials of degree five or more is abundant as well. Our effort is concentrated on representations that will not deem a problem in any application unsolvable. Hence, we will not consider continuous representations of objects.

Let k be the number of quantified variables in a problem. Then each subproblem also has k quantified variables. The computational complexity of using CAD for solving a problem is O(((α + βn)d)O(1)k−1 ) while that for solving a subproblem is O(((α + β)d)O(1)k−1 ). Note that the latter complexity is independent of n. Further,

((α + βn)d)O(1)k−1 À n((α + β)d)O(1)k−1 i.e., it would be more efficient to solve each subproblem using CAD than to solve the whole problem using the same algorithm. But we can do even better.

Since all subproblems are similar, it suffices to solve only one subproblem and com- pute the solutions of the others by mapping their variables to it. The complexity of this approach is O(((α+β)d)O(1)k−1 +(n−1)t). Since the number of operators is of the order of number of base predicates and each base predicate is defined in terms of at least one polynomial, t = O(s) = O(α + βn). Thus, the complexity of solving a problem using our SPS is O(((α + β)d)O(1)k−1 + (n − 1)(α + βn)). It can be seen that

n((α + β)d)O(1)k−1 > ((α + β)d)O(1)k−1 + (n − 1)(α + βn) i.e., it is more efficient to solve a problem by variable mapping than to solve each sub- problem using CAD. Can we do any better?

38 When a problem is encountered by the SPS for the first time, it is solved by decompos- ing into subproblems, solving the first subproblem using a standard quantifier elimination algorithm (such as CAD) and then obtaining the solution of the rest of the subproblems by mapping their variables to the first subproblem. Since the subproblem and its solution are stored in memory, if a similar subproblem is encountered in future, the SPS bypasses the quantifier elimination algorithm and solves it by variable mapping. If there are m problems to search in memory with same value for ϑ, the time required to map variables to a similar problem is O(t), and there are n subproblems, the worst case complexity for solving the problem is

O(mt+((α+β)d)O(1)k−1 +(n−1)t) = O(((α +β)d)O(1)k−1 +(m+n−1)(α+βn)) when a problem similar to the first subproblem cannot be found in memory, where m is typically a small integer. The best case complexity is

O((m + n − 1)t) = O((m + n − 1)(α + βn)) when the first subproblem finds a similar problem in memory. Compared to the com- plexity of solving the whole problem using a standard quantifier elimination algorithm (such as CAD), even our worst case scenario has considerable savings. Moreover, as the SPS solves more problems, the probability to encounter similar problems in memory in- creases thereby leading to the best case scenario which incurs a complexity of low order polynomial as compared to doubly exponential.

3.6 An example

To illustrate the problem solving process, let us consider the function problem

BehindCurve(q, c, p). For a point p ← {a, b} and a curve c ← {p1, p2, ...pn} where pi ← {xi, yi} is a point, decomposition of the problem occurs as follows:

φ ≡ BehindCurve(q, c, p) ≡ Intersect(c, {p, q})

39 n−1 ≡ ∨i=1 Intersect({pi, pi+1}, {p, q}) n−1 ≡ ∨i=1 (∃a, isaP oint(a) ∧ On(a, {pi, pi+1}) ∧ On(a, {p, q})) n−1 0 ≡ ∨i=1 (Qiφi) n−1 ≡ ∨i=1 φi

Thus there are (n − 1) subproblems φi, where

0 φi ≡ isaP oint(a) ∧ On(a, {pi, pi+1}) ∧ On(a, {p, q})

Qi ≡ ∃a

š

isaPoint a Ona,, p p On a,,^ p q` ^ 12`

Fig. 3.3: Parse tree for a subproblem of the BehindCurve problem.

1 3 1 Hence, from Fig. 3.3, ϑ(φi) =< 2, 2 3 , 1, < ∃ >, 2 > for i = 1, 2, ...n − 1. By theorem, all φi’s are similar since they are the subproblems of the same problem. If the

SPS does not find a problem in memory similar to the first subproblem φ1, it is sent to the problem classifier who sends it to the appropriate quantifier elimination algorithm. The problem definition, its tuple ϑ, parse tree, and solution are then stored in memory as follows:

40 φ1(q, p1, p2, p) ≡ ∃a, isaP oint(a) ∧ On(a, {p1, p2}) ∧ On(a, {p, q})

φ1({x, y}, {x1, y1}, {x2, y2}, {px, py})

≡ (px − x < 0 ∧ px − x1 ≤ 0 ∧ x1 − x ≤ 0 ∧ pyx1 − pyx + pxy − x1y − pxy1 + xy1 =

0)∨(x−px < 0∧x1 −px ≤ 0∧x−x1 ≤ 0∧pyx1 −pyx+pxy−x1y−pxy1 +xy1 = 0)∨...

where the arguments of φ1 are the free variables. The other subproblems are solved by replacing the variables in φ1 by the mapped variables. If a problem similar to φ1 is found in memory, φ1 will also be solved by replacing the mapped variables, just as the other subproblems.

3.7 Discussion

My goal in this chapter was to develop an efficient constraint satisfaction strategy without compromising generality for solving 2D spatial problems without human intervention. The spatial problems were specified in the specification language described in chapter 2. It was shown how the constraint satisfaction approach can be made computationally effi- cient by storing problems and their solutions in memory so that when a similar problem is encountered in future, it can be solved by mapping its solution from a similar previously solved problem in memory. Thus, the SPS grows smarter as it solves more problems. It is noteworthy that, even though I used the CAD algorithm for quantifier elimination and compared the complexity results with that of CAD’s, this approach is by no means lim- ited to any particular algorithm. The complexity of any quantifier elimination algorithm can be enhanced by using the idea of problem decomposition and variable mapping, as discussed in this chapter. In the next chapter, I will present an entirely different approach for designing the SPS with the same functionalities.

41 CHAPTER 4

A SPATIAL SEARCH FRAMEWORK

In the last chapter, I discussed a constraint satisfaction framework for spatial problem solving that accepts a problem specified in a formal human-usable language, transforms the problem into an appropriate algebraic modeling language, and utilizes a control strat- egy for opportunistically using different algorithms to solve the problem efficiently. The search for the solution is primarily executed by the algebraically involved quantifier elim- ination, constraint solving and optimization algorithms. A large body of research has been dedicated to constraint satisfaction strategies that solve problems represented using algebraic equations and inequalities in the real domain. Comparatively much less effort has gone into investigating alternate strategies for constraint satisfaction.

In this chapter, I investigate a radically different strategy, that of spatial search, for spatial problem solving by satisfying constraints. The observation is that humans, even though inefficient in algebraic manipulations, solve a wide variety of non-trivial spatial problems everyday, most of them quite efficiently, by searching for the solution in the space of the diagram aided by commonsense knowledge of geometry. The goal in this chapter is to develop a common framework of spatial search that can be utilized in solv- ing a wide variety of spatial problems relevant to diagrammatic reasoning (DR) and auto- mate them. The spatial problems are specified in the same formal specification language (as discussed in chapter 2) in terms of the properties of, relations among and actions on diagrammatic objects and the spatial problem solver (SPS) does not possess any domain-

42 specific knowledge. The spatial search is executed using three simple operators. The complexity of solving problems by spatial search is O(N k) where N is the number of pixels in a diagram and k is the number of quantified variables. This exponential com- plexity can be considerably reduced by viewing a diagram at multiple levels of resolution and filtering out space in the diagram in a problem-dependent manner thereby reducing search to a considerably small space and a small number of objects in that space.

4.1 The overall idea

As we have seen in chapter 2, function and optimization problems can be viewed as deci- sion problems. Solving the corresponding function problem requires solving the decision problem for every point in the diagram. Only the points for which the decision problem outputs T rue are included in the solution set. An optimization problem requires choosing the best solution out of a pool of solutions. The pool of solutions is computed by solving the corresponding decision problem for every point in the diagram. The best solution is chosen by a separate algorithm. Thus, theoretically, if a decision problem can be solved, function and optimization problems involving that decision problem can also be solved.

Since the number of points that constitute any diagram is infinite, how can one check in finite time whether or not a decision problem outputs T rue for each one of them? Clearly, such a task cannot be accomplished in finite time. However, if the space in a dia- gram is discretized into a finite set of pixels where each pixel corresponds to a point, the task can be accomplished in finite time. This reduction in time complexity comes at the cost of precision. The precision of the solution will be dependent on the maximum reso- lution of the diagram. As resolution increases, computational time increases. However, if the resolution can be varied in a problem-dependent manner such that only specific parts of the diagram are viewed at high resolution while the rest at low resolution, a significant amount of computational time can be saved without compromising on the precision. We will soon see how this idea can be implemented.

43 4.2 Underlying representation

For implementing the spatial search strategy, the space in a diagram is discretized by im- posing an array of square pixels (see Fig. 4.2(a) for an example). Each pixel is indexed and maintains a list of all diagrammatic objects that it is occupied by while each diagram- matic object maintains a list of all pixels that it occupies. This data structure consisting of pixel representation and object representation helps implement the properties, relations and actions in the vocabulary (as described in section 2.1) very effectively. For example, computing the relation Inside(p, r) requires checking whether the element occupied by the point p is also occupied by the region r or not. The relation Collinear(a, b, c) is computed as follows: An activation is spread (i.e., pixels are colored) along a straight line from c through b; if the pixel occupied by a or any of its immediate neighbors is colored, the three points are inferred to be collinear. Similarly, for computing the relation Between(b, a, c) (i.e., point b is between points a and c), an activation is swept along an area starting from c towards b; if the pixel occupied by a is colored before b or not colored at all, Between(b, a, c) is declared false, otherwise it is declared true. Properties, such as Length(c) and Area(r) are computed by counting the number of pixels occupied by curve c and region r respectively. Further, actions such as T ranslate(o, tx, ty) is accom- 0 plished by drawing a new object o , every point of which has been moved tx number of pixels along x-axis and ty number of pixels along y-axis. The important distinction to note here is that, unlike the constraint satisfaction strategy which searches for the solution to a problem in the abstract space of algebraic equations and inequalities, the spatial search strategy searches for the solution in the space of the diagram. Each time a new object is introduced or modified or deleted, the pixels are updated. This requires a computational cost proportional to the number of pixels occupied by the object.

4.3 The core algorithm

In the constraint satisfaction approach, a problem was transformed into a quantified log- ical combination of algebraic equations and inequalities which was sent to the quantifier

44 elimination algorithm to eliminate quantifiers. In the spatial search approach, the task of eliminating quantifiers is accomplished by searching in the space of the diagram. The SPS knows how to search when the different kinds of quantifiers occur. Whether a spa- tial problem is decision or function is interpreted from the problem specification by the SPS – if the specification contains free variables1, it is a function problem, the solution of which will be a set of pixels (this set might be empty) while it is a decision problem if the specification is devoid of free variables, the solution of which will be T rue/F alse. For example, the BehindCurve(q, c, p) problem, where q, p are points and c is a curve, is specified as:

BehindCurve(q, c, p) ≡ ∃a, isaP oint(a) ∧ On(a, c) ∧ On(a, {p, q})

This is a decision problem when it is evaluated for a particular point q, such as BehindCurve({2, 3}, {{−5, 10}, {−5, 5}, {0, 0}, {2, −5}, {10, −10}}, {−10, 0}), where q ← {2, 3}, c ← {{−5, 10}, {−5, 5}, {0, 0}, {2, −5}, {10, −10}}, p ← {−10, 0}. The solution is T rue if q is behind c with respect to p which is the case for this particular instance of q, c, p. The same problem is a function problem when it is evaluated for a generic point q ← {x, y}, where x, y are free variables, such as BehindCurve({x, y}, {{−5, 10}, {−5, 5}, {0, 0}, {2, −5}, {10, −10}}, {−10, 0}). For this particular instance of c and p, the solution is a region object in the diagram consisting of all pixels that are behind c with respect to p.

In a decision problem, when an existential quantifier occurs in the specification, i.e., the problem is of the form φ ≡ ∃x, φ0(x) where φ0(x) is the quantifier-free expression, the SPS searches all pixels in the diagram until it finds one that evaluates φ0 to T rue, when it halts and outputs T rue. If no such pixel is found, the SPS outputs F alse. The pixels being searched correspond to the existentially quantified point variable. For example, in the BehindCurve problem, the SPS searches for the point a. Let us consider an operator

0 0 F∃ that, given a diagram D, a quantifier free expression φ (x) such that φ ≡ ∃x, φ (x), and the existentially quantified variable x, computes the solution to the decision problem φ, as follows:

1A variable is a point in 2D space.

45 F∃ : 1. for i ← each pixel in D 2. if φ0(i) = T rue, 3. return T rue; 4. return F alse;

0 The application of the F∃ operator is written as F∃ o (D, φ (x), {}, ∃x) where the empty set {} indicates there are no free variables, hence this is a decision problem. A function problem requires two searches – for each pixel in the diagram, the SPS needs to solve the decision problem. The pixels for which the decision problem evaluates to T rue constitute the solution. The operator F collects all those pixels from a diagram that satisfy the decision problem. If φ(v) ≡ ∃x, φ0(v, x) is a function problem and v is the free variable, operator F computes the solution to the function problem φ, as follows:

F : 1. S ← {}; 2. for i ← each pixel in D

0 3. if F∃ o (D, φ (i, x), {}, ∃x) = T rue, 4. S ← S ∪ {i}; 5. return S;

0 This is written as F o F∃ o (D, φ (v, x), {v}, ∃x).

In a decision problem, when a universal quantifier occurs in the specification, i.e., the problem is of the form φ ≡ ∀x, φ0(x) where φ0(x) is the quantifier-free expression, the SPS searches all pixels in the diagram until it finds one that evaluates φ0 to F alse, when it halts and outputs F alse. If no such pixel is found, the SPS outputs T rue. The pixels being searched correspond to the universally quantified point variable. Let us consider

0 an operator F∀ that, given a diagram D, a quantifier-free expression φ (x) such that φ ≡ ∀x, φ0(x), and the universally quantified variable x, computes the solution to the decision problem φ, as follows:

46 F∀ : 1. for i ← each pixel in D 2. if φ0(i) = F alse, 3. return F alse; 4. return T rue;

0 The application of the F∀ operator is written as F∀ o (D, φ (x), {}, ∀x). If φ(v) ≡ ∀x, φ0(v, x) is a function problem, where v is the free variable, operator F computes the solution to the function problem φ(v), as follows:

F : 1. S ← {}; 2. for i ← each pixel in D

0 3. if F∀ o (D, φ (i, x), {}, ∀x) = T rue, 4. S ← S ∪ {i}; 5. return S;

0 This is written as F o F∀ o (D, φ (v, x), {v}, ∀x).

When there are more than one quantified variables in a decision problem φ ≡

0 0 Q(xn, ...x1)φ (x1, ...xn), it is first expressed in prenex form to extract φ (x1, ...xn). Then the quantifiers are eliminated by spatial search which is achieved by the successive appli- cation of the operators, F∃ or F∀, corresponding to the quantifiers, ∃ or ∀. For example, a 0 decision problem of the form φ ≡ ∃x2∀x1, φ (x1, x2) is implemented as follows:

1. for i ← each pixel in D

0 2. if F∀ o (D, φ (x1, i), {}, ∀x1) = T rue, 3. return T rue; 4. return F alse;

0 This can be written as F∃ o F∀ o (D, φ (x1, x2), {}, ∃x2∀x1). The solution to the corre- 0 sponding function problem is obtained by F o F∃ o F∀ o (D, φ (v, x1, x2), {v}, ∃x2∀x1).

47 Since a diagram is always drawn on a 2D surface, there can be at most one free variable in any problem. However, there can be arbitrary number of quantified variables with the quantifiers occurring in arbitrary order. In general, a decision problem is of

0 the form φ ≡ Q(xn, ...x1)φ (x1...xn) where Q(xn, ...x1) ≡ Qnxn, ...Q1x1, Qi being the quantifier for the variable xi, Qi ∈ {∃, ∀}. Such a problem can be solved as follows:

0 FQn o ... FQ1 o (D, φ (x1, ...xn−1, xn), {},Q(xn, ...x1))

Similarly, a function problem in DR is in general of the form φ(v) ≡

0 Q(xn...x1)φ (v, x1...xn) where v is the free variable. Such a problem can be solved as follows:

0 F o FQn o ... FQ1 o (D, φ (v, x1, ...xn), {v},Q(xn, ...x1))

Given the three operators, F∃, F∀, F , it is easy to automate the process of eliminating quantifiers and obtaining the solution by spatial search. However, it is trivial to note that, the complexity of this approach is O(N k) where N is the number of pixels in the diagram and k is the total number of free and quantified variables. At a reasonably high resolution, which is required for domains that value precision, such as military, the computational cost is considerable. The next section will explore ideas to reduce this cost.

Proof of correctness. By induction, I will prove that the decision problem φ ≡

0 Q(xn, ...x1)φ (x1, ...xn), where Q(xn, ...x1) ≡ Qnxn...Q1x1, Qi ∈ {∃, ∀}, can be solved by the application of the two operators, F∃ and F∀, in the following sequence: 0 FQn o ... FQ1 o (D, φ (x1, ...xn), {},Q(xn, ...x1)). Using that, I will prove that the func- 0 tion problem φ ≡ Q(xn, ...x1)φ (v, x1, ...xn), where v is a free variable, can be solved by the application of the three operators, F , F∃ and F∀, in the following sequence: 0 F o FQn o ... FQ1 o (D, φ (v, x1, ...xn), {v},Q(xn, ...x1)).

0 0 For n = 1, the decision problem φ ≡ ∃x1, φ (x1) or φ ≡ ∀x1, φ (x1). We 0 have seen before that these problems can be solved by F∃ o (D, φ (x1), {}, ∃x1) and 0 F∀ o (D, φ (x1), {}, ∀x1) respectively. Thus the proposition holds for n = 1.

48 0 For n = 2, the decision problem φ can have four cases: φ ≡ ∃x2∃x1, φ (x1, x2) 0 0 0 or φ ≡ ∀x2∃x1, φ (x1, x2) or φ ≡ ∃x2∀x1, φ (x1, x2) or φ ≡ ∀x2∀x1, φ (x1, x2). The algorithm for solving the first case is as follows:

0 φ ≡ ∃x2∃x1, φ (x1, x2) 1. for i ← each pixel in D

0 2. if F∃ o (D, φ (x1, i), {}, ∃x1) = T rue, 3. return T rue; 4. return F alse;

0 which is the same as F∃ o F∃ o (D, φ (x1, x2), {}, ∃x2∃x1). Similarly,

0 φ ≡ ∀x2∃x1, φ (x1, x2) 1. for i ← each pixel in D

0 2. if F∃ o (D, φ (x1, i), {}, ∃x1) = F alse, 3. return F alse; 4. return T rue;

0 which is the same as F∀ o F∃ o (D, φ (x1, x2), {}, ∀x2∃x1). Again,

0 φ ≡ ∃x2∀x1, φ (x1, x2) 1. for i ← each pixel in D

0 2. if F∀ o (D, φ (x1, i), {}, ∀x1) = T rue, 3. return T rue; 4. return F alse;

0 which is the same as F∃ o F∀ o (D, φ (x1, x2), {}, ∃x2∀x1). Lastly,

0 φ ≡ ∀x2∀x1, φ (x1, x2) 1. for i ← each pixel in D

0 2. if F∀ o (D, φ (x1, i), {}, ∀x1) = F alse, 3. return F alse; 4. return T rue;

49 0 which is the same as F∀ o F∀ o (D, φ (x1, x2), {}, ∀x2∀x1). Thus, the proposition holds for n = 2.

Let us assume that the proposition holds for n = m. Therefore, the decision prob-

0 lem φ ≡ Q(xm, ...x1)φ (x1, ...xm) can be solved by the application of the two operators, 0 F∃, F∀, in the following sequence: FQm o ... FQ1 o (D, φ (x1, ...xm), {},Q(xm, ...x1)). Now let us consider the proposition for n = m + 1. The decision problem φ ≡

0 0 Q(xm+1, ...x1)φ (x1, ...xm+1) can have two cases: φ ≡ ∃xm+1Q(xm, ...x1)φ (x1, ...xm+1) 0 or φ ≡ ∀xm+1Q(xm, ...x1)φ (x1, ...xm+1). The algorithm for the first case is as follows:

0 φ ≡ ∃xm+1Q(xm, ...x1)φ (x1, ...xm+1) 1. for i ← each pixel in D

0 2. if FQm o ... FQ1 o (D, φ (x1, ...xm, i), {},Q(xm, ...x1)) = T rue, 3. return T rue; 4. return F alse;

0 which is the same as F∃ o FQm o ... FQ1 o (D, φ (x1, ...xm+1), {}, ∃xm+1Q(xm, ...x1)). The algorithm for the second case is as follows:

0 φ ≡ ∀xm+1Q(xm, ...x1)φ (x1, ...xm+1) 1. for i ← each pixel in D

0 2. if FQm o ... FQ1 o (D, φ (x1, ...xm, i), {},Q(xm, ...x1)) = F alse, 3. return F alse; 4. return T rue;

0 which is the same as F∀ o FQm o ... FQ1 o (D, φ (x1, ...xm+1), {}, ∀xm+1Q(xm, ...x1)). Thus, given that the proposition holds for n = m, it also holds for n = m + 1. But it holds for n = 1, 2. Hence it holds for n = 3, 4, .... Hence the proof follows. ¥

0 For a function problem φ(v) ≡ Q(xn, ...x1)φ (v, x1, ...xn), the algorithm for its solu- tion is as follows:

50 0 φ(v) ≡ Q(xn, ...x1)φ (v, x1, ...xn) 1. S ← {}; 2. for i ← each pixel in D

0 3. if FQn o ... FQ1 o (D, φ (i, x1, ...xn), {},Q(xn, ...x1)) = T rue, 4. S ← S ∪ {i}; 5. return S;

0 which is the same as F o FQn o ... FQ1 o (D, φ (v, x1, ...xn), {v},Q(xn, ...x1)). Thus, the proposition holds for function problems as well. ¥

Till now, my discussion has been limited to decision and function problems only. An optimization problem is recognized from its specification starting with the keyword Minimize or Maximize (see section 2.2). The quantifiers in the constraints in an op- timization problem are eliminated similar to function problems as described above, the output being one or more diagrammatic objects. The optimization function is then min- imized or maximized. Though I firmly believe the function can be optimized using the spatial search strategy, a complete theory of how this can be done for arbitrary functions has not yet been developed and implemented. This is left as part of future research.

4.4 Enhancing the efficiency of spatial search

The aim of this dissertation has been to come up with general and efficient strategies for spatial problem solving. It is a pertinent question to ask how the human visual system ex- ecutes searches for such a wide variety of spatial problems so efficiently. It is well known that the human visual system executes much, if not all, of the computations in parallel and hence, the computational time is far less than sequential. This issue is extremely im- portant and has been a major line of research in the neural network community. However, it is beyond the scope of this dissertation. In addition to parallel computation, a problem guides the human visual system to reduce computation, by –

51 B B

A A (a) Path obtained by closely placed smaller obsta- (b) Actual shortest path avoiding all obstacles. cles abstracted into a larger region.

Fig. 4.1: An example of abstraction for efficient computation.

1. Viewing a diagram at multiple levels of resolution. For example, when computing the shortest path between two points s and e, smaller regions lying in close proximity are perceived as one single region and the suboptimal path obtained is refined later for the smaller regions at a higher resolution to obtain the optimal shortest path (see Fig. 4.1(a),4.1(b)).

2. Filtering out space and restricting search to a small space and a small number of objects in that space. For example, when computing the region behind a curve c with respect to a point p (see Fig. 2.2), only certain areas require more computation than the others. Which areas will require more computation and which areas less depends on the particular problem and not only on the configuration of objects in the diagram.2

2A well-known data-structure in computer science called quadtree is used to sample space based on the density of objects occupying it. The space is hierarchically represented using pixel-like elements of varying sizes. Parts of the space, where objects are sparse, are represented by bigger sized elements while densely populated parts are represented by smaller sized elements. A quadtree is inadequate for enhancing the efficiency of spatial search because it does not sample space in a problem-dependent manner. That is, for a particular diagram, the space decomposition will always be the same using a quadtree irrespective of what the problem is, which is not desired for guiding spatial search.

52 For example, in the BehindCurve problem, the space in the diagram that might contain the boundaries of the region behind curve c with respect to the point p receives more computation time. The rest of the space in the diagram can be classified into two classes – one that is clearly behind c with respect to p and the one that is clearly not. Computing these two classes, approximately, takes less time and once they have been computed, the space belonging to them is filtered out (i.e., does not receive any more computation time) and the entire time is restricted to computing the precise boundary of the region behind c with respect to p.

I will now show how these two useful strategies can be implemented by im- posing a variable resolution on the diagram and checking for satisfaction of proper- ties/relations/actions in the problem specification. The space in the diagram is discretized by imposing an array of square elements on the diagram where the size of each element in the array is chosen as the problem requires (see Fig. 4.2(a) for an example). Instead of choosing small sized pixels i.e., a high resolution, initially the size of each pixel is chosen to be large. Later, if the problem demands, each pixel will be subdivided into smaller pixels thereby increasing the resolution. As described in section 4.2, at any resolution, each pixel is indexed and maintains a list of all diagrammatic objects that it is occupied by while each diagrammatic object maintains a list of all pixels that it occupies. When a pixel is checked for satisfaction of constraints, it falls into one of three classes – those that satisfy the constraints, those that do not satisfy the constraints, and those that require further investigation. Only the pixels belonging to the third class are further subdivided into smaller pixels and checked for satisfaction of constraints thereby classifying them into three classes. This procedure continues until a satisfactory solution is obtained or a maximum degree of resolution is reached. It is trivial to see that, in this implementation, computation time is restricted to space in the diagram that requires it and not wasted by processing the entire diagram uniformly. Further, different resolutions provide different levels of abstraction, thereby allowing abstractions such as grouping based on proximity to be computed naturally with no extra computation. The important issue in this imple- mentation is how to determine the size of a pixel. As a first approximation, the length of a side of a pixel is initialized to the average distance between any two points in the diagram,

53 (a) An array is imposed on the diagram. (b) The partially behind array elements.

(c) Subdividing partially behind array ele- (d) The partially behind array elements at ments into smaller elements. the higher resolution.

Fig. 4.2: Solving the BehindCurve(p, c) problem by spatial search.

where the points might be independent diagrammatic objects or constituents of other di- agrammatic objects, such as curves or regions. If required, such as for grouping, the size of the pixels might even be increased. When a pixel is subdivided, the size of a pixel should ideally be determined according to the needs of the problem. For example, in the BehindCurve problem, each pixel was subdivided into nine smaller pixels as increasing the resolution to reach the maximum limit was the goal (see Fig. 4.2). However, in the shortest path finding example in Fig. 4.1, each pixel was subdivided to ungroup the group of small region obstacles and hence, during subdivision, the size of each pixel was chosen so as to contain only the smallest region obstacle (see Fig. 4.1(a),4.1(b)). Determination of the degree of resolution in this implementation is crucial and requires further research to come up with a systematic strategy in a problem dependent manner.

54 4.5 Computational complexity

Since the resolution is increased in a problem-dependent manner, it is difficult to ascertain the absolute reduction in computational costs. However, an average case analysis will provide an idea of the efficiency achieved. Two kinds of computational costs are involved – updating cost incurred due to updating the data structure each time the resolution is changed and processing cost incurred due to checking every pixel for satisfaction of the decision problem. The goal in this analysis is to count the total number of pixels updated and processed by the naive spatial search strategy and compare it with the number of pixels updated and processed by the resolution-based enhanced strategy. Let n be the number of pixels in a diagram initially at the lowest resolution, N be the number of pixels finally at the highest resolution, s be the average number of steps required to go from the lowest to the highest resolution, m be the average percentage of pixels that require further investigation at any step (m ≤ 1), d be the average number of pixels each pixel is subdivided into at each step (i.e., factor of increase in resolution), and k be the total number of variables in the specification of the spatial problem being solved.

Therefore, N = nds. Updating cost is linear in the number of pixels. In the naive spatial search, the SPS updates N pixels and processes N k pixels. In the resolution-based enhanced approach, the SPS updates n pixels and processes nk pixels in the first step. Only m% of n pixels require further investigation. Each of these nm pixels is further subdivided into d pixels, hence the number of pixels to update is nmd and that to process is nk(md)k in the second step. Similarly, in the third step, the number of pixels to update is n(md)2 and to process is nk(md)2k. Thus, the total number of pixels updated in s steps in the enhanced approach is

n + nmd + n(md)2 + ...n(md)s−1 = O(n(md)s) = O(Nms) assuming md > 1. Thus, the reduction in updating cost is a factor of ms, m ≤ 1, s > 1, compared to the naive approach. If md ≤ 1, upper bound of the updating cost is O(n) = O(N/ds) which is a cost reduction by factor 1/ds, d ≥ 1.

55 The processing cost determined by total number of pixels processed in s steps in the enhanced approach is

nk + nk(md)k + nk(md)2k + ...nk(md)(s−1)k = O(nk(md)sk) = O(N kmsk) assuming md > 1. Thus, there is a reduction in processing cost by a factor of msk, m ≤ 1, s > 1, k ≥ 1, compared to the naive approach. If md ≤ 1, the upper bound of the total number of pixels processed is O(nk) = O(N k/dsk) which is a cost reduction by factor 1/dsk, d ≥ 1.

For the BehindCurve problem (Fig. 4.2), the following parameters were used: m ← 0.5, d ← 9, s ← 3, n ← 48, N ← 35000, k ← 2. For the enhanced approach, the updating cost was 0.125 times while the processing cost was 0.0156 times the respective costs for the naive approach. That is, using the enhanced approach, in the entire process the SPS updated only 4375 pixels as compared to 35000 i.e., only 12.5%, and processed only 43752 pixels as compared to 350002 i.e., only 1.56%. This example provides an idea of the gain in computational costs achieved by the resolution-based enhancement.

4.6 Discussion

My goal in this chapter was to develop an efficient spatial search strategy without com- promising generality for solving 2D spatial problems without human intervention. The spatial problems were specified in the specification language described in chapter 2. Pro- cedures for solving a small vocabulary of properties, relations and actions are manually implemented and stored. The problems were solved by searching for the solution in the space of the diagram using three operators. It was shown how the spatial search approach can be made computationally efficient by viewing a diagram at multiple levels of resolu- tion, and filtering out space and restricting search to a small space and a small number of objects in that space in a problem-dependent manner. The correctness and efficiency of the strategy were analyzed.

56 In the next chapter, we will see some of the applications of the spatial problem solving framework in solving real world problems, where the problems will be specified using the specification language and the SPS will solve the problems without human intervention and diagrammatically represent the outcome whenever appropriate.

57 CHAPTER 5

APPLICATIONS

This chapter will illustrate how the proposed spatial problem solving framework can be deployed for specifying perceptions and actions by a human according to the needs of diagrammatic reasoning (DR), and executing them by solving the corresponding spatial problems without human intervention. Three applications will be considered from two very different domains – entity re-identification and ambush analysis that are deemed very important in the military domain, and theorem proving in Euclidean geometry. The subproblems that the spatial problem solver (SPS) autonomously decomposes each spatial problem into will be shown. Problems in military domain involve a wide variety of objects with arbitrary properties and relations, and hence, help to illustrate the expressiveness of the specification language and the efficiency and generality of the SPS.

5.1 Entity re-identification

The entity re-identification problem arises in US Army’s All-Source Analysis System (ASAS). The prototypical task is to decide if a newly sighted entity, given its time, lo- cation, and partial identity information (e.g., enemy/friendly, tank/soldier, etc.), is one of the entities in the database sighted and identified earlier, or a new entity. The presence of different kinds of obstacles, such as no-go regions, enemy locations, sensors, etc. and the maximum speed of entities in the terrain constrain the possibilities of the outcome. In

58 the following is considered a simple version of the problem to illustrate how the task is solved using DR and the spatial problems involved therein.

Let T3 be an entity newly sighted at time t3 located at point p3 while T1, T2 are the two entities that were located at points p1, p2 when last sighted at times t1, t2 respectively.

T1 and T2 were retrieved from the database as having the promise to be T3 based on their partial identity information. Also, in the area of interest, there are four enemy regions or obstacles {r1, r2, r3, r4} with a given firepower/sight range d of the enemy, as shown in Fig. 5.1(a). The human problem solver wants to know whether there exists a path between points p1 and p3 safely avoiding the obstacles, and whether that path can be traversed in time t3 − t1. He decomposes this problem into three spatial problems as follows:

1. Does there exist a path between p1 and p3 safely avoiding the obstacles?

2. Compute the shortest path from p1 to p3 safely avoiding the obstacles.

3. Is it possible for T1 to traverse the shortest path within time t3 − t1?

Given a set of regions {r1, r2, ...rn} and a firepower/sight range d, the spatial problem isaSafeRegion({r1, r2, ...rn}, d, q) when posed as a function problem, computes the set of all points q that lie outside and at a minimum distance of d units from the given re- gions. This constitutes the region within which the entities can navigate safely. Thus, the problem is specified as:

n isaSafeRegion({r1, r2, ...rn}, d, q) ≡ ∧i←1(¬Inside(q, ri) ∧ ∀a, isaP oint(a) ∧

(Inside(a, ri) ⇒ Distance(q, a) ≥ d))

n #4(ri) ≡ ∧i←1 ∧j←1 ∀a, ¬Inside(q, 4(ri)[j]) ∧ isaP oint(a) ∧ (¬Inside(a, 4(ri)[j]) ∨ Distance(q, a) ≥ d)

When solved using the constraint satisfaction strategy, the decomposition of the problem by the SPS is shown. Each of the subproblems are solved as described in section 3.1. When solved using the spatial search strategy, the problem specification is converted into prenex form and then, each pixel in the diagram is checked for satisfaction of the con- straints, starting from a low resolution and gradually increasing it, as described in section

59 4.1. The solution, a region object, which is the safe region, is shown shaded while the obstacles are shown in black in Fig. 5.1(b).

A path can exist between p1 and p3 lying entirely within the safe region if and only if p1 and p3 lie within a contiguous region. In order to figure out whether the safe region is con- tiguous or not, the Recognize function described in section 2.1 is used. The function call

Recognize(isaSafeRegion({r1, r2, r3, r4}, 10, {x, y}), Region), where d ← 10, pro- duces the periphery of the safe region as a piecewise-linear closed curve. Then it is deter- mined whether the points p1 and p3 lie inside the periphery of that region or not. If they do, then a path exists between them. If the safe region is not contiguous and the output of the Recognize function is a number of peripheries, then it is checked whether p1 and p3 lie inside any one of them. If they lie inside different regions, then a path joining them is not possible. For the case shown in Fig. 5.1(b), the Recognize function returns only one periphery within which p1 and p3 lie.

Next problem is to compute the shortest path from p1 to p3 lying entirely within the safe region. Computing the shortest path between two points, s and e, avoiding a set of regions R ← {r1, r2...rn} can be posed as an optimization problem. Since the domain is piecewise linear, the shortest path c will either be a line segment or be passing through some points lying on the periphery of the regions. So the problem reduces to finding se- quence of points from the periphery of regions in R such that the length of c is minimized, where c is assumed to be a sequence of N + 2 points, N being the number of all points on the peripheries of all regions in R. Thus,

F indP ointsonShortestP ath(s, e, R) ≡ Minimize(Length(c), c, isaCurve(c)

N+2 m ∧i←1 c[i] ∈ F latten(R, 2) ∧i←1 ¬Intersect(c, R[i])) where F latten(R, 2) is the set of all points on the peripheries of all regions in R. The shortest path is obtained by eliminating consecutive points in c that are not unique. For our case, R ← Recognize(isaSafeRegion({r1, r2, r3, r4}, 10, {x, y}), Region), and hence F latten(R, 2) outputs all points on the periphery of the safe region. The function part

N+2 m of the problem i.e., isaCurve(c) ∧i←1 c[i] ∈ F latten(R, 2) ∧i←1 ¬Intersect(c, R[i])

60

T2 T2

T1 T1

T3 T3

(a) Obstacles and entities. (b) Contiguous safe region.

T2 T2

T 1

T1

T3 T3

(c) Shortest paths from T and T to T . (d) Shortest paths avoiding newly added sen- 1 2 3

sors.

Fig. 5.1: Problem solving for entity re-identification.

61 is solved by the constraint satisfaction and spatial search strategies, and the solution is then handed over to optimization algorithms. The optimization algorithms produce the minimum length of the curve (path) c along with the points on c. The same - ing procedure is carried out for T2 and T3. Shortest paths from p1 and p2 to p3 lying within the safe region are shown in Fig. 5.1(c). It is noteworthy that the approach for the F indP ointsonShortestP ath problem can also be used to compute the shortest path maintaining a safe distance from obstacles for a mobile robot, as considered in [Weispfen- ning, 2001].

As it turns out, both the shortest paths can be traversed by the respective entities within time t3 −t1. The sensors database report that there were two sensors in the area of interest but none of them has reported any citation. The problem solver figures out using the property Intersect that the shortest paths pass through the sensors, and wants to know whether there exists alternate paths for T1 and T2 to reach p3. The entire procedure is reiterated considering the sensors and their area of coverage as obstacles, and the shortest paths obtained is shown in Fig. 5.1(d). This time it turns out that the shortest path from p1 to p3 cannot be traversed in time t3 − t1 while that from p2 to p3 can be. Hence, the problem solver identifies T3 as T2.

5.2 Ambush analysis

There are two main factors – range of firepower and sight – that determine the area cov- ered by a military unit. Presence of terrain features, such as mountains, limit these factors and allow units to hide from opponents. These hidden units not only enjoy the advantage of concealing their resources and intentions from the opponents but can also attack the opponents catching them unawares if they are traveling along a path that is within the sight and firepower range of the hidden units, thereby ambushing them. Thus, it is of utmost importance for any military unit to a priori determine the areas or portions of a path prone to ambush before traversing them. In this application, given a curve or region

62 as a hiding place and the firepower and sight ranges, I show how the regions and portions of path prone to ambush can be automatically computed using the proposed framework.

Given a curve c and the firepower and sight range d, the spatial problem RiskyRegion(c, d, q) when posed as a function problem, computes the set of all points covered by that range from c. Thus, the problem specification is:

RiskyRegion(c, d, q) ≡ ∃a, isaP oint(a) ∧ On(a, c) ∧ Distance(a, q) ≤ d #(c)−1 ≡ ∨i←1 ∃a, isaP oint(a) ∧ On(a, {c[i], c[i + 1]}) ∧ Distance(a, q) ≤ d

The decomposition is shown when using the constraint satisfaction approach. It can also be solved using the spatial search strategy. The solution is the shaded region shown in Fig. 5.2 with respect to the curve c2 and a particular value of d. The problem RiskyRegion(r, d, q) for a region r can be specified by replacing the predicate On(a, c) by Inside(a, r).

c2

c1

Fig. 5.2: Ambush analysis: Parts of the path c1 inside the shaded region are prone to ambush due to presence of enemies at c2.

63 Given a curve c1 as a path, a curve c2 for hiding, and a range d, the problem

RiskyP ath(c1, c2, d, q), posed as a function problem, is defined as parts of c1 covered by that range from c2 i.e., as the set of all points that lie on c1 and also inside the risky region. Thus,

RiskyP ath(c1, c2, d, q) ≡ On(q, c1) ∧ Inside(q, r2)

#(c)−1 #4(r2) ≡ ∨i←1 ∨j←1 On(q, {c1[i], c1[i + 1]}) ∧ Inside(q, 4(r2)[j])

where r2 ← RiskyRegion(c2, 10, {x, y}), d ← 10, and P eriphery(r2) ≡

Recognize(r2, Region). Again, this problem is solved by the constraint satisfaction as well as the spatial search strategies. The solutions are the parts of c1 inside the shaded region shown in Fig. 5.2. The problem RiskyP ath(c1, r, d, q) for a region r can be de-

fined similarly. The region behind c2 where the enemies might be hiding is the set of all points that are behind c2 with respect to each point on the risky parts of path c1. Thus, if c3 ← RiskyP ath(c1, c2, d, q) and r3 ← BehindCurve(a, c2, q), we have

BehindCurvewrtRiskyP ath(c3, c2, q) ≡ ∀a, isaP oint(a) ∧ On(a, c3) ⇒

Inside(q, r3)

#(c3)−1 #4(r3) ≡ ∨i←1 ∨j←1 ∀a, isaP oint(a) ∧ (¬On(a, {c3[i], c3[i + 1]}) ∨ Inside(q, 4(r3)[j]))

5.3 Euclidean geometry theorem proving

Diagrammatically proving theorems in Euclidean geometry requires the problem solver to construct/modify objects in a diagram and perceive information from the diagram, guided by problem solving goals. The proposed framework solves spatial problems that arise due to the need to execute perceptions and actions on a diagram. The strategy for a diagrammatic proof is developed by a human problem solver who specifies the spatial problems to the SPS using the proposed specification language. Based on the solution of a spatial problem, the next step is taken by the problem solver. It is important to note that deducing non-spatial information is done by the problem solver and not by the SPS. To

64 illustrate the entire process and the efficacy of the proposed framework, let us consider the diagrammatic proof of Pythagoras theorem.

Pythagoras theorem states that in any right-angled triangle, sum of squares of the two sides is equal to the square of the hypotenuse. In order to prove the theorem diagrammati- cally, the problem solver first needs to execute a set of actions for drawing points and line segments between two points, where some of the points have to be computed by solving spatial problems. The problem solver chooses points p1, p2, p3 as the three vertices of a right angle triangle, right angled at p2. Without loss of generality, he chooses the coor- dinates of points p2, p1, p3 as {0, 0}, {0, a}, {b, 0} respectively and assumes the distance between points p1, p3 as c (see Fig.5.3). Thus, the proof will be complete if the problem 2 2 2 solver can show a + b = c . In order to compute the rest of the points pi, 4 ≤ i ≤ 8, that he deems useful to prove the theorem, he specifies the following spatial problems, as function problems, to the SPS:

ComputeP oint(p2, p3, p4) ≡ Collinear(p2, p3, p4) ∧ Distance(p3, p4) = a

ComputeP oint(p2, p1, p8) ≡ Collinear(p2, p1, p8) ∧ Distance(p1, p8) = b o ComputeP oint(p1, p8, p6) ≡ Angle(p1, p8, p6) = 90 ∧ Distance(p8, p6) = a + b

ComputeP oint(p6, p8, p7) ≡ Collinear(p8, p7, p6) ∧ Distance(p8, p7) = a

ComputeP oint(p4, p6, p5) ≡ Collinear(p4, p5, p6) ∧ Distance(p4, p5) = b

The solutions to the problems, as returned by the SPS, are (in order): p4 ← {a + b, 0}, p8 ← {0, a + b}, p6 ← {a + b, a + b}, p7 ← {a, a + b}, p5 ← {a + b, b}.

The problem solver’s next task is to draw the points pi, 1 ≤ i ≤ 8, and the lines segments p2p4, p4p6, p6p8, p8p2, p1p3, p3p5, p5p7, p7p1. Coordinates of all points are now known to him after spatial problem solving in terms of abstract coordinates a, b. However, they cannot be drawn unless a and b are assigned some real value. The problem solver makes the assignment a ← 5 units and b ← 12 units. He knows that this assignment is arbitrary and does not play any role in his proof, except that it allows drawing a real diagram from abstract coordinates. The real diagram is important as, at every step in problem solving, the perception of information arranged locally in the diagram helps

65

p p8 7 p6

p5

p1 a c

p2 b p3 p4

Fig. 5.3: Diagrammatic proof of Pythagoras theorem.

reduce search to figure out the relevant pieces of knowledge to be applied or the step to be taken next. For example, the triangles p1p2p3, p3p4p5, p5p6p7, p7p8p1 seem to be congruent in the diagram (see Fig.5.3) and if they really are, the next step in the proof can be trying to show the sum of the areas of the smaller square and four triangles is equal to the area of the larger square. Here, the diagram facilitated the perception that the triangles are congruent which filtered out many possible next steps, thereby reducing search in the mind of the problem solver to determine which step to take next.

The SPS draws thepoints pi, 1 ≤ i ≤ 8, at respective coordinates, assuming a ← 5 and b ← 12 units, and then draws the lines segments p2p4, p4p6, p6p8, p8p2, p1p3, p3p5, p5p7, p7p1. From the given information and that obtained from the diagram, the problem solver deduces the following in sequence:

Distance(p6, p7) = b

isaSquare({p2, p4, p6, p8}) 2 Area(p2p4p6p8) = (a + b)

Distance(p5, p6) = a

Congruent(p1p2p3, p7p8p1)

66 Distance(p1, p7) = c

Congruent(p1p2p3, p3p4p5)

Distance(p3, p5) = c

Congruent(p3p4p5, p5p6p7)

Distance(p5, p7) = c

isaSquare({p1, p3, p5, p7}) 2 Area(p1p3p5p7) = c

It is noteworthy that perception plays a very interesting role in determin- ing congruence. The relation Congruent is defined as a rule. For example, the side-side-side congruence is defined as: for any two triangles abc and pqr, if Distance(a, b) = Distance(p, q) and Distance(b, c) = Distance(q, r) and Distance(c, a) = Distance(r, p), then Congruent(abc, pqr). Given any two arbitrary triangles, one can write a program that searches through all possible mutual configura- tions of the two triangles to check the pre-conditions which if met, declares congruence. However, such a program will be inefficient as it will, in the worst case, have to search through all possible mutual configurations of the two triangles. Human perception, on the other hand, filters out some of these configurations and starts by considering the one that ”seems” to be the best candidate. One mutual configuration seems to be a better candidate than another due to various qualitative reasons. For example, that one of the angles seem to be much more acute than another angle for a particular configuration of both triangles makes this mutual configuration a good candidate for formally exploring the congruence of these two triangles. Again, the perception that, at a particular configuration, one of the angles of a triangle is much more acute than the corresponding angle in another triangle helps to confidently rule out the option of their congruence at this mutual configuration without even formally checking for it (assuming the diagram is perfectly drawn and gen- eral enough). These are very important properties of high-level vision that contributes to making spatial problem solving for DR very efficient.

Next step for the problem solver is to perceive the information

67 Area(p2p4p6p8) = Area(p1p2p3)+Area(p3p4p5)+Area(p5p6p7)+Area(p7p8p1)+

Area(p1p3p5p7)

In order to accomplish this perception, the problem solver has to solve the spatial problem of computing the existence of a point that lies inside at least one of the objects on the right hand side of the above expression (i.e., smaller square or four triangles) and not inside the object on the left hand side (i.e., larger square), or lies inside the object on the left hand side and not inside any of the objects on the right hand side. If such a point exists, then the above expression is false; otherwise true. The spatial problem, posed as a decision problem, is specified to the SPS as:

∃q, isaP oint(q) ∧ (((Inside(q, p1p2p3) ∨ Inside(q, p3p4p5) ∨ Inside(q, p5p6p7) ∨

Inside(q, p7p8p1) ∨ Inside(q, p1p3p5p7)) ∧ ¬Inside(q, p2p4p6p8)) ∨

(Inside(q, p2p4p6p8)∧¬Inside(q, p1p2p3)∧¬Inside(q, p3p4p5)∧¬Inside(q, p5p6p7)∧

¬Inside(q, p7p8p1) ∧ ¬Inside(q, p1p3p5p7)))

This problem when solved using the constraint satisfaction strategy is further decomposed into subproblems. The solution returned by the SPS is F alse. When using the spatial search strategy, the SPS could not find any pixel for q even at the highest resolution that satisfied the constraints. Hence, it returned F alse. The problem solver infers that the above expression is T rue.

Next the problem solvers wants to perceive the areas of each of the four triangular regions and two square regions, which he accomplishes by calling the property Area for each of the regions. Plugging the areas in the respective places in the expression, the

2 1 2 2 2 2 problem solver gets (a + b) = 4. 2 ab + c or, a + b = c . This concludes his proof of Pythagoras theorem.

5.4 Discussion

My goal in this chapter was to illustrate the expressiveness of the specification language and efficiency of the two strategies by using the proposed frameworks for executing per-

68 ceptions and actions in reasoning with diagrams. Real world problems involving com- plicated diagrammatic objects from three applications - entity re-identification, ambush analysis and proving Pythagoras theorem - were chosen from two different domains - military and Euclidean geometry - to showcase the capabilities. The spatial problems corresponding to the perceptions and actions were specified in the specification language. The subproblems obtained by decomposing the problems using the constraint satisfaction strategy were shown. Finally, the diagrammatic objects obtained by solving the spatial problems using both the strategies were illustrated diagrammatically.

69 CHAPTER 6

CONCLUSIONS

Reasoning with a diagram requires the problem solver to opportunistically interact with it by abstracting or perceiving information from it or acting on it. Executing these ab- stractions, perceptions and actions require solving a wide variety of non-trivial spatial problems that can be defined in terms of properties of, relations among or actions on di- agrammatic objects. A large number of domain-specific diagrammatic reasoning (DR) systems have been built where a human developer writes efficient algorithms for solving pre-determined spatial problems by exploiting domain- and task-specific representations and reasoning strategies. The larger goal of our group’s research has been to build a gen- eral purpose DR system for which a very wide variety of spatial problems, that cannot be determined a priori, need to be solved. The goal of this dissertation was to investigate – 1. A language for a problem solver (human in our case) to communicate a wide vari- ety of spatial problems relevant to DR, and 2. A general domain-independent framework of underlying representations and rea- soning strategies suitable for efficiently solving spatial problems without human interven- tion.

This chapter discusses how well the goal was achieved, the contributions made in this dissertation, and avenues for future research.

70 6.1 Evaluation

Expressiveness of the specification language. A high-level language was proposed for humans to specify spatial problems to the spatial problem solver (SPS). This language is a functional constraint logic programming language where the constraints are speci- fied in first-order logic. In order to accommodate domains that involve arbitrary shaped objects, such as military, the object representation was chosen to be piecewise-linear as opposed to continuous. A vocabulary of predicates, encompassing objects, properties, relations, and actions, from our group’s earlier research was used in the language. The diagrammatic objects were of three general types – points, curves, regions. Each object type had its own set of properties while a few widely used relations and actions involving different objects were identified after having solved a variety of problems using diagrams in a number of different domains. The vocabulary was not claimed to be complete and new predicates were allowed if a problem could not be specified easily using the exist- ing ones. Thus, the vocabulary was rich enough to specify any spatial problem relevant to DR. First-order logic has for long been recognized as one of the most expressive lan- guages in which a human can formally communicate to a machine. Since the goal was to communicate problems, a declarative language was desired. Other formal programming languages, such as LISP, C/C++, Java, etc. do not provide the facility to communicate in a declarative manner, unlike first-order predicate logic. Thus, the proposed specification language consists of a language as expressive as first-order predicate logic in conjunc- tion with an open-ended rich vocabulary of objects, properties, relations and actions. The expressiveness of this language was illustrated by specifying various problems from two very different domains – military and Euclidean geometry.

Correctness of the SPS. A general domain-independent SPS was designed that can ac- cept any problem from the specification language and solve it without human intervention. Two strategies were proposed – constraint satisfaction and spatial search. The major com- ponent of the constraint satisfaction strategy was a set of algebraic quantifier elimination algorithms, which can be classified into two classes – one class of algorithms that are slow and general but provide exact and complete solutions while the other class of algorithms

71 are fast but provide either a part of or approximate solutions. All of these algorithms are well-known in the constraint satisfaction community and their correctness have already been proven in the literature, references to which have been made at relevant places. The correctness of the proposed mechanism to solve problems by using similar previously solved problems in memory has been discussed in section 3.3.

The spatial search strategy implemented quantifier elimination from the definitions of the quantifiers – when an existential quantifier occurred, the SPS searched for a pixel that satisfied the constraints while all pixels were searched for satisfaction of constraints when a universal quantifier occurred. Thus, provided the resolution was high enough, the search would always output the correct solution. The resolution-based enhancement required starting at a low resolution and selectively increase the resolution in which the search still followed the definition.

Generality of the SPS. Spatial problems were classified into three classes – decision, function and optimization. In this dissertation, decision and function problems were solved by both the strategies. Satisfaction of the constraints in an optimization prob- lem was treated similar to a function problem, the solution of which was provided to optimization algorithms to figure out the best solution out of a pool of solutions by mini- mizing or maximizing some function. For the constraint satisfaction strategy, well-known optimization algorithms were used while for the spatial search strategy, this was left as part of future research. Since all problems can be classified as a decision, function or optimization problem, any problem that can be specified in the specification language can be solved by the proposed strategies. No domain-specific knowledge or heuristic was used in the system. The enhancements made to the two strategies were general enough to be useful for all problems. Generality of the framework was illustrated by successfully solving a number of real world spatial problems from two very different domains without any human intervention.

Computational efficiency of the SPS. The constraint satisfaction strategy used state of the art algorithms for quantifier elimination which are the most efficient known in the field. Thus the SPS can already be claimed to be the most efficient for its purposes. However,

72 I made significant enhancements to its operation by solving problems using solutions of previously solved problems stored in memory. This reduced the computational complex- ity of quantifier elimination, the bottleneck of the approach, from doubly exponential to low-order polynomial.

The spatial search strategy was significantly slow as it had to process an exponen- tial number of pixels to reach the solution. The proposed resolution-based enhancement significantly reduced this computational cost including the updating cost and processing cost. All the spatial problems solved using these two strategies from different domains have been solved in time significantly less than that required by the state of the art algo- rithms.

6.2 Contributions

Proposal for a general framework for solving spatial problems relevant to DR. Rea- soning with diagrams require solving non-trivial spatial problems in order to perceive information from or act on a diagram. For the last couple of decades, a number of domain- specific DR systems have been built in which the domain- and application-specific spatial problems are pre-determined and hand-coded into the system by a human. This signif- icantly reduces the power – that of open-ended exploration in the state space by oppor- tunistic use of perceptual and knowledge resources – of a problem solver and reduces it to a mere algorithm. The larger goal of our group’s research is to build a general purpose multi-modal problem solver of which a general purpose DR system is the first step. A pre-requisite for such a system is a SPS that can execute perceptions and actions at the behest of the problem solver without human intervention. In this dissertation, I proposed a general framework for spatial problem solving consisting of a high-level language that is finite, extensible, human-usable, and expressive enough to describe a wide variety of spatial problems, and general strategies for solving those spatial problems. The spatial problems were defined in terms of constraints specified in first-order logic over the real domain using a vocabulary of objects, properties, relations and actions. Two general and

73 independent strategies - constraint satisfaction and spatial search – were developed for au- tonomously solving the spatial problems specified in the language and diagrammatically represent the outcome whenever appropriate. Several ideas about how to make these strategies computationally efficient were proposed and illustrated by a number of exam- ples in two very different domains – military and Euclidean geometry. The proposed framework will truly help a traditional problem solver to deal with the heterogeneity of spatial and linguistic representations of real world problems and utilize the heterogeneity of rationale for inferring solutions efficiently.

A language that grows. Constraint-based languages have been in use for quite some time in the constraint satisfaction community. However, such languages are very close to machine language devoid of any vocabulary of objects, properties, relations or actions, and problems are specified in terms of algebraic equations/inequalities. This makes it difficult for a user to specify a problem as he has to dig deep into an ocean of equations and inequalities and cannot communicate naturally in terms of high-level predicates. For any non-trivial problem, even his well-thought-out specification will be cumbersome, and hence will make little sense to another person. While a problem can be used in specifying another problem (for e.g., Intersect can be used in specifying BehindCurve), this is not a possibility unless the specification of the problem being used is understandable. This inability to reuse pre-specified problems in specifying new problems is a serious impediment for the language to grow naturally as every new problem has to be defined from scratch. One of the most well-known systems incorporating symbolic algorithms for solving problems in first-order logic is REDLOG [Sturm and Weispfenning, 1998; Weispfenning, 2001] which has been used in a number of applications but without any vocabulary. The BehindCurve(q, c, p) problem, for example, where p ← {px, py}, c ← 1 {p1, p2, ...pn}, pi ← {pi,x, pi,y}, q ← {x, y}, will be specified in REDLOG as:

BehindCurve({x, y}, {{p1,x, p1,y}, {p2,x, p2,y}, ...{pn,x, pn,y}}, {px, py}) ≡ n−1 ∃ax, ay, t, 0 ≤ t ≤ 1 ∧ px + t(x − px) = ax ∧ py + t(y − py) = ay ∧ ∨i←1(∃ti, 0 ≤ ti ≤

1 ∧ pi,x + ti(pi+1,x − pi,x) = ax ∧ py + ti(pi+1,y − pi,y) = ay)

1Examples in REDLOG are available at http://www.algebra.fim.uni-passau.de/ redlog/examples/.

74 while in my proposed framework, the same problem will be specified as:

BehindCurve({x, y}, {{p1,x, p1,y}, {p2,x, p2,y}, ...{pn,x, pn,y}}, {px, py}) ≡

Intersect({{p1,x, p1,y}, {p2,x, p2,y}, ...{pn,x, pn,y}}, {{px, py}, {x, y}})

Though solving the problem from both the specifications produce the same solution, it is needless to mention which specification is more elegant, understandable and easy to use. Thus, one of the major contributions in this dissertation is a language along with a vocab- ulary of high-level predicates that can grow naturally as more problems are specified.

Faster quantifier elimination without compromising on generality. It has been well- known that constraint satisfaction is a convenient framework for modeling search prob- lems. The use of quantifiers significantly enhances the expressiveness of a language for modeling such problems. In fact, many real world problems cannot be modeled with- out the use of quantifiers. Solving these problems requires quantifier elimination which, however, has been proven to be computationally very expensive (doubly exponential in the number of alternating blocks of quantified variables). Hence, modeling problems as quantified constraint satisfaction and solving them have remained limited to a few do- mains and applications. Researchers have investigated two classes of faster quantifier elimination algorithms – algorithms that are reasonably general but provide a partial or approximate solution to problems, thereby compromising on the quality of the solution for speed, and algorithms that provide complete solutions but only for a limited class of problems, thereby compromising on generality for speed. In this dissertation, given that the domain of representation of diagrammatic objects is piecewise linear, I proposed a strategy for re-using the solutions of previously solved similar problems stored in mem- ory to compute the solution of a new problem. This strategy completely bypasses the quantifier elimination process thereby reducing the worst case complexity from doubly exponential to low-order polynomial. The impressive feature of this strategy is that it helps the SPS to achieve speed by compromising neither the generality of the quantifier elimination algorithms nor the quality of their solutions, and lets it grow smarter as it solves more problems.

75 A spatial search strategy for quantifier elimination. Problems specified in a formal lan- guage, such as first-order predicate logic, have been typically solved using algebraically involved constraint satisfaction and quantifier elimination algorithms as such algorithms can naturally accept problems specified in such languages. The use of spatial search to implement quantified constraint satisfaction is relatively novel. The advantage of this strategy is its simplicity and generality. All decision and function problems are handled in the same way by the application of the three operators (F , F∃, F∀) in some order which is determined automatically by the SPS from the order of the quantifiers in the specification. The complexity of quantifier elimination in the constraint satisfaction strategy depends on the number of polynomials and their highest order while the same for the spatial search strategy depends on the number of pixels processed and updated. One of the limitations of the spatial search strategy is that problems cannot be solved symbolically and hence the solutions cannot be stored for future use, unlike the constraint satisfaction approach. Hence each problem has to be solved from scratch, irrespective of whether any similar problem has been previously solved or not.

6.3 Future research

Optimization by spatial search. In this dissertation, optimization problems were not com- pletely solved using the spatial search strategy. The functional part, i.e., the constraint solving and quantifier elimination, was solved by spatial search, the solution of which formed the pool of candidates out of which the best candidate has to be chosen as the final solution by minimizing or maximizing some function. I believe that a satisfactory general strategy for choosing the best solution by optimizing any arbitrary function can be developed that incorporates spatial search which forms a promising avenue for future research.

Selecting the right level of resolution in a problem-dependent manner. One of the fun- damental ideas by which the spatial search strategy can be made efficient is by varying the resolution according to problem solving needs. Given a particular resolution, what

76 resolution to choose next is very crucial for this idea to be effective. In the current imple- mentation, each pixel is subdivided into nine pixels when an increase in resolution was required. Though this gives satisfactory results in many cases, as we have seen, the effi- ciency can be further enhanced by choosing the resolution depending on the task and the objects involved.

A SPS that can suggest the next step in problem solving. Researchers [Larkin and Si- mon, 1987] have argued that one of the most important benefits of using diagrams is that the information is so spatially arranged that perception can pick up cues that suggest the next subproblem to be considered during problem solving. One such instance occurred in the Pythagoras theorem proving example where the perception that all triangles seemed to be congruent in the diagram laid the way for the rest of the proof. Whether or not the tri- angles were really congruent had to be verified formally but still, a huge amount of search was saved due to the cue provided by perception. What are the general principles using which vision picks up and suggests such information? Clearly, these are goal-dependent i.e., for example, vision would not have suggested that the triangles are of the same color because color is irrelevant for this particular goal. Are the relevant properties and relations extracted in a bottom-up or top-down manner? What is the role of the high level problem solver, if any? Though these issues have not been considered in this dissertation, they definitely require further investigation, as they form the crux of any general and efficient DR system.

77 BIBLIOGRAPHY

A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974. G. Allwein and J. Barwise. Logical reasoning with diagrams. Journal of Logic, Language and Information, 8(3):387–390, 1999. J. R. Anderson. Rules of the Mind. Lawrence Erlbaum Associates, Hillsdale, NJ, 1993. B. Banerjee and B. Chandrasekaran. Perceptual and action routines in diagrammatic reasoning for entity-reidentification. In Proc. 24th Army Science Conference, FL, 2004. Jon Barwise and John Etchemendy. A computational architecture for heterogeneous rea- soning. In I. Gilboa, editor, Proc. 7th Conference on Theoretical Aspects of Rationality and Knowledge, pages 1–27. Morgan Kaufmann, 1998. S. Basu, R. Pollack, and M.-F. Roy. Algorithms in real algebraic geometry. Springer- Verlag, 2003. B. Chandrasekaran, J. R. Josephson, B. Banerjee, U. Kurup, and R. Winkler. Diagram- matic reasoning in support of situation understanding and planning. In Proc. 23rd Army Science Conference, FL, 2002. B. Chandrasekaran, U. Kurup, B. Banerjee, J. R. Josephson, and R. Winkler. An architec- ture for problem solving with diagrams. In A. Blackwell, K. Marriott, and A. Shimo- jima, editors, Diagrammatic Representation and Inference, Lecture Notes in Artificial Intelligence, volume 2980, pages 151–165. Berlin: Springer-Verlag, 2004. B. Chandrasekaran, U. Kurup, and B. Banerjee. A diagrammatic reasoning architecture: Design, implementation and experiments. In Proc. AAAI Spring Symposium, Reasoning with Mental and External Diagrams: Computational Modeling and Spatial Assistance, pages 108–113, Stanford University, CA, 2005. B. Chazelle. Triangulating a simple polygon in linear time. Discrete and Computational Geometry, 6:485–524, 1991. H. Chen. The Computational Complexity of Quantified Constraint Satisfaction. PhD thesis, Cornell University, Dept. of Computer Science, 2004. G. E. Collins and H. Hong. Partial cylindrical algebraic decomposition for quantifier elimination. Journal of Symbolic Computation, 12(3):299–328, 1991. J. H. Davenport and J. Heintz. Real quantifier elimination is doubly exponential. Journal of Symbolic Computation, 5(1-2):29–35, 1988.

78 N. Dershowitz. Rewrite systems. In Handbook of Theoretical Computer Science. Elsevier, Dordrecht, The Netherlands, 1990. A. Dolzmann, T. Sturm, and V. Weispfenning. Real quantifier elimination in practice. In B. H. Matzat, G.-M. Greuel, and G. Hiss, editors, Algorithmic Algebra and Number Theory, pages 221–247. Springer, Berlin, 1998. R. W. Ferguson. Magi: Analogy-based encoding using symmetry and regularity. In Proc. 16th Annual Conference of the Cognitive Science Society, pages 283–288, Atlanta, 1994. R. W. Ferguson and K. D. Forbus. Georep: A flexible tool for spatial representation of line drawings. In Proc. 18th Natl. Conference on AI, pages 510–516, Austin, Texas, 2000. R. W. Ferguson and K. D. Forbus. Telling juxtapositions: Using repetition and alignable difference in diagram understanding. In K. Holyoak, D. Gentner, and B. Kokinov, edi- tors, Advances in Analogy Research, pages 109–117. Sofia: New Bulgarian University, 1998. J. Glasgow, N. H. Narayanan, and B. Chandrasekaran. Diagrammatic Reasoning: Cog- nitive and Computational Perspectives. AAAI Press, 1995. M. Jamnik. Mathematical Reasoning with Diagrams: From Intuition to Automation. CSLI Press, Stanford University, CA, 2001. J. E. Laird, P. S. Rosenbloom, and A. Newell. Universal Subgoaling and Chunking. Kluwer Academic Publishers, 1986. J. E. Laird, A. Newell, and P. S. Rosenbloom. SOAR: An architecture for general intelli- gence. Artificial Intelligence, 33:1–64, 1987. J. Larkin and H. A. Simon. Why a diagram is (sometimes) worth 10,000 words. Cognitive Science, 11:65–99, 1987. A. Lasaruk and T. Sturm. Weak quantifier elimination for the full linear theory of the integers. A uniform generalization of Presburger arithmetic. Technical Report MIP- 0604, FMI, Universitt Passau, Germany, April 2006. R. K. Lindsay. Using diagrams to understand geometry. Computational Intelligence, 14 (2):238–272, 1998. R. Loos and V. Weispfenning. Applying linear quantifier elimination. Computer Journal, 36(5):450–461, 1993. N. H. Narayanan and B. Chandrasekaran. Reasoning visually about spatial interactions. In Proc. 12th International Joint Conference on Artificial Intelligence, pages 360–365, Sydney, Australia, 1991. J. A. Nelder and R. Mead. A simplex method for function minimization. Computer Journal, 7:308–313, 1965. A. Newell. Unified Theories of Cognition. Harvard University Press, Cambridge, MA, 1990.

79 S. Pinker. A theory of graph comprehension. In R. Freedle, editor, Artificial Intelligence and the Future of Testing, pages 73–126. Lawrence Erlbaum, Hillsdale, NJ, 1990. Y. Pisan. A visual routines based model of graph understanding. In Proc. 17th Annual Conference of the Cognitive Science Society, Pittsburgh: Erlbaum, 1995. S. Ratschan. Efficient solving of quantified inequality constraints over the real numbers. ACM Transactions on Computational Logic, 7(4):723–748, 2006. R. Storn and K. Price. Differential evolution: A simple and efficient adaptive scheme for global optimization over continuous spaces. Journal of Global Optimization, 11: 341–359, 1997. T. Sturm and V. Weispfenning. Computational geometry problems in REDLOG. In D. Wang, editor, Automated Deduction in Geometry, Lecture Notes in AI, volume 1360, pages 58–86. Springer-Verlag, 1998. I. E. Sutherland. Sketchpad: A man-machine graphical communication system. In Proc. Spring Joint Computer Conference, pages 329–346, 1963. A. Tarski. A Decision Method for Elementary Algebra and Geometry. University of California Press, Berkeley, CA, 1951. A. Tarski. A decision method for elementary algebra and geometry. In B. F. Caviness and J. R. Johnson, editors, Texts and Monographs in Symbolic Computation. Springer- Verlag, Vienna, 1998. S. Tessler, Y. Iwasaki, and K. Law. Qualitative structural analysis using diagrammatic reasoning. In Proc. 14th Intl. Joint Conference on AI, pages 885–893, Montreal, 1995. S. B. Tricket and J. G. Trafton. Toward a comprehensive model of graph comprehension: Making the case for spatial cognition. In D. Barker-Plummer, R. Cox, and N. Swo- boda, editors, Diagrammatic Representation and Inference, Lecture Notes in Artificial Intelligence, volume 4045, pages 286–300. Berlin: Springer-Verlag, 2006. B. Tversky. Some ways that maps and diagrams communicate. In C. Freksa, W. Brauer, C. Habel, and K. F. Wender, editors, Spatial Cognition II: Integrating Abstract Theo- ries, Empirical Studies, Formal Methods, and Practical Applications, Lecture Notes in Computer Science, volume 1849, pages 72–79. Berlin: Springer-Verlag, 2000. V. Weispfenning. The complexity of linear problems in fields. Journal of Symbolic Com- putation, 5(1–2):3–27, 1988. V. Weispfenning. Semilinear motion planning in REDLOG. Applicable Algebra in Engi- neering, Communication and Computing, 12:455–475, 2001. S. Wolfram. The Mathematica Book. Available online at http://documents.wolfram.com/, 5th edition, 2003.

80 INDEX

Action, 3, 6, 20 spatial search strategy, 55 base, 31 triangulation, 31 Ambush, 3, 62 updating, 55 Angle, 20 ComputePoint, 65 ARCHIMEDES, 8 Constraint, 21 Area, 19, 44 Constraint satisfaction problem, 21 CSP, see Constraint satisfaction problem BehindCurve, 23, 39, 45, 56 Curve, 3, 31 BehindCurvewrtRiskyPath, 64 Cylindrical algebraic decomposition, 28 Between, 20, 44 Decision problem, 22, 43, 48 CAD, see Cylindrical algebraic decomposi- Diagram, 5, 43 tion abstract, 5 Centroid, 20 external, 5 Closed, 19 physical, 5 Collinear, 20, 44 Diagrammatic reasoning, 1 Complexity architecture, 2, 11 constraint satisfaction problem, 27 system, 6, 70 constraint satisfaction strategy, 37 DIAMOND, 9 cylindrical algebraic decomposition, 38 Distance, 20 equivalence checking, 33 Domain, 21 lexicographic sort, 37 DR, see Diagrammatic reasoning processing, 56 quantified constraint satisfaction prob- Entity re-identification, 58 lem, 27 FindPointsonShortestPath, 60 quantifier elimination, 28, 75 First order logic, 22

81 Function, 21 Obstacle, 58 problem, 22, 43, 48 On, 20 FurthestBehindCurve, 23, 25 Operator, 20, 43 Optimization problem, 22, 43, 51, 60, 76 GeoRep, 10 Goal, 2 Parse tree, 33 Perception, 3, 6, 67 Inside, 20, 32, 44 Periphery, 19, 31 Intersect, 20, 21, 32 Pixel, 43, 44 IntersectionPoint, 20 Point, 3, 31 isaCurve, 19 Predicate, 28, 29 isaPoint, 19 base, 31 isaRegion, 19 Prenex, 22, 47 isaSafeRegion, 59 Problem, 2, 13, 59 Isomorphism, 33, 34 classifier, 29 Label, 5 feature, 33 Language, 18 solver, 3, 59, 65, 70 modeling, 31 solving, 2 specification, 20, 42, 71, 74 Property, 3, 19 Leftof, 20 base, 31 Length, 19, 44 QCSP, see Quantified constraint satisfaction Line segment, 31 problem Location, 19 Quantified constraint satisfaction problem, Matrix, 22 21 Maximize, 21, 25, 51 Quantifier, 22 Memory, 35 Recognize, 19, 60 Minimize, 21, 25, 51 REDRAW, 6 Object Reflect, 20 base, 31 Region, 3, 31 diagrammatic, 3, 5, 19 triangular, 31

82 Relation, 3, 20 Variable, 21 base, 31 free, 28, 48 Resolution, 43, 52, 76 quantified, 28, 48 RiskyPath, 64 Variance, 20 RiskyPortionsofPath, 3 Virtual substitution, 28 RiskyRegion, 63 Vocabulary, 18, 44 Rotate, 20

Scale, 20 Shape, 19 Similar problem, 29, 32, 35 SKETCHY, 6 Spatial problem, 3, 6, 22, 26 solver, 27, 28, 34, 42, 45, 71–73, 77 SPS, see Spatial problem solver State, 2 Strategy constraint satisfaction, 27 spatial search, 42, 45, 76 Structural equivalence, 33 Subcurveof, 20, 22 Subgoal, 2 Subregionof, 20

Theorem, 35 proving in Euclidean geometry, 64 Pythagoras, 65 Tarski-Seidenberg, 28 Topof, 20 Touches, 20 Translate, 20, 44

83