SFB/TR 8: I1-[OntoSpace] OntoSpace Project Report University of Bremen Germany

General Ontology Baseline

Deliverable D1 I1-[OntoSpace]; Workpackage 1 Scott Farrar and John Bateman

December 2005 Version: 2.0

http://www.sfbtr8.uni-bremen.de/I1 e-mail: {farrar,bateman}@uni-bremen.de Roadmap for baseline deliverables D1-D4

The baseline position on ontology, ontology construction, and ontology use adopted in the project I1-[OntoSpace] is set out in a sequence of four deliverables (D1-D4). Each provides an introduction to the respective states of the art and describes the positions within these that I1-[OntoSpace] is adopting for its own work or as proposals for ontology construction within the SFB/TR Ontology Working Group generally. The baseline is made up of the following components:

D1 Ontology as such and the principal approaches and methodologies currently available for general ontology construction;

D2 The ontologies of space: approaches to representing space that have been taken on ontology and qualitative spatial representation and reasoning;

D3 The ontologies motivated by and for language: approaches to representing the kinds of distinctions that treatments of natural language require—particularly but not exclusively those required for spatial language;

D4 Inter-Ontology mappings and structuring devices: approaches to constructing on- tologies out of submodules and of relating such submodules in systematic ways.

This is the first of these deliverables and introduces the notions of ontology as such. Deliverables D1, D2 and D3 are results of Workpackage 1 as described in the I1-OntoSpace project proposal; D4 is a result of Workpackage 3 and cooperation with project I4-[SPIN]. In general, we will describe deliverables either by the long form ‘I1-[OntoSpace]:D1’ or, when there is no need for disambiguation, the short form ‘D1’.

Note: We maintain an extensive and regularly updated webpor- tal for our ontology activities as well as pointers to all kinds of ontologies at the Bremen Ontology Research Group website: http://www.fb10.uni-bremen.de/ontology

i Abstract

This is a general overview document setting out the basics of ontology design for Project I1-[OntoSpace] and the SFB. First, a discussion of the major design pa- rameters is given in order to familiarize SFB-members with the state of ontological engineering and the issues involved. We then select several key ontologies for dis- cussion along the lines of the parameters introduced. We conclude by setting out our starting assumptions for the ontology designs that will be employed within the project OntoSpace. These assumptions are also proposals for ontology design within the SFB as a whole.

Acknowledgements

The Cooperative Research Center for Spatial Cognition (Sonderforschungsbereich/Transregio SFB/TR8) of the Universities of Bremen and Freiburg is funded by the Deutsche Forschungs- gemeinschaft (DFG), whose support we gratefully acknowledge. We also acknowledge use- ful comments on earlier drafts especially from Doug Foxvog and also from Thora Tenbrink. Finally, we thank Claudio Masolo and Stefano Borgo for inspiration and advice on all things ontological.

ii Contents

1 Introduction: starting points 1

2 Basic Dimensions of Ontology Building 5 2.1 Philosophical approaches ...... 6 2.2 Meta-level decisions ...... 7 2.2.1 Subsumption, classes and instances ...... 8 2.2.2 Set theory and mereology ...... 9 2.2.3 3D and 4D views of reality ...... 11 2.2.4 Granularity and Scale ...... 14 2.3 Representations ...... 15 2.4 Method ...... 19 2.5 Computational instantiations of ontologies ...... 21

3 Sowa’s Ontology 25 3.1 Upper ontology and basics of Sowa’s ontology ...... 25 3.2 Relations and roles ...... 27 3.3 Abstractions ...... 29 3.4 Processes ...... 29 3.5 Representation ...... 30 3.6 Summary and discussion of Sowa’s ontology ...... 30

4 SUMO 31 4.1 SUMO basics and upper ontology ...... 32 4.2 Mereology in SUMO ...... 34 4.3 Representation ...... 35 4.4 Summary and discussion of SUMO ...... 36

5 Smartkom Ontology 36 5.1 Smartkom upper level ...... 37

iii 5.2 Smartkom roles ...... 38 5.3 Smartkom types ...... 39 5.4 Smartkom relations ...... 39 5.5 Summary and discussion ...... 41

6 OpenCyc 41 6.1 OpenCyc basics and upper ontology ...... 42 6.2 Microtheories ...... 46 6.3 Mereology in OpenCyc ...... 48 6.4 Representation ...... 48 6.5 Summary and discussion of OpenCyc ...... 50

7 DOLCE 51 7.1 DOLCE basics and upper ontology ...... 52 7.2 Qualities ...... 53 7.3 Primitive relations ...... 55 7.4 Representation ...... 57 7.5 Summary and discussion of DOLCE ...... 57

8 BFO 58 8.1 Philosophical underpinnings of BFO ...... 58 8.2 SNAP ...... 59 8.3 SPAN ...... 61 8.4 Trans-ontological relations ...... 63 8.5 Representation ...... 63 8.6 Summary and discussion of BFO ...... 64

9 General Ontology Language: GOL 65 9.1 Basic approach of GOL ...... 65 9.2 General Formal Ontology: GFO ...... 66 9.2.1 Categories ...... 66

iv 9.2.2 Classes ...... 68 9.2.3 Concrete entities ...... 68 9.3 GFO relations ...... 70 9.4 Summary and discussion ...... 71

10 The D&S extension to DOLCE 71

11 Conclusions and recommendations 75

I Appendix: Parthood basics 87

v List of Figures

1 A basic AI approach: Russell and Norvig’s upper ontology ...... 7 2 The 4D representation of a car gaining and losing a wheel (West, 2002a) . 12 3 Expressivity hierarchy for ALC classes of description logics ...... 23 4 Sowa’s upper level ontology lattice ...... 25 5 Roles in Sowa’s ontology ...... 27 6 Thematic roles or participants in Sowa’s ontology ...... 28 7 The process taxonomy in Sowa’s ontology ...... 29 8 An example conceptual graph ...... 30 9 The SUMO top-level categories ...... 31 10 Overall modular organisation of SUMO ...... 33 11 Various subrelations of ‘part’ in SUMO ...... 35 12 The upper level ontology used in the Smartkom project ...... 37 13 The process taxonomy of the Smartkom ontology ...... 38 14 A portion of the ‘AbstractRepresentationalObject’ taxonomy ...... 39 15 Smartkom ‘LocationType’ taxonomy ...... 40 16 Smartkom ‘PhysicalObjectType’ taxonomy ...... 40 17 OpenCyc’s upper ontology ...... 43 18 Taxonomy of predicates concerning microtheories ...... 47 19 Taxonomy of OpenCyc microtheories ...... 47 20 Parts taxonomy ...... 49 21 Physical parts taxonomy ...... 49 22 DOLCE taxonomy: taken from Masolo et al. (2002: 9) ...... 52 23 Quality and quality spaces (taken from Masolo et al., 2002: 12) ...... 54 24 DOLCE ontological dependencies (taken from Masolo et al., 2002:22) . . . 56 25 SNAP top-level categories ...... 60 26 SPAN top-level categories ...... 62 27 The top-level categories of the General Formal Ontology ...... 67

vi 28 The relation taxonomy of General Formal Ontology ...... 71 29 Description and situation extension to DOLCE (taken from Gangemi and Mika, 2003, figure 1) ...... 74 30 Hierarchy of mereologies according to strength of commitments; inclusion follows the connecting lines upwards (from Casati and Varzi, 1999, p48) . . 89

List of Tables

1 Levels of description suggested by Guarino (1995:632) ...... 2 2 Some dependences between ontological meta-properties according to the On- toClean methodology ...... 21 3 Examples of subsumption problems that violate the OntoClean methodology (Guarino and Welty, 2001, Table 3) ...... 21 4 Axioms of the basic ground mereology ...... 88

vii I1-[OntoSpace]:D1 1

1 Introduction: starting points

This is a general overview document setting out the basics of ontology design that we are proposing for our research projects carried out within the scope of the Collaborative Research Center for Spatial Cognition (SFB/TR8). First, a discussion of the major design parameters is given in order to familiarize SFB-members with the state of ontological engineering and the issues involved. We then select several key ontologies for discussion along the lines of the parameters introduced. Our discussion will mainly focus on current candidates for standardization and re-use within the ontology community, although we focus especially on those that we judge to be of relevance for the SFB. Our particular aim is to set out the starting assumptions for the ontology designs that we will employ within the project OntoSpace; we also consider these as guidelines to be adopted by the Ontology Working Group. The ontologies that we select for review involve ontologies from Artificial Intelligence (e.g., Russell and Norvig, SUMO, Cyc, Smartkom), from philosophy, particularly formal ontology (e.g., Sowa, BFO, GOL), as well as recent drives to combine these with cognitive concerns (e.g., DOLCE). They therefore range from ‘AI-friendly’ efforts such as those of SUMO to the rather more philosophically inclined. We hope that our discussion makes all of them equally accessible, while at the same bringing out what in particular they can contribute to our own research concerns of developing a broadly based ontological approach to the problems of spatial representation, action and interaction. Our discussion is aimed at readers who wish to gain an overview of what the current ontological options on the market are of what uses those ontologies can be put. For all areas, we give extension references to more detailed sources of information. The first question that we address is a very basic one: why do we want or need an ontology? There has been a steady progression in attempts to understand and model intelligent behavior towards building increasingly explicit and regularized representations of the world in which such behavior is to take place. Early ‘knowledge-intensive’ approaches within AI and cognitive science explored how systems could be constructed to use explicit knowledge representations. Such representations started with more or less ad hoc data structures whose interpretation was often open to question. The classic article of Woods (1975), ‘What’s in a link?’, started a more critical approach to knowledge representation, in which knowledge representations were to be placed on much firmer foundations than hitherto. This was taken further in discussions such as those of Brachman (1977) and Newell (1982). Newell proposed a new ‘level of description’ in computational systems—the knowledge level—which, separated from questions of logical formalism or implementation, was to be responsible for capturing statements about the knowledge of a system. This level was to be subject to its own rules, regulations and methodologies. The use of ontologies in current computational systems is a step further in this direction of increasing clarity concerning what is being modelled. It also represents an opening I1-[OntoSpace]:D1 2

Level Primitives Interpretation Main feature Logical Predicates, functions Arbitrary Formalization Epistemological Structuring functions Arbitrary Structure Ontological Ontological relations Constrained Meaning Conceptual Conceptual relations Subjective Conceptualization Linguistic Linguistic terms Subjective Language dependency

Table 1: Levels of description suggested by Guarino (1995:632) up of contacts with a broader range of philosophical concerns than were evident in early AI—this is only to be expected and is a sign that computational modelling has become more sophisticated; thinkers have been attempting to understand the world and human intelligence for a long time and it would have been curious indeed if AI and cognitive science had by chance found a way of avoiding all the difficult issues that have been unearthed during that history. A more concrete feeling for the role of ontology can be obtained by taking some of the examples used by Guarino (1995) in his extension of the presumed ‘levels of description’ beyond the logical and knowledge levels of Newell and Brachman to take in ontology per se. We will return briefly to these levels more technically below, but here we can use them to show the tasks that an ontology is intended for. We reproduce Guarino’s discussed levels in Table 1 for convenience; we concentrate on the first three levels described: the logical, the epistemological and the ontological. At, in some sense, the most basic level, we have representations expressed at the logical level or, as Guarino (1995) also describes it, the level of formalization. Primitives at the level of logic are the predicates, propositions, functions, and logical operators. This is to be understood as ‘basic’, however, only in the sense of being the most neutral, the most noncommital. For example, if we wish to represent the fact that there is some red ball we could write: ∃x ball(x) ∧ red(x) According to such a representation there is, then, no ontological (or any other) difference between ‘ball’ and ‘red’ or, to make this clearer perhaps, between ‘ballness’ and ‘redness’. Logically they are both unary predicates. On a somewhat deeper analysis, however, we might be unhappy with such a representation in that it quickly becomes clear with richer domains that there actually are several important differences between ‘ballness’ and ‘red- ness’ that are lost in the purely logical representation. At the next of Guarino’s levels of representation, the epistemological level, one then seeks to draw distinctions familiar from accounts of knowledge representation: for exam- ple, when modelling the world of this red ball, we make certain decisions concerning the concepts that are involved and which relations these concepts can enter into. This is now the basic vocabulary adopted in specifications expressed in terms of description logics be- I1-[OntoSpace]:D1 3

cause of their origins in work on knowledge representation (see Section 2.3 below). Thus, we would seek here, presumably, to state that actually a ‘ball’ is a concept and ‘redness’ is some property that instances of such concepts can carry—e.g., ‘red’ is the value of a ‘hasColor’ relation. This level then also offers a natural place for imposing more structure on the information being represented; the principal function of ‘frame languages’, the early ancestors of today’s description logics and knowledge representation languages, was to form appropriate bundles of knowledge, augmenting the flat representations of the logical level with the structuring primitives concepts, roles, slots, frames, etc. However, and this is Guarino’s main point here, this level of representation is also, in an important sense, arbitrary—there is no specified methodological reason why the knowledge modeller had to make the decision above concerning the ball and the color in that way. For reasons internal to some particular domain or application, the ‘knowledge engineer’ may have decided after all that both ‘red’ and ’ball’ are indeed concepts, or that there is no concept ‘red’, just a further subconcept ‘red ball’, or even that ‘red’ is the concept and ‘ball’ is the value of some role. The latter would then make the implicit commitment to color being the most important organizing category for the model: i.e., there is a color red and that color ‘has objectness’ ball. In general it is not possible to forbid such a representational choice—in a database of colors it might be exactly the kind of organization that is wanted, colors and examples of things that possess those colors. But what is important is that consequences follow from the modelling decision—certain inferences and generalizations are going to be easier to set up on the basis of one representation than they are in another. This is where Guarino appeals to his new further level, the ontological level, to help clarify and motivate particular modelling decisions over others. Here principles are set out for the kinds of formal properties that we demand of ‘concepts’ and the kinds of formal properties that we demand of ‘roles’—this is then intended to lead to more consistent mod- elling decisions being taken for domains as a whole and, as a consequence, more reliable and re-usable representations. This level of representation is, for Guarino and most who explore ontologies, no longer ‘arbitrary’: ontology is very much constrained. Ontological choices are no longer free modelling decisions but have instead to reflect, in the most tra- ditional senses of Ontology, the world. That is, we no longer have a free choice concerning our treatment of ‘balls’ and ‘redness’ because these are anchored in a real world of physical objects and physical qualities and their possible interrelationships. Thus, on the one hand, an ontology is a way of making explicit those commitments and structural necessities that follow from the fact that we are modelling not knowledge in the abstract, but concrete objects, qualities, relations and events of the known world subject to a rich web of non-arbitrary constraints. Moreover, on the other hand, an ontology is also one way of specifying explicitly just what follows from particular kinds of modelling decisions: to state that ‘ballness’ is a quality that inheres in certain colors would be a very strong ontological statement and many consequences would follow from it; the choice is no longer arbitrary. An ontology therefore establishes a methodology and set of principles for deciding in what way entities, relations, activities and so on are to be captured in I1-[OntoSpace]:D1 4

a formal representation. This also impacts on a further area where ontologies are very much at the center of attention: re-usability and knowledge sharing. Without an ontology, the chances of agreement across differing domains modelled by differing groups are much reduced; since any ontology is to be motivated by its anchoring in the world, it is generally hoped that knowledge representations that respect ontological decisions will provide a more robust basis for inter-operability and information sharing. This is one of the intuitions underlying the present prominence of ontologies within discussions of the Semantic Web (Burners-Lee, Hendler & Lassila 2001). The value of having a well-defined ontological level can also be seen in most of the tra- ditional problem areas of knowledge representation in AI and relates back to some of the earliest work in the area. Early computational ontologists, such as Pat Hayes (cf. Hayes 1979, Hayes 1985b, Hayes 1985a) began setting out formal systems intended to cap- ture the ‘naive physics’ of the everyday world. Formal systems of this type attempted to set out rules of inference that would enable a computational system employing them to understand that the world it inhabits and about which it is to communicate has certain essential properties and structures—properties and structures that are not accidental or contingent, but necessary to its constitution. This is potentially of fundamental importance for addressing such hard problems in AI as the frame problem: i.e., the problem of how a system is to know just which aspects of the world change following some action and which do not. For planning and related domains this is a crucial question and there is an entire literature on methods and techniques for restricting its consequences. Solutions proposed ranged from adopting maximally conser- vative hypotheses, such as nothing changes that is not explicitly mentioned in the effects fields of action representations, to less restricted inferencing approaches, where a planning system will try to infer just what must have changed or could have changed and what not. The most obvious role of ontology here is that it determines a dense structure of things that will not change: regardless of what kind of action is carried out (excluding for the moment acts of magic) physical objects will not cease to have positions in space and time, they will not cease to bear certain physical attributes, etc. These ontological commitments should be made explicit so that they do not have the same status as other kinds of ‘knowl- edge’ represented by a system. That is, there is a very special kind of relationship between a ball and its redness that it is the job of an ontology to set out. Then, going further, ontology is not restricted to describing objects. Ontology is equally concerned with ac- tivities, events, states and other temporally sophisticated entities. These too have certain necessary structures, which it is the job of ontology and Ontological Engineering to set out. Guarino offers the following further illustration of the role of ontology in this endeavor, which we can use to emphasize once again some of the points at issue:

“. . . let us suppose the following theory is true at a time t. I1-[OntoSpace]:D1 5

A Pen (a) B Functions (a) C Pen (x) ∧ Functions (x) ⊃ writes (x)

Suppose now that ¬writes(a) is true at a time t0; which of the assertions A, B, C should we retract in order to avoid the inconsistency of our theory?” (Guarino 1995, p637)

The possibilities turn on the kinds of ontological commitments that we have in place. We need to know that assertions A and C have a different status to that of B. It may well be that the pen stops working and so does not write—this is the commonsense and correct answer, i.e., B should be retracted. What is less sensible would be the logically equivalent retraction of A, i.e., that the in question stops being a pen or, perhaps even worse, of C, that pens have suddenly stopped having the function of writing. The first would be merely magical; the second a radical change in the social order! Such differences can of course be addressed by assigning the assertions different statuses in a purely logical account, but the general point remains—whenever and however this is done, this is to start building in ontological commitments concerning the entities and relations represented. And this is the proper task of ontology construction: just how are situations to be represented so as to bring out those arrangements which are necessarily the case and those which are merely contingent? As we shall see in this deliverable, however, the construction of ontologies is not an easy matter and there are a number of not entirely compatible suggestions about how it should be achieved. Moreover, although we will argue that the adoption of an ontological level of description brings with it a range of very practical benefits, such as increased re-usability of knowledge, improved reliability and consistency in modelling decisions, etc., what we take as an absolute starting point is more theoretical: we believe that such levels of rep- resentation must form a crucial component of all attempts to construct explicit models of our understanding of the world and of intelligent behavior within that world. All such attempts make ontological commitments—our task in these baseline ontology deliverables is to bring out the space of possibilities that are available for making these commitments explicit and arguable, and for relating them to one another.

2 Basic Dimensions of Ontology Building

The construction of ontologies for artificial intelligence, knowledge-based software and in- formation systems may vary according to a number of basic dimensions. In this section, we present an overview of the most important of these dimensions, many of them having their origins in Ontology from the philosophical tradition. We can classify these dimensions as follows: I1-[OntoSpace]:D1 6

• What does the ontology take itself to be describing or capturing? • What basic dimensions of choice are there for considering the content that is to be captured in the ontology? • What formal tools are selected for that representation?

Much of this discussion draws on standard introductions or characterizations of ontol- ogy in the literature; we draw particularly on Guarino (1994), Masolo, Borgo, Gangemi, Guarino & Oltramari (2003), Smith (1998) and Smith (forthcoming). There is also a ques- tion of what evidence is admissible for motivating the content of ontologies—this blends into discussions of particular types of ontologies, such as, most relevantly for OntoSpace, linguistic ontologies. We will, however, discuss this aspect of ontology construction separately in our deliverables concerning linguistic ontologies and their construction as such.

2.1 Philosophical approaches

What the elements of an ontology are supposed to represent is in some respects a subjective question. In the classical sense (commonly attributed to Husserl in his Logical Investiga- tions), Ontology as a philosophical pursuit is intended as a description of what exists in the material world, an account of absolute reality independently of any mind that perceives it. That is, the elements of Ontology are independent of any human mind that happens to also exist in the material world. This approach is known as ontological realism (Grenon & Smith 2004). The work of a formal ontology in the realist tradition, then, is to lay out “a priori distinctions” (i) “among the entities of the world (physical objects, events, regions, quantities of matter)” and (ii) “among the meta-level categories used to model the world (concepts, properties, qualities, states, roles, parts)” (Guarino 1995, p5). In contrast to this, ontologies may also be pursued as representions of concepts as grasped by some cognitive agent, human or artificial. This cognitively-oriented approach has been a major focus of ontologists in the AI tradition, especially with regards to domain-specific ontologies. Much current work in practical ontology building assumes a cognitivist, non- realist approach. An often cited upper ontology that is in line with the conceptualist, AI tradition is that of Russell and Norvig found in the popular AI textbook, Russell & Norvig (2003) (see Figure 1 for a graphical overview of the kinds of categories contained (Russell & Norvig 2003, p321)). In this sense, an ontology is “an explicit specification of a concep- tualization” where a “conceptualization” is the “set of objects, concepts, and other entities that are assumed to exist” in some domain (Gruber 1995). This second approach does not necessarily commit itself to describing what is actually real in the world, but only to what is important for some domain. This non-commitment to reality can naturally be freed even of any cognitivist underpin- nings. In some recent discussions of ontology, use of the term has become equivalent I1-[OntoSpace]:D1 7

Figure 1: A basic AI approach: Russell and Norvig’s upper ontology to ‘domain modelling’ in a knowledge engineering sense; in the words of Gangemi & Mika (2003): “Ontologies, as discussed in Artificial Intelligence, are formal, partial specifications of an agreement over the description of a domain.”. For some degree of disambiguation, we will write ontologies that come with a realist as- sumption as Ontologies (with a ‘large O’), ontologies with cognitive or linguistic moti- vation as ontologies (with a small O), and the others we will not describe as ontologies at all. We will ourselves attempt to skirt issues of existence so as to remain for present purposes consistent with a broader range of approaches.

2.2 Meta-level decisions

Whichever of the main approaches is taken, both realist and cognitivist ontologies must give an account of the most basic, domain-independent entities found in the “world”, as well as the categories which are particular to some domain of inquiry (often called a material ontology). The most basic level is known variously as an upper model, an upper ontology, or, following Husserl, a formal ontology (Smith forthcoming), etc. Our discussion here focuses on this foundational ‘component’ of an ontology: we will term it an upper ontology in line with several current attempts to achieve consensus concerning this ‘most general’ level of ontology without committing or necessarily agreeing with those efforts in general. Our precise points of agreement or disagreement will be clarified where they arise. There are several basic divisions that can be made concerning how this upper level is to be organized and these have ramifications for all subsequent design decisions. It is I1-[OntoSpace]:D1 8

therefore useful to present these explicitly at the outset so that we can make use of them when considering the individual ontologies we discuss below.

2.2.1 Subsumption, classes and instances

In the description of particular ontologies, authors often rely heavily on a presentation of the basic taxonomy of entities in the world to focus the discussion. The formation of this taxonomy may not be the most important dimension of ontology design; however, it provides a backbone from which domain specialists can then extend upper ontologies. For this reason we will first go into some detail concerning the formation of taxonomies and their theoretical importance. Mathematically, taxonomies are lattice structures containing ontological classes as nodes. We shall use the term ‘class’ to refer to those entities in an ontology that represent the concepts or categories found in the world that is being modelled. Within the taxonomy classes are related by the partial-ordering relation of subsumption. That is, subsumption is reflexive, transitive and anti-symmetric. Various ontologies use the terms ‘subclass’ or ‘subsumes’ to represent the subsumption relation. Taxonomies then do not include instances of classes; however, presentations of taxonomies may in fact display instances as leaves of the same tree. Formally, the instantiation relation is separate from subsumption and so should not have any bearing on the lattice structure of the taxonomy. The relation of instantiation holds between classes and in- stances. For example, the class Dog might be instantiated by the individual Fido. Note also that if Dog is subsumed by Animal, then Fido is also an instance of Animal. In this case, then, Dog and Animal are classes whose members are individuals. An individual is something that cannot be instantiated, e.g., Fido. Instancehood is a relationship something bears to something else (a class), while individuality is intrinsic. It should also be noted here in passing that just what kinds of individuals there are will depend on the ontology considered: Cyc, for example, adopts a very broad range of possible individuals, including notions such as relations, functions, rules, and groups. Systems for describing ontologies can take some differing positions on these basic issues. Some systems only include classes of individuals (“first-order classes”)—as in our example above. In ontologies which do not permit classes of classes, the terms ‘instance’ and ‘concept’ or ‘class’ is often used to refer to individuals and classes of individuals. Other systems go further and allow classes (and taxonomies) of classes as well. In some cases, the class-of-class structures can become exceedingly complex: we see this for the Cyc and OpenCyc ontology discussed below (cf. Section 6). In such ontologies the (higher- order) class BiologicalSpecies may be instantiated by the class Dog. If BiologicalSpecies is subsumed by OrganismType then Dog is also an instance of OrganismType. In this spirit a system might support a configuration as follows:1, the class, BiologicalTaxon

1Thanks to Doug Foxvog (pers. comm.) for this example and for the clarifying notes made in this I1-[OntoSpace]:D1 9

has BiologicalKingdom, BiologicalPhylum, BiologicalOrder,... BiologicalSpecies as instances. The class BiologicalSpecies has Dodo, HomoSapiens, and AmanitaMuscaria as instances. The class HomoSapiens has PersonA and PersonB as instances. Of these, only the last two are then individuals, although all of them are instances of something—BiologicalTaxon being an instance of (the Class) Class.

2.2.2 Set theory and mereology

In the design of ontologies, the choice of formal modelling tools is one of the most important design aspects, both from a theoretical and a practical point of view. A fundamental distinction exists between those ontologies that use as the primary modelling tool some flavor of set theory and those that use a mereology.2 Set theory is a powerful modelling device used extensively in mathematics and formal concept analysis. In an ontology that relies primarily on set theory, the basic set operations—union, intersection, subset, etc.— are used for deriving the ontological classes. This does not necessarily imply that such ontologies equate classes with sets of individuals (Degen, Heller, Herre & Smith 2001); but, this assumption is central to some ontologies, for example the SUMO ontology (Niles & Pease 2001b). As will be discussed in Section 6, the top class in OpenCyc is a set built up from more fundamental sets, which are in turn built up from yet more fundamental sets. In such an ontology a basic distinction is made between those entities which are sets and those entities which are individuals, that is, not set-theoretic in nature. Individuals in this sense are sometimes called urelements (Smith 1998). Singleton sets containing urelements are the leaf nodes (in the taxonomy) of ontologies that equate sets with classes. Some ontologists argue further that the mathematical notion of set membership should also not be confused with instantiation. Gangemi, Guarino, Masolo & Oltramari (2001) use the following example to illustrate the difference between set membership and instantiation. Consider two possible interpretations of “Socrates is a man”:

1. Socrates belongs to the class of all human beings;

2. Socrates exhibits the property of being a man;

Then:

“Usually, in mathematics, the two views are assumed to be equivalent, and a predicate is taken as coinciding with the set of entities that satisfy it. This view is however too simplistic, since in Tarskian semantics set membership is taken

subsection. 2Set theory and mereology are the most basic modelling tools that ontologists have at their disposal. However, other formal devices are available to supplement these, including topology, category theory, and formal algebra (Smith forthcoming). I1-[OntoSpace]:D1 10

as a basis to decide the truth value of property instantiation, so the former notion is independent from the latter. The existence of a mapping between the two relations does not justify their identification: one thing is a set, another thing is a property common to the elements of a set.” (Gangemi, Guarino, Masolo & Oltramari 2001, p3)

Arguments for using set theory include the fact that it has well-understood mathematical properties and lends itself to modelling both closed worlds, i.e., when basic urelements can be identified (Smith 1998), and infinite domains. Also, set theory allows ontologies to be easily partitioned into particular sub-ontologies (simply by using the notion of subset). Depending on the task, reasoning can then be limited to a particular subset of a knowledge base, resulting in increased efficiency. Since sets may be defined extensionally, ontologies based on set theory can include arbitrary folk or scientific taxonomies on top of the basic backbone taxonomy. This possibility follows from that of being able to construct higher order sets, i.e., sets of sets. This issue will be taken up in some detail in Section 6. The use of set theory as the basic modelling tool is rejected, in particular, by realists (e.g., , Ingarden, Chisholm, Smith) who consider the use of set theory (an abstraction) as simply wrong in the face of ontology as a description of reality (not an abstraction) (Smith 1998). One line of argumentation is that realists are concerned with ontology at the mesoscopic level, the level at which humans experience the world. At this level, there is no clear starting point for defining the urelements that are necessary as the basis of a set-theoretic approach (Gangemi, Guarino, Masolo & Oltramari 2001, Smith 1998).

“The application of set theory to a subject-matter presupposes the isolation of some basic level of urelements in such a way as to make possible a simulation of all structures appearing on higher levels by means of sets of successively higher types.” (Smith 1998)

Furthermore, purely set theoretical accounts ignore the relations within its individual enti- ties, or urelements, that are not fundamentally set-theoretic in nature (Degen et al. 2001). Relations between urelements as such are the major focus of ontologies relying on set theory as the basic modelling tool. In contrast to set theory, mereology concerns the relationship between parts and wholes; that is, mereology seeks to identify the parts that particular entities may possess, and how those parts are related to one another (Smith 1997). It does not have the notion of distin- guished ‘element’ necessary to set theory. An ontology that uses mereology as its primary modelling device takes as primitive the basic mereological relation part. This mereological relation is used to derive other basic axioms of mereology which hold universally over all kinds of entities. The relation proper-part, for example, is used to describe the relation between a chair and one of its legs, but also the relation between Fred’s running of the Boston marathon and one of his individual strides. Mereology, however, is not used to describe the relation between Fred’s stride and his leg. In the history of ontology, it was I1-[OntoSpace]:D1 11

notably Husserl who first used a kind of mereology as a starting point for his investigations of ontology (Smith 1997). One claimed advantage of using mereology instead of set theory is that a domain can be modelled without knowing exactly what the atoms are or even without needing to assume that there are any atoms at all (Smith forthcoming, Gangemi, Guarino, Masolo & Oltramari 2001). Another compelling reason to use mereology, at least for ontologies of reality, is that the mereological approach precludes having to discuss the intended meaning of the ontology in terms of set-theoretic semantics; Smith (1995), for example, provides a critique of set-theoretic semantics in the context of formal ontology. One disadvantage of mereologically-based accounts of ontology when considering practical implementation and use of such ontologies for computational modelling is, however, the expressiveness of the formal language used to define them. These tend to involve first-order logic at least. Finding a bridge between the standard definitions used for ontologies that draw on some expressive formalisms and tractable representations to be employed in our computational components is itself a major research task of the SFB. (This issue is taken up in more detail in Section 2.3.) Mereology is, moreover, very rarely considered sufficient in its own right. Especially in the areas of concern to the SFB—spatial representation—it is clear that the parthood relationship is not capable of capturing some of the most significant aspects of spatial configurations. For this reason, mereology is commonly extended with topological notions, giving mereotopological accounts (Smith 1996). This extension can, however, be done in a number of ways (cf., e.g., Casati & Varzi 1999, Heller & Herre 2003); we return to this aspect of ontology design in our deliverables concerning spatial ontologies. Mereology is weaker in terms of its mathematical modelling capabilities when compared to set theory (Smith forthcoming). However, Lewis (1991) has shown that set theory can be derived from mereology with the addition of the singleton operator (see Smith & Brogaard (2002) for a discussion). Atomic entities can then be defined in a mereology- based ontology as individuals that have no proper parts, and then function analogously to the urelements of a set-theoretically-based ontology. Atoms do not necessarily have to be independent entities—those entities existing by virtue of their own existence. They can also be dependent entities, such as Color which does not exist without some entity that has color as a property. Detailed introductions to mereology, its history and variants are given in, for example, Simons (1987) and Casati & Varzi (1999). We also provide in Appendix I a condensed introduction and point out some of the problems that arise when differing notions of ‘part of’ are combined.

2.2.3 3D and 4D views of reality

Another basic parameter that affects the nature of an upper ontology is whether the world is viewed from the 3D (endurantist) or the 4D (perdurantist) perspective. (See Loux (1998) for an introduction to the issue.) In a 3D-ontology, time and space are treated completely separately. The objects or events defined in the Ontology have no necessary time I1-[OntoSpace]:D1 12

Figure 2: The 4D representation of a car gaining and losing a wheel (West, 2002a)

component and have therefore to be indexed separately to, e.g., time points or intervals, if they are to be linked to time. This is one natural way of implying that the same object can exist over time, i.e., it endures: the same object may be identified at time point t1 as well as at time point t2. The temporal entities required are defined in the ontology additionally. Spatial entities then have other spatial entities as parts, whereas temporal entities have other temporal entities as parts. In contrast, the world in a 4D-ontology is viewed as one unified “time worm”. That is, entities are naturally viewed as continually changing since they necessarily include an extension in time as well as an extension in space: i.e., “time is just another dimension, in addition to and analogous to the three spatial dimensions” (Bittner & Smith forthcoming, p8). Objects are said to have various stages or phases of existence throughout time, that is, they unfold through time and never exist in full at any one moment in time (Bittner & Smith forthcoming, Heller 1991). A good illustration of the feel of a 4D, perdurantist, account is shown in Figure 2, taken from West (2002a). The graph shows three three spatial dimensions collapsed to a single vertical axis and the time dimension running along the horizontal axis. An area on the graph is therefore a fragment of 4D space-time. The area Car1 represents a ‘car’ existing in time and taking up some space. The lower areas Wheel1 represent a ‘wheel’, similarly existing in time and space. Both entities are necessarily seen as space-time worms. The fact that on the 1st January 2001 the car in question received the wheel in question and then kept that wheel until 5/4/2001 is shown by the shaded areas of the graph. S1 is then the temporal part of the car that corresponds to the car when it has the wheel Wheel1, while S2 is the temporal part of the wheel corresponding to that wheel while it was part of the car. The ontology we mentioned at the outset above, the AI ontology of Russell & Norvig (2003, p321), is one of the more commonly cited 4D ontologies. From the upper ontology given in Figure 1 above, the 4D aspect of the ontology is captured by the notion of a ‘generalized event’. A generalized event is some chunk of multidimensional spacetime. Some important regularities can then be extracted from this representation building just I1-[OntoSpace]:D1 13

on the basic notion of parthood, i.e., of mereology, as introduced above. First, the 4D entity S1 is a proper part of the entity Car1. Second, the 4D entity S2 is a proper part of the entity Wheel1. That is, part of the life of the car is the portion of its life (S1) that it spent using that wheel; similarly for the wheel. Moreover, the 4D entity S2 is a proper part of the 4D entity S1: that is, the life of the car and the wheel overlap for the interval indicated and during that interval the wheel is a proper part of the car, both spatially and temporally. 4D-ontologies are typically pursued the most fervently by those who need representations of activities, processes and change. Complex issues arise with respect particularly to the identity of entities, however. Individual time-space worms need to be labelled and fixed so that we can keep track of them when they combine, as was the case with the car and the wheel. In addition, there are further questions concerning whether it is always a time- space worm that is involved in some statement or not: for the perdurantist, there is no choice. For example, West (2002b) describes the relationship of marriage as an ‘unchanging property of related states’ that relates two 4D entities: portions of the lives of the entities who are married. Within a 4D ontology then, the statement:

married (JohnA, JaneA) is necessary, where “JohnA” and “JaneA” are temporal proper parts of the 4D entities that are John’s and Jane’s lives in toto (analogous to S1 and S2 in the car and wheel example). This can be subjected to some criticism: i.e., is it really just portions of these persons’ lives that are related in the marriage relationship or something more? Is it perhaps a relationship between the persons as such, as legal entities perhaps (one that happens to hold as long as they are married)? Some of these arguments revolve around the linguistic awkwardness of discussing a marriage between portions of lives rather than between people; others attempt to get a bit more deeply at the intuitions underlying that feeling of awkwardness. Further difficulties arise when we address distinctions concerning the material out of which something is made and that which is made out of that material: for example, a lump of metal and a statue made out of that metal. Here we have, arguably, the same 4D worm, but different identity conditions. The statue can be destroyed far more easily than the metal can be destroyed. This leads to areas of discussion that are not specifically the problem of 4D ontologies, however. We will take up these issue again below, with respect to some of the particular positions that have been taken within the ontologies and the ontological positions that we discuss. We then return to the 3D/4D distinction in our Deliverable D2 where it is also relevant, naturally, for the approaches to space and the location of objects and events in space over time. I1-[OntoSpace]:D1 14

2.2.4 Granularity and Scale

A further important dimension of ontology design at the metalevel that we will mention here is that of granularity. Granularity refers to the level of detail an ontology addresses. An ontology for geography would likely provide a much higher level of granularity of the features of the earth, than would an anthropological ontology concerned with the basic objects of human material culture. It is the job of the ontologist to develop basic ontological tools that can be applied to any domain. It is then the job of the scientist to extend the basic ontology to a high degree of granularity in a particular field. Granularity is a major issue in ontology. One approach is to partition an ontology into separate domains of varying degrees of granularity. This approach, based on set theory, is followed by OpenCyc and will be discussed in Section 6.2. However, Smith & Brogaard (2002) and Bittner & Smith (2001) have developed a theory of granular partitions by appealing to mereological principles, discussed in Section 8. The notion of granularity should not be confused with another important dimension of ontology design, that of scale. Scale pertains to the relative size of the objects being modeled. An ontology may describe reality at the microscopic, mesoscopic, or geographic level (Grenon & Smith 2004). For example, an ontology for quantum physics is at a microscopic level suitable for describing the movements of electrons and molecules, but wholly inappropriate for the mesoscopic human level where the movements of vehicles and humans are important, or for the geographic level where the salient objects of inquiry are mountains, rivers, and oceans. Of course scale as such is not limited to the three levels mentioned above. Most current ontological work concerns itself with a range of scales concerned with the everyday ‘world’ of human action, and interaction. This is motivated in various ways: e.g.,

“There is physical structure on the scale of millimicrons at one extreme and on the scale of light years at another. But surely the appropriate scale for animals is the intermediate one of millimeters to kilometers, and it is appropriate because the world and the animal are then comparable.” (Gibson 1966, p21f)

Both notions, granularity and scale, are theory independent—as one can have a low gran- ularity description of a high-scale object. For example, galaxies and all the particular sub-classes of galaxies can be discriminated as finely as some astrophysicist wants, while only one level of scale is relevant. A further rather different classification of granularity as such is proposed by Sowa (2000, p122). Sowa divides granularity according to three distinct sources: actuality, epistemic and intentional. Actuality is the granularity imposed by the ‘world itself’, the granularity of atoms and other entities revealed by nuclear physics. Epistemic granularity is that imposed by the extent of our knowledge as observers—such observation is always subject to limitations, either of the senses or of observational equipment. These can always be I1-[OntoSpace]:D1 15

in error or introduce inaccuracies or biases. And intentional granularity arises due to us as agents focusing on particular aspects of the world rather than others for particular purposes. Thus:

“A mining engineer . . . may treat a slurry of coal in water as continuous when it is being transported through a chute, but lumpy when the coal is being sep- arated from the water. The engineer, who sees that the slurry is epistemically lumpy, may treat it as intentionally continuous, even though at the atomic level both the water and the coal are actually discrete.” (Sowa 2000, p122)

This places granularity very much as an observer’s phenomenon and there is an open question as to what this then has to do with ontology, or at least with Ontology. As we shall see below, differing positions have been taken on this issue as well; particularly in Smith’s Basic Formal Ontology (BFO: Section 8), we see a claim very much at odds with Sowa’s proposal, one which does not assign ontological reality to the atomic level and the rest to observational inadequacy or focused attention.

2.3 Representations

A matter of immediate practical importance in ontology building is the formal language in which the ontology is expressed, as the computational properties of the formalism will affect the usability of all the subsequent modelling decisions. This issue is especially crucial with respect to the goals of the SFB. In order to be able to model the complexity of the modelling pursued in individual projects of the SFB, the formalism(s) used to model the shared ontology may need to be highly expressive. Moreover, pure philosophical inquiry into Ontology usually opts for some kind of first-order language with which to formalize its discussion, placing issues of implementation or computational use in the background. It is commonly stated that any formalism that is less expressive would not be appropriate for the kinds of statements that need to be made when constructing an ontology. An ontology is then represented in some logical formalism, called a modelling language. A modelling language is used to represents the elements in the intended domain of dis- course. Depending on the approach, there may be another language called an ontology meta-language used to describe the modelling language itself. The constants of the meta-language are used as predicates in the modelling language. The particular view of the world expressed by the ontology does not (or should not) depend on the modelling language (Guarino 1998), although the expressibility of the modelling language may limit what aspects of the ontology are representable. For example, consider the case of an on- tology that includes ternary relations. This ontology cannot be modelled directly using a language (e.g., a description logic) containing only binary relations and so some kind of transformation between the two forms is necessary. Two classes of logic will be considered here for use as a modelling language: the classes I1-[OntoSpace]:D1 16

of first-order languages and of description logics. The problem of moving between these, i.e., from a more expressive modelling language to a less expressive formalism, requires further compromises and omissions of various kinds. A smooth migration path from more expressive formalisms to more restricted formalisms, as well as the possible application of more powerful reasoning tools, is is an active area of concern of the SFB Project I4-[SPIN]. Thus, in the current document, we will only provide a description of those parts of the available formalisms that relate to the construction of ontologies and their use as modelling languages, leaving the full treatment of this issue to latter work with I4-[SPIN]. To begin, one first-order language common in ontology work is the Knowledge Inter- change Format (KIF: Geneserith & Fikes 1992). KIF is essentially a prefix (Lisp-like) version of first-order logic, i.e., (predicate argument argument ...). As its name implies, KIF is intended for the exchange of knowledge between disparate computer systems. It was not intended for use within a single computer system, although it may be employed for that purpose (Geneserith & Fikes 1992). KIF also serves as a kind of interlingua between various knowledge representation system, e.g., between Classic and Loom (Common Logic Working Group 2003). KIF consists of a machine-readable syntax which can express first- order logic, plus a declarative semantics. The following shows an example of a possible expression in KIF:

(defrelation character (?x) := (exists ((?n natural-number)) (and (>= ?n 0) (< ?n 128) (= (code-char ?n) ?x))))

For its semantic interpretation, elements of KIF are interpreted according to some universe of discourse; the semantics of KIF is in general quite standard when compared to the semantics of other first-order logics (Geneserith & Fikes 1992). Although users may define arbitrary universes of discourse, KIF was not intended at the outset as a language for ontology building. Except for some required semantic entities, it is almost as ontologically neutral as standard first-order formalisms (see Degen et al. (2001) for a discussion). The required entities, which are inherited into any application using KIF by definition, act like basic datatypes for ensuing KIF expressions and include:

• words: words are themselves objects in the universe of discourse, along with: • the things that they represent, • all sets of objects in the universe of discourse, • complex numbers, • finite lists of objects in the universe of discourse, • bottom — a distinguished object that occurs as the value of various functions when applied to arguments for which the function makes no sense. (Geneserith & Fikes 1992) I1-[OntoSpace]:D1 17

KIF also has certain quasi-ontological notions built in. For example, KIF makes a funda- mental distinction between individuals and sets, which is also a basic axiom found in some ontologies (Cycorp 2004d, Degen et al. 2001). Many ontologies have now been constructed in KIF; moreover, the exchange and re-use methodology of Ontolingua (Farquhar, Fikes & Rice 1996), of which we will say more in our deliverable D4, is based on KIF and provides a library of ontology modules. As a consequence, KIF is now considered by many as a de facto standard for ontology specification. There are, however, several further competing first-order formalisms used for ontology construction. CycL is one such language and is the language used to represent the Open- Cyc ontology; we present a description of this variant in Section 6. Also, SUO-KIF (IEEE 2003), a subset of KIF with a few additions for ontology design, is used by the developers of the SUMO ontology (Niles & Pease 2001b); we discuss this ontology in Sec- tion 4. There are, therefore, several first-order languages which, although structurally identical to KIF, are used by different communities. An alternative to this proliferation of KIF-like languages, and hence the need for translations between them, has been proposed in the Common Logic (CL) standard (Common Logic Working Group 2003). CL is not currently a fully specified language, like KIF or CycL, but instead consists of an abstract syntax that may have many different instantiations depending on purpose. That is, KIF itself, CycL, SUO-KIF, etc., can be considered instantiations of CL (Common Logic Work- ing Group 2003). We show here one example of an expression in the abstract syntax, followed by an equivalent expression in an instantiated KIF-like language:

CL: UnivQuant(v1, Cond(App(Boy,v1), ExQuant(v2,Conj(App(Girl,v2), App(Kissed,v1,v2))))) KIF-like: (forall (?x) (=> (Boy ?x) (exists (?y) (and (Girl ?y) (Kissed ?x ?y)))))

The primary advantage of such an approach is that a standard first-order semantics can be applied to the abstract syntax, and every instantiation of the syntax inherits that semantics. The CL specification also contains a proposal for an XML syntax instantiation (Common Logic Working Group 2003). CL is still in its early stages of development, however, and is therefore not immediately usable for ontology construction within the SFB. Moroever, an additional consideration arising out of the formalization plans with Project I4-[SPIN] is the lack of explicit typing within CL; this differs from the approach to specification taken with the first-order language CASL (cf. Astesiano, Bidoit, Krieg-Br¨uckner, Kirchner, Mosses, Sannella & Tarlecki 2002), which is proposed for ontology construction within the SFB as part of the research pro- gram within I4-[SPIN]. Given the already highly developed nature of CASL in terms of specification and formal tools, it is at present unlikely that the Common Logic initiative will become a serious contender within the current phase of the SFB. I1-[OntoSpace]:D1 18

Although some ontology specifications required for the SFB may require a first order lan- guage, this should not be an automatic and necessary overhead for all the ontology work and its computational instantiations in the SFB. The ontologies themselves should not be a computational bottleneck that impede basic research and implementation within the SFB. They must instead act to facilitate research, particularly with respect to the shar- ing of knowledge between disparate spatial and linguistic components. For this purpose, it is also useful to consider the use of less expressive formalisms as modelling languages. This approach has a long tradition in computational ontological engineering where the formalism of choice is description logic (DL: Baader, Calvanese, McGuinness, Nardi & Patel-Schneider 2003). More recently, DLs have also come into focus in the knowledge engineering and ontology literature due to the rise in popularity of object-oriented design and proposals for the intended functionality of the Semantic Web (Burners-Lee et al. 2001). A description logic is a much less rich, but highly structured subset of first-order logic that buys improved computational complexity properties at the cost of expressivity. Statements within DLs are expressed at the level of predicates, that is, there are no variables. Thus, the following example definition in a DL:

Mother ≡ W oman u ∃hasChild.P erson

can be glossed as: “The class Mother is defined as the intersection of the class of Woman and the class having at least one hasChild role whose value is restricted to the class Person”. Statements in a description logic are therefore formulae with one free-variable, which is then omitted. The basic description logic, called ALC, allows statements to be made with the following constructions:

DL construct description concept names unary predicates role names binary predicates ¬C negation C u D conjunction C t D disjunction ∃R.C existence of a role R value-restricted to be filled by concepts of type C ∀R.C all roles of type R value-restricted to be filled by concepts of type C

The terminology of ‘concepts’ and ‘roles’ shows the origin of description logic work in knowledge representation and the early so-called frame-based languages that were used for knowledge representation (Baader et al. 2003). In such languages, information is gath- ered together into ‘frames’ (concepts), each particular type of which admits of a specified set of possible attributes (roles). Within a description logic specification, classes and roles are separated from individuals by partitioning the knowledge base into a Tbox for classes I1-[OntoSpace]:D1 19

and roles and an Abox for individuals. Concepts may then have particular ‘individu- als’, or ‘instances’. This is precisely the kind of structuring typical of Guarino’s (1995) ‘epistemological level’ that we discussed in Section 1 above. Description logics provide a formally well-defined and, now, well understood basis for the kinds of specifications and reasoning commonly required in basic knowledge representation tasks. Most of the descriptive relevance of these frameworks then comes from the basic axiom constructors that allow subsumption relationships to be built up between concepts (for ALC and above) and between roles (for the description logic SHIQ and above: see below). One benefit of using a DL is then that certain ontological machinery is already in place: the various structures (class, role, individual) and statements of their subsumption relationships are built into the language. These are exactly the kinds of constructions that ontologies frequently require for describing various domains of discourse. It is then very natural and straightforward to specify, for example and if one should want to, that a ball is a type of physical object and that entities have subtypes physical objects and abstract objects, and that all physical objects may have a role ‘hasColor’ with value-restriction ‘Color’: i.e., the basic statements that are, as we shall see, typical of ontologies. Although attractive in some respects, as we mentioned above, ontology builders often argue that description logic is not powerful enough and must of necessity lead to descriptions that, at best, distort the nature of the ontological statements that are being made, e.g.:

“Using such a language for specifying foundational ontologies would be non- sensical: because of their very goals and nature, these ontologies need an expres- sive language, in order to suitably characterize their intended models”(Masolo et al. 2003, p6).

Although most instances of where ontologies are to be applied, such as current visions of the Semantic Web (Burners-Lee et al. 2001), do not envision going beyond description logics,3 this certainly may not necessarily apply to the SFB. As a general methodological strategy, therefore, we will always seek to make the level of expressivity required for an ontology module explicit so as to support transformations across formalisms and computational instantiations varying in their expressivity.

2.4 Method

The relationship between an ontology modelling language and a metalanguage can be made to do some useful work by setting out clear methodological criteria for how ontology construction may proceed in terms of properties that need to hold at the meta-level. A successful example of this can be found in the OntoClean methodology (cf., e.g., Guarino

3There are a few exceptions to this—e.g., the approach adopted in the ontology SUMO (cf. Section 4). The willingness to go beyond description logics is currently increasing, however, as reasoners for more expressive formalisms are improving rapidly in performance. I1-[OntoSpace]:D1 20

& Welty 2004, Guarino & Welty 2002), which uses the notion of a meta-language exten- sively in its definition of meta-properties. Meta-properties are properties of properties, not of objects in the world and are used to constrain ontology development and to evaluate particular proposed ontological organizations. The meta-properties particularly important for OntoClean are: rigidity, identity, unity and dependence. Rigidity refers to essen- tial properties, i.e., properties that an entities cannot loose without ceasing to be itself; identify refers to criteria for discriminating entities from each other or for recognizing when one has a particular kind of entity; unity refers to the ‘wholeness’ of an entity, whether it has parts, boundaries and so on; and dependence reflects whether an entity can exist independently or whether it needs to be ‘carried’ by another (e.g., the color of an object is dependent for its existence on that object, the hole of a doughnut is dependent for its existence on that doughnut, etc.). Ensuring that these meta-properties collectively hold provides a methodology for con- structing well-formed taxonomies; violations of the meta-properties are more or less strong indications that a poor modelling decision may have been made: that is, while the method- ology does not determine what is the correct choice, it does make it clear what consequences follow from any particular modelling choice—thereby supporting consistent modelling de- cisions. Something of the implications of the methodology for ontology construction can then be seen in the following brief characterization:

“... if a property holds necessarily for all the instances of a certain concept, of course its negation cannot hold necessarily for all the instances of a subsumed concept. This means that, if F is a certain formal property, anti-F cannot sub- sume F : anti-rigidity cannot subsume rigidity, anti-unity cannot subsume unity, and anti-extensionality cannot subsume extensionality.” (Gangemi, Guarino & Oltramari 2001, p288)

Thus, the concepts ‘person’ and ‘student’ differ in that the former is rigid—i.e., every person is essentially a person in that they cannot stop being a person without ceasing to be that individual—and the latter is anti-rigid—i.e., every student can also possibly be a non-student without ceasing to be the individual they are. This allows us to see that subsuming ‘person’ to ‘student’ would be a poor ontological choice. Some combinations of meta-properties and their use to define types and roles are illustrated in Table 2. Ontoclean has been used most extensively in the design of the DOLCE ontology discussed in Section 7 below, and has also been applied to re-organize somewhat less organized knowledge sources such as the psycholinguistically and lexically motivated WordNet (e.g., Gangemi, Guarino & Oltramari 2001). We will address its application to linguistically motivated ontologies in our deliverable I1-[OntoSpace]:D3. In the meantime, Table 3, taken from Guarino & Welty (2001), shows some perhaps at first glance acceptable look- ing subsumptions that in fact violate the OntoClean recommendations. A more careful consideration of these examples indeed reveals certain inconsistencies in the modelling.4 4Of the ontologies mentioned in this table, most are ‘linguistic ontologies’. Pangloss (Hovy & Knight I1-[OntoSpace]:D1 21

Metaproperty Rigidity Identity Notional dependence example + + - type ‘person’ ˜ inherits + material role ‘student’ ˜ supplies + formal role ‘Part’

Key: +: metaproperty holds, ˜: anti-metaproperty holds, notional dependence: property entails a further property holding of a distinct entity (e.g., student→teacher)

Table 2: Some dependences between ontological meta-properties according to the Onto- Clean methodology

Problematic case Source A physical object is an amount of matter Pangloss An amount of matter is a physical object WordNet An organization is a group WordNet An organization is both a social being and a group Cyc A place is a physical object Mikrokosmos, WordNet A window is both an artifact and a place Mikrokosmos A person is both a physical object and a living being Pangloss An animal is both a solid tangible thing and a perceptual Cyc agent A car is both a solid tangible thing and a physical device Cyc A communicative event is a physical, a mental, and a social Mikrokosmos event

Table 3: Examples of subsumption problems that violate the OntoClean methodology (Guarino and Welty, 2001, Table 3)

This is generally because the linguistic form ‘is a’ is being used in a variety of, not always mutually consistent, senses.

2.5 Computational instantiations of ontologies

We conclude our overview of ontology basics by discussing implementation issues for the representations introduced in Section 2.3. Like the representations themselves, implemen- tations fall into two categories: first-order theorem provers and description logic classifiers. Although there seem to be no first-order theorem provers that take arbitrary KIF files

1993) and Mikrokosmos (Carlson & Nirenburg 1992, Nirenburg & Raskin 2001) are ontologies developed primarily for machine translation, WordNet (Miller 1990) for psychologically motivated lexical semantic research. I1-[OntoSpace]:D1 22

as direct input, there are a number of systems that support KIF-like languages. The Sigma Knowledge System5, for example, used by the SUMO community, is one such theorem prover and ontology browser which take SUO-KIF files as input. Sigma uses the successful first-order theorem prover Vampire6 (Riazanov & Voronkov 2002). Also, the OpenCyc system includes a version of the Cyc Inference Engine and the Cyc Knowledge Base Browser7, which takes as its input CycL files. Experience with these approaches to reasoning with ontologies is still, however, somewhat limited outside of their particular (sometimes internal) user communities. Moreover, the increasing concern with interchange formats, re-usability and standards means that reasoners developed within other communities can now also be considered for ontology work; this is part of work begun in the SFB/TR8 project I4-[SPIN]. In contrast to these new developments, description logics have a long history of application for ontology or ontology-like representation and reasoning; this is due precisely to their origins in knowledge representation:

“Because Description Logics are a KR formalism, and since in KR one usually assumes that a KR system should always answer the queries of a user in reason- able time, the reasoning procedures DL researchers are interested in are deci- sion procedures, i.e., unlike, e.g., first-order theorem provers, these procedures should always terminate, both for positive and for negative answers.” (Baader & Nutt 2003, p48)

Thus a particular benefit of using a DL is to be found in the relative efficiency of its reason- ing. DLs facilitate certain kinds of reasoning required by knowledge systems, particularly determining the subsumption hierarchy of DL classes and whether or not individuals are instances of a particular class (Baader & Nutt 2003, p47–48). The description logic known as ALC (Schmidt-Schauß & Smolka 1991) is the smallest propositionally closed DL and all more expressive description logics extend this in expres- sivity by adding constructions such as qualified number restrictions on roles, i.e., that the number of roles present for a concept can be greater or lesser than some specified num- ber, inverse roles, transitivity of roles, and subsumption hierarchies over roles. The drive to increased expressivity must be combined in a trade-off against processability. Each increase in expressivity brings with it a potential cost in computational complexity; inopportune combinations of properties (e.g., number restrictions on transitive roles: Hor- rocks, Sattler & Tobies (1999)) can push the resulting logic over into undecidability. The development of description logics can thus be seen as going hand-in-hand with the devel- opment of ever more effective techniques for processing logics with increasing expressivity. A description logic is employed in order to support automated reasoning and, therefore, its actual computational behavior is extremely important.

5Sigma is available under a GNU license at https://sourceforge.net/projects/sigmakee/. 6Vampire is available at http://www.cs.man.ac.uk/˜riazanoa/Vampire/. 7This is available at http://www.opencyc.com. I1-[OntoSpace]:D1 23

The logics resulting from the kinds of extensions just mentioned are traditionally described using a naming scheme in which each extension adds its own distinctive label to the base name ALC. The expressivity hierarchy of the resulting logics can then be shown as illus- trated in Figure 3.

SHOIN (D) = OWL-DL

ALCQHIR+ (D) = SHIQ(D) SHOQ(D)

ALCQHIR+ = SHIQ

ALCQI

ALCI, ALCN , ...

ALC

Key: I: inverses; N : number restrictions; Q: qualified restrictions; H: role hierarchies; R+: transitivity over roles; D: domains of specified datatypes.

Figure 3: Expressivity hierarchy for ALC classes of description logics

While earlier studies in description logics concerned themselves with computational com- plexity in the area of ALC and below (cf. Donini, Lenzerini, Nardi & Nutt 1995), the advent of effective optimization techniques for dealing with the logics of higher expressivity has moved attention towards the top of the hierarchy. Even though all of the logics with the ex- pressivity of ALC and above are computationally intractable for satisfiability and subsump- tion, this only reflects ‘worst-case’ behavior. Sophisticated optimization techniques are now available which have already produced systems which boast practical tractibility for the more expressive logics. Since these more expressive logics regularly combine many features, making the ALC-naming convention increasingly unwieldy, Horrocks et al. (1999) introduce an alternative naming convention based on S (due to its close formal relationship to the propositional modal logic S4(m). The logic S is ALC extended by transitivity over primitive roles (i.e., ALCR+ ) and is the basis of the SH-family of description logics. Several of these have played a fundamental role in shaping both OWL and its immediate forerunners. The implementations for these expressive description logics include for SHIQ and SHIQ(D) the knowledge representation systems RACER8 (Haarslev & M¨oller 2001) and FaCT9 (Fast Classification of Terminologies Horrocks 1998). A list of reasoners that are available is maintained at the website ’http://www.cs.man.ac.uk/ sattler/reasoners.html’.

8Racer is available free for research purposes via an educational license by following the instructions at: http://www.racer-systems.com/products/download/education.phtml Commercial use is provided by Racer Systems. 9Various versions of FaCT are available at http://www.cs.man.ac.uk/˜horrocks/FaCT/. I1-[OntoSpace]:D1 24

The language proposed for ontologies for the Semantic Web, OWL, also includes a de- scription logic variant, OWL-DL, that is slightly more expressive than SHIQ in that it includes a constructor for building concepts out of finite sets of specified individuals. It also assumes a fixed set of datatypes inherited from XML schemas and RDF (Smith, Welty & McGuinness 2004). It is probable that efficient systems for processing this extended logic, SHOIN (D), will also become available in the near future: probably as extensions of the existing tools—for example, extensions for FaCT have already been undertaken in this direction via the logic SHOQ(D) (Horrocks & Sattler 2001), which omits inverses (for complexity reasons) but includes individuals. Moreover, the more restricted version of OWL, OWL-Lite, also corresponds directly to a description logic, SHIF(D), and is al- ready supported.10 A detailed overview of these developments and their interrelationships is given by Horrocks, Patel-Schneider & van Harmelen (2003). In addition, due to the recent interest in description logic formalisms triggered by their intended application within the Semantic Web, there are now an increasing number of authoring and other related utilities available. One very popular ontology editor and visualization tool is Protege11 (Noy, Sintek, Decker, Crubezy, Fergerson & Musen 2001). Protege is particularly useful as it has the ability to create and store ontologies in various formats, including OWL, DAML+OIL, UML, and other XML-based formats. Although not initially intended as a classifier front-end, Protege can utilize the RACER system to some degree. The exact relationship is still under exploration within I1-[OntoSpace]. Another popular editing tool is OilEd12 (Bechhofer, Horrocks, Goble & Stevens 2001). OilEd comes packaged, at the moment, with the FaCT reasoner; OilEd uses the DIG protocol (Bechhofer 2003) to access FaCT, Racer, and other reasoning systems. The DIG protocol enables reasoners to handle multiple client connections and is specified in XML. As a starting point for ontology definitions within I1-[OntoSpace], we originally decided to use RACER as a reasoning engine with specifications written in OWL-DL (naturally restricted to the expressivity of SHIQ for the time being) using the Protege ontology tool. We are now also exploring a variety of reasoners as their availability grows as a response to providing semantic web compatible reasoning capabilities. The use of DL- centered representations is be explored primarily for the baseline linguistic ontologies as we do not yet have conclusive evidence in this area that a first-order language is required; we return to this more concretely in our deliverable on linguistic ontologies. Explorations of more expressive languages will then be made in cooperation with I4-[SPIN] and will be based on CASL; first descriptions of this work involving the use of CASL for ontology specification have now appeared as L¨uttich & Mossakowski (2004).

10Progress here is rapid and presents a moving target for a document such as this: at the date of going to press, both FaCT++ and Pellet (http://www.mindswap.org/2003/pellet/index.shtml) support the logic SHOIQ (Horrocks & Sattler 2005), i.e., the expressivity of OWL-DL plus qualified cardinality restrictions in DL terminology. In our own work, therefore, we are orienting to description logics of various specified degrees of expressivity and are relying on the existence of standard interfaces, such as the OWL-API or the DIG-API, to maintain maximal modularity. 11The latest version of the Protege tool is maintained at http://protege.stanford.edu/ 12OilEd is freely available from http://oiled.man.ac.uk/download.shtml. I1-[OntoSpace]:D1 25

Figure 4: Sowa’s upper level ontology lattice

3 Sowa’s Ontology

We begin the discussion of individual ontologies with one that is described at length in John Sowa’s popular textbook, Knowledge Representation (Sowa 2000). Inspired by philosophers ranging from Aristotle to Peirce and Whitehead, Sowa created this ontology to incorporate some of the major ideas from Ontology as a philosophical discipline in such a way as to merge these ideas into a computationally, AI oriented artifact. Specifically, Sowa describes the ontology using conceptual graphs (Sowa 1983). The upper ontology along with some important content issues and the conceptual graph formalism will be covered in this section.

3.1 Upper ontology and basics of Sowa’s ontology

Sowa’s upper level, shown in Figure 4, is in the form of a lattice which is a mathematical structure consisting of a set of concepts and a partial-ordering relation, in this case, the subsumption relation. The top category > is often referred to as the ‘ type’. The lattice is bounded by ⊥, or the ‘absurd type’. In other words the top subsumes every category in the lattice, while the bottom is subsumed by every category in the lattice. In between are the categories of interest. The uppermost categories in Sowa’s ontology are the ‘primitives’, those subsumed directly by >. These fall into three groups: the triad Independent, Relative, and Mediating; the dichotomy of Heraclitus Physical and Abstract; and Whitehead’s distinction of Continuant and Occurrent (Sowa 2000, p67). The notion of the triad that has had such an influence on Sowa’s work draws philosophical I1-[OntoSpace]:D1 26

roots ranging from Aristotle and Scholastics to Kant, Hegel, and Whitehead (Sowa 2000, p57). It was popularized by C. S. Peirce in the late 19th century as ‘firstness’, ‘secondness’, and ‘thirdness’. Sowa uses Kant’s terminology (independent, relative, mediating) for the triad in the upper level ontology. All things that are of type Independent are characterized by their intrinsic nature, independent of any external relationships. Examples include Human or Raining. No other concept is required for their definitions. The category Relative, on the other hand, subsumes all entities which must stand in some relation to another in order for their existence, e.g., Pet which requires the existence of an owner of type Human, or Mother which requires the existence of an offspring. The third member of the triad is Mediating which subsumes those categories that bring Independent and Relative into some kind of relation, e.g., Marriage which brings Wife and Husband into a relation, or LegalSystem which brings Attorney and Client into a relation. As for the dichotomy of Heraclitus, Sowa points out the Physical/Abstract distinction is much older (Sowa 2000, p67) and corresponds to the Greek “physis (nature) and logos (word, reason, or speech)” (Sowa 2000, p67). The category Physical subsumes all entities that exist in time and space, such as rocks, people, and other tangibles. The category Abstract on the other hand subsumes those entities that have no location in time or space, for example intangibles, such as colors, propositions, and mathematical objects. The remaining primitives, Continuant and Occurrent, are taken from Whitehead’s categories which he proposed to distinguish “enduring objects, which have a stable identity over some period of time, from “the constantly perishing occasions, whose successive stages may not resemble one another” (Sowa 2000, p71). Thus, a Continuant has only spatial parts, while an Occurrent has parts which are ‘stages’ and may exist at different (Sowa 2000, pp500–501). The end result is that the lattice structure of Sowa’s ontology possesses a strong symmetry over its categories. The non-primitive categories of the lattice on the second level, those not directly sub- sumed by >, are generated by the product of the various primitives. For example, by the multiplicative combination of Independent–Relative–Mediating with Physical–Abstract, the categories of Actuality, Form, Prehension, Proposition, Nexus, and Intention are gener- ated (see Figure 4). An Actuality is an independent physical entity whose existence is not contingent upon any other entity, e.g., a person or the process raining. Form is ab- stract information independent of any embodiment, such as a schema (geometrical forms or syntactic structures of sentences) or scripts (e.g., computer programs or recipes). A Prehension is a physical entity that exists by virtue of its relation to some other physical entity, including knots in a string or joints between bones (object-object relation) or dogs barking or an apple being eaten (object-event relation). A Proposition is an abstraction relative to some other entity or entities, such as in a logical proposition or a drawing which includes a pointer to the object it represents. A Nexus is a physical entity mediating two or more other entities, such as an action that relates an agent and a patient. Finally, there is Intention, which is an abstraction considered to mediate other entities such as reasons or purposes. When these resulting second-level categories are then combined with Continuant–Occurrent, the lower categories in the final row of Figure 4 are generated. The combinatorial effect is I1-[OntoSpace]:D1 27

Figure 5: Roles in Sowa’s ontology

that each of the lower categories inherits a combination of properties from its subsuming categories. As such, Sowa indicates that, “no fixed collection of distinctions or categories is likely to be adequate for describing all things for all time” (Sowa 2000, p75). The lattice methodology leaves open the possibility of omitting one or more primitives or including other primitives, and thus generating a different lattice and keeping the theory open. Because the lattice in Figure 4 is already crowded with connecting lines, ten of the upper level categories are not included. In total, using the maximum number of combinations of primitives, the upper level contains 37 categories including > and ⊥ (Sowa 2000, p73). The most important categories not shown in Figure 4 are described in the remainder of this section.

3.2 Relations and roles

In Sowa’s ontology there is only one primitive relation, Has. This dyadic primitive, however, is used to transform instances of Role (a subclass of Actuality, as shown in Figure 4) into various non-primitive relations. A role “characterizes some entity by some role it plays in relationship to another entity” (Sowa 2000, p80). For example, Potato is an entity that can play the role of Food. Roles must stand in dyadic relations to other entities. Roles are classified as in Figure 5. The classification of roles is based on the notion of a ‘prehension’, which describes some object in relationship to some other object (Sowa 2000, p501). A prehension is exemplified by the action described by the verb chase in A cat is chasing a mouse, where the cat and the mouse are bound together in an interlocking relationship, or a prehension (Sowa 2000, p270). From Figure 5 a PrehendingEntity is that which plays a role of some type. A PrehendedEntity is the role type itself. Either of these entities I1-[OntoSpace]:D1 28

Figure 6: Thematic roles or participants in Sowa’s ontology

can be ‘intrinsic’ or ‘extrinsic’, thus generating the next level of roles. A Composite is an intrinsic prehending entity that “bears a relationship to each component within itself. Its subtypes are distinguished by the kind of prehension: a whole is made up of its parts; and a Substrate...is the underlying material that supports dependent properties, such as size, weight, shape, or color” (Sowa 2000, p87–88). A Correlative is an extrinsic prehending or prehended entity which “bears a relationship to something outside itself. Examples include mother and child, lawyer and client, or employer and employee. A correlative could be considered the prehending entity of one prehension or the prehended entity of the converse prehension” (Sowa 2000, p88). A Component is an intrinsic prehended entity which “bears a relationship to the composite in which it inheres. Its subtypes include Part, whose existence is independent of the whole, and property, which cannot exist without some substrate” (Sowa 2000, p88). A special kind of role is the ‘thematic role’ or that which participates in an event. A notion already familiar in linguistics, thematic roles are called Participant in the upper ontology. Figure 6 shows the various types of Participant (Sowa 2000, p506). First of all it should be noted that participant is subsumed by Part (Figure 5). Participants are generated according the method noted in Section 3.1, that is, by combinations of the dichotomies Determinant–Immanent and Source–Product. Determinant captures the notion of ‘control’, as compared to entities which are Immanent, that is, present throughout the event but with no control over the outcome. Source and Product refer respectively to those entities that are present at the beginning and end of the event. Neither Source nor Product need to be present throughout the event. Sowa (2000) gives a detailed account of various participant types and provides compelling examples. We mention only a few to give a sense of how familiar thematic roles are categorized (Sowa 2000, p508–510). Subsumed under Initiator is the familiar notion of Agent, or the active animate entity that voluntarily initiates the action. Subsumed under Resource is the notion of Path, defined as a resource of a spatial nexus. Subsumed under Goal is Experiencer, or the active animate goal of an experience process. And finally, subsumed under Essence is Theme, or that which is moved, said, or experienced, but not structurally changed. We say more concerning linguistic ‘thematic roles’ in our deliverable D3. I1-[OntoSpace]:D1 29

Figure 7: The process taxonomy in Sowa’s ontology

3.3 Abstractions

Sowa’s ontology is also notable because it gives a thorough account of abstract entities. Sowa (2000) gives particular attention to one type of abstraction, that of Form. Forms, like all abstractions, are independent of time and space, “[b]ut they can be used to characterize physical entities that do” (Sowa 2000, p90). As noted above, forms divide into Schema and Script, the former being all forms and patterns of time-stable objects and the latter denoting forms of dynamically changing processes. One kind of schema is the SpatialForm, e.g., the familiar shapes of natural and idealized objects. Spatial forms can be contrasted with the notion of Arrangement, or schemas with no spatial dimension, e.g., “numbers, sets, lists, algebras, grammars, and the data structures of computer science” (Sowa 2000, p90). Also included as arrangements are the syntactic forms of natural languages, programming languages, and symbolic logic. Schemas include abstraction forms that are in flux. One type of script is KineticForm, which includes the information on a reel of motion picture film or the patterns and equations for generating motion in virtual reality. Another type of script is Procedure which includes time- or sequence-dependent specifications of actions such as computer programs, musical scores and recipes.

3.4 Processes

The lattice structure of the upper ontology places Process on an equal footing with Object. Whereas objects retain their identity over time, processes are time un-stable. Processes are classified according to their starting and stopping points and the kinds of changes that are associated with them, as shown in Figure 7. Processes are described and classified by their starting and stopping points and by the kinds of changes that occur as a result of the processes. A ContinuousProcess is one in which changes occur continuously, such as in a natural processes, e.g., ‘erosion’ or a ‘party’. That is, a continuous process has at least one unspecified boundary (starting or stopping) point. An Initiation has a clear starting point but no specified stopping point. Cessation has the opposite characteristics. And a Continuation has neither a starting nor a stopping point. A DiscreteProcess is an idealized I1-[OntoSpace]:D1 30

process that has clear boundaries, e.g., a ‘traffic light’s changing from red to green’ or ‘a rock’s falling’. With discrete processes, changes occur all at once. An Event is one kind of discrete process, that clearly has beginning and end points. Finally, a State is a type of discrete process where no change at all occurs, but boundaries are clearly defined.

3.5 Representation

To represent the ontology Sowa uses conceptual graph (CG) notation (Sowa 1983, Sowa 2000). Based on Peirce’s ‘existential graphs’ (Sowa 2000, p23), the formalism is comprised of a graphically based language of nodes and labelled arcs. The language was originally developed “to simplify the mapping to and from natural languages” (Sowa 2000, p25). An example of a conceptual graph is given in Figure 8. A CG has no variables. Boxes in a

Figure 8: An example conceptual graph

CG represent concepts, and circles are conceptual relations between concepts. In predicate calculus, the graph in Figure 8 would be translated as (∃x : Cat)(∃y : Mat)on(x, y). There is also a linear notation for CG, e.g., ([Cat] −→ (On) −→ [Mat]).

3.6 Summary and discussion of Sowa’s ontology

Sowa’s ontology illustrates the close ties of ontology in knowledge engineering with Ontol- ogy from philosophy. A key component of Sowa’s ontology is the mathematically attrac- tive lattice structure of the upper level. This approach contrasts with others that we will overview in this deliverable in that the inventory of categories in Sowa’s ontology is not fixed, but derived from a set of primitives and a multiplicative process. The construction of various relations from the primitive Has relation is particularly interesting from the per- spective of natural language, in that the resulting roles appear to be relatively close to the semantic structure of natural language in comparison to most other ontologies we review. Another characteristic that distinguishes Sowa’s ontology from the others is the focus on abstract entities, an area often insufficiently covered. I1-[OntoSpace]:D1 31

Figure 9: The SUMO top-level categories

4 SUMO

The Suggested Upper Merged Ontology (SUMO: Niles & Pease 2001b, Niles & Pease 2001a) is in many respects the most ‘AI-friendly’ of the ontologies with which we will be concerned and so provides a gentle start to our discussions of particular ontologies coming from the computational perspective. It is one of three starter documents currently under consideration by the IEEE working group for a Standard Upper Ontology (SUO). The top-level of the SUMO, showing the subsumption relationships between its most gen- eral classes, is as shown in Figure 9. Each SUMO category is defined, on the one hand, by its placement in this inheritance hierarchy, ranging from most general to most specific and, on the other hand, by associated axioms. The current version of SUMO13 at the time of writing (version 1.73) contained 1770 terms, 7271 axioms, and 1238 rules.14 Providing an ontology in this form is to attempt to successively decompose the possible ‘entities’ that make up the world or a domain of interest. Entity must here be understood in as neutral, or all-inclusive, manner as possible—it is a place-holder for anything that exists in a domain. Formally it simply corresponds to the top node in an inheritance lattice. Suggesting just how the domain is decomposed is then the job that any adopted methodology for ontology construction should perform. The basic division within SUMO

13See http://www.ontologyportal.org for the latest version. 14Combined with the domain ontologies, SUMO contains over 20,000 terms and 60,000 axioms. I1-[OntoSpace]:D1 32

is then clearly made between two ‘subworlds’: the world of physical entities and the world of abstract entities. These two highest categories of the SUMO hierarchy are distinguished from one another by their associated properties: physical objects are objects that necessar- ily exist in time and space, while abstract objects do not. Then, going further, we see that the physical world is then again divided among those entities that are labelled as objects and those that are labelled as processes. The precise meaning of these terms again needs to be taken from the axiomatic definitions provided for the concepts rather than intuitions about what distinction may be intended, although such intuitions are of course useful in gaining a rough impression of an intended organization. A convincing upper level ontology should provide clearly motivated positions for any kind of entity that we wish to ‘place’ within the ontology; to the extent that this is not possible, or is problematic, there remain questions to be answered for that ontology.

4.1 SUMO basics and upper ontology

In addition to the subsumption organization indicated in Figure 9, SUMO is claimed to be a modular ontology in that it is also divided, albeit informally, into domains; informally, this means that the divisions are marked by comments in the definition file rather than explicitly represented structures of the kind we will see in OpenCyc. Figure 10, taken from the SUMO documentation, lists the most general modules assumed. The structural ontology, for example, incorporates the base ontology and vice versa. The structural ontology contains the specification of the basic constructs used in building the backbone taxonomy of categories and relations. For example, the basic ontological struc- turing relations, subclass and instance, are defined here. The subclass predicate is used to structure the taxonomy of categories, while the instance relation is used to declare indi- viduals. The instance relation is defined set theoretically, somewhat circularly, as follows:

(instance instance BinaryPredicate) (domain instance 1 Entity) (domain instance 2 SetOrClass) (documentation instance "An object is an instance of a SetOrClass if it is included in that SetOrClass. An individual may be an instance of many classes, some of which may be subclasses of others. Thus, there is no assumption in the meaning of instance about specificity or uniqueness.")

The SUMO axiom for the properties of physical objects is then expressed in SUO-KIF (see Section 4.3) as follows:

(<=> (instance ?PHYS Physical) (exists (?LOC ?TIME) I1-[OntoSpace]:D1 33

STRUCTURAL ONTOLOGY + | | + BASE ONTOLOGY / | | \ / | | \ / | | \ / | | \ / | | \ + + + + SET/CLASS THEORY NUMERIC TEMPORAL MEREOTOPOLOGY / | | | / | | | / | | | + + + + GRAPH MEASURE PROCESSES +--+ OBJECTS + + \/ \/ \/ + + QUALITIES

Figure 10: Overall modular organisation of SUMO I1-[OntoSpace]:D1 34

(and (located ?PHYS ?LOC) (time ?PHYS ?TIME))))

This states that if (and only if) some entity represented by the variable ?PHYS is an instance of the SUMO class Physical, then there exist both a location and a time such that the physical object is located at the location and occurs/exists at the time. Both located and time are SUMO predicates defined elsewhere in the ontology. The condition is also stated in this case to be a biconditional by virtue of the ‘<=>’ symbol. Abstract objects in contrast are defined as having neither time nor location. Since both location and time are defined elsewhere in the ontology, the axioms serve to bind together definitions not only vertically (i.e., along the axis of subsumption) but also horizontally (i.e., across classes in the hierarchy). This latter is a crucial component of any richer ontology and makes the difference between a simple taxonomy and a strongly constraining model of some domain.

4.2 Mereology in SUMO

The SUMO knowledge base states that it incorporates “the relatively noncontroversial elements of Smith’s and Guarino’s respective mereotopologies” (SUMO 2003, comments). The basic relation for mereology is part, in terms of which “all other mereological relations are defined” (SUMO 2003, part). The relevant axioms are:

(instance part SpatialRelation) (instance part PartialOrderingRelation) (domain part 1 Object) (domain part 2 Object)

As an instance of PartialOrderingRelation, SUMO part resembles the primitive mereological relations found in ontologies based on mereology (cf. Appendix I). However, its signature spans only tangible, spatial entities—i.e, Objects. Such a restriction is not typical of mereology in general. Even so, part has several subrelations, as shown in Figure 11. The following lists other instances of SpatialRelation, but mostly pertaining to mereotopol- ogy, and not mereology proper.

• connected: used for objects that meet or overlap spatially. • connects: “the bridge” relation holding among 3 objects. • part: the most general mereological relation. • partlyLocated: relation between anything that is spatially located at the spatial extent of some object and that object. I1-[OntoSpace]:D1 35

part X ¨¨HHXXX  ¨¨ HH XXX  ¨ H XXX  ¨¨ HH XXX member component piece properPart superficial Part P  @PP  @ PP  @ PP surface bottom top side

Figure 11: Various subrelations of ‘part’ in SUMO

• between: a simple spatial relation holding among 3 objects.

• traverses: used when 2 objects overlap spatially.

• distance: relation between a length and 2 objects.

• hole: relation between a fillable body and the surface of some object.

The axiomatization of these relations picks out components of their standard mereotopo- logical definitions, which we discuss further in deliverable I1-[OntoSpace]:D2.

4.3 Representation

SUMO is formalized in a slightly adapted form of KIF (Geneserith 1991, Geneserith & Fikes 1992) (refer to Section 2.3) called SUO-KIF (IEEE 2003). The follow additions pertaining to ontology construction are given in the SUO-KIF documentation:

• (instance ?X ?Y) - ?X is an individual which is a member of class ?Y

• (subclass ?X ?Y) - ?X is a class which is a subclass of class ?Y.

• subclass is transitive and instance is transitive through subclass:

(=> (and (instance ?X ?Y) (subclass ?Y ?Z)) (instance ?X ?Z)) I1-[OntoSpace]:D1 36

(=> (and (subclass ?X ?Y) (subclass ?Y ?Z)) (subclass ?X ?Z))

As mentioned above in Section 2.5 above, the developers of SUMO also provide a reasoner for SUO-KIF, called Sigma.

4.4 Summary and discussion of SUMO

SUMO is an easily accessible ontology which attempts to combine the work of many re- searchers; it is, consequently, a melange of categories and design. We find concepts and axioms taken, broadly, from work on time such as Allen (1984), work on spatial entities such as holes from Casati & Varzi (1994), aspects of DOLCE and BFO, suggestions from Sowa (2000, Chapter 2), and many more. Thus, whereas SUMO is intended to have ecumenical appeal, a clear motivation for its ontological design decisions does not exist. Regarding the various parameters of ontology design laid out in Section 2, SUMO is a mixed ontology containing elements of realism, but also cognitively specific categories. SUMO is based on set theory, and has a limited mereological vocabulary; it is a 3-D ontology separating Object, Process, and Time; and, finally, it is heterogeneously granular, but is most detailed at the mesoscopic level.

5 Smartkom Ontology

In this section we describe the Smartkom ontology, a major component in the Smartkom project (Wahlster 2001). As the subtitle of the project suggests (Dialog-based Human- Technology Interaction by Coordinated Analysis and Generation of Multiple Modalities), the project’s aim is a multimodal system that merges the advantages of “dialogic commu- nication with the advantages of a mixture of graphic control surfaces and gestures” 15. The input to the system includes prosodic speech information, gestural information, and facial expression and their emotional states (Gurevych, Porzel & Malaka 2004, p2). Instead of creating a variety of knowledge representation formats tailored to the specific tasks of the various components, which is typical of many NLP systems, the Smartkom project employs the ontology as a common, language-independent knowledge source. The ontology consists of 730 concepts and 200 relations (Gurevych et al. 2004, p2), and is imple- mented in the Ontology Inference Layer (OIL) language (Fensel, Horrocks, Van Harmelen, Decker, Erdmann & Klein 2000) and used with the FACT system (Horrocks 1998); it lies accordingly within the expressive power of description logic. The Smartkom ontology was

15Quote taken from the Smartkom website, http://www.smartkom.org. I1-[OntoSpace]:D1 37

Figure 12: The upper level ontology used in the Smartkom project

first tailored to the tourism domain but is now expanded to include other domains in the Smartkom project, e.g., new media and program guides (Gurevych et al. 2004, p2).

5.1 Smartkom upper level

Taking inspiration from the ontology of Russell & Norvig (2003) and the methodology of Guarino & Welty (2000), the authors of the Smartkom ontology constructed the following upper level, shown in Figure 12. The Smartkom upper level divides all concepts in the ontology into either Type or Role.A Type subsumes everything that possesses ‘primary ontological status’, or concepts that are independent of how they are applied (Gurevych et al. 2004, p4). That is, types always have the same ontological status independent of the particular context. The class Type contrasts with Roles, or anything that can take on a different role in a specific situation, event or process. For example a building has primary ontological status, but it can be a hospital, school, or railway station depending on the situation, just as someone can be a mother or teacher in a particular situation but remain a person in all cases. The Smartkom authors, then, place great importance on determining the ontological status of each sort of entity. I1-[OntoSpace]:D1 38

Figure 13: The process taxonomy of the Smartkom ontology

5.2 Smartkom roles

Role is the most general class subsuming all other roles that an entity can play in a par- ticular domain. Instead of dividing roles into object and process roles, Role is divided into Event and AbstractEvent. That is, Smartkom assumes, after Russell & Norvig (1995), a 4D ontology. An event is any entity that exists in space and time, e.g., any physical object or process. It is by Event that the classes of PhysicalObject and Process are subsumed. These classes subsume familiar physical objects, e.g., Artifact and Agent. Also, Smartkom gives an extensive inventory of processes, as shown in Figure 13. A GeneralProcess refers to very gen- eral processes, e.g., duplication, imitation, or repetition. A MentalProcess refers to those which are cognitive, emotional, or perceptual, such as making plans. A PhysicalProcess includes motions, transactions or controlling processes. A SocialProcess refers to commu- nication or instruction. These constructs show much similarity to those found within the ‘frame semantics’ (Fillmore 1968, Fillmore 1982) developed within the FrameNet project (Baker, Fillmore & Lowe 1998), to which we will return in more detail in our deliverable D3. Turning to abstract events, these classes, either an AbstractObject or an AbstractProcess, do I1-[OntoSpace]:D1 39

AbstractRepresentationalObject Angle Attribute Attitude Genre Language Property PhysiologicalProperty Quality Style Direction Relation SpatialRelation Scalar TemporalRelation Scale

Figure 14: A portion of the ‘AbstractRepresentationalObject’ taxonomy not exist relative to any conceptualization of space or time. Subclasses of AbstractObject in- clude AbstractNumber, AbstractRepresentationalObject, and Set. Abstract representational objects are highly differentiated in the Smartkom ontology. For example, relations, direc- tions and attributes are subsumed under this class (Gurevych & Porzel 2004). Attributes, for example, are “[t]he most abstract properties of living and non-living beings such as for example gender, used to define objects and beings” (Gurevych & Porzel 2004). Directions and spatial relations are largely based on English prepositions, and we will return to this portion of the ontology in our deliverable D2.

5.3 Smartkom types

The types in Smartkom, as opposed to the roles mentioned in Section 5.2, are those entities that have primary ontological status. Shown in Figure 12, subclasses include AbstractO- bjectType, AbstractProcessType, LocationType, PhysicalObjectType, and ProcessType. Par- ticularly detailed are the classes LocationType and PhysicalObjectType. Location types are the spatial and temporal locations where something can be. Finally, the class PhysicalOb- jectType subsumes the non-role counterpart of PhysicalObject from Figure 12.

5.4 Smartkom relations

Given that OIL is a rather impoverished language in comparison to other formalisms we address in this deliverable, classes are specified according to how they stand in relation to I1-[OntoSpace]:D1 40

LocationType SpatialEntity SpaceInterval Height Length Width SpacePoint TemporalEntity TimeInterval TimePoint

Figure 15: Smartkom ‘LocationType’ taxonomy

PhysicalObjectType CollectionType LivingType GroupType AgentType VegetationType NonLivingType SubstrateType

Figure 16: Smartkom ‘PhysicalObjectType’ taxonomy I1-[OntoSpace]:D1 41

other classes. Relations in OIL are ‘slots’, essentially binary relations between concepts. In the specification of Process, for example, the following slots are necessary: begin-time, end- time, and state. Other slots include the traditional thematic roles: has-agent, has-theme, has-experiencer, has-instrument, has-location, etc. Part of the OIL metalanguage, slots are purely relational, functional, or transitive. Purely relational slots are the binary relations mentioned in the previous paragraph. Functional slots include: has-age required for PersonType, and entrance-fee required for Building.A transitive property is, for example, has-member which has the inverse is-member-of. Fur- thermore, slots are organized into hierarchies: the has-agent slot, for instance, subsumes has-buyer and has-cognizer.

5.5 Summary and discussion

Smartkom should be considered a ‘light-’ or perhaps ‘medium-weight’ ontology (cf. SUMO above or DOLCE below) as it only provides partial definitions for its classes and relations. This is due of course to the modeling constraints imposed by the OIL formalism. Smartkom is notable, however, because it puts emphasis on roles that entities can play (Role) as distinguished from primary ontological entities, subsumed by Type. This design feature of the ontology is argued by Gurevych et al. (2004) to aid in the generation of a semantic representation for natural language utterances and will be considered as a far more central component of ontologies as we continue.

6 OpenCyc

OpenCyc (Cycorp 2004d) is the open source version of the Cyc project (Lenat & Guha 1989), that latter of which is claimed to be the world’s largest and most complete com- monsense reasoning system. Developed by Cycorp, OpenCyc includes modules for speech understanding, database integration, rapid development of ontologies, and tools for pro- cessing e-mail. According to the website, release 1.0 of OpenCyc will contain 6,000 classes, relations, and individuals in the upper ontology, accompanied by 60,000 assertions about those 6,000 entities. The release will contain the OpenCyc Inferencing Engine and the OpenCyc Knowledge Base Browser, along with a number of other tools for programmers and knowledge engineers. The following is an overview of OpenCyc 0.7.0 with respect to its major conceptual modelling choices plus a short preview of the spatially related cate- gories. It should be noted that this version is significantly impoverished as compared to the promised 1.0 version. The overview which follows is based largely on the contents of OpenCyc 0.7.0. Specific entries will be cited as follows: the label of the given entry (class or individual) will be given within the actual citation, e.g., (Cycorp 2004d, Dog). Other OpenCyc-related I1-[OntoSpace]:D1 42

documentation will be cited, including the original documentation of the Cyc Project (Cycorp 2004b), the Cyc 101 Tutorial (Cycorp 2004a) and The Ontological Engineer’s Handbook (Cycorp 2004c). It should be noted that original documentation of the Cyc project (Cycorp 2004b) has been officially replaced with material on the OpenCyc website. However, at the time of writing this section, both sets of documentation were used since the documentation of the original Cyc project was more detailed, albeit not “edited for correctness” (Cycorp 2004b).

6.1 OpenCyc basics and upper ontology

OpenCyc is a massive knowledge base (KB). Before discussing its contents and the basic modelling decisions, it is necessary to explain a few of its basic assumptions pertaining to the meta-level, the highest level of design. First of all, everything in OpenCyc is either meant to represent a set, e.g., the set of all people, or an individual, e.g., some specific person, Joe Smith. Thing is the top class in the OpenCyc taxonomy and represents the universal set. OpenCyc, therefore, is firmly grounded in set theory, and not mereology. “Every thing in the Cyc ontology—every Individual (of any kind), every Set-Mathematical, and every Collection—is an instance of...Thing. Similarly, every collection is a subcollection of...Thing” (Cycorp 2004d, Thing). Thing is partitioned into Individual and SetOrCollection. “Individual is the collection of all individuals: things that are not sets or collections” (Cycorp 2004d, Individual). They may include notions such as relations, functions, and rules. That is, instances of Individual do not have members as sets do; they are urelements of the domain. They may, however, have parts. Individual is disjoint with SetOrCollection, whose documention reads:

“Something is an instance of SetOrCollection just in case it is a collection (i.e. an instance of Collection) or a mathematical set (i.e. an instance of Set-Mathematical). Instances of Set-Mathematical and instances of Collection (and thus instances of SetOrCollection) share some basic common features. All instances of Collection and all instances of Set-Mathematical (and thus all in- stances of SetOrCollection) are abstract entities, lacking spatial and temporal properties. Nearly all instances of Collection (except “empty” collections) and nearly all instances of Set-Mathematical (except the empty set; see TheEmp- tySet) have “elements” (i.e. instances or members...); hence set-or-collections may stand to one another in generalized set-theoretic relations such as subsetOf and disjointWith...” (Cycorp 2004d, SetOrCollection)

That is, SetOrCollection is further divided into mathematically-defined sets called Set- Mathematical, that is, sets defined by enumerating the members, and Collection. Collection is documented as: I1-[OntoSpace]:D1 43

“The collection of all Cyc collections. Cyc collections are natural kinds or classes, as opposed to mathematical sets; their instances have some common attribute(s). Each Cyc collection is like a set in so far as it may have ele- ments, subsets, and supersets, and may not have parts or spatial or temporal properties.” (Cycorp 2004d, Collection)

Furthermore, it is stated: “The criteria for membership in a Collection need not be stated explicitly in the KB”(Cycorp 2004b). That is, the use of collections of collections contrasts with overt features-value systems such as those employed in SUMO and DOLCE.

Thing ¨¨HH ¨¨ HH ¨ H ¨¨ HH ¨¨ HH SetOrCollection Individual ¨¨HH ¨¨ HH ¨ H ¨¨ HH ¨ H ¨¨ HH Collection Set-Mathematical ¨¨HH ¨ H ¨¨ HH ¨¨ HH FirstOrderCollection ConventionalCollectionType

Figure 17: OpenCyc’s upper ontology

OpenCyc has a number of distinct ontological structuring relations. The most basic struc- turing relation is genls, as defined in definition (1).

(1) The formula (genls X Y) means “Every instance of collection X is also an instance of collection Y”.

This is the relation of subsumption which is basic to many other ontologies, not just OpenCyc, as discussed in Section 2.2. Figure 17 shows the basic taxonomic relations among the categories mentioned this far structured using the genls relationship. This is not to be confused with the subsumption relation, which is represented in OpenCyc using the predicate isa. Also for structuring reasons, OpenCyc makes much use of the notion of a ‘collection of collections’ at its highest ontological level. In a collection of collections, the contained I1-[OntoSpace]:D1 44

collections are viewed themselves as instances of Collection (but not as instances of the OpenCyc constant Individual).

“There are cases where it is useful to make assertions about a class of collections, so we can reify these classes as constants which are collections of collections. These constants are both instances of Collection and subsets of Collection, since all their instances are also instances of Collection.” (Cycorp 2004b)

Two important subcategories of Collection are ConventionalCollectionType and FirstOrder- Collection. The names of collections of collections usually end in -Type. ConventionalCollectionType

...corresponds to a category in some agreed-upon or conventional classification system...used by people. In such systems, a change or reclassification is possible by a decision of an authority, or by a changed social agreement or custom, with- out changing the intrinsic natures of the actual objects in the category. Conven- tionalClassificationTypes include categories in biological taxonomy, standard classifications in data dictionaries and thesauri, cultural taboo classes, military doctrinal classes, and named calendar intervals. (Cycorp 2004b)

An example of ConventionalCollectionType in Cyc is FoodGroup. FirstOrderCollection on the other hand is much more general and subsumes many important collections of collections:

FirstOrderCollection AgentTypeByEmotionalState AnimalCapabilityType BaseWordFormTypeByEndingPhonemeType BeliefSystemType FeelingType FormalProductType HumanCapabilityType MatterTypeByPhysicalState MeasurableScalarIntervalType MicrotheoryType ObjectType ProductType RelationshipType SituationType StuffType TransportationEventByVehicleType

Collections of collections simply allow a particular collection to be regarded as an instance. This is useful for two reasons, as the OpenCyc documentation points out. I1-[OntoSpace]:D1 45

The first concerns predicate argument typing. Consider the following assertion:16

(concentration PortionOfLemonade001 Sucrose (GramsPerMilliliter 0.1))

The specification for the concentration relation includes the following assertion that “en- forces the minimal constraint that concentration can only refer to kinds of tangible stuff in its second argument”(Cycorp 2004b).

(arg2Isa concentration TangibleStuffCompositionType)

That is, the argument types of a predicate can be defined so as to be restricted to particular specified collections of collections. The second concerns the ability to create arbitrary classification hierarchies within the KB, that is, to “provide additional structure to the genls hierarchy” (Cycorp 2004b). “... there are many ways we could divide TangibleThing up into subsets (that is, ‘subcollec- tions’)” (Cycorp 2004b). Collections of collections are related to their members by the structuring predicate type- Genls. For example, in

(typeGenls DogBreed Dog) every instance of DogBreed is a type of Dog. Thus, the entire class DogBreed is used as a conventional classification type in a particular context. Finally, there is also a structuring predicate genlPreds, which resembles subrelation in SUMO (discussed in Section 4) and allows structures similar to the role hierarchies of description logics. Other structuring relations in OpenCyc are covered in the next section. The structuring predicate genlPreds is used, for example, to structure instances of Predicate into a taxonomy: awareOf awareOfProp hasEmotionAbout likes-Generic hasEmotionAboutProposition desires goals intends 16See naming convention notes in Section 6.4. I1-[OntoSpace]:D1 46

6.2 Microtheories

Another meta-level construct in OpenCyc is the microtheory.17 The use of microtheories is essentially a way of dividing a KB into partitions—clearly an essential feature for an ontology the size of Cyc. Each partition is based on a set of shared assumptions, a field of study, and/or a source of knowledge. For example, in one microtheory, Table may be declared to be a solid object, but in another, it could be made up mostly of space. The former is from a human point of view, whereas the second is from the point of view of particle physics. This is OpenCyc’s solution to the context and granularity issues, mentioned in Section 2 above. One benefit of the microtheory approach is that every assertion in OpenCyc can be indexed and then identified in some reasoning process. Thus, OpenCyc may include a number of contradictory assertions across different microtheories, as in the above example, and still produce valid inferences. Within a given microtheory, therefore, knowledge must be consistent, i.e., there should be no contradictions. Further benefits are claimed to include easier, more focused knowledge entry, faster reasoning, and better handling of global consistency. Every assertion in OpenCyc is part of some microtheory. OpenCyc uses the ist relation to assert that some formula is true in a given microtheory (MT), as shown in example (2).

(2) (ist FORMULA MT)

Microtheories form an explicit hierarchy of domains within the OpenCyc KB, enabling microtheories to inherit sets of assertions from other microtheories. Microtheories are related to one another by means of the predicate genlMt, which is precisely analogous to the relation genls for Collections (see Figure 18). The most general microtheory available for reasoning in OpenCyc is the BaseKB: “Only the most general, context-independent information belongs in the BaseKB.” This information is therefore global and must be valid for any reasoning process in OpenCyc. There is also a special predicate, genlMt-Vocabulary, which relates a microtheory to a set of vocabulary items used within that microtheory. This notion is not fully described in the OpenCyc documentation; it is defined simply as “the collection of all microtheories which specify the vocabulary for some topic, but have no rules or other non-definitional assertions” (Cycorp 2004d, genlMt-Vocabulary) OpenCyc contains hundreds of separate microtheories. As an example, some of the key microtheories related to spatial reasoning are given in Figure 19. Note that the relation for this tree is genlMt, and not genls. The SpatialGMt is:

“the GeneralMicrotheory for describing the most general spatial vocabulary. Terms in SpatialGMt are generally topologically invariant. The specMt Geom- etryGMt is used for geometrical shapes, angles, properties. Other spatial terms are described in NaiveSpatialMt.” (Cycorp 2004d, SpatialGMt)

17This construct is not yet fully described in the OpenCyc documentation, so the material here will be incomplete I1-[OntoSpace]:D1 47

genlPreds ¨¨HH genlMt ... ¨¨HH ¨¨ HH ¨ H nearestGenlMt genlMt-Vocabulary

Figure 18: Taxonomy of predicates concerning microtheories

The NaiveSpatialMt:

“provides concepts and rules to represent the natural way we reason about spatial relations. It deals with representing the number of sides about an ob- ject...and the conclusions we can draw from these objects being in static or dynamic situations with regards to only themselves or other objects.” (Cycorp 2004d, NaiveSpatialMt)

The NaivePhysicsMt microtheory is:

“an instance of both TheoryMicrotheory and GeneralMicrotheory. The Naive- PhysicsMt contains information about the behavior that physical bodies (in- stances of PartiallyTangible []) undergo in a wide variety of physical situa- tions (where the situations in question are usually instances of PhysicalEvent []).” (Cycorp 2004d, NaivePhysicsMt)

We return to these spatial aspects in more detail in our deliverable on spatial ontologies, I1-[OntoSpace]:D2.

Microtheory ¨XHXX ¨¨ HHXXX  ¨ H XXX  ¨¨ HH XXX  ¨ H XXX  ¨¨ HH XXX NaivePhysics SpatialGMt ObjectPhysical Natural Path Character- GeographyMt SystemsMt isticsMt

Figure 19: Taxonomy of OpenCyc microtheories I1-[OntoSpace]:D1 48

6.3 Mereology in OpenCyc

Even though OpenCyc is based on set theory, it has a substantial inventory of mereological relations. The basic relation is parts which relates individuals to their parts in the very general sense typical of mereology (cf. Appendix I); the only restriction on the arguments of parts is that they be instances of Individual.

“This predicate relates individuals to their (individual) “parts”, where this is understood in a very broad sense that includes spatial parts, temporal parts, “conceptual” parts, members of groups, and so forth. (parts WHOLE PART) means that PART is in some sense a part of WHOLE. Note that PART need not be a proper part of WHOLE: parts is reflexive (see ReflexiveBinaryPredicate) . . . ” (Cycorp 2004d, parts)

The relation parts is additionally an instance of AntiSymmetricBinaryRelation and Transi- tiveBinaryPredicate. This illustrates the way OpenCyc defines the characteristics (meta- properties) of predicates, that is, by asserting that the predicates are instances of par- ticular collections, e.g., TransitiveBinaryPredicate. Parts therefore resembles the primitive mereological relations found in ontologies based on mereology. Furthermore, it has many specializations covering several microtheories—a partial taxonomy of which is given in Figure 20. The subrelations do not necessarily inherit all the properties of parts; that is, they are not necessarily instances of AntiSymmetricBinaryRelation, ReflexiveBinaryRelation, and Transi- tiveBinaryPredicate. For example, constituents is not transitive. From the NaivePhysicsVo- cabularyMt microtheory, the relation physicalParts has a number of specializations of note, as shown in Figure 21. There is much philosophical discussion concerning some of the ‘entities’ entailed by this description. For example, to what extent ‘cavities’ can be considered as parts, whether they have walls or not, is discussed at length by Casati & Varzi (1994). The precise meaning of these relations listed here therefore depends, as always, on the axiomatization that they are subject to.

6.4 Representation

OpenCyc is represented in a first-order language developed specifically for the Cyc project called CycL (Cycorp 2004c). This language, however, closely resembles KIF in that it is also Lisp-like and many of the naming conventions are similar. Unlike KIF, all constants in CycL are preceded by the character sequence “#$”, e.g., #$Mountain, #$isa, etc. As with KIF variable names are preceded with a question mark, e.g., ?Hat, or ?X. Logical connectives are written out in full: #$not, #$and, #$or, and #$implies. This definitional I1-[OntoSpace]:D1 49

parts compositeParts [BaseKB] containsPortals [NaivePhysicsVocabularyMt] intangibleParts [BaseKB] containsInformation [InformationTerminologyMt] intangibleComponent [BaseKB] quantitySubsumes [BaseKB] subInformation [BaseKB] subInfoStructures [InformationTerminologyVocabularyMt] subLists [BaseKB] subSituations [BaseKB] physicalDecompositions [NaivePhysicsVocabularyMt] constituents [NaivePhysicsVocabularyMt] physicalParts [NaivePhysicsVocabularyMt] physicalPortions [NaivePhysicsVocabularyMt] subRegions [NaiveSpatialVocabularyMt] borderSubRegions [BordersVocabularyMt] geographicalSubRegions [GeographicalRegionGVocabularyMt] internalSubRegions [NaiveSpatialVocabularyMt] timeSlices [BaseKB] subAbstractions [BaseKB] subProcesses [BaseKB]

Figure 20: Parts taxonomy

physicalParts [NaivePhysicsVocabularyMt] cavityHasWall [NaivePhysicsVocabularyMt] containsCavityWithWalls [NaivePhysicsVocabularyMt] externalParts [NaivePhysicsVocabularyMt] surfaceParts [NaivePhysicsVocabularyMt] internalParts [NaivePhysicsVocabularyMt] objectHasVisualMarks [NaivePhysicsVocabularyMt] physicalExtent [NaivePhysicsVocabularyMt] portalHasCovering [NaivePhysicsVocabularyMt]

Figure 21: Physical parts taxonomy I1-[OntoSpace]:D1 50

language is at least first-order. The reasoner supplied for this representation language was mentioned in Section 2.5 above.

6.5 Summary and discussion of OpenCyc

In summary OpenCyc 0.7.0 is a massive knowledge base with nearly 6,000 categories. Future releases are intended to incorporate 60,000 assertions to describe the categories. OpenCyc’s meta-level assumptions have been discussed focusing mainly on its use of mi- crotheories to partition the KB into usable chunks and its use of collections of collections to provide rich taxonomic structure of the KB. Regarding the various parameters of on- tology design laid out in Section 2, OpenCyc is a mixed ontology containing elements of realism but also a number of cognitively specific categories, e.g., NaiveSpatialMt. Open- Cyc is anchored in set theory, but has an expressive vocabulary for describing mereology. Furthermore, OpenCyc is a 3-D ontology. Finally, OpenCyc contains a high degree of granularity, though certainly some domains are more fine grained than others. The impact of the notion of microtheory on OpenCyc’s system design should not be over- looked. Whereas the whole of OpenCyc may seem, at first, too large to be useful to any domain, “zooming in” to the level of an individual microtheory might allow for consid- erable usability of OpenCyc. Assertions which are peripheral to some domain may be ignored for specific applications. One problem with OpenCyc’s microtheory approach, however, is that the motivation for proposing new microtheories seems unconstrained. As was mentioned earlier, microtheories are motivated according to shared assumptions, a field of study, and/or a source of knowledge. That is, there does not seem to be a formal way of saying that some microtheory should not be added to the system. Not only for new theories of knowledge, but any new approach to an established theory may be grounds for adding yet another microtheory. This issue also raises the question of why there is one monolithic OpenCyc KB in the first place. The effectiveness of the approach could be measured by the ‘bushiness’ of the microtheory hierarchy—if this becomes too flat, then significant re-use of knowledge is clearly not being achieved.18 The current release of OpenCyc is also axiomatically impoverished, e.g., features of Col- lections are often left implicit. As mentioned above, this is even specified in the OpenCyc documentation: “The criteria for membership in a Collection need not be stated explicitly in the KB.” The reason for this seems not at all clear:

“As with any Cyc constant, an instance of Collection should be created only if it is expected to have some purpose or utility. Moreover, the ‘best’ collections to create are the ones which are impossible to define precisely, yet about which there are rules and other things to say. E.g., ‘WhiteCat’ is not a good element of Collection to create, because it’s easy to define with other Cyc concepts, and

18The result would then presumably resemble more closely the Multiple Source Ontology of Martin (2003), an approach to ontology combination that explicitly avoids tight integration. I1-[OntoSpace]:D1 51

there’s not much to say about the collection of white cats; but ‘WhiteCollar- Worker’ could be a good instance of Collection, because it is hard to define exactly, yet there are many things to say about it.” (Cycorp 2004a, p1)

There may be a connection to be drawn here with defining concepts as ‘primitive’ in a knowledge representation language based on description logic. This is used when a concept is to be defined at a certain place in a subsumption lattice without requiring automatic classification of where the concept should actually fit based on axioms that are defined for it. The definition is then analogous to definition by dividing into sets of alternative choices rather than by providing explicit identification criteria. Nevertheless, this approach to ontology construction is not immediately reconcileable with more formal principles of ontology building, whereby categories should be well defined (i.e., axiomatised) given the modelling tools available.

7 DOLCE

The Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE: Masolo, Borgo, Gangemi, Guarino, Oltramari & Schneider 2002, Masolo et al. 2003) was originally part of the WonderWeb project19 whose aim is the development of foundational ontology libraries for the Semantic Web. DOLCE as part of that larger effort is meant to be used

“for comparing and elucidating the relationships with other future modules of the library, and also for clarifying the hidden assumptions underlying existing ontologies or linguistic resources such as WordNet”. (Masolo et al. 2002, p8)

That is, DOLCE is not meant to be a universal or standard ontology. Also, the authors state that they

“...do not commit to a strictly referentialist metaphysics related to the intrinsic nature of the world: rather, the categories we introduce here are thought of as cognitive artifacts ultimately depending on human perception, cultural imprints and social conventions (a sort of ‘cognitive’ metaphysics)” (Masolo et al. 2002, p8).

DOLCE is freely available and has emerged from the work of several researchers associated with the Laboratory for Applied Ontology (Trento/Rome, Italy).20

19IST Project 2001-33052 WonderWeb: Ontology Infrastructure for the Semantic Web 20See: http://www.loa-cnr.it/DOLCE.html I1-[OntoSpace]:D1 52

Figure 22: DOLCE taxonomy: taken from Masolo et al. (2002: 9)

7.1 DOLCE basics and upper ontology

DOLCE’s upper level, shown in Figure 22, is constructed based on the principles set out in the Ontoclean methodology (Guarino & Welty 2002, Guarino & Welty 2004) mentioned in Section 2.4. For example, the categories in the upper ontology are established based on the notion of a rigid property in the meta-language construct. A rigid property is one that must hold in order for some entity to exist. The most fundamental division made in DOLCE is between entities that unfold in time, called perdurants, and entities which are present ‘all-at-once’ in time, called endurants.

“Endurants are wholly present (i.e., all their proper parts are present) at any time they are present. Perdurants, on the other hand, just extend in time by accumulating different temporal parts, so that, at any time they are present, they are only partially present, in the sense that some of their proper temporal parts (e.g., their previous or future phases) may be not present.” (Masolo et al. 2002, 10).

We turn to the spatial aspects of endurants in DOLCE in the deliverable that focuses on spatial ontology in particular. However, it is worth mentioning that DOLCE makes a distinction between those Endurants having spatiotemporal properties, subsumed by Phys- icalEndurant, versus those which do not, subsumed by NonphysicalEndurant. The latter reflects the cognitive bias of DOLCE and includes categories such as SocialObject, SocialA- I1-[OntoSpace]:D1 53

gent and MentalObject. To date these categories are the least articulated of the categories in DOLCE and form an active area of research for its developers. Perdurants are classified according to their temporal ‘shape’ paralleling the Vendler classes in linguistics (Vendler 1967), namely, as events, processes, and states. Perdurants have temporal or spatial parts:

“For instance, the first movement of (an execution of) a symphony is a tem- poral part of it. On the other side, the playing performed by the left side of the orchestra is a spatial part. In both cases, these parts are occurrences them- selves. We assume that objects cannot be parts of occurrences, but rather they participate in them.” (Masolo et al. 2003, 24)

7.2 Qualities

DOLCE’s treatment of Quality deserves special mention as it represents a significant ex- tension over the approaches to qualities in SUMO and OpenCyc. DOLCE separates actual qualities from their abstract values, which are defined and located in quality space. In- stances of Quality (non-abstractions) are said to “inhere” in their associated hosts, but their values are defined as elements in abstract quality space. This is inspired by the work on conceptual spaces of G¨ardenfors(2000). The difference between a Quality and an element in quality space may be explained by reference to a distinction observable in the natural language sentences given in (3-5).

(3) This rose is red.

(4) The color of this rose is red.

(5) The color of this rose changed from red to brown.

The word red in (3) is used as an adjective and refers to a particular instance of Quality. This is the color that the particular rose described has. This quality of color is uniquely that of this particular rose. In contrast, the red of (4) is being used as a nominal rather than an adjective and refers instead to an instance of a color quale, or position in abstract color space, subsumed under the concept Region. That this is necessary is argued by the existence of statements such as (5): here it makes little sense to say that it is the color ‘red’ itself that is changing from red to brown. It is only the particular instantial color that the rose has that is taking up different color values within the abstract color region. We need then to keep these two layers of the account separate: the instantial quality and the quale that it takes as value. I1-[OntoSpace]:D1 54

Figure 23: Quality and quality spaces (taken from Masolo et al., 2002: 12)

This relationship between a host, its quality, and a quale is shown well in Figure 23 from Masolo et al. (2002, p12). This kind of graphical representation brings out additional information to the largely taxonomic views of ontologies that we have seen so far. The categories from the taxonomy are shown here as boxes and subsumption is represented by spatial inclusion in the diagram. Thus, a ‘rose’ is a subclass of a Non-Agentive Physical Object, which is itself a subclass of Physical Object. The particular rose that we are dis- cussing in (3-5) is then the dot labelled rose#1. By inheritance, this entity is an instance of a concept ‘Red Object’. In addition to this subsumption information, we also have the ‘horizontal’, cross-links established by the axiomatization. This is shown as arrows relating particular boxes.

Thus, the rose itself stands in the quality relationship qtC with an instance of Color, which is a kind of Physical Quality. This instance is the precise and particular color that rose#1 has. The value of this particular color is then given by the further quale relationship with one of the colors defined in the Color Region, which is itself a subclass of Physical Region. This relationship is also indexed by time and so can change. What then occurs is that the particular color that is the color of the rose is linked with a different element in the color region. The color ‘red’ does not then change; and the particular color that is the rose’s color is always that rose’s color. Making sure these kinds of entities do not come apart in embarrassingly non-realistic ways is one of the most important tasks that an explicit ontology has to perform. The solution proposed here in DOLCE is therefore very important. Insufficient axiomatizations can leave the rose and its color leading independent lives (analogously to doughnuts and their holes leading independent lives), which does not then offer a good fit to ‘reality’. I1-[OntoSpace]:D1 55

Qualities are treated separately again from endurants and perdurants and take up a further equal position in the taxonomy (cf. Figure 22). This is because qualities are not viewed as having parts and so cannot fall under the general mereological axioms that hold for both endurants and perdurants. In contrast, Quality Regions may, however, be discussed in terms of parthood and are placed under Abstract.

7.3 Primitive relations

DOLCE emphasizes a number of primitive relations that span “multiple application do- mains”. This means that they are taken to be applicable in many different areas where an ontology needs to be provided. Using these definitions then provides a set of basic building blocks, each of which has a well-defined and understood set of properties. The particular groups of axioms provided are explicitly proposed as being suitable for re-use when constructing alternative ontologies. The primitive relations and their areas include:

• Parthood: “x is part of y” • Temporary Parthood: “x is part of y during t” • Constitution: “x constitutes y” • Participation: “x participates in y during t” • Quality: “x is a quality of y” • Quale: “x is the quale of y (during t)” • Dependence: “x depends on y” (of which there are many distinct kinds)

Many of the axioms involved here have the effect of binding together ‘horizontally’ the categories shown in the taxonomy of Figure 22. This is one beneficial consequence of ap- plying the OntoClean methodology referred to above: the typical knowledge representation problem of ‘isa-overloading’, whereby the subsumption relationship is made to carry a va- riety of different meanings (often implicitly because the modelling has not been considered sufficiently closely), is avoided and so a richer set of relationships needs to be imposed in addition to the pure subsumption taxonomy. Achieving some degree of ontological clarity concerning these relationships has itself been a major contribution of the DOLCE ontology. This is brought out well by the diagram shown in Figure 24, which again uses the nested box representation adopted above to explain qualities. In this case, however, we see all of the major categories of the DOLCE taxonomy and the kinds of dependency relation- ships that obtain between them. Each of the dependency relations individuated involves a particular axiomatization that both constrains the configurations possible and allows particular inferences to be drawn (Masolo et al. 2002, pp32-33). We will make use of some of these dependency relations in later deliverables. I1-[OntoSpace]:D1 56

Key: MSD: mutual specific dependence; OSD: one-sided specific dependence; OGD: one- sided generic dependence; GK: constant generic constitution; PGD: partial generic spatial dependence.

Figure 24: DOLCE ontological dependencies (taken from Masolo et al., 2002:22) I1-[OntoSpace]:D1 57

7.4 Representation

We can see, therefore, that the definitional style of DOLCE is to provide a densely in- terwoven collection of formal building blocks. Particular axioms are defined that lock constructs into a web of necessarily following consequences. All of the constructs proposed are thereby embedded in an interrelated collection of axioms that strongly constrain possi- ble extensions. These foundational axioms centre primarily around mereology, ontological dependence, temporal and spatial inclusion and constitution. Several forms of the DOLCE ontology have been specified. The most comprehensive specifi- cation uses some aspects of a simple variety of modal logic and keeps itself within first-order logic by employing definition ‘schema’ or ‘macros’ that expand over the finite set of cat- egories employed by the account. This account is not then directly usable for practical reasoning, although experiments with first order theorem provers would be very interest- ing. For application, DOLCE adopts a compromise between the power of full first-order logic and a description logic. The developmental methodology followed with DOLCE is described more precisely by Masolo et al. (2003, p6) in terms of the following steps:

1. Describe a foundational ontology on paper, using a full first-order logic with modality;

2. Isolate the part of the axiomatization that can be expressed in the semantic web language, OWL, and implement it;

3. Add the remaining part in the form of KIF comments attached to OWL concepts.

At the moment DOLCE in its full form is available in logical notation in a non-computable readable form and in a variant expressed in KIF (step 1). The OWL form consists of: “simplified translations of Dolce2.0 that do not consider: modality, temporal indexing, relation composition. In addition, different names are adopted for relations that have the same name but different arities in the FOL version.”21 At the time of writing this document, no KIF statements had been attached to the OWL rendering.

7.5 Summary and discussion of DOLCE

According to the dimensions of ontology construction laid out in Section 2, DOLCE has a clear cognitive, non-realist, bias in its philosophical approach. The upper taxonomy is well-defined and based on the OntoClean methodology (Section 2.4 and Guarino & Welty (2004)). DOLCE is firmly based on mereology with little reliance on set theory for its conceptual distinctions. Also, DOLCE assumes an endurantist perspective evinced by its separation of Endurant from Perdurant.

21Text taken from http://www.loa-cnr.it/DOLCE.html I1-[OntoSpace]:D1 58

8 BFO

The Basic Formal Ontology (BFO) is the ontological framework being developed by the re- searchers at the Institute for Formal Ontology and Medical Information Science (IFOMIS) at the University of the Saarland (and formerly at the Leipzig University), Germany. The BFO framework will actually consist of several ontologies. The intent of IFOMIS is to de- velop two basic formal ontologies, SNAP and SPAN, and then to extend these to develop a number of material ontologies, especially in the domains of medicine and the geo-sciences. The main description of BFO can be found in Smith & Grenon (2004), Bittner & Smith (2003b), and Smith (forthcoming).

8.1 Philosophical underpinnings of BFO

Before delving into the individual ontologies, some remarks on the philosophical underpin- nings of BFO as a whole are in order. First of all, the methodology behind BFO is, in Smith’s terms: “realist, perspectivalist, fallibilist and adequatist” (as outlined by Smith (forthcoming)). The following from Grenon & Smith (2004), defines perspectivalism, one of the most important feature of BFO with respect to the other ontologies described in this document, as follows:

“Perspectivalism maintains that there may be alternative, equally legitimate perspectives on reality. But perspectivalism is constrained by realism: thus it does not amount to the thesis that just any view of reality is legitimate. To establish which views are legitimate we must weigh them against their ability to survive critical tests when confronted with reality, for example via scientific experiments. Those perspectives which survive are deemed to be transparent to reality. This is however in a way that is always subject to further correction...” (Grenon & Smith 2004, p138)

The main outcome of adopting such a perspectivalist stance is that the various perspectives on reality are encoded in separate ontologies. There are two basic perspectives: the snap- shot perspective described in SNAP ontologies, and the time-span perspective described by SPAN ontologies. This bifurcation of perspective is not simply a separation based on the 3-D/4-D paradigm. The key assumption behind BFO is, rather, that reality actually consists of two very different sorts of entities, continuants and occurrents. Continuants are said to endure through time, maintaining their identity over any given time points or intervals. Rocks, shapes, and people are examples of continuants. Occurrents on the other hand unfold through time.

“This means that each portion of the time during which an occurrent oc- curs can be associated with a corresponding temporal portion of the occur- I1-[OntoSpace]:D1 59

rent. This is because occurrents exist only in their successive temporal parts or phases.” (Bittner & Smith 2003a, p2).

Examples include “your smiling, her walking, the landing of an aircraft, the passage of a rainstorm over a forest, the rotting of fallen leaves” (Grenon & Smith 2004, 140). The main insight here is that no single inventory of categories and relations—no single ontology—can account for both SNAP and SPAN entities, as the two views of reality are incompatible. Thus, Grenon & Smith (2004) reject at the outset a reductionist 3-D/4-D paradigm by declaring that the whole of reality can never be accounted for by embracing either of the two (3-D/4-D) approaches:

“...we here depart from Quine, who propounded an exclusively four-dimensionalistic ontology. Processes are precisely occurrent entities in which substances and other SNAP entities participate. Processes are the entities revealed when we adopt a certain special (SPAN) perspective on reality. But the orthogonal (SNAP) perspective is no less faithful to what exists.” (Grenon & Smith 2004).

We illustrated an exclusively four-dimensional ontology in the examples drawn from West in Section 2.2.3 above. A primary motivation for this fundamental separation into two kinds of ontology is that, as a consequence, BFO can use a straightforward mereological parthood relationship that always holds and supports the usual non-problematic cases of parthood in both realms: the SNAP and the SPAN. This counters one traditional problem discussed with parthood concerning what happens when some entity loses an inessential part: such as, e.g., when a cat loses its tail (Simons 1987). In the world before the losing of the tail the cat has certain parts; in the world after the losing of the tail, the parts of the cat are different. One cannot therefore say that the tail is part of the cat without invoking a temporal index. The BFO SNAP/SPAN distinction is intended to escape this dilemma because the two situations, before and after the losing of the tail, are in distinct SNAP ontologies; and within each of these ontologies, parthood is well-defined and atemporal: either the cat has the tail or it does not.22

8.2 SNAP

SNAP ontologies are ontologies where time is not present: they contain a snapshot view of reality. The upper taxonomy of SNAP is given in Figure 25. Within any SNAP ontology the essential ontological relationship of parthood behaves sim- ply and reliably: either we are dealing with a snapshot in which the cat has its tail, or

22We will return to this very productive strategy of restricting the scope within which the part relation may be applied in our dicussions of space in Deliverable D2. I1-[OntoSpace]:D1 60

Figure 25: SNAP top-level categories I1-[OntoSpace]:D1 61

a snapshot where the cat has lost its tail—in each ontology all of the current parts of an endurant are present. The SNAP entities “have no temporal parts”. Such entities then endure by being present in sequences of SNAP ontologies: the ‘same’ entity may thus occur in different part-whole relationships only in different snapshots. Each SNAP ontology as a whole has a time-index; the entities making up a given SNAP ontology do not, however. This ensures that each individual SNAP ontology is timeless.

8.3 SPAN

SPAN ontologies are intrinsically temporal: they always involve 4-D views of reality. Time is inherently and explicitly contained within any SPAN ontology and so its perdurants, called ‘continuants’ in BFO, are inherently temporal—they are the space-time worms seen in the figures of Section 2.2.3 above.

“In regard to SPAN, a form of methodological eternalism is appropriate, i.e., of the philosophical doctrine according to which all times (past, present, and future) exist on a par [. . . ] an all-encompassing SPAN ontology has time itself as constituent.” (Grenon & Smith 2004)

In the upper taxonomy of SPAN shown in Figure 26, then, time itself is the maximal TimeRegion. In contrast with SNAP entities, those of SPAN do have temporal parts. This then is the main distinguishing feature, since in a SNAP ontology, time is always explicitly outside the ontology itself. Within a SPAN ontology, BFO makes a distinction between Processes and Events. Pro- cesses are:

“...those extended processual entities which are self-connected wholes. On the one hand, they have beginnings and endings corresponding to real discontinu- ities, which are their bona fide boundaries. On the other hand, they involve no temporal or spatiotemporal gaps in their interiors. In particular, a given process may not be occurring at two distinct times without occurring also at every time in the interval between them.” (Grenon & Smith 2004, p153)

Events on the other hand are instantaneous boundaries of processes and transitions within processes. Examples include: “the beginning of a conflict; the ceasing to exist of a country as a result of annexation; the detaching of a portion of rock as a result of erosion.” (Grenon & Smith 2004, p154) I1-[OntoSpace]:D1 62

Figure 26: SPAN top-level categories I1-[OntoSpace]:D1 63

8.4 Trans-ontological relations

Grenon & Smith (2004) reject the notion that reality can be described in a single ontology. They argue for the necessity of trans-ontological relations between various SNAP and SPAN entities. Grenon & Smith (2004) cite “four main types of binary trans-ontological relations”. Each type’s signature can seen along with an example. The four main types are listed here:

(SNAP, SPAN), e.g., participant, as in ‘the jumper is a participant in the jumping event’

(SNAP, SNAP), e.g., part, as in ‘the leg was part of the table’

(SPAN, SNAP), e.g., creation, as in ‘burning creates ashes’

(SPAN, SPAN), e.g., as in ‘the poor driving caused an accident’

Grenon & Smith (2004) emphasize that these relations are those which are strictly trans- ontological, as opposed to intra-ontological. Two entities in the same SPAN ontology cannot be related trans-ontologically.

8.5 Representation

The specification of the BFO is mostly within a variant of mereology extended by topology within a first-order language. The formal trans-ontological relationships are expressed similarly, although they are not seen as additional existing entities but, instead, as formal ‘glue’ for holding the framework together. These relationships are expressed in terms of an additional set of logical axioms. The insistence within BFO on two distinct kinds of ontology, SNAP and SPAN, while sim- plifying the local (i.e., intra-ontology) accounts of parthood, naturally leads to additional complexity elsewhere in the account. In particular, the problem is raised of how to track the identity of entities across distinct SNAP ontologies: how is it formally specified that the cat with a tail in the SNAP ontology Si is the ‘same’ cat as that without a tail in the SNAP ontology Si+1? To solve this, the account needs to apply its trans-ontology relations in order to relate the distinct cats to a single cat-life: this ‘cat-life’ is then a SPAN-entity that can be systematically related to as many SNAP ontologies as necessary. The precise formalisation of this model is not yet clarified, although a beginning has been set out in Masolo et al. (2003). I1-[OntoSpace]:D1 64

8.6 Summary and discussion of BFO

The BFO framework is a prime example of ontological realism. It also demonstrates the use of mereology as a basic tool, rather than set theory. The main difference between BFO and the other ontologies surveyed is the separation of SNAP entities from SPAN entities. In this respect BFO diverges considerably from the reductionist 3-D/4-D paradigm discussed in Section 2. The notion of a trans-ontological relation introduced to bridge the SNAP- SPAN divide is also a rich area for further ontological investigation and formalization: of particular interest here will be areas such as those of granularity and roles. In contrast to both SUMO and DOLCE, BFO is situated within the traditional concerns of Ontology proper: that is, for Smith, BFO is a characterisation of what exists, of what is real. It is not cognitively oriented like DOLCE, nor engineering-oriented like SUMO. Nevertheless, and precisely as is the case with DOLCE, Smith argues strongly for the practical relevance of this approach. In Smith (forthcoming), for example, he sets out a convincing case for allowing the ‘world’ and its properties to take on far more of the workload in the design of intelligent systems: many things need not be represented as ‘mental’ properties because the world is already structured in diverse and detailed ways that make such mental representation unnecessary. There are important and suggestive lines of connection to be drawn here with notions of ecological psychology and situated computing, which both similarly place much more load on the world as a supplier of structured information directly rather than insisting on representations of that information internally to some cognitive agent. Smith also makes it clear, however, that there will need to be several distinct ontologies of the world simultaneously: the world of what exists cannot be reduced to a single all encompassing ontology. The SNAP/SPAN division is the first strong indication of this: both are ‘true’ but one cannot be reduced to the other. Whereas BFO is not yet represented in any machine-readable formalism, an attempt at beginning a formalization has been made within the context of the WonderWeb project along the same lines as that pursued for DOLCE (Masolo et al. 2003). There are, however, significant differences between DOLCE and BFO particularly in the areas of spatial rep- resentation. BFO, for example, does not group spatial information along with other kinds of ‘qualities’ as DOLCE does (cf. Figure 22) but separates these out into basic categories in their own right as 3D (continuant) Sites (cf. Figure 25) and 4D (occurrent) Settings (cf. Figure 26). Both are distinguished from ‘space as such’, which is a continuant Spatial Region or the spatial component of an occurrent Spacetime region. The precise nature of sites and settings and their relationships to space and time will be taken up in more detail in deliverable I1-[OntoSpace]:D2. I1-[OntoSpace]:D1 65

9 General Ontology Language: GOL

Several institutes at the University of Leipzig have been working on an ontology language and an upper level ontology that is intended to serve as a basic organizing structure for their work on the representation of medical information, in particular, that undertaken by the research group in Ontologies in Medicine (Onto-Med). The General Ontology Language (GOL) was first described in Degen et al. (2001) and Heller & Herre (2003), then more recently in Heller & Herre (2004) which expands GOL. GOL shares some of its development history with BFO (Section 8) but is now proceeding as an independent development. GOL “is intended to be a formal framework for building and representing ontologies. The main purpose of GOL is to provide a library of formalized and axiomatized top-level ontologies which can be used as a framework for building more specific ontologies” (Heller & Herre 2004, p8). The first top-level ontology, also described in Heller & Herre (2004), is the General Formal Ontology (GFO). The following section describes the basics of GOL and then provides details of GFO.

9.1 Basic approach of GOL

Concerning the philosophical approach, the ontologies described with GOL, e.g., the GFO, do not commit to realism, conceptualism or nominalism. This is because GOL allows for elements of the material world (immanent universals), elements of the mental world (conceptual structures), and elements of the socio-semiotic world (symbolic structures). In other words, GOL captures three different ‘levels of abstraction’, or strata. Each level is characterized by an integrated system of categories. The material level is characterized by the categories of the basic sciences: biology, chemistry, and physics. The mental level is characterized by categories of awareness (cognitive science) and personality (the phe- nomenon of will and reaction to experiences). The social level “captures phenomena of communication, of economic and legal realities, language, science, technology, and morals etc.” (Heller & Herre 2004, p17). GOL belongs to the branch of formal ontology that adopts set theory as its basic construc- tion mechanism; however, just as some ontologies based on mereology include notions of sets, GOL is prepared to admit further direct relationships among set members apart from their participation in sets; Degen et al. (2001) claim this to be a basic problem arising out of the extensionalism of standard set-theoretic accounts. Thus, the basic elements of sets, called Urlements, have an additional rich ontological classification. The precise organization of GOL is still under development and some changes are apparent between specifications: in some respects the more recent specification (Heller & Herre 2003, Heller & Herre 2004) now appears more closely related to BFO and DOLCE than the earlier version represented in Degen et al. (2001); in particular, as we shall see, ‘time-and-space’ entities are now separated out into their own ontological category.23 Our description here

23Formerly they were kinds of individuals. I1-[OntoSpace]:D1 66

will generally follow the more recent version, although the conclusions from Degen et al. (2001) concerning other ontologies will also be drawn upon below. Although GOL is claimed to be ‘endurantist’ (i.e., committed to a 3D+T modeling of the world rather than a 4D model) there is a crucial relation between objects and their exis- tence in and persistence through time. The persistence of individual objects is guaranteed by relating time-fixed individuals to a corresponding universal: e.g., individual substances are related to an abstract substance which is a universal; Heller and Herre state that this is similar to practice in conceptual modeling. The relation of particular endurants to corre- sponding processes is a significant feature of the ontology and recurs for several categories: for example, in addition to the substance-process relating substances and time, there are also moment-processes. This can, therefore, be seen as a contribution to the formal inter- ontology relationships proposed by Smith and colleagues for BFO. In all of the definitions of these concepts, the category of boundary plays a central role.

9.2 General Formal Ontology: GFO

Despite some similarities, GFO defines a range of basic ontological categories that differ in their organization in various ways from the ontologies seen so far. At the very top level, GFO distinguishes among categories, classes, and concrete entities. The top-level categories of GOL are shown in Figure 27. The following three sections will address each of the entities given in Figure 27, organized under the headings of ‘categories’, ‘classes’, and ‘concrete entities’.

9.2.1 Categories

A category is something that “can be predicated of other entities and they are expressed and represented by terms of a language” (Heller & Herre 2004, p13). There are three kinds of categories: immanent universals (also called universals), conceptual structures (sometimes referred to as concepts), and symbolic structures. Immanent universals are constituents of the real world. Conceptual structures are present in the mind. And symbolic structures are signs or text instantiated by tokens. By positing these three sorts of categories, GFO embraces respectively aspects of realism, conceptualism, and nominalism, but does not claim to commit to any of them. Immanent universals are the first sort of category. “We assume that the immanent univer- sals exist in the individuals (in re) but not independently from them, thus, our view of the immanent universals is Aristotelian in spirit...” (Heller & Herre 2004, p16). Persistants are universals whose instances are presentials—those individual that exist in space and are time stable (see below). Properties are universals that are existentially dependent on indi- viduals. The has-property relation is the formal means whereby properties are related to individuals, though Heller & Herre (2004) leave open the possibility of property ascription I1-[OntoSpace]:D1 67

Figure 27: The top-level categories of the General Formal Ontology I1-[OntoSpace]:D1 68

to universals. Some examples of properties include size, shape, color, severity, goodness, and badness. Consider the properties expressed in the following: “the size of a cabinet” and “a big cabinet”. First of all, GFO labels the cabinet as the property bearer. The first phrase expresses a property on the property bearer whereas the latter expresses a property value. “A property value reflects the relationship between the property of x and the same prop- erty as exhibited by another entity y. Like properties, property values are considered to be universals” (Heller & Herre 2004, p30). Property values are grouped into measuring systems which, for a given property, can be instantiated in a number of ways according to granularity. This is similar to the DOLCE approach seen earlier, both inspired from G¨ardenfors (2000). From the examples above, size refers to a universal, but GFO also provides a means to refer to particular properties that only single individuals bear, e.g., a particular cabinet having a particular size “the size of that cabinet”. These entities are called qualities and their particular values are called quality values, both of which are sub- sumed by concrete entity. The treatment here is similar to the DOLCE approach seen in Section 7, but DOLCE only considers particulars, that is, qualities and quality values in the language of GFO. Qualities are classified according to what kinds of entities bear them, e.g., qualities of physical structure, qualities of processes, and qualities of other qualities. The remaining sorts of categories are conceptual structures and symbolic structures. The only specification of conceptual structures is that they are concepts as grasped by some agent. And even symbolic structure is only loosely defined as “...signs or texts which may be instantiated by tokens.” The three sorts of categories are related in a very specific way: “an immanent category is captured by a concept which is denoted by a symbolic structure. Texts and symbolic structures may be communicated by their instances which are physical tokens” (Heller & Herre 2004, p13). This triadic organization echoes the traditional ‘meaning triangle’ of Ogden & Richards (1923) as discussed in Sowa (2000).

9.2.2 Classes

The next top level entity is class. Classes are essentially sets, since “[c]lasses and the mem- bership relation satisfy the principle of extensionality: two classes are equal if they have the same members. Extensionality is obviously not true for categories and the instantia- tion relation” (Heller & Herre 2004, p13). As opposed to categories and concrete entities, classes are not urelements.

9.2.3 Concrete entities

Concrete entities are the non-instantiable particulars, which include space-time entities and individuals. A space-time entity covers all spatial or temporal regions and their boundaries. Chronoids and topoids are particular types of temporal and spatial regions respectively. Chronoids are temporal intervals with boundaries, while topoids are connected spatial I1-[OntoSpace]:D1 69

regions having boundaries too, as well as a certain mereotopological structure (Heller & Herre 2004, p19). We will pick up a description of topoids in our deliverable D2. As for individuals, these are the entities located in space and time. Accordingly individuals are classified according to how they relate to space and time. A presential, for example, exists wholly at a time boundary. One type of presential is the physical structure. Physical structures are bearers of qualities and have a higher degree of independence than, for example, qualities that inhere in them (Heller & Herre 2004, p24). A physical structure consists of an amount of substrate which “...may be understood as a special persistant whose instances are distinct amounts at certain time-boundaries” (Heller & Herre 2004, p24). Finally, physical structures have spatial extension and location in space (see D2). The other kind of presential is the configuration which is a complex unit (a conglomeration) of physical structures, qualities and relators. The only restriction on configuration is the presence of at least one physical object. Configurations are either simple or non- simple. Simple configurations have one physical object and only qualities that inhere in that object, while non-simple configurations have more than one physical object and a number of relators between the objects. Another type of configuration is the situation. This is described thus:

“A situation is a special configuration which can be comprehended as a whole and satisfies certain conditions of unity imposed by certain universals, relations and categories associated with the situation. Hence, situations satisfy a con- dition of unity and comprehensibility. This implies that situations are related to minds and that the psychological stratum is involved in them. Situations present the most complex presentials of the world. In the realm of presentials they have the highest degree of independence.” (Heller & Herre 2004, p50).

Turning now to occurrents, these entities are extended and located in time: intuitively, they are things that happened—processes. Examples include things like diseases such as rhinitis, actions such as writing a letter, a state such as sitting in front of a computer, etc. There are three sorts of occurrents that GFO includes: generalized process, history, and change. A generalized process is an aggregate of processes, such as a lecture series made up of individual processes. Processes are classified by their temporal “shape”, for example as continuous or discrete and whether or not there is change of state involved. Since presentials depend on processes, there are also special kinds of processes defined and upon which configurations and situations depend: these are termed configuroids and situoids respectively. Situoids are considered to be the most complex, integrated and independent entities of the world; all other entities are embedded into corresponding situoids. Situoids then have determinate temporal and spatial extents provided by chronoids and topoids respectively. This relationship is called framing. Heller and Herre offer the following example:

“An example of a situoid is John’s kissing Mary in a certain environment which I1-[OntoSpace]:D1 70

contains the substances John and Mary and a relational moment kiss connect- ing them. Taken in isolation, however, these entities do not yet form a situoid; we have to add a certain environment consisting of further entities and a loca- tion to get a comprehensible whole: John and Mary may be sitting on a bench or walking through a park.” (Heller & Herre 2003, p10).

We note in passing here that this opens up not only a connection with Barwise and Perry but with the entire literature on ‘context’ in linguistics and pragmatics: this will be ad- dressed again when we turn to linguistic ontologies in D3. Heller and Herre note that the topoid framing a situoid is a fiat object (in the sense defined by Smith (2001), i.e., brought into being by communication or convention) and it is here that notions of culture and prag- matics become particularly inescapable when concrete descriptions are to be attempted. As Degen et al. explain:

“A situoid always [involves] a certain cut through reality, which means: a cer- tain granularity and point of view (Bittner & Smith 2003b). To capture this idea we assume that every situoid s has associated with it a certain finite num- ber of universals, which are (roughly) those universals which we need to grasp in order to grasp the situoid itself.” (Degen et al. 2001, p37).

Remaining occurrents include history and change. A history is a derived notion consisting of process boundaries, a temporally structured collection of processes that occur together. For example, consider a history of patient data: measuring the temperature of a patient, determining their weight, and determining their blood pressure. Finally, change refers to a pair of coincident process boundaries—intuitively, the instant that something happens, or moment. “For example, the immatriculation of a student is a change. It comprises two process boundaries, one terminating the process of the application at university, one beginning the process of studying” (Heller & Herre 2004, p34). Extrinsic changes are discontinuous and instantaneous; intrinsic changes can be understood to be continuous. Thus, the discrete processes and continuous processes mentioned earlier are defineable in terms of extrinsic and intrinsic changes respectively.

9.3 GFO relations

The basic relations of GFO are set out in the taxonomy in Figure 28, organized by the type of relata. What is particularly different from the ontologies described so far is that part-of is actually an abstract relation with a variety of subrelations classified by the sort of the whole in question. I1-[OntoSpace]:D1 71

Figure 28: The relation taxonomy of General Formal Ontology

9.4 Summary and discussion

GOL represents an ambitious ontology effort comparable to that of DOLCE. Degen et al. (2001) list several requirements for the GOL project in general:

“An upper-level ontology must, we hold, satisfy the following criteria: it must include at least the three ontological categories: individuals, universals, and sets, together with a system of relations and predicates containing the basic relations described [above]. These form the necessary core of every ontology. It will need to be extended by further basic relations, including those treating space, time and shape as well as topological relations such as boundary and connectedness.” (Degen et al. 2001, p43)

With GOL we see a sound starting point for ontology building in general with a clearly stated philosophy and methodology. Particularly relevant for robotics operating in real environments through the use of natural language, are GOL’s suggestions concerning how aspects of realism (real world), conceptualism (agents) and nominalism (language) can be combined into a unified framework. Also relevant to the goals of the SFB is the notion of topoids and chronoids and how they relate to presentials. This topic will taken up again in our deliverable D2 in the analysis of space.

10 The D&S extension to DOLCE

The notion of roles and the necessity for variable or flexible descriptions of events that has reoccurred in many of our discussions of issues in the ontologies above presents a I1-[OntoSpace]:D1 72

particularly important area for ontology development. This flexibility is one that must be placed on a more secure formal footing. We will see in deliverable D2 that the ability to make varied ontological statements about space and entities ‘situated’ in space will make this capability particularly crucial. As our last ontology to be described in this overview of approaches, therefore, we focus on an effort that has been undertaken on the basis of the DOLCE ontology introduced above. This work, which is now finding a growing range of adherents and applications, is the Descriptions and Situations (D&S) framework developed by Gangemi & Mika (2003) as a foundational ontology extension. Since this extension is likely to play an increasingly important role in ontology construction, we sketch its main features here. The starting point for this extension to DOLCE is an acceptance of a direction of develop- ment in ontology design in which there are an growing range of ‘black box ontologies’ that need to be related for applications. Since the deeper semantics of such ontologies may not be available in an accessible fashion, Gangemi and Mika propose placing this semantics elsewhere by means of “a mechanism to mime the human cognitive ability to contextualize our ontological commitments, even when we have scanty evidence of them.” (Gangemi & Mika 2003) This mechanism relies on a technical notion of reification applied to contexts: thus contexts become reified objects that can be referred to elsewhere in the account. The essential position of the D&S extension then involves the development and use of

“. . . an ontology of contexts, called Descriptions and Situations (D&S), which provides a principled approach to context reification through a clear separa- tion of states-of-affairs and their interpretation based on a non-physical con- text, called a description. The ontology of descriptions also offers a situation- description template and reification rules for the principal categories of the DOLCE foundational ontology.” (Gangemi & Mika 2003)

The need for a proper treatment of descriptions is motivated by Gangemi and Mika by drawing on the clear importance of ‘non-physical objects’ that are active in many domains but appear to receive second-class treatment ontologically. These non-physical objects, such as social institutions, organizations, laws, norms and so on, are themselves inter- related in complex ways and clearly benefit from an ontological treatment that makes those relationships clear. It is less than satisfactory, therefore, for their treatment to lag so far behind the description of physical objects. Providing for such entities ontologically then supports their use for defining the intended meanings of ontological components. As an extension to DOLCE, the D&S framework follows the same definitional methodology and provides an axiomatization of the entities and relations that form the theory. These axioms define the central notion of a context as a basic, first-order and first-class entity. The essential components here are the following:

• a state of affairs, made up of any non-empty set of assertions coherent within a specified first order theory, called a ground ontology. I1-[OntoSpace]:D1 73

• a description, seen as an entity that ‘partly represents’ some theory (whether or not formalized) T that can be “conceived” by some agent. Such descriptions can be entities in the ground ontology.

• a situation, ‘constituted by’ the entities and relations that are mentioned in the assertions from some state of affairs and partly representing a model for the theory T according to the axioms of the ground ontology.

With these entities in place, Gangemi and Mika propose that,

“Intuitively, when a description is applied to a state of affairs, some structure (a ‘situation’) emerges (this reflects the cognitive structuring process . . . ). The emerging structure is not necessarily equivalent to the actual structure.” (Gangemi & Mika 2003)

This provides an ontologically founded notion of epistemological layering:

“Epistemological layering consists of assuming that any logical structure Li (either formal or capable of being at least partly formalised) is built upon a structure SoA that it describes according to a theory Ti (either formal or capable of being at least partly formalised). In other words, Ti describes what kind of ontological commitment Li is supposed to have within the epistemological layer that is shared by the encoder of an ontology.” (Gangemi & Mika 2003)

Particularly appealing in this account is the flexibility that is associated with being able to ‘choose’ which ontological components are to serve as the ground ontology. When this varies, we have varying perspectives on the world just as we have seen necessary in several of the accounts described above. The notion of ground ontology that is relied upon in the account is sufficiently general to range over many candidates. This is the sense in which the D&S addition is a generic module that can be used by ontologies of various kinds. All that is assumed is that the ground ontology has at least one unary predicate and one n-ary predicate whose universe is restricted to the unary predicate. D&S then adds to this, as indicated informally above, two unary predicates, D (Description) and S (Situation), and a binary predicate satisfies (inverse: satisfiedBy) that holds between S and a subset of D, called the situation de- scription (SD). The category of descriptions is added to the assumed ontology in some position compatible with non-physical entities depending on intentional agents: if DOLCE is assumed as the ontology extended, then a suitable home for descriptions is offered by Non-physical endurant. If there is no such suitable category offered, then the ontology is extended simply by descriptions. Particularly important for our intended use of ontologies is then a further transformation that the addition of D&S induces on the ontology extended. This Gangemi and Mika call I1-[OntoSpace]:D1 74

Figure 29: Description and situation extension to DOLCE (taken from Gangemi and Mika, 2003, figure 1)

the functional structure or selectional structure. By means of this mechanism, the D&S extension takes a significant step towards dealing with the crucial notion of roles and role fillers in a very general way. It is largely this feature of the approach that has caused it to receive considerable attention; previous attempts to provide a notion of functional role have not so far achieved an appropriate anchoring of roles and their properties within a generic ontology. The D&S framework is a considerable advance in this direction. Formally:

“For each most general predicate Pi in [an ontology augmented by D&S], there D exists a predicate Pi subsumed by D (but disjoint from SD), and between D each pair Pi, Pi the selects binary predicate may hold when an instance of Pi is a constituent of a situation.” (Gangemi & Mika 2003)

This establishes a dual structure (the description) whose behavior mirrors that of the structure described. For each predicate that is in the setting of a situation satisfying a D situation description, that predicate Pi is selected by the description component Pi . As a result, categories from the ground ontology can find ‘functional descriptions’ or equivalents in a particular description. For example, a Perdurant from the DOLCE ontology can be ‘selected by’ a Course of Event in the D&S extension, Endurant by a Functional Role, and a Region by a Parameter. Gangemi & Mika (2003) present a UML graph of their extension and its relation to DOLCE; this is shown here in Figure 29. Gangemi & Mika (2003) propose several further interesting inter-relationships between I1-[OntoSpace]:D1 75

the descriptions and the ground ontology provided by DOLCE and go on to discuss an application of their account to the ontology of communication; this is, however, more relevant to our deliverable D3 and so we will not discuss it further here. There is also still much to consider with respect to D&S and its consequences for other components of an ontology such as DOLCE. Gangemi and Mika do not describe qualities, for example, an area that we will see to be central in our discussion of spatial ontologies in our deliverable D2. Finally, Masolo, Vieu, Bottazzi, Catenacci, Ferrario, Gangemi & Guarino (2004) build on the account and develop it further towards a useful treatment of social roles. We also expect this to be very significant for future developments.

11 Conclusions and recommendations

In this section, we draw together briefly some of the lessons and points derived from our discussion of the currently available ontologies and the methods used in their production. Although the number of ontologies is steadily growing and there are many that we have not discussed, those discussed here represent both those that we see as most significant in approach and, grouped together, succeed in covering the range of theoretical positions currently being pursued. We have seen that there are directions in ontology development which adopt quite different degrees of formal power: some attempt to stay within the general area of description logics, others emphasize the need for more expressive languages. In our own work, based on these experiences, we will begin by drawing formally on two lines of development for representing ontologies:

• We will explore the use of a first-order axiomatization of the kind illustrated most clearly in the DOLCE ontology above. In particular, we will focus, in cooperation with the SFB project I4-[SPIN], on the use of the algebraic specification language CASL (Astesiano et al. 2002) for such first-order specifications. • We will also pursue specifications that stay within the expressivity of description logics, particularly that supported by the Web Ontology Language variant OWL- DL. Specifications in these terms are scheduled for early use within the SFB (e.g., to support the linguistic processing as quickly as possible: cf. our deliverable D3) and will be executed with standard description logic representation systems, such as Racer, Pellet, Loom, etc.

The axiomatizations themselves will rely on a taxonomic subsumption backbone of onto- logical categories. We will consider to what extent a mereological foundation is necessary or beneficial here. In terms of content, we will orient ourselves according to the framework suggested by the DOLCE ontology. We find this to be both the most rigorously axiomatized and the I1-[OntoSpace]:D1 76

broadest in scope (if not in detail) of the ontologies we have seen above, while at the same time allowing precisely the degree of freedom for further development that we need in those areas to receive detailed attention in our own work on ontology. This means that we can explore such areas in more detail as necessary for our concrete tasks, with strong support from DOLCE but without too restrictive prior structuring. Both the position of space and the position of language are such areas and our approaches to these will be taken up in our deliverables D2 and D3 respectively. The DOLCE axiomatization provides very useful guidelines for just how more refined treatments can be added to the generic foundation and so contrasts with approaches which already have commitments to treatments of language and space built in. We will also adopt in our specification work, while building on the above, a number of ‘orienting problem areas’ or methods that we expect will be particularly important in shaping our results. These can be listed as follows.

1. The OntoClean methodology (cf. Section 2.4) and its use of the metaproperties of rigidity, identity, unity and dependence will be applied to all our specifications. This will involve reassessing existing fragments (e.g., the ‘linguistic’ ontologies, cf. D3) in terms of their conformance.

2. The notion of identity will be considered particularly carefully. Identity appears as one of the metaproperties of the OntoClean methodology in two guises: discrimina- tion and recognition. The former, which is meant as ‘identity proper’ within Onto- Clean, is made up of criteria by which elements can be determined to be the same or not—i.e., given two individuals, identity criteria allow us to ascertain whether these two individuals are actually one and the same. The latter are criteria which allow us to know whether we have an example of some particular category or not. Both of these kinds of criteria are used by Guarino & Welty (2004) when allocating their identity metaproperty, although they are clearly very different. The notion of identity within 3D+T ontologies is treated by allowing predicates that are indexed by time to refer to individuals. Thus, an individual x may have some part P at time t1 but not at time t2: but the individual remains. This simple state of affairs hides a host of problems, however, as we see in the need to introduce particular ‘part-of’ relations that may or may not refer to time. Simply to deal with this issue raises questions such as whether some parts are ‘significant’ parts or not (i.e., if an individual loses a significant part, then perhaps it is not the same individual anymore), or whether some parts or in fact ‘too small’ to matter or not (e.g., objects losing and gaining atoms at the quantum level). The notion of identity also raises questions within 4D ontologies. Here we need to know whether an individual observed at time t1 is the ‘same’ individual that is observed at time t2: and the only way this can be done within the 4D space-time world is to say that the two observations refer to the same space-time worm. For this to be possible, the space-time worm itself needs to be identifiable. I1-[OntoSpace]:D1 77

In an attempt to deal with some of these issues in a clean fashion, we will explicitly explore the multi-perspectival view proposed for the BFO discussed above: that is, we will require both spacetime worms (SPAN) and 3D entities (SNAP). Identity will be pursued as definable in terms of a formal relationship of individual SNAP entities to individual SPAN entities (i.e., the ‘lives’ of the SNAP entities). This leads naturally to notions of continuous change and conceptual neighborhoods that have been addressed in the spatial reasoning and representation community (cf. Freksa 1991, Cohn & Hazariki 2001) and to which we return in deliverable D2.

3. The notion of granularity is, we suspect, also a crucial organizing feature that may have many consequences for how we specify our ontologies. In Smith’s BFO, any ontology has to commit explicitly to a given granularity and this is part of the specification. This appears to us to be crucial, not only for general ontological work but also for concrete spatial and linguistic problems. Moreover, the ability to move between various granularities will also be important: if our ontology specifications can include granularity at a general level as part of their basic mechanisms, then we will have a very useful tool in place for the modelling of explicit situations where differing granularities may be applied. Granularity will therefore be explored early on in our axiomatizations. Its relation to possibly varying ‘descriptions’ in the D&S extension to DOLCE also needs to be clarified.

4. Possibly related to granularity is the notion of perspective. Here we need to con- sider the ontological consequences of allowing perspectives as such: to what extent is something simply a ‘point of view’ and to what extent is it a proper component of an ontology. Under Sowa’s (2000) view as mentioned briefly in Section 2.2 above, even basic distinctions such as the continuant/occurrent opposition and granularity come down to a question of an observational perspective and is divorced from the ‘real world’: any ontology using non-atomic (in the quantum sense!) descriptions is then for Sowa automatically not a realist Ontology but a cognitive/perceptual rep- resentation as, for example, DOLCE claims itself to be. For Smith and the BFO, on the other hand, many alternative perspectives are equally ‘real’ and belong, there- fore, to Ontology. There is something very appealing in this latter view that we will investigate further; in Smith’s terms, for example, if one needs to argue that everyday objects such as chairs and tables are not ‘real’ (e.g., that they are only per- ceptually constructed on the basis of information given by our senses), then one may have a problem. Is the chair that one is sitting on merely a result of observational inadequacies (cf. Sowa) or is it something more? And if so, what? The issues quickly become complex in this area but we will not be able to ignore them. Particularly when we deal with language and the social entities that arise in normal interaction, questions of perspective and ‘social reality’ will always be present. In particular, we will need to consider differing perspectives in our work on human- robot interaction: here a basic assumption is that a robot interactant and a human user have very different perspectives on a situation—both may be valid, however. I1-[OntoSpace]:D1 78

The distinct perspectives are largely due to the very different perceptual abilities of the agents involved. Communication between the two agents will nevertheless have to reconcile these differences without seeing one as a ‘correction’ or approximation for the other: they are qualitatively, not just quantitatively, different.

5. The requirements of different perspectives, different granularities, SNAP and SPAN distinctions, different domains (space, time, language, robot and human joint actions, goals and purposes) and so on raise very forcibly questions of ontology architectures and modularity. This will be taken up particularly in our deliverable D4. Very few of the ontologies presented in this deliverable have had much to say about this issue. The Cyc ontology and its explicit microtheories is perhaps the most developed in this respect but, as we saw, the criteria for microtheory creation and their precise styles of interaction are less clear. The trans-ontology formal relations of Smith’s BFO also address certain issues involving interacting ontology modules, but their formalization is still in its early stages. A refined view of modularity will be essential for organizing the distinct ontological components that we currently envisage.

The ontology components that we will develop will seek to represent real situations that occur in SFB application scenarios—drawing always on the general ontological domains formalized rather than creating ad hoc specialized representations. The precise nature of the relationships between the general ontological domains and the particular situations modelled will be investigated on the basis of the concrete applications and their require- ments. As a measure of overall coverage, we will use the, in some cases quite extensive, collections of categories proposed in ontologies such as Cyc and SUMO (and, as we shall see in Deliverable D3, WordNet) as checklists: in the areas of relevance to spatial reasoning, representation and interaction our specifications will need to cover a large proportion of the categories that have been proposed hitherto. We expect, however, that our specifications will differ from those original proposals in their ontological organization as we apply the results of our investigations in the orienting problem areas (1)-(5) above. I1-[OntoSpace]:D1 79

References

Allen, J. F. (1984), ‘Towards a general theory of action and time’, Artificial Intelligence 23, 123– 154.

Astesiano, E., Bidoit, M., Krieg-Br¨uckner, B., Kirchner, H., Mosses, P. D., Sannella, D. & Tarlecki, A. (2002), ‘CASL - the Common Algebraic Specification Language’, Theoretical Computer Science 286, 153–196. Special issue on Abstract Data Types.

Baader, F., Calvanese, D., McGuinness, D., Nardi, D. & Patel-Schneider, P., eds (2003), The Description Logic Handbook, Cambridge University Press.

Baader, F. & Nutt, W. (2003), Basic description logics, in F. Baader, D. Calvanese, D. McGuin- ness, D. Nardi & P. Patel-Schneider, eds, ‘The Description Logic Handbook’, Cambridge University Press, chapter 2, pp. 47–100.

Baker, C. F., Fillmore, C. J. & Lowe, J. B. (1998), The Berkeley FrameNet Project, in ‘Proceed- ings of the ACL/COLING-98’, Montreal, Quebec.

Bechhofer, S. (2003), The DIG description logic interface: DIG/1.1, Technical report, University of Manchester.

Bechhofer, S., Horrocks, I., Goble, C. & Stevens, R. (2001), OilEd: A reason-able ontology editor for the semantic web, in ‘KI-2001: Advances in Artificial Intelligence’, number 2174 in ‘LNAI’, Springer, Berlin, Heidelberg, pp. 396–408.

Bittner, T. & Smith, B. (2001), A taxonomy of granular partitions, in ‘Proceedings of the Con- ference on Spatial Information Theory - COSIT 2001’, Lecture Notes in Computer Science, Springer-Verlag, Berlin-Heidelberg, pp. 28–43. http://people.ifomis.uni-leipzig.de/thomas.bittner/tp.pdf

Bittner, T. & Smith, B. (2003a), Granular spatio-temporal ontologies, in ‘2003 AAAI Symposium: Foundations and Applications of Spatio-Temporal Reasoning (FASTR)’. http://people.ifomis.uni-leipzig.de/thomas.bittner/SSS303TBittner.pdf

Bittner, T. & Smith, B. (2003b), A theory of granular partitions, in M. Duckham, M. F. Goodchild & M. F. Worboys, eds, ‘Foundations of Geographic Information Science’, Taylor and Francis, London, pp. 117–151. http://wings.buffalo.edu/philosophy/faculty/smith/articles/partitions.pdf

Bittner, T. & Smith, B. (forthcoming), ‘Formal ontologies for space and time’. http://ontology.buffalo.edu/geo/sto.pdf

Brachman, R. J. (1977), ‘What’s in a concept: Structural foundations for semantic networks’, International Journal of Man-Machine Studies 9, 127–152.

Burners-Lee, T., Hendler, J. & Lassila, O. (2001), ‘The semantic web’, Scientific American 284(5). I1-[OntoSpace]:D1 80

Carlson, L. & Nirenburg, S. (1992), Practical world modelling for NLP applications, in ‘Pro- ceedings of the Third Conference on Applied Natural Language Processing’, Association for Computational Linguistics, pp. 235–236. Casati, R. & Varzi, A. C. (1994), Holes and other superficialities, MIT Press (Bradford Books), Cambridge, MA and London. Casati, R. & Varzi, A. C. (1999), Parts and places: the structures of spatial representation, MIT Press (Bradford Books), Cambridge, MA and London. Chaffin, R. & Herrmann, D. J. (1988), The nature of semantic relations, in M. Evens, ed., ‘Relational Models of the Lexicon’, Cambridge University Press, Cambridge. Cohn, A. & Hazariki, S. (2001), ‘Qualitative spatial representation and reasoning: an overview’, Fundamenta Informaticae 43, 2–32. Common Logic Working Group (2003), Common logic: Abstract syntax and semantics, Technical report. http://cl.tamu.edu Cycorp (2004a), Cyc 101 tutorial, Technical report. http://www.opencyc.com/doc/tut/ Cycorp (2004b), Cyc tutorial, Technical report. http://www.cyc.com/cycdoc/course/collections-module.html Cycorp (2004c), Ontological engineer’s handbook v. 0.7, Technical report. http://www.cyc.com/doc/handbook/oe/oe-handbook-toc-opencyc.html Cycorp (2004d), Opencyc 0.7.0, Technical report. http://www.opencyc.org Degen, W., Heller, B., Herre, H. & Smith, B. (2001), GOL: A general ontology language, in B. S. Welty, Christopher, ed., ‘Proceedings of the 2nd International Conference on Formal Ontology in Information Systems’, ACM Press, New York, pp. 34–45. Donini, F. M., Lenzerini, M., Nardi, D. & Nutt, W. (1995), The complexity of concept languages, Technical Report RR-95-07, Deutsches Forschungszentrum f¨ur K¨unstliche Intelligenz, Ger- many. http://citeseer.nj.nec.com/donini91complexity.html Donnelly, M. (2003), Layered mereotopology, in ‘IJCAI 2003 – Eighteenth International Joint Conference on Artificial Intelligence’. http://ontology.buffalo.edu/medo/Donnelly IJCAI03.pdf Donnelly, M. (2005), ‘Relative places’, Applied Ontology 1(1), 55–75. Donnelly, M. & Smith, B. (2003), Layers: A new approach to locating objects in space, in W. Kuhn, M. F. Worboys & S. Timpf, eds, ‘Spatial Information Theory: Foundations of Geographic Information Science’, number 2825 in ‘Lecture Notes in Computer Science’, Springer, Berlin, pp. 50–65. http://ontology.buffalo.edu/geo/Layers.pdf I1-[OntoSpace]:D1 81

Farquhar, A., Fikes, R. & Rice, J. (1996), The Ontolingua server: a tool for collaborative ontology construction, in ‘Proceedings of the 10th Banff Knowledge Acquisition for Knowledge Based Systems Workshop (KAW95)’, Banff, Canada. http://ontolingua.stanford.edu/

Fensel, D., Horrocks, I., Van Harmelen, F., Decker, S., Erdmann, M. & Klein, M. (2000), OIL in a nutshell, in R. Dieng, ed., ‘Proceedings of the 12th. European Workshop on Knowledge Acquisition, Modeling and Management (EKAW-00)’, number 1937 in ‘Lecture Notes in Artificial Intelligence’, Springer-Verlag.

Fillmore, C. (1982), Frame semantics, in Linguistics Society of Korea, ed., ‘Linguistics in the morning calm’, Hanshin, Seoul, pp. 111–137.

Fillmore, C. J. (1968), The case for case, in E. Bach & R. T. Harms, eds, ‘Universals in Linguistic Theory’, Holt, Rinehart and Wilson, New York.

Freksa, C. (1991), Conceptual neighborhood and its role in temporal and spatial reasoning, in M. Singh & L. Trav´e-Massuy`es, eds, ‘Decision Support Systems and Qualitative Reasoning’, North-Holland, Amsterdam, pp. 181–187.

Gangemi, A., Guarino, N., Masolo, C. & Oltramari, A. (2001), Understanding top-level onto- logical distinctions, in ‘Proceedings of IJCAI 2001 workshop on Ontologies and Information Sharing’.

Gangemi, A., Guarino, N. & Oltramari, A. (2001), Conceptual analysis of lexical taxonomies: the case of WordNet top-level, in B. S. Welty, Christopher, ed., ‘Proceedings of the 2nd International Conference on Formal Ontology in Information Systems’, ACM Press, New York, pp. 285–296.

Gangemi, A. & Mika, P. (2003), Understanding the Semantic Web through descriptions and situations, in ‘Proceedings of ODBASE 2003’.

G¨ardenfors, P. (2000), Conceptual spaces: the geometry of thought, MIT Press, Cambridge, MA.

Geneserith, M. (1991), Knowledge interchange format, in J. Allen, ed., ‘Proceedings of the 2nd International Conference on the Principles of Knowledge Representation and Reasoning (KR- 91)’, Morgan Kaufman, pp. 238–249. http://logic.stanford.edu/kif/kif.html

Geneserith, M. & Fikes, R. (1992), Knowledge interchange format, version 3.0. reference manual, Computer Science Department, Technical Report Logic 92-1, Stanford University.

Gerstl, P. & Pribbenow, S. (1995), ‘Midwinters, end games, and body parts: a classification of part-whole relations’, International Journal of Human-Computer Studies 43(5/6), 865–890.

Gibson, J. (1966), The senses considered as perceptual systems, Allen and Unwin, London.

Grenon, P. & Smith, B. (2004), ‘SNAP and SPAN: Towards dynamic spatial ontology’, Spatial Cognition and Computation 4(1), 69–103. I1-[OntoSpace]:D1 82

Gruber, T. (1995), ‘Toward principles for the design of ontologies used for knowledge sharing’, International Journal of Human-Computer Studies 43(5/6), 907–928.

Guarino, N. (1994), The ontological level, in R. Casati, B. Smith & G. White, eds, ‘Philosophy and the Cognitive Sciences’, H¨older-Pichler-Tempsky, Vienna.

Guarino, N. (1995), ‘Formal ontology, conceptual analysis and knowledge representation’, Inter- national Journal of Human-Computer Studies 43(5/6), 625–640.

Guarino, N. (1998), Formal ontology and information systems, in N. Guarino, ed., ‘Formal On- tology in Information Systems’, IOS Press, Amsterdam, pp. 3–18.

Guarino, N. & Welty, C. (2000), A formal ontology of properties, in ‘Knowledge Engineering and Knowledge Management: Methods, Models and Tools. 12th International Conference, EKAW2000’, Springer Verlag, France, pp. 97–112.

Guarino, N. & Welty, C. (2001), Identity and subsumption, in R. Green, C. Bean & S. Myaeng, eds, ‘The semantics of relationships: an interdisciplinary reader’, Kluwer Academic, Dor- drecht.

Guarino, N. & Welty, C. (2002), ‘Evaluating ontological decisions with OntoClean’, Communica- tions of the ACM 45(2), 61–65.

Guarino, N. & Welty, C. (2004), An overview of OntoClean, in S. Staab & R. Studer, eds, ‘Handbook on Ontologies’, Springer-Verlag, Heidelberg and Berlin.

Gurevych, I. & Porzel, R. (2004), SmartKom ontology. OIL file: Release 1.0.

Gurevych, I., Porzel, R. & Malaka, R. (2004), The SmartKom ontology, in W. Wahlster, ed., ‘SmartKom—Foundations of Multimodal Dialogue Systems’, Springer, Berlin.

Haarslev, V. & M¨oller,R. (2001), RACER system description, in ‘International Joint Conference on Automated Reasoning (IJCAR’2001)’, number 2083 in ‘Lecture Notes in Computer Sci- ence’, Springer-Verlag, Berlin, pp. 701–712. http://citeseer.nj.nec.com/haarslev01racer.html

Hahn, U., Schulz, S. & Romacker, M. (1999), Partonomic reasoning as taxonomic reasoning in medicine, in ‘Proceedings of the 16th National Conference on Artificial Intelligence & 11th Innovative Applications of Artificial Intelligence Conference (AAAI’99/IAAI’99)’, AAAI, Menlo Park, CA: AAAI Press, Cambridge, MA: MIT Press, pp. 271–276.

Hayes, P. J. (1979), The naive physics manifesto, in D. Michie, ed., ‘Expert systems in the microelectronic age’, Edinburgh University Press, Edinburgh, Scotland.

Hayes, P. J. (1985a), Native physics I: ontology for liquids, in J. R. Hobbs & R. C. Moore, eds, ‘Formal theories of the commonsense world’, Ablex Publishing Corporation, New Jersey, pp. 71–108.

Hayes, P. J. (1985b), The second naive physics manifesto, in J. R. Hobbs & R. C. Moore, eds, ‘Formal theories of the commonsense world’, Ablex Publishing Corporation, New Jersey. I1-[OntoSpace]:D1 83

Heller, B. & Herre, H. (2003), Formal ontology and principles of GOL, OntoMed Report 1, In- stitute for Medical Informatics, Statistics and Epidemiology (IMISE), University of Leipzig, Germany. http://www.onto-med.de

Heller, B. & Herre, H. (2004), General ontological language GOL: A formal framework for building and representing ontologies, Technical Report 7, Institute for Medical Informatics, Leipzig, Germany. In collaboration with Patryk Burek, Frank Loebe and Hannes Michalek.

Heller, M. (1991), The ontology of physical objects: four-dimensional hunks of matter, Cambridge University Press, Cambridge.

Horrocks, I. (1998), Using an expressive description logic: FaCT or fiction?, in ‘Proceedngs of the 6th International Conference on Principles of Knowledge Representation and Reasoning– KR’98’, pp. 636–647.

Horrocks, I., Patel-Schneider, P. F. & van Harmelen, F. (2003), ‘From SHIQ and RDF to OWL: The making of a web ontology language’, Journal of Web Semantics 1(1).

Horrocks, I. & Sattler, U. (2001), Ontology reasoning in the ∫hoq(d) description logic, in ‘Proceed- ings of the International Joint Conference in Artificial Intelligence (IJCAI’2001)’, Seattle, USA.

Horrocks, I. & Sattler, U. (2005), A tableaux decision procedure for SHOIQ, in ‘Proc. of the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI 2005)’. http://www.cs.man.ac.uk/ horrocks/Publications/download/2005/HoSa05a.pdf

Horrocks, I., Sattler, U. & Tobies, S. (1999), Practical reasoning for expressive description logics, in H. Ganzinger, D. McAllester & A. Voronkov, eds, ‘Proceedings of the 6th. International Conference on Logic for Programming and Automated Reasoning (LPAR’99)’, number 1705 in ‘Lecture Notes in Artificial Intelligence’, Springer-Verlag, Berlin, pp. 161–180.

Hovy, E. H. & Knight, K. (1993), Motivating shared knowledge resources: an example from the Pangloss collaboration, in ‘Proceedings of IJCAI Workshop on Knowledge Sharing and Information Interchange’, International Joint Conference on Artificial Intelligence.

IEEE (2003), Standard upper ontology knowledge interchange format, Technical report. http://suo.ieee.org/suo-kif.html

Lenat, D. & Guha, R. V. (1989), Building large knowledge-based systems: representation and inference in the CYC project, Addison-Wesley Publishers, New York.

Lewis, D. (1991), Parts of classes, Blackwell, Oxford.

Loux, M. (1998), Metaphysics: A contemporary introduction, Routledge, London.

L¨uttich, K. & Mossakowski, T. (2004), Specification of ontologies in CASL, in A. C. Varzi & L. Vieu, eds, ‘Proceedings of the International Conference on Formal Ontology in Information Systems (FOIS-2004)’, IOS PRess, Amsterdam, pp. 140–150. I1-[OntoSpace]:D1 84

Martin, P. (2003), Knowledge representation, sharing and retrieval on the web, in N. Zhong, J. Liu & Y. Yao, eds, ‘Web Intelligence’, Springer-Verlag, Heidelberg and Berlin. http://www.webkb.org/doc/papers/wi02/

Masolo, C., Borgo, S., Gangemi, A., Guarino, N. & Oltramari, A. (2003), Ontologies library (final), WonderWeb Deliverable D18, ISTC-CNR, Padova, Italy.

Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A. & Schneider, L. (2002), The WonderWeb library of foundational ontologies: preliminary report, WonderWeb Deliverable D17, ISTC-CNR, Padova, Italy.

Masolo, C., Vieu, L., Bottazzi, E., Catenacci, C., Ferrario, R., Gangemi, A. & Guarino, N. (2004), Social roles and their descriptions, in D. Dubois, C. Welty & M.-A. Williams, eds, ‘Proced- ings of the Ninth International Conference on the Principles of Knowledge Representation and Reasoning (KR2004)’, Whistler, BC, Canada, pp. 267–277. http://magic.it.uts.edu.au/KR2004/

Miller, G. (1990), ‘WordNet: an online lexical database’, International Journal of Lexicography 3(4). ftp://ftp.cogsci.princeton.edu/pub/wordnet/5papers.pdf

Newell, A. (1982), ‘The knowledge level’, Artificial Intelligence pp. 87–127.

Niles, I. & Pease, A. (2001a), Origins of the Standard Upper Merged Ontology: A Proposal for the IEEE Standard Upper Ontology, in ‘Working Notes of the IJCAI-2001 Workshop on the IEEE Standard Upper Ontology’, pp. 37–42. http://projects.teknowlege.com/IJCAI01/index.html

Niles, I. & Pease, A. (2001b), Toward a standard upper ontology, in C. Welty & B. Smith, eds, ‘Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS-2001)’, Association for Computing Machinery, Ogunquit, Maine. http://projects.teknowledge.com/HPKB/Publications/FOIS.pdf

Nirenburg, S. & Raskin, V. (2001), Ontological semantics, formal ontology, and ambiguity, in B. S. Welty, Christopher, ed., ‘Proceedings of the 2nd International Conference on Formal Ontology in Information Systems’, ACM Press, New York, pp. 151–161.

Noy, N. F., Sintek, M., Decker, S., Crubezy, M., Fergerson, R. W. & Musen, M. A. (2001), ‘Creating Semantic Web contents with protege-2000’, IEEE Intelligent Systems 16(2), 60– 71.

Ogden, C. & Richards, I. (1923), The Meaning of Meaning, Harcourt, Brace, and Co., Inc.

Riazanov, A. & Voronkov, A. (2002), ‘The design and implementation of VAMPIRE’, AI Com- munications: Special issue on CASC 15(2), 91–110.

Russell, S. J. & Norvig, P. (1995), Artificial Intelligence: a modern approach, Prentice Hall, Englewood Cliffs, N.J. I1-[OntoSpace]:D1 85

Russell, S. J. & Norvig, P. (2003), Artificial Intelligence: a modern approach, 2nd international edition edn, Prentice Hall, Upper Saddle River, N.J., chapter 10: Knowledge Representation, pp. 320–340.

Schmidt-Schauß, M. & Smolka, G. (1991), ‘Attributive concept descriptions with complements’, Artificial Intelligence 48(1), 1–26.

Schulz, S. & Hahn, U. (2001a), Mereotopological reasoning about parts and (w)holes in bio- ontologies, in C. Welty & B. Smith, eds, ‘Proceedings of the 2nd International Conference on Formal Ontology in Information Systems’, ACM Press, New York, pp. 210–221.

Schulz, S. & Hahn, U. (2001b), Necessary parts and wholes in bio-ontologies, in D. Fensel, F. Giunchiglia, D. McGuinness & M.-A. Williams, eds, ‘Proceedings of Principles of Knowledge Representation and Reasoning: Proceedings of the 8th International Conference (KR2002)’, Morgan Kaufmann, San Francisco, CA, pp. 387–394.

Simons, P. (1987), Parts: a study in ontology, Clarendon Press, Oxford.

Smith, B. (1995), ‘Formal ontology, common sense and cognitive science’, International Journal of Human-Computer Studies 43(5/6), 641–668.

Smith, B. (1996), ‘Mereotopology: a theory of parts and boundaries’, Data and knowledge engi- neering 20, 287–303.

Smith, B. (1997), ‘On substances, accidents and universals: In defense of a constituent ontology’, Philosophical Paper 27, 105–127.

Smith, B. (1998), Basic concepts of formal ontology, in N. Guarino, ed., ‘Formal Ontology in Information Systems’, IOS Press, Amsterdam, pp. 19–28.

Smith, B. (2001), ‘Fiat objects’, Topoi 20(2), 131–148. http://ontology.buffalo.edu/smith/articles/fiat.htm

Smith, B. (forthcoming), Ontology and information systems, in ‘Stanford Encyclopedia of Phi- losophy’.

Smith, B. & Brogaard, B. (2002), ‘Quantum mereotopology’, Annals of Mathematics and Artifi- cial Intelligence 35(1-2), 153–175. http://ontology.buffalo.edu/smith/articles/QM.htm

Smith, B. & Grenon, P. (2004), ‘The cornucopia of formal-ontological relations’, Dialectica 58(3), 279–296. http://wings.buffalo.edu/philosophy/ontology/smith/articles/cornucopia.pdf

Smith, M. K., Welty, C. & McGuinness, D. L. (2004), OWL web ontology language guide, Tech- nical Report 20040210, World Wide Web Consortium. http://www.w3.org/TR/owl-guide/

Sowa, J. F. (1983), ‘Generating language from conceptual graphs’, Computers and Mathematics with Applications 9(1), 29–43. I1-[OntoSpace]:D1 86

Sowa, J. F. (2000), Knowledge Representation: logical, philosophical, and compuational founda- tions, Brooks/Cole, Pacific Grove, CA.

SUMO (2003), ‘The suggested upper merged ontology’, Teknowledge. Version 1.60. http://ontology.teknowledge.com/

Vendler, Z. (1967), Linguistics in Philosophy, Cornell University Press, Ithaca.

Wahlster, W. (2001), SmartKom: Multimodal communication with a life-like character, in ‘Proc. of the 7th European Conference on Speech Communication and Technology - Eurospeech 2001’, Vol. 3, pp. 1547–1550.

West, M. (2002a), Information modelling: An analysis of the uses and meanings of associations, Technical report, Shell Information Technology International Ltd, UK. http://www.matthew-west.org.uk/Documents/InformationModellingPDT2002.pdf

West, M. (2002b), A spatio-temporal model of activity and state, Technical report, Shell Infor- mation Technology International Ltd, UK. Written as preparation for the National Science Foundation ACTOR 2002 conference. http://www.matthew-west.org.uk/Documents/Spatio-temporal-Paradigm.pdf

Winston, M. E., Chaffin, R. & Herrman, D. J. (1987), ‘A taxonomy of part-whole relations’, Cognitive Science 11(4), 417–444.

Woods, W. A. (1975), What’s in a link: foundations for semantic networks, in D. G. Bobrow & A. M. Collins, eds, ‘Representation and Understanding: studies in cognitive science’, Academic Press, New York. Also reprinted in Readings in Knowledge Representation, Ronald J. Brachman and Hector J. Levesque (eds.), Morgan Kaufman Publishers Inc., 1985, pp217- 241. I1-[OntoSpace]:D1 87

I Appendix: Parthood basics

This appendix describes very briefly the basis of mereology, the formal theory of parts and wholes, as it is used for ontology construction. In addition, we draw attention to the fact that there are very many notions of ‘parthood’ and that differing ‘mereologies’ may only go so far in covering these. For a more complete and grounded basic description of the constructs described here, as well as references to the original sources of mereology in mathematics, the reader is referred particularly to Casati & Varzi (1999, Chapter 3) and Simons (1987). The formal approach to parts constructs an axiomatic system which, by virtue of increasing constraints, serves to define a range of possible intended interpretations of the notion of ‘part’. Definitions are usually given in a first order logic with equality. The kinds of defini- tions shown here reoocur in many of the individual ontologies that consider the ‘parthood’ relation; we have seen this concretely above with each of the ontologies discussed. The starting point for definition is the Ground Mereology (M) made up of a single binary parthood relation P and the properties of reflexivity, antisymmetry and transitivity. Thus parthood is simply a partial ordering over some domain of objects. The objects related by parthood in mereology are typically taken to be ‘regions’ of some dimensionality: illustrations are usually given for the 2D case but the definitions are not intended to be limited to just two dimensions. This ground mereology supports definitions of overlap, proper part, over-crossing and proper overlap. It also supports the corresponding formal notions of underlap, over-crossing and proper underlap. Overlap is defined, for example, simply as:

Oxy =df ∃z(P zx ∧ P zy) i.e., two regions x and y overlap when they have a common part z. Underlap is then the corresponding formula, formed by reversing the arguments of the part relations:

Uxy =df ∃z(P xz ∧ P yz) i.e., two regions x and y underlap when they are both parts of a common whole. The other definitions are given in Table 4. Because part can be defined in terms of proper part (plus identity), the same mereology could also be constructed taking proper part as primitive rather than part. Not just any partial ordering provides a reasonable interpretation of parthood, however, and so it is the job of a good mereology to further constrain what is being said so that reasonable interpretations of parthood follow. Opinions differ on how that can best be done. A very basic extension is the following:

P P xy → ∃z(P zy ∧ ¬Ozx) I1-[OntoSpace]:D1 88

overlap Oxy =df ∃z(P zx ∧ P zy) underlap Uxy =df ∃z(P xz ∧ P yz) proper part P P xy =df P xy ∧ ¬P yx over-crossing OXxy =df Oxy ∧ ¬P xy under-crossing UXxy =df Uxy ∧ ¬P yx proper overlap P Oxy =df OXxy ∧ ¬OXyx proper underlap P Oxy =df UXxy ∧ ¬UXyx

Table 4: Axioms of the basic ground mereology i.e., that if some region y has a proper part x, then it also has a further part z that does not overlap with x. Casati & Varzi (1999) refer to this as ‘weak supplementation’ which, together with the axioms of Table 4, gives rise to the Minimal Mereology (MM); other writers, such as Simons (1987), regard this as so basic to the meaning of ‘part’ that it is listed alongside the axioms given above. A stronger form of supplementation is: ¬P yx → ∃z(P zy ∧ ¬Ozx) i.e., if y is not a part of x then y has some part z which does not overlap with x. The stronger form entails the weaker form but not vice versa. The combined axioms with the stronger form of supplementation is called Extensional Mereology (EM). One reason for this is that it can be proved that non-atomic objects with the same proper parts are identical. Identity is thus made to follow from a sharing of parts or, alternatively, an object is exhaustively defined by its constituent parts. This is already an interpretation of parthood that may or may not be considered ‘natural’. Inter- actions with time and the possibility that objects may change their parts are immediately rendered non-straightforward. An alternative path for constraining the notion of part, i.e., of strengthening the theory beyond a partial order, is to provide axioms that insist that the domain of discourse remains ‘closed’ under certain operations–i.e., when the operations are carried out, their results obey all the rules for regions that their operands obeyed. Thus, however far one applies the operations, one stays within the same mereological world. These are defined formally in terms of sums and products; mereological sum is also often called fusion. Sums can be defined using underlap and overlap, and products can be defined using overlap and part: Uxy → ∃z∀w(Owz ↔ (Owx ∨ Owy)) Oxy → ∃z∀w(P wz ↔ (P wx ∧ P wy)) With these axioms added to the basic mereology M, the result is the Closure Mereology (CM). When combined with the minimal or extensional mereologies, the result is a corre- I1-[OntoSpace]:D1 89

Figure 30: Hierarchy of mereologies according to strength of commitments; inclusion follows the connecting lines upwards (from Casati and Varzi, 1999, p48) sponding closed form (CMM and CEM respectively). Under CEM it can be proved that “any two underlapping have a unique mereological sum, and any two overlappings have a unique product”. (Casati & Varzi 1999, p43). One final step is to add so-called arbitrary fusions, i.e., fusions over indefinitely many entities. This gives Classical Mereology or General Mereology (GM). It can be defined with an axiom schema of the form:

∃xφ → ∃z∀y(Oyz ↔ ∃x(φ ∧ Oyx))

Adding in the extensional component then yields General Extensional Mereology (GEM), which is known to be a complete Boolean algebra without a zero element. The hierarchy of mereologies shown so far can be shown graphically as in Figure 30, taken from Casati & Varzi (1999, Figure 3.2, p48). The question remains, however, to what extent does this construction correspond to the notion of ‘part’? There are a number of discussions of problems that arise when applying the framework. One, alluded to above, is the nature of time: if the identity of objects is given by their parts, then many objects that are taken to be the same object by ‘common- sense’ are suddenly revealed to be anything but the same object mereologically (e.g., the cat with or without its tail). This is one reason why, for example, in the definition of the General Ontology Language (cf. Heller & Herre (2003, p14) and Section 9), the following distinct kinds of parthood need to be listed: I1-[OntoSpace]:D1 90

part (x, y) x is part of y tpart (x, y) x is temporal part of y spart (x, y) x is spatial part of y cpart (x, y) x is constituent-part of y (y contains x) part-eq (x, y) the reflexive version of part tpart-eq (x, y) the reflexive version of tpart spart-eq (x, y) the reflexive version of spart cpart-eq (x, y) the reflexive version of cpart

The DOLCE ontology has a more restricted range of part-relations, essentially part as such and temporal part, defined thus:

Parthood P (x, y) → (AB(x) ∨ PD(x)) ∧ (AB(y) ∨ PD(y)) Temporary Parthood P (x, y, t) → (ED(x) ∧ ED(y) ∧ T (t))

These definitions just type the arguments of the predicates: that is, parthood can hold over two Abstracts (AB) or over two Perdurants (PD); while temporary parthood relates two Endurants (ED) and a time. This reflects the definitions of Perdurants as being ‘entirely present in time’—thus, they are not compatible with an additional temporal specification, the definitions of Abstracts as not being located in time, and Endurants as being necessarily time-indexed. Other listing of possible parthood relationships have already been shown for OpenCyc and SUMO above. In terms of representation, the standard accounts of mereology adopt a first order for- malization. There have also been attempts, however, to construct functioning mereologi- cal reasoning systems with languages with lower expressivity; Hahn, Schulz & Romacker (1999) and Schulz & Hahn (2001b), for example, show a part-whole modelling in terms of a description logic for the purposes of ontological engineering in medicine. They also extend this to mereotopology in Schulz & Hahn (2001a), of which we will hear more in deliverable D2. Broader discussions of the kinds of parts that need to be recognized have drawn on lin- guistic and psychological evidence. In their study of how people describe and use ‘parts’, for example, Winston, Chaffin & Herrman (1987) defined six types of part-whole rela- tions: component-object (e.g., pedal:bike), member-collection (e.g., tree:forest), portion- mass (e.g., slice:cake), stuff-object (steel:bike), feature-activity (e.g., paying:shopping), and place-area (oasis:desert). In a similar vein Chaffin & Herrmann (1988) add to this list phase-process (e.g., adolescence:growing-up). Each of these allows rather different kinds of inferences to follow. Moreover, despite this already rather long list, Gerstl & Pribbenow (1995) go further, criticizing Winston et al. and Chaffin and Hermann somewhat for an overly linguistic bias, and propose “a (language-independent) classification” that is also domain-independent “in that it does not differentiate between physical objects, situations or abstract entities making up a whole.” (Gerstl & Pribbenow 1995, p887) Their account distinguishes three distinct kinds of wholes that of necessity induce differing kinds of part- I1-[OntoSpace]:D1 91

whole relations. The basic organizing feature is given by whether the wholes are to be considered as heterogeneous, uniform or homogeneous. Heterogenous wholes allow their parts to be distinguished from each other by virtue of differing roles they may play (e.g., according to function) for the whole; this gives rise to complex/components relationships, such as the car and its engine. Uniform wholes have parts, but those parts are not distinguishable—this is probably the closest to the pure notion of part within mereology; this gives rise to collection/element relationships, such as a bag of peas and the individual peas within it. And, finally homogeneous wholes do not allow particular parts to be identified at all and give rise to mass/quantity relationships such as 2 litres of the water in the basin. Gerstl and Pribbenow also state that the part-whole relation applied can depend on the view that is being taken of an entity: for example, 3kgs of the sand in the sandpit vs. 100 grains of the sand in the sandpit; or the ships of the fleet vs. the flagship of the fleet. This question of relating ontological distinctions to choices of perspective is one that has to be considered carefully if ontology construction as such is not to fall back to becoming a matter of arbitrary modelling. As Smith writes: there may be many perspectives, but not all perspectives are true, i.e., real. We also see a number of approaches in which the scope within which mereological part is to be applicable is restricted in order to provide a more adequate ontological treat- ment. Particularly significant for us here are Donnelly’s restrictions of parthood to op- erate within ontological layers (cf. Donnelly 2003, Donnelly & Smith 2003) and within location-complexes (cf. Donnelly 2005): we will explore both of these spatial extensions in more depth in our deliverable D2. In short, there is still much to be done in clarifying just what kinds of parts there are and how (and where) they are best to be modelled. However, this notwithstanding, a mere- ological basis such at that summarized here and adopted in most more formal ontologies provides a solid foundation for further investigations. Certainly, separating out basic on- tological relations from different usages of the linguistic term ‘part’ represents a necessary step in clarifying the issues. Index Abox, 19 heterogeneous, 91 arbitrary fusions, 89 homogeneous, 91 classes, 8 identity, 20 Classical Mereology, 89 instances, 8 Closure Mereology, 88 instantiation, 8 Common Logic, 17 inverse roles, 22 conceptual spaces, 53 context, 72 Knowledge Interchange Format, 16 continuants, 58 knowledge level, 1 Cyc Inference Engine, 22 knowledge sharing, 4 Cyc Knowledge Base Browser, 22 lattice, 25 CycL, 17 layers, 91 dependence, 20 linguistic ontologies, 6 description, 73 location-complexes, 91 description logic, 18 logical level, 2 Descriptions and Situations, 72 mereology, 10, 87 DIG, 24 mereotopological, 11 endurantist, 11 meta-properties, 20 endurants, 52 microtheory, 46 epistemological layering, 73 Minimal Mereology, 88 epistemological level, 2 modelling language, 15 essentially, 20 modularity, 78 Extensional Mereology, 88 Multiple Source Ontology, 50 extensionalism, 65 occurrents, 58 FaCT, 23 OilEd, 24 frame problem, 4 OntoClean, 19 frame-based languages, 18 Ontolingua, 17 framing, 69 Ontological Engineering, 4 functional structure, 74 ontological level, 3 fusion, 88 ontological realism, 6 Ontologies, 7 General Extensional Mereology, 89 ontologies, 7 General Mereology, 89 ontology meta-language, 15 granular partitions, 14 granularity, 14 perdurantist, 11 Ground Mereology, 87 perdurants, 52 ground ontology, 72 practical tractibility, 23 Protege, 24

92 I1-[OntoSpace]:D1 93

qualified number restrictions, 22 quality space, 53

RACER, 23 re-usability, 4 reification, 72 rigidity, 20

S4(m), 23 scale, 14 selectional structure, 74 Sigma Knowledge System, 22 situation, 73 state of affairs, 72 subsumption, 8 subsumption hierarchies over roles, 22 SUO-KIF, 17 taxonomy, 8 Tbox, 18 trans-ontological relations, 63 transitivity, 22 uniform, 91 unity, 20 upper ontology, 7 urelements, 9 Urlements, 65

Vampire, 22