Toward Formalizing Ologs: Linguistic Structures, Instantiations, And

Toward formalizing ologs: linguistic structures, instantiations, and mappings MARCO A. PEREZ´ DAVID I. SPIVAK Department of Mathematics Department of Mathematics Massachusetts Institute of Technology Massachusetts Institute of Technology [email protected] [email protected] October 17, 2018 Abstract We define the notion of linguistic structure on a small category, in order to provide a more formal description of ontology logs, also known as ologs, introduced in [20] by R. E. Kent and the second author. In particular, we construct a bicategory Eng, of English noun phrases and verb phrases, endorsed as functional by varying sets of authors. An olog is then defined as a lax functor to Eng. We then present a new notion of linguistic functor, which extends the notion of meaningful functors defined in [16]. Finally, we discuss the relationship between ologs and databases in this context. Contents 1 Introduction 2 2 English language as a bicategory 5 2.1 Nouns,verbs,andfunctionalsentences . .......... 5 2.2 Somepostulatesaboutendorsement . ........ 9 arXiv:1503.08326v2 [math.CT] 22 Jun 2015 2.3 The bicategory Eng ofEnglishexpressions . 11 2.4 The bicategory InstEng of instantiated English expressions . 16 3 Ologs, instantiated ologs, and mappings between them 19 3.1 Linguisticstructures .. .. .. .. .. .. .. .. .. .. ........ 20 3.2 Mappingsbetweenologs ............................ ..... 22 3.3 Instantiated linguistic structures . ............ 26 3.4 Instantiatedfunctors.. .. .. .. .. .. .. .. .. .. ........ 30 References 34 1 1 Introduction The theory of ontology logs (ologs for short) was introduced by Robert Kent and the second author in their paper [20], as a framework for knowledge representation. Ologs are basi- cally mathematical categories that have been wrapped in natural-language English. They have been applied in several branches of science and engineering, [3, 5, 8, 9, 13, 19], as a tool for various kinds of formal modeling. Typically, a person who wishes to record and document some of her knowledge or ideas will do so in prose, e.g., a scientist publishes ideas in the form of research papers. Ologs offer the ability to express complex ideas using a special type of diagrams. Namely, the objects of study and the relationships between them can be represented as the objects and arrows in a category. The difference between an olog and a category is that an olog has additional structure: each object is labeled with a noun phrase and each arrow is labeled with a verb phrase, so that reading source-arrow-target yields an English sentence. These must satisfy certain rules, giving a set-theoretic semantics to the olog, which in turn allows it to serve a dual role as a database schema. There is a formula for composing sentences end- to-end into a new sentence, when following a path of arrows through an olog, and a pair of equivalent paths (also known as a commutative diagram) in the category is understood as a declared fact equating the two English sentences. The idea of regarding English sentences as mappings is not new. One interesting approach on this matter is given by the notion of conceptual metaphor [22] which links one idea (a source domain) to another (a target domain) to better understanding something. More formal approaches (compared to the previous one) to language, knowledge and information modeling have been developed by other authors, such as D. Kartsaklis, M. Sadrzadeh, S. Pulman and B. Coecke. In [12], for instance, they make use of notions from category theory, such as compact closed categories and strongly monoidal functors, to studying meaning in natural language. The primary goal of this paper is to present a reformulation of the previous concept of olog, which leads to a categorical and linguistic description of mappings between ologs. In our reformulation, any type, aspect, or fact (in the sense of [20]) in an olog must be endorsed by a set of people, who understand its meaning, i.e., the way it indicates a set or a function. This cultural understanding of the English language is captured by a certain bicategory Eng. The above idea that ologs are categories wrapped in English will be formalized by saying that ologs are categories mapping to Eng. Our notion of mapping between ologs falls out of that structure. The result is that a map between ologs is not just a functor between their underlying categories; it is a functor that respects the linguistic description on each node and arrow of the ologs. For example, there is an obvious functor F (namely the identity) between the underlying categories of the following ologs: 2 pa manh pa womanh “?” is ÐÐÐ→ has as weight (in kilograms) (1) pan objecth pa number between 20 and 500h But would this functor F mean anything? We will rephrase this question in § 3.2 as follows: “does there exist an author who is willing to endorse a linguistic structure on F ?” In this paper we explain the difference between a (linguistic) map of ologs and a mere functor between their underlying categories. We will address this particular case (1) in Example 3.2.6. To some extent, this issue is considered in [16], where the authors introduce the concept of meaningful functor. One limitation of their approach is that it depends on a strong assump- tion, namely that every olog C is equipped with a functor I ∶ C Ð→ Set, and that this functor somehow controls the meaning of the olog. Such a set-valued functor is called an instan- tiation, a term coming from database theory [17]. The idea is that I represents a kind of database of examples, or instances, for the various types, aspects, and facts in the olog. For example, if an author is writing an olog C describing a familiar real-world situation, then for a type c (e.g., c = cat) the set I(c) represents all the examples of c (all cats) known by the author. Somewhat strangely, however, there is no requirement in [16] that the database functor I should in any way correspond to the linguistic structure on the olog. A similar issue exists for functors between ologs in [16]. In this paper, we remedy these issues. First, we allow ologs to exist without being instantiated; that is we disentangle ologs and their instantiations. This way, the set of documented examples can evolve over time, without changing the olog to which they refer. On the other hand, we add a constraint to instantiations: for a set-valued functor to count as an in- stantiation of an olog C, it must conform to the linguistic structure, the labelings, on C. The same goes for mappings between ologs: in order for a functor to count as a mapping between ologs, it must conform to the linguistic structures involved. We also take more care to explain the relationship between an olog and its authors. We introduce the concept of endorsement: an author can endorse that a certain concept or relationship between concepts makes sense, that a certain fact is true, etc. Here is anexpert-levelview of this paper. There is a bicategory Eng that denotes the English language as it divides into noun phrases, which indicate sets, and verb phrases, which indicate functions, where all of this “indicating” is decided solely by speakers. Given that such a bicategory Eng exists, an olog is just a small category C and a lax functor L∶ C Ð→ Eng, called a linguistic structure on C. Allowing the base category to vary, we get a fibration 3 Olog Ð→ Cat, where Olog denotes the category of ologs and Cat is the category of small categories. Instantiated ologs are defined similarly: there is a bicategory InstEng ⊆ Eng×Set in which the noun phrase associated to each object is exemplified by a set, and each verb phrase associated to a morphism is exemplified by a function. An instantiated olog is a lax functor C Ð→ InstEng, and we again have a fibration InstOlog Ð→ Cat. All of this will be explained in the main sections of the paper. This paper is organized as follows: In § 2 we introduce the bicategory Eng of English language. We present the notion of author endorsement for noun phrases, verb phrases and equivalence between sentences, along with a list of linguistic guidelines needed to design any olog within Eng. § 3 introduces the category Olog of ologs, as well as the category InstOlog of instantiated ologs, by defining mappings between ologs and between instantiated ologs. To do so, we introduce the notion of linguistic functors and instantiated functors, as mentioned above. The latter of these is an adaptation of the “meaningful functor” notion defined in [16]. Most of the assertions on the bicategory Eng have a linguistic element and so are not purely mathematical. However, with the help of several linguistic postulates (mainly found in § 2.2), fairly formal proofs are possible. Once Eng is given, the rest is straightforward category theory, thus giving to the theory of ologs a more solid theoretical basis. As the paper is mainly written for a non-mathematical audience, all proofs are given as prose arguments, merged with the text. Background and Notation We will assume the reader is familiar with some basic concepts from category theory, such as opposite categories, isomorphisms, functors, and natural transformations. Readers without category-theoretic background may still benefit from reading the less categorical defini- tions and results, skimming the category theory, and trying to digest the examples. The books [1, 18] are good sources for category theory, with many illustrations. No previous knowledge on bicategories, lax functors and lax transformations is needed; everything we say about these topics will be spelled out in concrete terms, but interested readers can check the book [14, Chapter 1] as an excellent source for a brief review on this matter.

Toward Formalizing Ologs: Linguistic Structures, Instantiations, And

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support