Implementingperceptuals

Total Page:16

File Type:pdf, Size:1020Kb

Implementingperceptuals View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Göteborgs universitets publikationer - e-publicering och e-arkiv DEPARTMENT OF PHILOSOPHY, LINGUISTICS AND THEORY OF SCIENCE IMPLEMENTING PERCEPTUAL SEMANTICS IN TYPE THEORY WITH RECORDS (TTR) Arild Matsson Master’s Thesis 30 credits Programme Master’s Programme in Language Technology Level Advanced level Semester and year Spring, 2018 Supervisors Simon Dobnik and Staffan Larsson Examiner Peter Ljunglöf Keywords type theory, image recognition, perceptual semantics, visual question answering, spatial relations, artificial intelligence Abstract Type Theory with Records (TTR) provides accounts of a wide range of semantic and linguistic phenomena in a single framework. This work proposes a TTR model of perception and language. Utilizing PyTTR, a Python implementation of TTR, the model is then implemented as an executable script. Over pure Python programming, TTR provides a transparent formal specification. The implementation is evaluated in a basic visual question answering (VQA) use case scenario. The results show that an implementation of a TTR model can account for multi-modal knowledge representation and work in a VQA setting. Acknowledgements Huge thanks to my supervisors and to professor Robin Cooper, all of whom have provided significant help through this process. Contents 1 Introduction ............................................ 1 1.1 Contribution of this thesis ................................. 1 2 Background ............................................ 2 2.1 Image recognition and object detection ........................... 2 2.2 Computational semantics .................................. 2 2.2.1 Type theory in linguistics .............................. 3 2.3 Perceptual semantics .................................... 3 2.4 Type Theory with Records (TTR) ............................. 4 2.4.1 Overview of TTR .................................. 4 2.5 Visual question answering (VQA) ............................. 6 3 Method ............................................... 6 3.1 PyTTR: Programming with TTR .............................. 6 3.2 Object detection with YOLO ................................ 6 3.3 Objects and perception with TTR ............................. 7 3.4 Spatial relations ....................................... 7 3.5 Language and visual question answering (VQA) ...................... 8 4 Results ............................................... 9 4.1 TTR model ......................................... 9 4.1.1 Object detection ................................... 9 4.1.2 Individuation .................................... 10 4.1.3 Spatial relation classification ............................ 11 4.1.4 Beliefs ........................................ 12 4.1.5 Language ...................................... 12 4.1.6 Agent ........................................ 12 4.2 Combining situation types ................................. 16 4.3 Parsing language to TTR .................................. 17 4.3.1 Parsing to first-order logic (FOL) .......................... 19 4.3.2 Translating FOL to TTR .............................. 19 3 4.4 The relabel-subtype relation ................................ 20 4.4.1 Subtype relabeling algorithm ............................ 21 4.4.2 Restrictions of the algorithm ............................ 22 4.4.3 Implementation ................................... 22 4.5 Additions to PyTTR .................................... 22 4.6 Demonstration ....................................... 24 5 Discussion ............................................. 24 5.1 Subtype check ....................................... 24 5.2 PyTTR ........................................... 24 5.3 Inference-first ........................................ 24 6 Conclusions ............................................ 26 6.1 Suitability .......................................... 26 6.2 Benefits ........................................... 26 6.3 Can it tell us something about semantics? ......................... 26 6.4 Future work ......................................... 26 A Code ................................................ 31 1 Introduction Having computers understand visual input is desirable in several areas. A domestic assistant robot may use a camera to navigate and identify useful objects in a home. Driver-less cars need to be able to read road signs and track other moving vehicles. Web crawlers may extract information from images alongside text on the web. This kind of understanding involves processing sensory (such as visual) input on a cognitive level. Low- level image processing may include tasks such as prominent color extraction, edge detection and visual pattern recognition. Higher-level processing, however, includes identifying objects, their properties and their relations to each other. This information can then be used for language understanding, reasoning, prediction and other cognitive processes. Making the connection between sensory input and cognitive categories is what concerns the field of perceptual semantics (Pustejovsky, 1990). Humans use language to communicate information. Thus it is useful to add linguistic capacities to a per- ceptual system. With vision and language connected, a robot can talk about what it sees, and descriptions can be automatically generated for images found on the web. Image caption generation is indeed a popular task for evaluating computer vision systems. Another one is visual question answering (VQA) (Antol et al., 2017), where the system is expected to generate answers to natural-language questions in the context of a given image. The connection between different modes of information, such as vision and language, requires a model of semantic representation. Formal models such as first-order logic (FOL) have long been of choice, but recent developments have seen data-driven approaches excel in some cases. Briefly put, the former kind tends to deliver deep structures of information in narrow domains, while the latter more easily covers wide domains, but with shallow information content (Dobnik & Kelleher, 2017). A recent contribution that combines several branches in formal systems is Type Theory with Records (TTR) (Cooper, 2005a, 2016). TTR is implemented in Python as PyTTR (Cooper, 2017). 1.1 Contribution of this thesis The main purpose of this thesis is to extend the basic implementation of TTR (PyTTR, Cooper (2017)) to apply it for the first time in a practical task relating vision and language, in particular VQA. The questions that this research raises are: 1. To what degree is (a) TTR as a theoretical framework and (b) its existing practical implementation suited to connect existing vision and language systems? 2. What are the benefits of using TTR this way for (a) vision and language systems and (b) visual question answering? 3. What can connecting vision and language systems tell us about semantics (and TTR)? To explore these questions, a model will be formulated in TTR and implemented using PyTTR. The model will benefit from builing on past proposals; especially relevant are Dobnik & Cooper (2013) and Dobnik & Cooper (2017). As a limitation for the VQA task, the language domain is restricted to polar (yes/no) questions. 1 The theoretical background is summarized in Section 2. In Section 3, the strategies and techniques used for the implementation are described. The implementation is then presented in Section 4. In Section 5, the results are discussed in relation to the questions above. Finally, some conclusions are made in Section 6. 2 Background This section will highlight some important pieces of the history of past research in relevant fields. 2.1 Image recognition and object detection In image recognition, visual data is analyzed in order to detect and classify objects. A wide range of models have been developed to solve this task. Some focus only on the detection of objects (Blaschko & Lampert, 2008), some only on classification (e.g. ResNet, He et al., 2016), and others attempt to solve the full problem in an integrated fashion (Redmon et al., 2016; He et al., 2017). Like in machine learning in general, and other kinds of data processing, significant values from the input data (the image) are collected in a process known as feature extraction. Various types of features exist for image processing. For one, Scale-invariant feature transform (SIFT) is a technique where significant locations of an image are used to extract keypoints (Lowe, 1999). Classification can then be performed by comparing the keypoints of a target image to those in a database. With deep neural networks, however, the need for prior feature extraction is generally eliminated (He et al., 2016, 2017). Neural networks that process image data typically contain convolutional network layers. Color image data is highly dimensional, typically represented as a 2D matrix of pixels, where each pixel itself specifies a quantity of each of three basic colors. Convolutional layers are used to divide the image into smaller, overlapping segments, thus capturing locational aspects of the data. This way, the dimensionality can be reduced to a single, one-dimensional vector. 2.2 Computational semantics Semantics is the study of meaning. Computational semantics is concerned with how to represent meaning digitally, and use it to perform semantic parsing, inference and other tasks (Blackburn & Bos, 2003). A well-established and largely capable formalism for expressing and operating on propositions is first-order logic (FOL). In FOL, the
Recommended publications
  • How to Do Things with Types⇤
    Joint Proceedings NLCS’14 & NLSR 2014, July 17-18, 2014, Vienna, Austria How to do things with types⇤ Robin Cooper University of Gothenburg Abstract We present a theory of type acts based on type theory relating to a general theory of action which can be used as a foundation for a theory of linguistic acts. Contents 1 Introduction 1 2 Type acts 2 3 Coordinating interaction in games 5 4 Conclusion 9 1 Introduction The title of this paper picks up, of course, on the seminal book How to do things with words by J.L. Austin [Aus62], which laid the foundation for speech act theory. In recent work on dialogue such as [Gin12] it is proposed that it is important for linguistic analysis to take seriously a notion of linguistic acts. To formulate this, Ginzburg uses TTR (“Type Theory with Records”) as developed in [Coo12]. The idea was also very much a part of the original work on situation semantics [BP83] which has also served as an inspiration for the work on TTR. TTR was also originally inspired by work on constructive type theory [ML84, NPS90] in which there is a notion of judgement that an object a is of a type T , a : T .1 We shall think of judgements as a kind of type act and propose that there are other type acts in addition to judgements. We shall also follow [Ran94] in including types of events (or more generally, situations) in a type theory suitable for building a theory of action and of linguistic meaning. ⇤The work for this paper was supported in part by VR project 2009-1569, SAICD.
    [Show full text]
  • Department of Philosophy, Linguistics and Theory of Science on 2018-01-25 to Be Valid from 2018-01-25
    DEPARTMENT OF PHILOSOPHY, LINGUISTICS AND THEORY OF SCIENCE Type Theory with Records: From Perception to Communication, 7,5 higher education credits Type Theory with Records: Från perception till kommunikation, 7,5 högskolepoäng Third Cycle/Forskarnivå Confirmation The course syllabus was confirmed by the Department of Philosophy, Linguistics and Theory of Science on 2018-01-25 to be valid from 2018-01-25. Entry requirements General and specific entry requirements for third-cycle education according to Admissions Regulations and the general syllabus for the subject area at hand. Learning outcomes After completion of the course the doctoral student is expected to be able to : Knowledge and understanding demonstrate enhanced knowledge and understanding of the subject areas of the course, discuss in detail key theories and issues in one or more of the subject areas of the course, Skills and abilities offer a detailed analysis of issues, lines of arguments or methods from the literature, select and define issues, lines of arguments or methods from the literature that are suitable for a short critical essay or a conference paper, Judgement and approach critically discuss, orally or in writing, questions, lines of arguments or methods that are used in the literature, critically assess the validity of the studied lines of argument. Course content The course introduces TTR, a Type Theory with Records, as a framework for natural language grammar and interaction. We are taking a dialogical view of semantics. The course covers the formal foundations of TTR as well as TTR accounts of perception, intensionality, information exchange, grammar (syntax and semantics), quantification, modality and other linguistic phenomena.
    [Show full text]
  • Witness-Loaded and Witness-Free Demonstratives
    Witness-loaded and Witness-free Demonstratives Andy Lücking Goethe University Frankfurt, Germany To Appear 2017 Prefinal draft, cite only full chapter Abstract According to current theories of demonstratives, both discourse referentially (en- dophoric) and real-world referentially (exophoric) uses of demonstrative noun phrases (DemNPs) obey the same mode of reference. Based on the clarification potential of DemNPs and on data on bridging and deferred reference it is argued that only ex- ophoric DemNPs allow for the identification of a demonstratum, while endophoric ones do not. Furthermore, the view that discourse reference does not involve a demonstration act is taken and, hence, contrary to standard assumption, the claim is made that both uses follow different modes of reference. In order to maintain a unified analysis of DemNPs, it is argued to spell out their semantics in terms of a grammar-dialog interface, where demonstratives and demonstration acts contribute to processing instructions for reference management. In this system, exophoric Dem- NPs are modeled as witness-loaded referential expressions, while endophoric Dem- NPs remain witness-free. A final claim is that the witness gives rise to manifold perceptual classifications, which in turn license indirect reference. The analysis is implemented in Type Theory with Records (which provides the notion of a witness) within Ginzburg’s dialog framework called KoS. The dynamics of demonstratives is captured by a set of rules that govern their processing in dialog. Keywords: demonstratives, demonstration, reference, deferred reference, wit- nesses, dialog 1 Introduction Demonstrative noun phrases (DemNPs) like this painting can be used in two ways: exophorically and endophorically (the latter can be further distinguished into anaphoric and cataphoric).
    [Show full text]
  • Theoretical Linguistics, Vol
    Edinburgh Research Explorer Action-based grammar Citation for published version: Kempson, R, Cann, R, Gregoromichelaki, E & Chatzikyriakidis, S 2017, 'Action-based grammar', Theoretical Linguistics, vol. 43, no. 1-2, pp. 141-167. https://doi.org/10.1515/tl-2017-0012 Digital Object Identifier (DOI): 10.1515/tl-2017-0012 Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record Published In: Theoretical Linguistics General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 29. Sep. 2021 Theoretical Linguistics 2017; 43(1-2): 141–167 Ruth Kempson, Ronnie Cann, Eleni Gregoromichelaki and Stergios Chatzikyriakidis Action-Based Grammar DOI 10.1515/tl-2017-0012 1 Competence-performance and levels of description First of all we would like to thank the commentators for their efforts that resulted in thoughtful and significant comments about our account.1 We are delighted that the main thesis we presented – the inclusion of interactive language use within the remit of grammars – seems to have received an unexpected consensus from such diverse perspectives.
    [Show full text]
  • Modelling Language, Action, and Perception in Type Theory with Records
    Modelling Language, Action, and Perception in Type Theory with Records Simon Dobnik, Robin Cooper, and Staffan Larsson Department of Philosophy, Linguistics and Theory of Science University of Gothenburg, Box 200, 405 30 G¨oteborg, Sweden [email protected], {staffan.larsson,robin.cooper}@ling.gu.se http://www.flov.gu.se Abstract. Formal models of natural language semantics using TTR (Type Theory with Records) attempt to relate natural language to per- ception, modelled as classification of objects and events by types which are available as resources to an agent. We argue that this is better suited for representing the meaning of spatial descriptions in the context of agent modelling than traditional formal semantic models which do not relate spatial concepts to perceptual apparatus. Spatial descriptions in- clude perceptual, conceptual and discourse knowledge which we represent all in a single framework. Being a general framework for modelling both linguistic and non-linguistic cognition, TTR is more suitable for the mod- elling of situated conversational agents in robotics and virtual environ- ments where interoperability between language, action and perception is required. The perceptual systems gain access to abstract conceptual meaning representations of language while the latter can be justified in action and perception. Keywords: language, action, perception, formal semantics, spatial de- scriptions, learning and classification. 1 Introduction Classical model-theoretic semantics [1] does not deal with the connection be- tween linguistic meaning and perception or action. It does not provide a theory of non-linguistic perception and it is not obvious how to extend it to take account of this. Another feature of classical approaches to model theory is that they treat natural language as being essentially similar to formal languages in, among other things, assuming that there is a fixed interpretation function which mediates be- tween natural language and the world.
    [Show full text]
  • Reference and Perceptual Memory in Situated Dialog
    Referring to the recently seen: reference and perceptual memory in situated dialog John D. Kelleher Simon Dobnik ADAPT Research Centre CLASP and FLOV ICE Research Institute University of Gotenburg, Sweden Technological University Dublin [email protected] [email protected] Abstract From theoretical linguistic and cognitive perspectives, situated dialog systems are interesting as they provide ideal test-beds for investigating the interaction between language and perception. At the same time there are a growing number of practical applications, for example robotic systems and driver-less cars, where spoken interfaces, capable of situated dialog, promise many advantages. To date, however much of the work on situated dialog has focused resolving anaphoric or exophoric references. This paper, by con- trast, opens up the question of how perceptual memory and linguistic refer- ences interact, and the challenges that this poses to computational models of perceptually grounded dialog. 1 Introduction Situated language is spoken from a particular point of view within a shared per- ceptual context [6]. In an era where we are witnessing a proliferation of sensors that enable computer systems to perceive the world, effective computational mod- els of situated dialog have a growing number of practical applications, consider applications in human-robot interaction in personal assistants, driver-less car inter- faces that allow interaction with a passenger in language , and so on. From a more fundamental science perspective, computational models of situated dialog provide arXiv:1903.09866v1 [cs.HC] 23 Mar 2019 a test-bed for theories of cognition and language, in particular those dealing with the binding/fusion of language and perception in the interactive setting involving human conversational partners and ever-changing environment.
    [Show full text]
  • Mass/Count Variation: a Mereological, Two-Dimensional Semantics
    Baltic International Yearbook of Cognition, Logic and Communication Volume 11 Number: Cognitive, Semantic and Crosslinguistic Approaches Article 11 2016 Mass/Count Variation: A Mereological, Two-Dimensional Semantics Peter R. Sutton Heinrich Heine University Düsseldorf, Germany Hana Filip Heinrich Heine University Düsseldorf, Germany Follow this and additional works at: https://newprairiepress.org/biyclc Part of the Semantics and Pragmatics Commons This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License. Recommended Citation Sutton, Peter R. and Filip, Hana (2016) "Mass/Count Variation: A Mereological, Two-Dimensional Semantics," Baltic International Yearbook of Cognition, Logic and Communication: Vol. 11. https://doi.org/ 10.4148/1944-3676.1110 This Proceeding of the Symposium for Cognition, Logic and Communication is brought to you for free and open access by the Conferences at New Prairie Press. It has been accepted for inclusion in Baltic International Yearbook of Cognition, Logic and Communication by an authorized administrator of New Prairie Press. For more information, please contact [email protected]. Mass/Count Variation 2 The Baltic International Yearbook of The other index is a precisification index (inspired by Chierchia 2010). Cognition, Logic and Communication At some precisifications, the set of entities that count as ‘one’ can be excluded from a noun’s denotation leading to mass encoding. We com- December 2016 Volume 11: Number: Cognitive, Semantic and bine these indices into a two dimensional semantics that not only mod- Crosslinguistic Approaches els the semantics of a number of classes of nouns, but can also explain pages 1-45 DOI: http://dx.doi.org/10.4148/1944-3676.1110 and predict the puzzling phenomenon of mass/count variation.
    [Show full text]
  • Indexicals As Weak Descriptors
    Indexicals as Weak Descriptors Andy Lücking Goethe University Frankfurt Abstract Indexicals have a couple of uses that are in conflict with the traditional view that they directly refer to indices in the utterance situation. But how do they refer instead? It is argued that indexicals have both an indexical and a descriptive aspect – why they are called weak descriptors here. The indexical aspect anchors them in the actual situation of utterance, the weak descriptive aspect sin- gles out the referent. Descriptive uses of “today” are then attributed to calendric coercion which is triggered by qunatificational elements. This account provides a grammatically motivated formal link to descriptive uses. With regard to some uses of “I”, a tentative contiguity rule is proposed as the reference rule for the first person pronoun, which is oriented along recent hearer-oriented accounts in philosophy, but finally has to be criticized. 1 Descriptive Indexicals Indexicals have descriptive uses as exemplified in (1a) (taken from Nunberg, 2004, p. 265): (1) a. Today is always the biggest party day of the year. b. *November 1, 2000 is always the biggest party day of the year. According to Nunberg (2004), today in (1a) is interpreted as picking out a day type or day property instead of referring to a concrete day, since “[. ] the interpretations of these uses of indexicals are the very things that their linguistic meanings pick out of the context.” (p. 272). The full date in (1b), to the contrary, refers to a particular day and has no such type or property reading. The interpretations of both sentences in (1) diverge, even if both sentences are produced on November 1, 2000.
    [Show full text]
  • Semantic Theories of Questions∗
    Semantic theories of questions∗ Floris Roelofsen October 17, 2018 Summary This survey article discusses two basic issues that semantic theories of ques- tions face. The first is how to conceptualise and formally represent the semantic content of questions. This issue arises in particular because the standard truth-conditional notion of meaning, which has been fruitful in the analysis of declarative statements, is not applicable to questions. The second issue is how questions, when embedded in a declarative statement (e.g., in Bill wonders who called) contribute to the truth-conditional content of that state- ment. Several ways in which these issues have been addressed in the literature are discussed and compared. Keywords semantics of questions, alternative semantics, partition semantics, inquisitive semantics, interrogative complements, clause-embedding predicates. 1 Introduction Formal semantic theories traditionally construe the meaning of a sentence in terms of its truth-conditions: to know the meaning of a sentence is to know the conditions under which the sentence is true, and those under which it is false. This notion of meaning has been very fruitful in the analysis of declarative statements, but it does not straightforwardly apply to questions, which are not naturally thought of as being true or false in a given situation. Thus, the most basic issue that semantic theories of questions need to address is the following: • Issue 1: How should the semantic content of a question be conceptualised, and how should it be represented formally? Questions are not only of interest in their own right. They can also be part of declarative statements and as such contribute to the truth-conditional content of such statements.
    [Show full text]
  • Types Theory and Lexical Semantics Robin Cooper, Christian Retoré
    Extended abstracts of the ESSLLI 2015 workshop TYTLES: Types Theory and Lexical Semantics Robin Cooper, Christian Retoré To cite this version: Robin Cooper, Christian Retoré. Extended abstracts of the ESSLLI 2015 workshop TYTLES: Types Theory and Lexical Semantics. Aug 2015, Barcelona, Spain. 2015. hal-01584832v2 HAL Id: hal-01584832 https://hal.archives-ouvertes.fr/hal-01584832v2 Submitted on 24 Sep 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Extended abstracts of the ESSLLI 2015 workshop TYTLES: Types Theory and Lexical Semantics Workshops organizers / programme committee co-chairs: Robin Cooper & Christian Retoré Barcelona, 3-7 August 2015 Those are the extended abstracts of the TYTLES (Barcelona, August 3-7 2015) workshop together with the slides that were projected during the introduction and the conclusion. Some authors may publish alternative versions later on elsewhere. Program committee: • Robin Cooper (University of Gothenburg, Co-Chair), • Christian Retoré (Université de Montpellier, & LIRMM Co-Chair) • Alexandra Arapinis (CNR,
    [Show full text]
  • Ellipsis in Head-Driven Phrase Structure Grammar
    Ellipsis in Head-Driven Phrase Structure Grammar Jonathan Ginzburg and Philip Miller∗ Universite´ Paris Diderot, Laboratoire de Linguistique Formelle (UMR 7110) and EA 3697 CLILLAC-ARP June 16, 2017 1 Introduction 1.1 General properties of ellipsis The characterizing feature of ellipsis is that elements of semantic content are obtained in the absence of any corresponding syntactic form. The syntax thus appears to be incomplete. More specifically, the implicit semantic content is recovered from elements of the linguistic and non-linguistic context. In this sense, ellipsis is similar to anaphora, except that there is no overt anaphoric element involved. The elements present in an elliptical clause are predicates, arguments or adjuncts of what is omitted. It is the presence of these elements that makes it possible to recognize the ellipsis. It is in principle always possible to reanalyse any elliptical phenomenon as a case of anaphora, either by hypothesizing an unpronounced pro-form in the elliptical site (zero-anaphora) or by considering the licenser to be anaphoric (e.g., considering the pre-elliptical auxiliary to be an anaphor in Verb Phrase ellipsis (VPE), cf., Schachter 1978; ∗This piece is dedicated to the memory of our friend, teacher, and collaborator, Ivan Sag. Ivan was originally intended as a co-author of this piece, but died before it was drafted. His inspiration and many ideas are evident, though the chapter is much the poorer due to his absence. We would like to thank Doug Arnold, Bob Borsley, Rui Chaves, Jong-Bok Kim, Jason Merchant, Franc¸ois Mouret, Stefan Muller,¨ Enric Vallduv´ı, and Tom Wasow, as well as two anonymous reviewers for comments on the first version.
    [Show full text]
  • Interfacing Language, Spatial Perception and Cognition in Type Theory with Records
    Interfacing language, spatial perception and cognition in Type Theory with Records Simon Dobnik and Robin Cooper Department of Philosophy, Linguistics & Theory of Science (FLOV) Centre for Linguistic Theory and Studies in Probability (CLASP) University of Gothenburg, Sweden abstract We argue that computational modelling of perception, action, lan- Keywords: spatial guage, and cognition introduces several requirements of a formal se- language, Type mantic theory and its practical implementations in situated dialogue Theory with Records (TTR), agents. Using examples of semantic representations of spatial descrip- computational tions we show how Type Theory with Records (TTR) satisfies these framework requirements and provides a promising knowledge representation sys- tem for situated agents. 1 introduction In this paper, we consider the treatment of spatial language from the perspective of a robot learning spatial concepts and classifying situa- tions according to the spatial relations holding between objects while interacting with a human conversational partner. We start from our ex- perience of building such agents and a conclusion that there is a need for a unified knowledge representation system that connects theories of meaning from formal semantics to practical implementations. We suggest that the type-theoretic notion of judgement is important and that a type theory such as TTR (Type Theory with Records) is advan- tageous because it can be used to model both the low level perceptual judgements of the robot as well as the conceptual judgements associ- ated with spatial relations. This is distinct from previous approaches Journal of Language Modelling Vol 5, No 2 (2017), pp. 273–301 Simon Dobnik, Robin Cooper (discussed in Section 3) where the low level perceptual processing is carried out in an entirely different system to that used for semantic processing.
    [Show full text]