Extracting Common Sense Knowledge from Text for Robot Planning

Extracting Common Sense Knowledge from Text for Robot Planning Peter Kaiser1 Mike Lewis2 Ronald P. A. Petrick2 Tamim Asfour1 Mark Steedman2 Abstract— Autonomous robots often require domain knowledge to act intelligently in their environment. This is particu- larly true for robots that use automated planning techniques, which require symbolic representations of the operating environment and the robot’s capabilities. However, the task of specifying domain knowledge by hand is tedious and prone to error. As a result, we aim to automate the process of acquiring general common sense knowledge of objects, relations, and actions, by extracting such information from large amounts of natural language text, written by humans for human readers. We present two methods for knowledge acquisition, requiring Fig. 1: The humanoid robots ARMAR-IIIa (left) and only limited human input, which focus on the inference of ARMAR-IIIb working in a kitchen environment ([5], [6]). spatial relations from text. Although our approach is applicable to a range of domains and information, we only consider one type of knowledge here, namely object locations in a kitchen environment. As a proof of concept, we test our approach using domain knowledge based on information gathered from an automated planner and show how the addition of common natural language texts. These methods will provide the set sense knowledge can improve the quality of the generated plans. of object and action types for the domain, as well as certain I. INTRODUCTION AND RELATED WORK relations between entities of these types, of the kind that are commonly used in planning. As an evaluation, we build Autonomous robots that use automated planning to make a domain for a robot working in a kitchen environment decisions about how to act in the world require symbolic (see Fig. 1) and infer spatial relations between objects in representations of the robot’s environment and the actions this domain. We then show how the induced knowledge can the robot is able to perform. Such models can be aided by the be used by an automated planning system. (The generated presence of common sense knowledge, which may help guide symbols will not be grounded in the robot’s internal model; the planner to build higher quality plans, compared with the however, approaches to establish these links given names of absence of such information. In particular, knowledge about objects or actions are available (e.g., [1], [2], [3] and [4]).) default locations of objects (the juice is in the refrigerator) The extraction of spatial relations from natural language or the most suitable tool for an action (knives are used for has been studied in the context of understanding commands cutting) could help the planner make decisions on which and directions given to robots in natural language (e.g., actions are more appropriate in a given context. [7], [8], [9]). In contrast to approaches based on annotated For example, if a robot needs a certain object for a task, it corpora of command executions or route instructions, or the can typically employ one of two strategies in the absence of use of knowledge bases like Open Mind Common Sense prior domain knowledge: the robot can ask a human for the [10] explicitly created for artificial intelligence applications, location of the object, or the robot can search the domain we extract relevant relations from large amounts of text in an attempt to locate the object itself. Both techniques written by humans for humans. The text mining techniques are potentially time consuming and prevent the immediate used in [11], [12], [13] to extract action-tool relations to deployment of autonomous robots in unknown environments. disambiguate visual interpretations of kitchen actions are By contrast, the techniques proposed in this paper allow the related. In [14], spatial relations are inferred based on search robot to consider likely locations for an object, informed by engine queries and common sense databases. common sense knowledge. This potentially improves plan In the following, we describe a process for learning quality, by avoiding exhaustive search, and does not require domain ontologies (Section II) and for extracting relations the aid of a human either to inform the robot directly or to (Section III). The last two sections evaluate both methods encode the necessary domain knowledge a priori. (Section IV) and describe how the resulting knowledge can While it is not possible to automatically generate all be used in an automated planning system (Section V). the domain knowledge that could possibly be required, we propose two methods for learning useful elements of II. AUTOMATIC DOMAIN ONTOLOGY LEARNING 1Institute for Anthropomatics and Robotics, Karlsruhe Institute of Tech- In this section, we propose a method for automatically nology, Karlsruhe, Germany fpeter.kaiser, [email protected] learning a domain ontology D—a set of symbols that refer to 2School of Informatics, University of Edinburgh, Edinburgh, Scotland, United Kingdom fmike.lewis, rpetrick, a robot’s environment or capabilities—with very little human [email protected] input. The method can be configured to learn a domain of objects or actions. With robotic planning in mind, it is crucial that in either of these cases, the contained symbols are not too abstract. In terms of a kitchen environment, interesting objects might be saucepan, refrigerator or apple, while abstract terms like minute or temperature that do not directly refer to objects are avoided. Similarly, we focus on actions that are directly applied to objects like knead, open or screw, and ignore more abstract actions like have or think. Automatic domain ontology learning is based on a domain-defining corpus CD, which contains texts concerning the environment that the domain should model. For example, a compilation of recipes is a good domain-defining corpus for a kitchen environment. Note that these texts have been written by humans for human readers, and no efforts are Fig. 2: The process of domain ontology learning (left) and taken to make them more suitable for CD. However, CD relation extraction (right). The ontology resulting from the needs to be part-of-speech (POS) tagged and possibly parsed first method can be used as input to relation extraction. if compound nouns are to be appropriately recognized.1 The domain-defining corpus CD is used to retrieve an initial vocabulary V which is then filtered for abstract sym- tells us if s is a physical meaning of w: bols. Depending on the type of symbol that this vocabulary ( 1; if s is a physical meaning of w is meant to model, only nouns or verbs are included in cw;s = (1) V. In the first step, CD is analyzed for word frequency 0; otherwise: and the k most frequent words are extracted (see Alg. 1). WordNet also features a frequency measure fw;s that indi- Only words with a part-of-speech-tag (POS-tag) equal to cates how often a word w was encountered in the sense of s, p 2 fnoun; verbg are considered. The resulting vocabulary based on a reference corpus. As we are not doing semantic V is then filtered according to the score Θ(w; p) which parsing on CD, we do not know which of the possible senses expresses the concreteness of a word w. of w is true. However, we can compute a weighted average of the concreteness of the different meanings of w, weighing each word-sense with its likeliness: Algorithm 1: learnDomainOntology(CD; p; k; Θmin) P fw;s · cw;s 1 V mostFrequentWords(CD; p; k) s2S(w) Θ(w) = P : (2) 2 D fw 2 V : Θ(w; p) ≥ Θming s2S(w) fw;s 3 return D As a byproduct, Θ can only have nonzero values for words that are contained in WordNet, which filters out misspelled words or parsing errors. Fig. 2 gives an overview of the domain ontology learning As there is no suitable differentiation in WordNet’s on- process. Additionally, it shows details about the relation tology for verbs, we can not apply the exact same approach extraction procedure that will be discussed below, and the here. However, WordNet features a rough clustering of verbs interoperability between the two methods. In the following that we use to define the filter. We set c to 1, if the section, we discuss the concreteness score Θ in detail. w;s verb w with sense s is in one of the following categories: verb.change, verb.contact, verb.creation or verb.motion. A. The Concreteness Θ III. RELATION EXTRACTION Having a measure of concreteness is necessary for fil- The second technique we propose for information acquisi- tering symbols that are too abstract to play a role in our tion deals with relations between symbols, defined using syn- target domain. In particular, the score Θ(w; p) resembles the tactic patterns. Such patterns capture the syntactic contexts concreteness of a word w with POS-tag p using the lexical that describe the relevant relations as well as the relations’ database WordNet ([16], [17]). For nouns, WordNet features arguments. For example, the pattern an ontology that differs between physical and abstract entities. However, a word can have different meanings, some (#object; noun); (in; prep); (#location; noun) (3) of which could be abstract and others not. WordNet solves describes a prepositional relation between two nouns using this issue by working on word-senses rather than on words. the preposition in. The pattern also defines two classes,3 For a word w with a sense2 s from the set S w of possible ( ) #object and #location, which stand for the two arguments senses of w, we can compute a Boolean indicator c that w;s of the relation.

Extracting Common Sense Knowledge from Text for Robot Planning

Open Mind Common Sense: Knowledge Acquisition from the General Public

Commonsense Knowledge Base Completion with Structural and Semantic Context

Common Sense Reasoning with the Semantic Web

Conceptnet 5.5: an Open Multilingual Graph of General Knowledge

Analogyspace: Reducing the Dimensionality of Common Sense

Arxiv:2005.11787V2 [Cs.CL] 11 Oct 2020 ( ( Hw Estl ...Tetpso Knowledge of Types the W.R.T

Improving User Experience in Information Retrieval Using Semantic Web and Other Technologies Erfan Najmi Wayne State University

Farsbase: the Persian Knowledge Graph

Multi-Task Learning for Commonsense Reasoning (UNION)

Senticnet: a Publicly Available Semantic Resource for Opinion Mining

Conceptnet — a Practical Commonsense Reasoning Tool-Kit

How Can Common Sense Support Instructors with Distance Education?