Using Statistical Techniques and WordNet to Reason with Noisy Data

Rakesh Gupta and Mykel J. Kochenderfer∗ Honda Research Institute USA, Inc. 800 California Street, Suite 300 Mountain View, CA 94041 [email protected], [email protected]

Abstract In our work, statistical data clustering techniques such as Latent Semantic Indexing (LSI) are used to remove in- We collected data from non-experts over the web to create consistencies in the knowledge base to find consensus data. a common sense knowledge base for indoor home and office The derived database can then be used to perform practi- environments. In this paper, we discuss how we use statistical cal reasoning using Belief-Desire-Intention (BDI) architec- data dimension reduction and clustering techniques to deter- mine consensus in the knowledge base. We explain the use of tures (Rao & Georgeff 1995; Wooldridge 1999). Latent Semantic Indexing in finding consensus. These statis- It is important to emphasize that our system is not at- tical techniques make our system robust to noisy data in the tempting to find the right answer but the one that reflects the knowledge base. Our work contrasts with traditional AI sys- majority consensus opinion. Without performing any con- tems which are typically brittle as well as difficult to extend sistency checking, we handle inconsistencies in two ways. due to handcrafted pieces of knowledge in their knowledge Firstly, we do not allow users to explicitly enter negations bases. We then discuss how the WordNet hypernym hierar- (e.g. “a toilet is not found in the kitchen”) into the knowl- chy is used to generalize knowledge and perform inference edge base. Thus, we do not have assertions of a proposition about objects not in the knowledge base. WordNet also makes and its negation in our knowledge base. Secondly, we fo- the reasoning system robust to vocabulary differences among cus on resolving high level inconsistencies in the knowledge people by using synonyms. base by consensus knowledge. For example, 15 people say that refrigerator is used to store food and one person says Introduction that it is used to provide filtered water. The later is not the primary use of the refrigerator, and is discarded by consen- The dominant view of intelligent reasoning, derived from sus even though it is valid knowledge. mathematical logic, approaches intelligent reasoning as a One of the strengths of our approach is in the restriction form of calculation, typically deduction. These symbolic of the domain (to indoor home and office environments), systems are domain dependent. As knowledge increases it which makes our knowledge base dense enough to be sta- becomes increasingly difficult to keep the database consis- tistically usable for inferencing. In the past, the available tent with prior handcrafted knowledge. Systems like data has not been dense enough to leverage statistical tech- (Guha et al. 1990) have resorted to partitioning the database niques. Knowledge bases including the Open Mind Com- in consistent knowledge chunks based on context (called mi- mon Sense project at the MIT Media Lab (Liu, Lieberman, crotheories). However, such systems are handcrafted by a & Selker 2003; Eagle, Singh, & Pentland 2003), Thought- relatively small number of experts, and construction of the Treasure (Mueller 1998) and Cyc (Guha et al. 1990) have system requires examination of all the existing items in the attempted to capture much broader but sparse human com- relevant piece of the database for consistency. This consis- mon sense knowledge. tency check requires exponential time based on the number of items in the database. Objective Statistical techniques have been used extensively in text The objective of our work is to make indoor mobile robots retrieval and data mining to work with large collections of that work in environments like homes and offices more in- text documents (Baeza-Yates & Ribeiro-Neto 1999). We telligent through common sense. We want to endow these use these statistical techniques at the sentence and phrase robots with some amount of general knowledge to use as a level to address the issue of making these noisy databases basis to accomplish their tasks and interact with people and consistent. their environment. ∗ Mobile robots in homes and offices will be expected to Currently at Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh, United accomplish tasks within their environment to satisfy the per- Kingdom. ceived desires and requests of their users using common Copyright c 2004, American Association for Artificial Intelli- sense. Common sense does not require expert knowledge, gence (www.aaai.org). All rights reserved. and hence it may be gathered from non-specialist netizens in the same fashion as the projects associated with the rather and focuses on the role that intentions play in practical rea- successful Open Mind Initiative pioneered by David Stork soning. It is founded upon established theory of rational ac- (Stork 1999; 2000). The raw data collected over the web is tion in humans. processed using statistical techniques and WordNet to pop- Beliefs correspond to information the agent has about the ulate an initial robot knowledge base. In the future, these world. Given F , sensor observations, and a belief revision robots will be able to learn through human interaction and function, current beliefs can be determined. Desires repre- other methods and add to their knowledge bases online. sent states of affairs that the agent would (in an ideal world) In the next section, we describe our framework for cap- wish to be brought about. Current beliefs and I provide a turing template data using Open Mind and our database list of current desires. In addition, humans may also explic- schema. We then explain how we use Latent Semantic In- itly assign goals (desires) to the robot. Intentions represent dexing to prune our database to data reflecting consensus. desires that it has committed to achieving. Desires can be We then discuss how we use WordNet to generalize our filtered by a deliberation function to give current intentions knowledge and to further make it independent of vocabu- that are then acted upon by the robot. lary. Finally, we discuss our conclusions and future work. Relational Common Sense Knowledge Knowledge Capture Framework We first collected the names of objects commonly found in homes and offices. We started with a collection of pho- In this section, we describe the common sense knowl- tographs of such objects, and users identified them by en- edge base, the various implications used, and the database tering their corresponding names into a form. Users were schema. We also provide examples of template-based forms also prompted to enter new object names by forms like An for collecting common sense knowledge. object found in the office is . To capture relations, it would be easiest to have the users Once some objects were collected, we asked users to iden- enter statements of the form (object, property), but this tify various properties of an object. For example, the user would be unnatural for the user. The other extreme would might be prompted A microwave is often . Or, the be to accept phrases in natural language and later parse them user might be prompted People sometimes desire that a cup using lemmatization and part-of-speech tagging. We de- of coffee is . From this data, the system built a cided to take an intermediate approach using sentence tem- collection of object attributes. plates where the user is prompted with a natural language From these statements, the system constructed questions sentence with some blanks. Users enter words or phrases about causality. For example, a form asked A plant is healthy to complete these sentences (as in many of the other Open when a is . The response of the user Mind Initiative projects). was then transformed into an implication and added to the set F and stored in the system’s database. Knowledge Base Implications We gathered information about indicators of desires. For example, a form asked, You might want a dvd player The framework of this work is object-centric. It is assumed to be plugged into a television if you notice that your that the desires of the users and the actions taken by the robot has become . The entries for the are grounded in the properties of objects in the world. blanks were stored as an implication in the set I. In our system, a statement is a pair φ =(o, p) where o is Common sense knowledge about the location of common the object and p is a verb or adjective. For example, a user objects is required to perform actions on objects in the home may wish that her coffee be heated. In this case, the object of or office. For example, if we want the robot to make a cup of the desire is coffee and the desired property is heated.The coffee, it would be useful for the robot to know that coffee representation of this statement would be (coffee , heated). makers tend to be found in kitchen and that milk is usually Statements can also be thought of as actions. For example, stored in the refrigerator. We therefore had activities where (coffee , heated) represents the action that causes coffee to netizens could indicate typical object location and pairs of be heated. The robot is capable of explicitly executing some objects that are generally found together. set of actions represented in our system as statements. Other common sense knowledge we collected included One statement implying (or causing) another is repre- knowledge about uses of different objects and activities of sented as: φ1 → φ2. For example, we might have the impli- people. All this data was collected over the web from non- cation (trash, in trash can) → (house, clean).Weshall experts through forms hosted on the Open Mind website. denote the collection of all implications of this form by F . More details about Open Mind Indoor Common Sense data One statement can indicate a desire represented by the collection is described in Gupta & Kochenderfer (2004). second statement. Symbolically, this implication is repre- sented as: φ1 →d φ2. For example, we might have the im- Database Schema plication (stomach, growling) →d (human, fed).Weshall Open Mind data collection is schematically shown in fig- denote the collection of all such implications used to antici- ure 1. All knowledge entered into the website goes through pate human desires by I. a review process by an administrator. Once accepted, This knowledge base can be used for common sense and these structured responses are transformed into relations and practical reasoning using Belief-Desires-Intentions (BDI) saved in the database. These relations are then used to gen- theory. BDI was originally developed by Bratman (1987) erate new sentence templates. view Latent Semantic Indexing (LSI) as a similarity mea- sure that makes judgements based on word co-occurrence. Common Sense Such co-occurrence has been used as an indicator of topic Generate Knowledge Base relatedness in document analysis (Deerwester et al. 1990; new sentence Scott & Matwin 1999), but in our work, we use it to analyze templates phrases. In latent semantic space, phrases that are seman- tically similar according to the co-occurrence analysis are clustered together. We select the primary cluster as the one reflecting consensus. To prepare the data for LSI, we first run a spell-check on the web-collected data and correct spellings. We then ana- lyze the sentences to remove frequent words using a stop- Convert word list to reduce the feature set size, and then we per- to form . Stemming maps word variants to the same relations feature (e.g. cooking, cooked, cooks maps to same word cook). We then apply LSI via Singular Value Decomposi- tion (SVD) to find the most relevant sentence cluster. The computation of SVD is quadratic in the rank of the term by sentence matrix and cubic in the number of singular values that are computed (Manning & Schuetze 2001). For example, in our database following is the raw data collected for the template A cup is used to . Save as raw get the right amount of an ingredient when cooking sentences drink tea Review hold tea drink out of hold a drink Figure 1: Schematic for Open Mind data collection. serve tea hold coffee drink coffe from A MySQL database is used to store the data collected drink coffee from the website. Raw sentences are archived in a table drink from with the following schema. drink Entries(id, userid, date, sentence, drink and hold tea status, activity, form) holding tea When a user enters new knowledge, its status is uncom- drink from mitted. An administrator is later able to commit (or reject) hold coffee the entry to the database. Committing an entry transforms store items the sentence and populates one or more of the following re- wake up in the morning lations: store dishes Objects(id, name) keep awake Statements(id, obj, prop, desire) The entry coffe is replaced by the word coffee in the spell- Uses(id, obj, vp) check phase. The stop words of, a, from, when, up, in, the Causes(id, obj1, prop1, obj2, prop2) and an get filtered out. Stemming maps hold and holding to Desires(id, obj1, prop1, obj2, prop2) the same root word hold. Thus, after preprocessing steps we Rooms(id, name) get the following terms: Proximity(id, obj1, obj2) Locations(id, obj, room) 1. hold We use statistical techniques like LSI to clean up that raw 2. tea knowledge base. Once data is cleaned up and compacted us- 3. drink ing WordNet, these relations are used to answer user posed 4. out text queries about object location and their use in various 5. serve tasks. 6. coffee 7. get Latent Semantic Indexing for finding 8. right 9. amount consensus 10. ingredient Luhn (1961) proposed the idea of notional families to group 11. cook together words of similar and related meaning. We can 12. store 000000111110000000 011000000000000000 Where might I find … 110000000000000000 Two objects that you might find in the a carafe? 001100000000000000 same room are: a bottle and a refrigerator. 101000000000000000 … 010010000000000000 A refrigerator is commonly found in the kitchen. 100001000000000000 … OpenMind Indoor Common Sense Database Inference 001001000000000000 Engine 001001000000000000 001000000000000000 001000000000000000 bottle 111000000000000000 110000000000000000 beer bottle carafe jug 001000000000000000 A carafe is found 100001000000000000 in the kitchen. 000000000001100000 WordNet hyponym hierarchy 000000000000011000 000000000001000100 000000000000000011 Figure 4: Inference using Open Mind and WordNet.

Figure 2: 19 phrase × 18 term matrix for SVD. Each row corresponds to a particular phrase (e.g. “get the right amount Using WordNet for knowledge compaction and of an ingredient when cooking”), and the 1’s and 0’s corre- vocabulary independence spond to the presence or absence of a particular term (e.g. “hold”). The WordNet lexical database is used to compact the knowl- edge base to allow the making of inferences about objects that do not themselves exist in the knowledge base but their hypernyms or synonyms do. 13. item To illustrate knowledge compaction, suppose our 14. wake database contains the fact that spoons are usually found in 15. morning the kitchen. Hypernymy is a linguistic term for the is-a re- 16. dish lationship, e.g. since a spoon is a cutlery, cutlery is a hy- 17. keep pernym of spoon as shown in the WordNet hierarchy for the 18. awake noun tableware in Figure 3. In this example, we would ask if all types of cutlery are usually found in the kitchen. If This leads to the 19 phrase × 18 term matrix in figure 2. the consensus answer is yes, we would replace the knowl- The first phrase indicates a use for a cup, and the last three edge about spoon location with the knowledge about cutlery. are nonsense options. SVD decomposition clusters phrases We use the disambiguated sense to determine the appropri- from 2–16 in the clusters represented by the first two sin- ate hypernym to query the user. gular values. The first phrase shows up in the cluster with Knowledge compaction not only saves space but also the third highest singular value and the remaining nonsense makes our knowledge more general. For example, initially phrases in the clusters with the remaining smaller singular our knowledge base may have the knowledge that “forks values. Thus LSI helps us select one of the relevant phrases are usually found in kitchen” and “spoons are usually found from 2–16 range. A phrase from this range can be selected in the kitchen”. With knowledge compaction, we remove randomly. knowledge about forks and spoons and replace it with cut- lery knowledge. This saves storage space as well as allows It is important to point out that even though we used LSI us to infer that table knives are found in the kitchen. We for determining consensus, any of the other techniques for could not have inferred anything about table knives before data dimension reduction and clustering such as Spectral this knowledge compaction. Clustering and Principal Component Analysis (PCA) could Since the relations are structured and interconnected, rule- be similarly applied. based inference may be applied to produce sentences that While the LSI method deals nicely with the synonymy were not explicitly entered into the database. For example, problem by clustering them together, it offers only a par- if the database contains the facts that a spoon is found near tial solution to the polysemy problem. That is, a word with a bowl and a bowl is generally found in the dining room, more than one entirely different meaning (e.g. bank) is rep- the inference engine would be able to infer that a spoon may resented as a weighted average of the different meanings. If be found in the dining room with a certain degree of con- none of the real meanings is like the average meaning this fidence. Our knowledge base has been used in conjunction may cause a severe distortion. In our case polysemy is not with WordNet to infer the location of objects that were not a serious issue because the restriction of the domain heavily found in the database. Users may issue commands using ob- cuts down the number of terms with multiple relevant mean- jects that do not explicitly exist in the knowledge base but ings in the domain. are in the WordNet hypernym hierarchy of the object. root: tableware crockery, cutlery, flatware dishware eating utensil chinaware, fork plate china cup carving fork dessert plate

chalice, goblet salad fork dinner plate

coffee cup table fork paper plate Dixie cup, paper cup, spoon soup plate drinking cup measuring cup dessert spoon platter

teacup runcible spoon saucer

dish soupspoon, soup spoon bowl tablespoon

casserole teaspoon

serving dish table knife

Figure 3: Example of a WordNet hierarchy for the noun tableware.

Figure 4 shows an example of inferencing where the legs—only that pets can have four legs (but they might have query is: Where might I find a carafe? From the Open Mind two legs) etc. Indoor Common Sense database, after LSI we have: Besides knowledge compaction, it is important that our common sense system provide an amount of vocabulary • Bottle and refrigerator are found together. independence since two people typically choose the same • A refrigerator is found in the kitchen. name for a single well-known object less than 20% of the time (Deerwester et al. 1990). Voorhees (1994) expanded We can use the WordNet hierarchy and knowledge entered queries manually to include synonyms of terms to improve by netizens to prune irrelevant senses of the word bottle in performance in text retrieval tasks. An example of using the context of indoor environments. From this pruned set synonyms is the query Where can I find chalice.Theword of senses, we can use the information that bottle is a hyper- chalice does not exist in our knowledge base but its syn- nym for carafe from WordNet. We can therefore infer that onym goblet does. This allows the inference that chalice can a carafe is found in the same place as a bottle, which is the be found in the kitchen. kitchen. As another example, the object patchwork was never en- tered into the knowledge base. However, our inference en- Conclusions and Further Work gine was able to observe that a patchwork is a hyponym of We extracted common sense knowledge as relations from quilt. The knowledge base contained the fact that quilts are structured sentence inputs obtained over the web. Focusing typically found in bedrooms and living rooms, and so the on a restricted indoor home and office environment domain system was able to infer that patchworks are found in bed- supplied us with dense common sense knowledge base that rooms and living rooms. was processed using data clustering statistical techniques to There is a distinction between hypernyms and hyponyms clean up the noisy data as well as to determine consensus. in inferencing with WordNet. If a queried word is not We further processed the knowledge base using the Word- present in the database but its hypernym is, then we can pro- Net lexical database to generalize data by replacing impli- ceed using the information we have in the database about the cations associated with existing objects by implications as- hypernym. However, if the queried word is not present in sociated with their hypernyms. WordNet was also used to the database but its hyponym is, it is not necessarily sound provide vocabulary independence in queries. to use the information contained in the database about the This framework is general enough to be extended to other hyponym. For example: if we do not know anything about restricted environments like hospitals and airports. Outdoor pets but we know that its hyponym dogs has the property of common sense knowledge might be used to determine when has four legs, we cannot simply deduce that pets have four it is safe to cross roads. In hospitals and airports, common sense may be used to determine when to offer help to peo- Manning, C. D., and Schuetze, H. 2001. Foundations of ple in carrying objects or answer queries about locations of Statistical natural language Processing. Cambridge, Mas- objects or places in the environment. sachusetts: MIT Press. chapter Topics in Information Re- Further work includes refinements to LSI including trieval: Latent Semantic Indexing. phrase finding and methods to handle negation and disjunc- Mueller, E. T. 1998. Natural language processing with tion in queries. Another improvement would be to incorpo- ThoughtTreasure. New York: Signiform. rate low-frequency words that are informative but are filtered Nilsson, N. J. 1994. Teleo-reactive programs for agent out. We are also interested in expanding this work to learn control. Journal of Artificial Intelligence Research 1:139– new knowledge online. 158. On the interface side, future work includes replacing the text interface with a system. We will also Rao, A. S., and Georgeff, M. P. 1995. BDI-agents: from explore human-robot dialogue so that the robot can ask ques- theory to practice. In Proceedings of the First Intl. Confer- tions if it is not able to understand new information. It will ence on Multiagent Systems. be an interesting challenge to extend the system to under- Scott, S., and Matwin, S. 1999. Feature engineering for stand simple spatial descriptions such as, Peter’s room is to text classification. In Bratko, I., and Dzeroski, S., eds., the right or the object on the table is the projector. Proceedings of ICML-99, 16th International Conference This work can be extended to perform practical reasoning on Machine Learning, 379–388. Bled, Slovenia: Morgan and action selection using the BDI architecture. Another Kaufmann Publishers, San Francisco, US. possible path is the use of teleo-reactive programs (Nilsson Stork, D. G. 1999. The Open Mind Initiative. IEEE Expert 1994) in accomplishing tasks in dynamic environments. Systems and Their Applications 14(3):19–20. Stork, D. G. 2000. Open data collection for training intel- Acknowledgments ligent software in the open mind initiative. In Proceedings This work was done while Mykel Kochenderfer was a sum- of the Engineering Intelligent Systems (EIS2000). mer intern at Honda Research Institute USA, Inc. Nils Nils- Voorhees, E. M. 1994. Query expansion using lexical- son reviewed an earlier version of this paper and provided semantic relations. In Proceedings of SIGIR 94, 61–69. invaluable feedback. Thanks are also due to anonymous re- viewers and the users of the Open Mind Indoor Common Wooldridge, M. 1999. Intelligent agents. In Weiss, G., ed., Sense website for their data and feedback. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. Cambridge, MA, USA: The MIT References Press. 27–78. Baeza-Yates, R., and Ribeiro-Neto, B. 1999. Modern Infor- mation Retrieval. New York: Addison Wesley Longman. Bratman, M. 1987. Intention, Plans, and Practical Reason. Cambridge, MA: Harvard University Press. Deerwester, S. C.; Dumais, S. T.; Landauer, T. K.; Furnas, G. W.; and Harshman, R. A. 1990. Indexing by . Journal of the American Society of In- formation Science 41(6):391–407. Eagle, N.; Singh, P.; and Pentland, A. 2003. Common sense conversations: Understanding casual conversation using a common sense database. Artificial Intelligence, Information Access, and Mobile Computing Workshop at the 18th International Joint Conference on Artificial Intel- ligence (IJCAI). Guha, R. V.; Lenat, D. B.; Pittman, K.; Pratt, D.; and Shep- herd, M. 1990. Cyc: A midterm report. Communications of the ACM 33(8):391–407. Gupta, R., and Kochenderfer, M. J. 2004. Common sense data acquisition for indoor mobile robots. In Nineteenth National Conference on Artificial Intelligence (AAAI-04). Liu, H.; Lieberman, H.; and Selker, T. 2003. A model of textual affect sensing using real-world knowledge. In Pro- ceedings of the Seventh International Conference on Intel- ligent User Interfaces (IUI 2003), 125–132. Luhn, H. P. 1961. The automatic derivation of information retrieval encodement from machine readable text. Informa- tion Retrieval and 3(2):1021–1028.