Improving NLP Systems with Common Sense Knowledge and Reasoning

MASARYK UNIVERSITY FACULTY}w¡¢£¤¥¦§¨ OF I !"#$%&'()+,-./012345<yA|NFORMATICS Improving NLP Systems with Common Sense Knowledge and Reasoning PH.D. THESIS PROPOSAL Zuzana Nevˇeˇrilová Brno, September 2010 Advisor: doc. PhDr. Karel Pala, CSc. Signature: .................. Contents 1 Introduction ...............................2 1.1 Common Sense Definitions ...................3 1.2 Cognitive Science Contribution to Automated NLU .....5 1.3 Knowledge Representation and Inference ...........6 1.4 Common Sense Reasoning and Context Sensitivity ......7 1.5 Limited Success of Existing Common Sense Applications and Intelligent Agents .........................8 1.6 Motivation .............................9 2 Current State-of-art ........................... 10 2.1 Resources of Common Sense Knowledge ........... 10 2.1.1 Encyclopedias and Explanatory Dictionaries . 10 2.1.2 Ontologies . 11 2.1.3 Special Collections of Common Sense Knowledge . 14 2.1.4 Neural Networks . 16 2.2 Computer Programs that Use Common Sense ......... 16 3 Present Results ............................. 20 3.1 Visualization ............................ 20 3.2 Collecting Common Sense Propositions ............ 20 3.3 Czech Verb Valency Lexicon VerbaLex ............. 21 3.4 Publication Overview ....................... 21 4 Aims of the Dissertation ........................ 23 4.1 Evaluation of Resources of Common Sense Propositions .. 25 4.2 Application ............................ 25 4.3 Evaluation ............................. 27 4.4 Time Schedule ........................... 27 1 Chapter 1 Introduction Within more than 50 years of computational linguistics, different aspects of natural language understanding (NLU) have been studied. From gram- mar construction which initially seemed to resolve the problem, computational linguists came over complex, multi-level natural language processing (NLP) systems. In the NLP framework, natural language is generally decomposed on smaller units as phonemes, morphemes, words, phrases, sentences, dis- courses. According to the Frege’s principle (also known as the Principle of compositionality [Janssen, 2001]) “the meaning of a compound expres- sion is a function of the meaning of its parts and of the syntactic rule by which they are combined.” In computational linguistics this principle is widely plausible and it is a base of generic NLP systems. Accord- ing to [Johnson-Laird and Miller, 1976] “understanding the meaning of a sentence depends on knowing how to translate it into the information- processing routines it calls for.” The approach of analyses on each level of the language has been widely plausible by computer scientists. Actu- ally there are two very different principles known by the name “Frege’s principle”. The Principle of compositionality is widely accepted (and im- plemented), however the second (also called Context principle: “Never ask for the meaning of a word in isolation, but only in the context of a sentence” [Janssen, 2001]) is accepted only by some linguists (e.g. Corpus Pat- tern Analysis [Pustejovsky et al., 2004]). According to [Allen, 1995] a NLU system has to use considerable knowledge about the language itself, about the context the discourse is held in and about the general world. Miller and Johnson-Laird in [Johnson-Laird and Miller, 1976] describe the need of context as: Efforts to put some sensible construction on what another person is saying are usually aided by knowledge of the context in which he says it. The context provides a pool of shared information on which both parties to a conversation can draw. The infor- 2 1. INTRODUCTION mation, both contextual and general, that a speaker believes his listener shares with him constitutes the cognitive background of this utterance. Researchers in artificial intelligence (AI) such as Lieberman, Lenat or Minsky agree that NLU is conditional on real-world knowledge (in this work called common sense knowledge). Currently, there is a huge effort in developing collections of common sense propositions (for definitions see below) as well as reasoners over these collections. However, in practice not many applications exist and not many applications have been published. It even seems that using statistical methods is wide-spread and the research on common sense collections and common sense reasoning remains experimental for years. Following sec- tions try to clarify the problem and the incomplete results of works hitherto done. 1.1 Common Sense Definitions First of all, the term common sense has to be defined. It can be defined from different points of view: from the view of linguistics, cognitive science, artificial intelligence or computer science. For this reason, three definitions are provided: “Common sense includes commonsense knowledge – the kinds of facts and concepts that most of us know – but also the commonsense reasoning skills which people use for applying their knowledge. We each use terms like commonsense for the things that we expect other people to know and regard as obvious” [Minsky, 2006]. Common sense is simply a shared knowledge. In human communication this shared knowledge is not men- tioned because it is expected to be known to all participants of the communication. If this expectation is exaggerated, it leads to communication mis- understandings. This case happens very often in human-computer communication. On the other hand, if the expectation is underestimated, the conversation is boring. According to [Minsky, 1986] “common sense is not a simple thing. In- stead, it is an immense society of hard-earned practical ideas–multitudes of life-earned rules and exceptions, dispositions and tendencies, balances and checks.” Adults can not recall their own process of learning the basic facts and rules. That is why it is called common sense. This knowledge, acquired in childhood and improved during the whole life, comprises [Minsky, 2006]: 3 1. INTRODUCTION • Social rules. For example, inanimate object do not move themselves, they have to be pushed, pulled or carried. Those actions are considered inappropriate if applied to a person. • Economic rules. Every action leads to questions about how much effort and time one should spend at comparing the costs of alternative solutions. • Conversational skills. People usually know how to keep track of the topic, their conversational goals, their social roles. Everybody has to guess what his/her addressees already know – repeating things one already knows is annoying (see also communication maxims in [Grice, 1989]). • Sensory and Motor Skills. These skills are usually not called “com- monsensical”, but the (in)ability to physically do something is un- doubtedly a part of future human planning. • Self-Knowledge. Models of one’s own abilities is necessary for planning. Common sense has a lot of relations to the physical world, people’s usual abilities as well as emotions. Minsky in [Minsky, 2006] states that “emotions are certain ways to think that we use to increase our resourcefulness”. Moreover, Minsky is convinced that purely logical, rational thinking does not exist because our minds are always affected by our assumptions, inten- tions and values of life. Barry Smith in [Smith, 1995] describes common sense as “on one hand a certain set of processes of natural cognition–of speaking, reasoning, seeing, and so on. On the other hand common sense is a system of beliefs (or folk physics and folk psychology). Over against both of these is the world of common sense, the world of objects to which the processes of natural cognition and the corresponding belief-contents standardly relate.” Common sense propositions are not always related to scientific or even real world observations (e.g. propositions such as “natural gas smells”, “oasis is a calm place”). According to [Smith, 1995], common sense is not considered to be a single, coherent object of scientific observation (similarly to natural language). Its beliefs are context-dependent and this dependency is in principle unlimitedly nuanced. Moreover, there is not a single “world” to which natural cognition can relate. 4 1. INTRODUCTION 1.2 Cognitive Science Contribution to Automated NLU Cognitive science is an interdisciplinary study of mind and intelligence. Its start is probably in psychology, where cognitivism is, in part, a synthesis of earlier forms of psychological analysis. It emphasizes internal mental processes, but it has come to use precise quantitative analysis to study how people learn and think [Sternberg, 2002]. Connectionism is one subfield of cognitive science, neuroscience and artificial intelligence that attracts the interest of computer scientists as of 1980’s. The basic idea of connectionism is that mental models are repre- sented by networks of simple units and “the key to knowledge representation lies in the connections among various nodes, not in each individual node” [Sternberg, 2002]. Cognitive science tries to discover how human mind and memory works by means of observations of human behavior. Apart from observation of disabled people (e.g. with aphasia or autism), there are sev- eral generic experiments that support hypotheses about how human brain works. Semantic priming is one of such outer evidences about human memory storage, retrieval and organization. According to [McNamara, 2005] “priming is an improvement in perfor- mance in a perceptual or cognitive task, relative to an appropriate baseline, produced by context or prior experience. Semantic priming refers to the improvement in speed or

Load more