From: AAAI-87 Proceedings. Copyright ©1987, AAAI (www.aaai.org). All rights reserved.

Inference In Text Understanding

Peter Norvig Computer Science Dept., Evans Hall University of California, Berkeley Berkeley CA 94720

Abstract The problem of deciding what was implied by a writ- [Charniak, 19781 to marker-passer based [Charniak, 19863 ten text, of “reading between the lines’ ’ is the problem systems. Granger has gone from a plan/goal based system of inference. To extract proper inferences from a text [Granger, 19801 to a spreading activation model [Granger, requires a great deal of general knowledge on the part Eiselt and Holbrook, 19841. Gne could say that the research- of the reader. Past approaches have often postulated ers gained experience, but the programs did not. Both these an algorithm tuned to process a particular kind of researchers ended up with systems that are similar to the one knowledge structure (such as a script, or a plan). An outlined here. alternative, unified approach is proposed. The algo- rithm recognizes six very general classes of inference, classes that are not dependent on individual knowledge I have implemented an inferencing algorithm in a program structures, but instead rely on patterns of connectivity called FAUSTUS (Pact Activated Unified STory Understanding between concepts. The complexity has been effec- System). A preliminary version of this system was described tively shifted from the algorithm to the knowledge in [Norvig, 19831, and a complete account is given in [Norvig, base; new kinds of knowledge structures can be added without modifying the algorithm. 19861. The program is designed to handle a variety of texts, and to handle new subject matter by adding new knowledge rather than by changing the algorithm or adding new inference The reader of a text is faced with a formidable task: recogniz- rules. Thus, the algorithm must work at a very general level. The algorithm makes use of six inference classes which are ing the individual words of the text, deciding how they are described in terms of the primitives of this language. The structured into sentences, determining the explicit meaning of algorithm itself can be broken into steps as follows: each sentence, and also making inferences about the likely implicit meaning of each sentence, and the implicit connec- step 0: Clonstruct a kwwk ase defining general con- tions between sentences. An inference is defined to be any cepts like actions, locations, and physical objects, as well as assertion which the reader comes to believe to be true as a specific concepts like bicycles and tax deductions. The same result of reading the text, but which was not previously knowledge base is applied to all texts, whereas steps l-5 apply believed by the reader, and was not stated explicitly in the to an individual text. text. Note that inferences need not follow logically or neces- Step I: Construct a semantic representation of the next sarily from the text; the reader can jump to conclusions that piece of the input text. Various conceptual analyzers (parsers) seem likely but are not 100% certain. have been used for this, but the process will not be addressed In the past, there have been a variety of programs that han- in this paper. Occasionally the resulting representation is dled inferences at the sentential and inter-sentential level. vague, and FAUSTUS resolves some ambiguities in the input However, there has been a tendency to create new algorithms using two kinds of non-marker-passing inferences. every time a new knowledge structure is proposed. For exam- Step 2: Pass markers from each concept in the semantic ple, from the Yale school we see one program, MARGIE, representation of the input text to adjacent nodes, following [Schank et al., 19731 that handled single-sentence inferences. along links in the semantic net. IMarkers start out with a given Another program, SAM [Cullingford, 19781 was introduced to amount of marker energy, and are spread recursively through process stories referring to scripts, and yet another, PAM, the network, spawning new markers with less energy, and [Wilensky, 19781 dealt with plan/goal interactions. But in stopping when the energy value hits zero. (Each of the primi- going from one program to the next a new algorithm always tive link types in KODIAK has an energy cost associated with replaced the old one; it was not possible to incorporate previ- it.) Each marker points back to the marker that spawned it, so ous results except by re-implementing them in the new for- we can always trace the marker path from a given marker malism. Even individual researchers have been prone to intro- back to the original concept that initiated marker passing. duce a series of distinct systems. Thus, we see Charniak going from demon-based [Charniak, 19721 to frame-based Step 3: Suggest Inferences based on marker collisions. When two or more markers are passed to the same concept, a This work was supported ia past by National Science Foundation grant IST-8208602 and by Defense Advanced Research Projects Agency contrad NWO39-84-C-0089. marker collision is said to have occurred. Por each collision, look at the sequence of primitive link types along which mark- Non-Marker-Passing Inference Classes ers were passed. This is called the path shape. If it matches Relation Classification one of six pre-defined path shapes then an inference is sug- Relation Constraint gested. Suggestions are kept in a list called the agenda, rather For example, an elaboration collision can occur at concept X than being evaluated immediately. Note that inferences are when one marker path starts at Y and goes up any number of suggested solely on the basis of primitive link types, and are H links to X, and another marker path starts at Z and goes up independent of the actual concepts mentioned in the text. The some H links, out across an S link to a slot, then along an R power of the algorithm comes from defining the right set of link to the category the filler of that slot must be, and then pre-defined path shapes (and associated suggestions). possibly along H links to X. Step 4: Evaluate potential inferences on the agenda. The Given a collision, FAUSTUS first looks at the shape of the result can be either-making the suggested inference, rejecting two path-halves. If either shape is not one of the three named it, or deferring the decision by keeping the suggestion on the shapes, then no inference can come of the collision. Even if agenda. If there is explicit contradictory evidence, an infer- both halves are named shapes, a suggestion is made only if the ence can be rejected immediately. If there are multiple poten- halves combine to one of the six inference classes, and the tial inferences competing with one another, as when there are suggestion is accepted only if certain inference-class-specific several possible referents for a pronoun, then if none of them criteria are met. is more plausible than the others, the decision is deferred. If there is no reason to reject or defer, then the suggested infer- The rest of the paper will be devoted to showing a range of ence is accepted. examples, and demonstrating that the general inference classes used by FAUSTUS can duplicate inferences made by various Step 5: Repeat steps l-4 for each piece of the text. special-purpose algorithms. Step 6: At the end of the text there may be some suggested inferences remaining on the agenda. Evaluate them to see if they lead to any more inferences. One of the first inferencing programs was the Teachable Language Comprehender, or TLC, [Quillian, 19691 which The knowledge base is modeled in the KODIAK representation language, a semantic net-based formalism with a-f&d set of took as input single noun phrases or simple sentences, and related them to what was already stored in semantic memory. well-defined primitive links. We present a simplified version For example, given the input “lawyer for the client,” the pro- for expository reasons; see [Wilensky, 19861 for more details. gram could output “at this point we are discussing a lawyer KODIAK resembles I&ONE [Bra&man and Schmolze, 19851, who is employed by a client who is represented or advised by and continues the renaissance of spreading activation approaches spurred by [Fahlman, 19791. this lawyer in a legal matter.” The examples given in [Quil- lian, 19691 show an ability to find the main relation between 3, Describing Inference Classes Shap3 two concepts, but not to go beyond that. One problem with TLC was that it ignores the grammatical relations between FAUSTUS has six marker-passing inference classes. These classes will be more meaningful after examples are provided concepts until the last moment, when it applies “form tests” below, but I want to emphasize their formal definition in terms to rule out certain inferences. For the purposes of generating of path shapes. Each inference class is characterized in terms inferences, TLC treats the input as if it had been just “Lawyer. of the shapes of the two path-halves which lead to the marker Client”. Quilli an suggests this could lead to a potential prob- lem. He presents the following examples: collision. There are three path-half shapes, which are defined in terms of primitive link types: H for hierarchical links lawyer for the enemy enemy of the lawyer between a concept and one of its super-categories, S for links lawyer for the wife wife of the lawyer between a concept and one of its “slots” or relations, and R lawyer for the client client of the lawyer for a link indicating a range rest&ion on a relation. A * In all the examples on the left hand side, the lawyer is marks indefinite repetition, and a marks traversal of an employed by someone. However, among the examples on the inverse link. right hand side, only the last should include the employment Inference Classes Path 1 Path 2 relation as part of the interpretation. While he suggests a solu- Elaboration Ref Elaboration tion in general terms, Quillian admits that TLC as it stood Double Elaboration Elaboration Elaboration could not handle these examples. Reference Resolution Ref Ref Concretion Elaboration Filler FAUSTUS has a better way of combining information from Relation Concretion Elaboration Filler syntax and semantics. Both TLC and FAUSTUS suggest infer- View Application Constraint View ences by spreading markers from components of the input, and looking for collisions. Path Name Path Shape The difference is that TLC used syntactic relations only as a filter to eliminate certain sugges- Elaboration origin + H* + S + R + H* + collision tions, while FAUSTUS incorporates the meaning of these rela- Ref origin + H* + collision tions into the representation before spreading markers. Even Filler origin -+ S1 + S + H* + collision Constraint origin + H* + R” -+ collision vague relations denoted by for and of are represented as full- View origin + H* + V + H* + R-l + collision fledged concepts, and are the source of marker-passing.

562 Natural Language 5. Script Based Inferences Trace output #l below shows that FAUSTUS can find a con- nection between lawyer and client without the for relation, The SAM (Script Applier Mechanism) program [Cullingford, just l&e TLC. Markers originating at the representations of 19781 was built to account for stories that refer to stereotypi- ‘ ‘lawyer’ ’ and “client” collide at the concept employing- cal situations, such as eating at a restaurant. A new algorithm event. The shape of the marker path indicates that this is a was needed because Conceptual Dependency couldn’t double elaboration path, and the suggested inference, that the represent scripts directly. h KODIAK, there are no arbitrary lawyer employs the client, is eventually accepted. distinctions between “primitive acts” and complex events, so In output #2 a non-marker-passing inference first classifies eating-at-a-restaurant is just another event, much like the for as an employed-by r&tiOB, because a lawyer is eating or walking, except that it involves multiple agents defined as a professional-service-provider, which and multiple sub-steps, with relations between the steps. Con- includes an employed-by slot as a specialization of the for sider the following example: slot. This classification means the enemy must be classified as The Waiter ZUl employer. Once this is done, FAUSTUS can suggest the employing-event that mediates between an employee and an [ 11 John was eating at a restaurant with Mary. employer, just as it did in #l. Finally, in #3, the of is left Rep: (EATING (ACTOR = JOHN)(SETTING = A RESTAURANT) with the vague interpretation related-to, so the enemy does (WITH = MARY)) not get classified as an employer, and no employment event is suggested. Inferring: a WITH of the EATING is probably Quillian #l the ACCOMPANIER because Mary fits it best. [ 11 Lawyer. This is a RELATION-CONCRETION inference. Rep: (LAWYER) Inferring: the EATING is a EAT-AT-RESTAURANT. [ 23 Client. This is a CONCRETION inference.

Rep: (CLIENT) [ 23 The waiter spilled soup all over her.

Inferring: there is a EMPLOYING-EVENT such that Rep: (SPILLING (ACTOR = THE WAITER) (PATIENT = the CLIENT is the EMPLOY-ER of it and (RECIPIENT = HER) 1 the LAWYER is the EMPLOY-EE of it. Inferring: there is a EAT-AT-RESTAURANT such that This is a DOUBLE-ELABORATION inference. the SOUP is the FOOD-ROLE of it and the RESTAURANT is the SETTING of it. Quillian #2 This is a DOUBLE-ELABORATION inference. [ 11 lawyer for the enemy Inferring: there is a EATING such that Rep: (LAWYER (FOR E THE ENEMY)) the SOUP is the EATEN of it and it is the PURPOSE of the RESTAURANT. Inferring: a FOR of the LAWYER is the EMPLOYED-BY This is a DOUBLE-ELABORATION inference. This is a RELATION-CLASSIFICATION inference. Inferring: there is a EAT-AT-RESTAURANT such that Inferring: there is a EMPLOYING-EVENT such that the WAITER is the WAITER-ROLE of it and the ENEMY is the EMPLOY-ER of it and the SOUP is the FOOD-ROLE of it. the LAWYER is the EMPLOY-EE of it. This iS a DOUBLE-ELABORATION inference. This is a DOUBLE-ELABORATION inference. The set of inferences seems reasonable, but it is instructive Quillian #3 to contrast them with the inferences SAM would have made. SAM would first notice the word restaurant and fetch the res- [ 11 enemy of the lawyer taurant script. From there it would match the script against Rep: (ENEMY (OF = THE LAWYER)) the input, filling in all possible information about restaurants with either an input or a default value, and ignoring input that Inferring: a OF of the ENEMY is didn’t match the script. FAUSTUS does not mark words like probably a RELATED-TO restaurant or waiter as keywords. Instead it is able to use This is a RELATION-CONCRETION inference. information associated with these words only when appropri- It should be noted that [Char&k, 19861 has a marker-passing ate, to find connections to events in the text. Thus, FAUSTUS mechanism that also improves on Quillian, and is in many could handle John walked past a restaurant without inferring ways similar to FAUSTUS. Chamiak integrates parsing, while that he ordered, ate, and paid for a meal. FAUSTUS does not, but FAUSTUS has a larger knowledge base (about 1000 concepts compared to about 75). Another key difference is that Char&k uses marker strength to make deci- In the previous section we saw that FAUSTUS was able to make sions, while FAUSTUS only uses markers to find suggestions, what have been called “script-based inferences” without any and evaluates them with other means. explicit script-processing control structure. This was enabled

Not-wig 563 partially by adding causal information to the representation of ing of (5) should include the inference that the transaction script-like events. The theory of plans and goals as they relate probably took place in the cobbler’s store, and that the hiker to story understanding, specifically the work of Wilensky will probably use the boots in his avocation, rather than, say, [Wilensky, 19781, was also an attempt to use causal informa- give them as a gift to his sister. The first of these can be tion to understand stories that could not be comprehended derived from concretion inferences once we have described using scripts alone. Consider story (4): what goes on at a shoe store. The problem is that we want to describe this in a neutral manner - to describe not “buying at (4a) John was lost. a shoe store” which would be useless for “selling at a shoe (4b) He pulled over to a farmer by the side of the road. store” or “paying for goods at a shoe store” but rather the (4~) He asked him where he was. general “shoe store transaction.” This is done by using the Wilensky’s PAM program processed this story as follows: from commercial-event absolute, which dominates store- (4a) it infers that John will have the goal of knowing where he transaction on the one hand, and buying, selling and is. From that it infers he is trying to go somewhere, and that paying on the other. Each of these last three is also dom- going somewhere is often instrumental to doing something inated by action. Assertions are made to indicate that the there. From (4b) PAM infers that John wanted to be near the buyer of buying is Beth the actor of the action and the farmer, because he wanted to use the farmer for some purpose. merchant of the commercial-event. The next step is to Finally (4~) is processed. It is recognized that asking is a plan define shoe-store-transaction as a kind of store- for knowing, and since it is known that John has the goal of transaction where the merchandise is constrained to be knowing where he is, there is a match, and (4~) is explained. shoes. With that done, we get the following: As a side effect of matching, the three pronouns in (4~) are The Cobbler and the Hiker disambiguated. Besides resolving the pronouns, the two key inferences are that John has the goal of finding out where he [ 11 A cobbler sold a pair of boots to a hiker. is, and that asking the farmer is a plan to achieve that goal. Rep: (SELLING (ACTOR = A COBBLER) (PATIENT = A BOOT) In FAUSTUS,we can a.ITiVeat the same interpretation Of the (RECIPIENT = A HIKER)) story by a very different method. (4a) does not generate any expectations, as it would in PAM, and FAUSTUScannot find a Inferring: the SELLING is a SHOE-STORE-TRANSACTION. This is a CONCRETION inference. connection between (4a) and (4b), although it does resolve the Inferring: there is a WALKING such that pronominal reference, because John is the only possible candi- it is the PURPOSE of the BOOT and (de), FAUSTUS date. Finally, in makes the two main infer- the HIKER is the OBJECT-MOVED of it. ences. The program recognizes that being near the farmer is This is a DOUBLE-ELABORATION inference. related to asking him a question by a precondition relation (and resolves the pronominal references while making this The program concludes that a selling involving shoes is a shoe connection). FAUSTUS could find this connection because store transaction, and although it was not printed, this means both the asking and the being-near are explicit inputs. The that it takes place in a shoe store, and the seller is an employee other connection is a little trickier. The goal of knowing of the store. The second inference is based on a collision at where one is was not an explicit input, but “where he was” is the concept walking. The purpose of boots is walking, and part of (4c), and there is a collision between paths starting the walking is to be done by the hiker9 because that’s what from the representation of that phrase and another path start: they do. Note that the representation is not sophisticated ing from the asking that lead to the creation of the plan-for enough to distinguish between actual events and potential between John’s asking where he is and his hypothetical ktlQW- future events like this one. ing where he is. The important conclusion, as far as FAUSTUS is concerned, One hallmark of an AI program is to generate output that was is that both script- and goal-based processing can be repro& not expected by the program’s developer. The following text duced by a system that has no explicit processing mechanism shows an example of this: aimed at one type of story or another, but just looks for con- nections in the input as they relate to what is known in The President memory. For both scripts and goals, this involves defining situations largely in terms of their causal structure. [ 11 The president discussed Nicaragua. 7. Coherence Rellation Based Inferences Rep: (DISCUSSING (ACTOR = THE PRESIDENT) (CONTENT = NICARAGUA)) In this section we turn to inferences based on coherence rela- [ 21 He spoke for an hour. tions, as exemplified by this example proposed by Kay and Fillmore [Kay, 19811: Rep: (TALKING (ACTOR = HE) (DURATION = AN HOUR))

(5) A hiker bought a pair of boots from a cobbler. Inferring: 'HE' must be a PERSON, because it is the TALKER From the definition of buying one could infer that the hiker now owns the boots that previously belonged to the cobbler This is a RELATION-CONSTRAINT inference. and the cobbler now has some money that previously Inferring: IHE' refers to the PRESIDENT. belonged to the hiker. However, a more complete understand- This is a REFERENCE inference.

564 Natural language 1980. Inferring: the NICARAGUA is a COUNTRY such that it is the HABITAT of 'HE* and [Granger, Eiselt and Holbrook, 19841 Richard H. Granger, it is the COUNTRY of the PRESIDENT. Kurt P. Eiselt and Jennifer K. Holbrook, Parsing with This is a DOUBLE-ELABORATION inference. parallelism: a spreading-activation model of inference processing during text understanding, Technical Report Inferring: the TALKING refers to the DISCUSSING. #228, Dept. of Information and Computer Science, UC This is a REFERENCE inference Irvine, 1984. This example was meant to illustrate action/action co- [Kay, 19811 Paul Kay, Three Properties of the Ideal Reader, reference. The talking in the second sentence refers to the Berkeley Cognitive Science Program, 1981. same event as the discussing in the first sentence, but nei- [Lockman and Klappholz, 19801 Abe Lockman and A. David ther event is explicitly marked as definite or indefinite. Klappholz, Toward a procedural model of contextual FAUSTUS is able to make the inference that the two actions are reference resolution, Discourse Processes 3 (1980), 25- co-referential, using the same mechanism that works for pro- 71. nouns. The idea of treating actions under a theory of refer- [Norvig, 19831 Peter Norvig, Frame Activated Inferences in a ence is discussed in bkman and Klappholz, 19801. Story Understanding Program, Proceedings of the 8th FAUSTUS correctly finds the coreference between the two Int. Joint Co& on M, Karlsruhe, West Germany, 1983. actions, and infers that ‘he’ refers to the president. [Norvig, 19861 Peter Norvig, A Unified Theory of Inference But FAUSTUS infers that Nicaragua is the president’s home for Text Understanding, Ph.D Thesis, University of or habitat and is the country of his constituency. This makes a California, Berkeley, 1986. certain amount of sense, since presidents must have such [Quillian, 19691 M. Ross Quillian, The teachable language things, and Nicaragua is the only country mentioned. Of comprehender: A simulation program and theory of course, this was unexpected, we interpret the text as referring language, Corrwnunications of the ACM, 1969,459-476. to the president of the because we are living in [Schank et al., 19731 Roger C. &hank, Neil Goldman, Charles the U.S., and our president is a salient figure. Given Rieger and Christopher K. Riesbeck, MARGIE: lack of context, the inference is quite reasonable. FAUSTUS'S Memory, analysis, response generation, and inference in 9. Conclusion English, 3rd Int. Joint Co& on AI, 1973,255-261. We have shown that a general marker-passing algorithm with [Wilensky, 19781 Robert W. Wilensky, Understanding Goal- a small number of inference classes can process a wide variety based Stories, Yale University Computer Science of texts. FAUSTLJS shifts the complexity from the algorithm to Research Report, New Haven, CT, 1978. the knowledge base to handle examples that other systems [Wilensky, 19861 Robert Wilensky, Some Problems and could do only by introducing specialized algorithms. Proposals for Knowledge Representation, Report No. eferences UCB/Computer Science Dpt. 86/294, Computer Science Division, UC Berkeley, 1986. [Bra&man and Schmolze, 19851 Ronald J. Brachman and James G. Schmolze, An overview of the KL-ONE knowledge representation system, Cognitive Science 9, 2 (1985), 171-216. [Char&k, 19721 Eugene Char&k, Toward a Model of Children’s Story Comprehension, AI-Tech. Rep.-266, MIT AI Labs, Cambridge, MA, 1972. [Charniak, 19781 Eugene Charniak, On the use of framed knowledge in language comprehension, Artifzciul Intelligence I I, 3 (1978), 225-266. [Char&k, 19861 Eugene Charniak, A neat theory of marker passing, Proceedings of the F@h National Conference on Artifrcial Intelligence, 1986,584-588. [Cullingford, 19781 R. E. Cullingford, Script Application: Computer understanding of newspaper stories, Research Report #116, Yale University Computer Science Dept., 1978. [Fahlman, 19791 Scott E. Fahhnan, NETL: A System for Representing and Using Real-World Knowledge, MIT Press, Cambridge, 1979. [Granger, 19801 Richard H. Grarwr, Adaptive Understanding: Correcting Erroneous Inferences, 17 1, Yale Univ. Dept. of Computer Science, New Haven, CT,

Norvig 565