review articles

DOI:10.1145/2701413 AI has seen great advances of many kinds recently, but there is one critical area where progress has been extremely slow: ordinary commonsense.

BY ERNEST DAVIS AND GARY MARCUS Commonsense Reasoning and Commonsense Knowledge in

WHO IS TALLER, Prince William or his baby son Prince key insights

George? Can you make a salad out of a polyester shirt? ˽˽ To achieve human-level performance in domains such as natural language If you stick a pin into a carrot, does it make a hole processing, vision, and robotics, basic knowledge of the commonsense world— in the carrot or in the pin? These types of questions time, space, physical interactions, people, may seem silly, but many intelligent tasks, such as and so on—will be necessary. understanding texts, computer vision, planning, and ˽˽ Although a few forms of commonsense reasoning, such as taxonomic reasoning scientific reasoning require the same kinds of real- and temporal reasoning are well world knowledge and reasoning abilities. For instance, understood, progress has been slow. ˽˽ Extant techniques for implementing if you see a six-foot-tall person holding a two-foot-tall commonsense include logical analysis, handcrafting large knowledge bases, person in his arms, and you are told they are father Web mining, and crowdsourcing. Each of these is valuable, but none by itself is a and son, you do not have to ask which is which. If you full solution.

need to make a salad for dinner and are out of lettuce, ˽˽ Intelligent machines need not replicate you do not waste time considering improvising by human cognition directly, but a better understanding of human commonsense taking a shirt of the closet and cutting might be a good place to start.

92 COMMUNICATIONS OF THE ACM | SEPTEMBER 2015 | VOL. 58 | NO. 9 it up. If you read the text, “I stuck a pin In this article, we argue that common- that are comparatively easy to acquire, a in a carrot; when I pulled the pin out, it sense reasoning is important in many AI substantial fraction can only be resolved had a hole,” you need not consider the tasks, from text understanding to com- using a rich understanding of the world. possibility “it” refers to the pin. puter vision, planning and reasoning, A well-known example from Terry Wino- To take another example, con- and discuss four specific problems grad48 is the pair of sentences “The city sider what happens when we watch where substantial progress has been council refused the demonstrators a per- a movie, putting together infor- made. We consider why the problem mit because they feared violence,” vs.“… mation about the motivations of in its general form is so difficult and because they advocated violence.” To de- fictional characters we have met why progress has been so slow, and termine that “they” in the first sentence only moments before. Anyone survey various techniques that have refers to the council if the verb is “feared,” who has seen the unforgettable been attempted. but refers to the demonstrators if the horse’s head scene in The Godfather verb is “advocated” demands knowledge immediately realizes what is going on. Commonsense in Intelligent Tasks about the characteristic relations of city It is not just it is unusual to see a sev- The importance of real-world knowl- councils and demonstrators to violence; ered horse head, it is clear Tom Hagen edge for natural language processing, no purely linguistic clue suffices.a is sending Jack Woltz a message—if and in particular for disambiguation of I can decapitate your horse, I can de- all kinds, was discussed as early as 1960, 3 a Such pairs of sentences are known as “Wino- capitate you; cooperate, or else. For by Bar-Hillel, in the context of machine grad schemas” after this example; a collection now, such inferences lie far beyond translation. Although some ambigui- of many such examples can be found at http:// 32 ILLUSTRATION BY PETER CROWTHER ASSOCIATES BY ILLUSTRATION anything in artificial intelligence. ties can be resolved using simple rules cs.nyu.edu/faculty/davise/papers/WS.html.

SEPTEMBER 2015 | VOL. 58 | NO. 9 | COMMUNICATIONS OF THE ACM 93 review articles

Machine translation likewise often tasks can be carried out purely in terms matches the partial view of the chair involves problems of ambiguity that of manipulating individual words or on the near side of the table. can only be resolved by achieving an short phrases, without attempting any The viewer infers the existence of actual understanding of the text—and deeper understanding; commonsense objects that are not in the image at all. bringing real-world knowledge to bear. is evaded, in order to focus on short- There is a table under the yellow table- Google Translate often does a fine job term results, but it is difficult to see cloth. The scissors and other items of resolving ambiguities by using nearby how human-level understanding can hanging on the board in the back are words; for instance, in translating the be achieved without greater attention presumably supported by pegs or two sentences “The electrician is work- to commonsense. hooks. There is presumably also a hot ing” and “The telephone is working” into , the “Jeopardy”-playing water knob for the faucet occluded by German, it correctly translates “work- program, is an exception to the above the dish rack. The viewer also infers how ing” as meaning “laboring,” in the first rule only to a small degree. As de- the objects can be used (sometimes sentence and as meaning “functioning scribed in Kalyanpur,27 commonsense called their “affordances”); for example, correctly” in the second, because in the knowledge and reasoning, particular- the cabinets and shelves can be opened corpus of texts Google has seen, the Ger- ly taxonomic reasoning, geographic by pulling on the handles. (Cabinets, man words for “electrician” and “labor- reasoning, and temporal reasoning, which rotate on joints, have the handle ing” are often found close together, as played some role in Watson’s opera- on one side; shelves, which pull out are the German words for “telephone” tions but only a quite limited one, and straight, have the handle in the center.) and “function correctly.”b However, if they made only a small contribution to Movies would prove even more dif- you give it the sentences “The electrician Watson’s success. The key techniques ficult; few AI programs have even tried. who came to fix the telephone is work- in Watson are mostly of the same fla- The Godfather scene mentioned earlier ing,” and “The telephone on the desk is vor as those used in programs like is one example, but almost any movie working,” interspersing several words Web search engines: there is a large contains dozens or hundreds of mo- between the critical element (for exam- collection of extremely sophisticated ments that cannot be understood ple, between electrician and working), and highly tuned rules for match- simply by matching still images to the translations of the longer sentences ing words and phrases in the ques- memorized templates. Understand- say the electrician is functioning prop- tion with snippets of Web documents ing a movie requires a viewer to make erly and the telephone is laboring (Table such as ; for reformulating numerous inferences about the inten- 1). A statistical proxy for commonsense the snippets as an answer in proper tions of characters, the nature of physi- that worked in the simple case fails in form; and for evaluating the quality of cal objects, and so forth. In the current the more complex case. proposed possible answers. There is state of the art, it is not feasible even to Almost without exception, current no evidence that Watson is anything attempt to build a program that will be computer programs to carry out lan- like a general-purpose solution to the able to do this reasoning; the most that guage tasks succeed to the extent the commonsense problem. can be done is to track characters and Computer vision. Similar issues identify basic actions like standing up, arise in computer vision. Consider the sitting down, and opening a door.4 b Google Translate is a moving target; this par- ticular example was carried out on 6/9/2015, photograph of Julia Child’s kitchen Robotic manipulation. The need but translations of individual sentences (Figure 1): Many of the objects that for commonsense reasoning in au- change rapidly—not always for the better on are small or partially seen, such as the tonomous robots working in an un- individual sentences. Indeed, the same query metal bowls in the shelf on the left, controlled environment is self-evident, given minutes apart can give different results. most conspicuously in the need to have Changing the target language, or making the cold water knob for the faucet, the seemingly inconsequential changes to the round metal knobs on the cabinets, the robot react to unanticipated events sentence, can also change how a given ambi- the dishwasher, and the chairs at the appropriately. If a guest asks a waiter- guity is resolved, for no discernible reason. table seen from the side, are only rec- robot for a glass of wine at a party, and Our broader point here is not to dissect Google ognizable in context; the isolated im- the robot sees the glass he is picked up Translate per se, but to note it is unrealistic to expect fully reliable disambiguation in the ab- age would be difficult to identify. The is cracked, or has a dead cockroach at sence of a deep understanding of the text and top of the chair on the far side of the the bottom, the robot should not simply relevant domain knowledge. table is only identifiable because it pour the wine into the glass and serve it. If a cat runs in front of a house-clean- Table 1. Lexical ambiguity and Google Translate. We have highlighted the translation of ing robot, the robot should neither run the word “working.” The German word “arbeitet” means “labors;” “funktioniert” means “functions correctly.” it over nor sweep it up nor put it away on a shelf. These things seem obvious, but ensuring a robot avoids mistakes of this English original Google translation kind is very challenging. The electrician is working. Der Electriker arbeitet. The electrician that came to fix Der Elektriker, die auf das Telefon zu Successes in Automated the telephone is working. beheben kam funktioniert. Commonsense Reasoning The telephone is working. Das Telefon funktioniert. Substantial progress in automated The telephone on the desk is working. Das Telefon auf dem Schreibtisch arbeitet. commonsense reasoning has been made in four areas: reasoning about

94 COMMUNICATIONS OF THE ACM | SEPTEMBER 2015 | VOL. 58 | NO. 9 review articles taxonomic categories, reasoning WithHeresy, and so on. These over- Incident are uncertain. about time, reasoning about actions lap, and it is not clear which of these Simple taxonomic structures such and change, and the sign calculus. In are best viewed as taxonomic catego- as those illustrated here are often each of these areas there exists a well- ries and which are better viewed as used in AI programs. For example, understood theory that can account properties. In taxonomizing more ab- WordNet34 is a widely used resource for some broad range of common- stract categories, choosing and delim- that includes a taxonomy whose el- sense inferences. iting categories becomes more prob- ements are meanings of English Taxonomic reasoning. A taxonomy lematic; for instance, in constructing words. As we will discuss later, Web is a collection of categories and in- a taxonomy for a theory of narrative, mining systems that collect common- dividuals, and the relations between the membership, relations, and defi- sense knowledge from Web docu- them. (Taxonomies are also known as nitions of categories like Event, Ac- ments tend to be largely focused on semantic networks.) For instance, Fig- tion, Process, Development, and taxonomic relations, and more suc- ure 2 shows a taxonomy of a few catego- ries of animals and individuals. Figure 1. Julia Child’s kitchen. Photograph by Matthew Bisanz. There are three basic relations: ˲˲ An individual is an instance of a cate- gory. For instance, the individual Lassie is an instance of the category Dog. ˲˲ One category is a subset of another. For instance Dog is a subset of Mammal. ˲˲ Two categories are disjoint. For in- stance Dog is disjoint from Cat. Figure 2 does not indicate the dis- jointness relations. Categories can also be tagged with properties. For instance, Mammal is tagged as Furry. One form of inference in a taxonomy is transitivity. Since Lassie is an in- stance of Dog and Dog is a subset of Mammal, it follows that Lassie is an instance of Mammal. Another form of inference is inheritance. Since Lassie is an instance of Dog, which is a subset of Mammal and Mammal is marked with property Furry, it follows that Dog and Lassie have property Furry. A variant of this is default inheritance; a category can be marked with a charac- Figure 2. Taxonomy. teristic but not universal property, and a subcategory or instance will inherit Animal the property unless it is specifically canceled. For instance, Bird has the default property CanFly, which is in- Mammal Furry Bird CanFly herited by Robin but not by Penguin. The standard taxonomy of the ani- mal kingdom is particularly simple in structure. The categories are generally Dog Cat Robin Penguin sharply demarcated. The taxonomy is tree-structured, meaning given any two categories, either they are disjoint or one is a subcategory of the other. Lassie Tweety Other taxonomies are less straight- forward. For instance, in a semantic Node Types Link Types network for categories of people, the Category Subcategory individual GalileoGalilei is si- Individual Instance multaneously a Physicist, an As- Property PropertyOf tronomer, a ProfessorOfMath- Cancel ematics, a WriterInItalian, a NativeOfPisa, a PersonCharged-

SEPTEMBER 2015 | VOL. 58 | NO. 9 | COMMUNICATIONS OF THE ACM 95 review articles

largely understood. Moreover, a great deal is known about extensions to these domains, including: Understanding a Classic ˲˲ Continuous domains, where change is continuous. Scene from The Godfather ˲˲ Simultaneous events. Hagen foresaw, while he was planning the operation, that ˲˲ Probabilistic events, whose out- Woltz would realize that come depends partly on chance. Hagen arranged for the placing of the head in Woltz’s bed ˲˲ Multiple agent domains, where in order to make Woltz realize that agents may be cooperative, indepen- Hagen could easily arrange to have him killed dent, or antagonistic. if he does not accede to Hagen’s demands. ˲˲ Imperfect knowledge domains, where actions can be carried out with the purpose of gathering information, cessful in gathering taxonomic rela- plex and their interpretation is con- and (in the multiagent case) where co- tions than in gathering other kinds of text dependent. Temporal reasoning operative agents must communicate knowledge. Many specialized taxono- was used to some extent in the Wat- information. mies have been developed in domains son “Jeopardy!”-playing program to ˲˲ Decision theory: Comparing dif- such as medicine40 and genomics.21 exclude answers that would be a mis- ferent courses of action in terms of the More broadly, the en- match in terms of date.27 However, expected utility. terprise is largely aimed at developing many important temporal relations The primary successful applications architectures for large-scale taxono- are not explicitly stated in texts, they of these kinds of theories has been to mies for Web applications. are inferred; and the process of infer- high-level planning,42 and to some ex- A number of sophisticated exten- ence can be difficult. Basic tasks like tent to robotic planning, for example, sions of the basic inheritance archi- assigning timestamps to events in Ferrein et al.16 tecture described here have also been news stories cannot be currently done The situation calculus uses a developed. Perhaps the most powerful with any high degree of accuracy.47 branching model of time, because it and widely used of these is description Action and change. Another area was primarily developed to charac- logic.2 Description logics provide tracta- of commonsense reasoning that is terize planning, in which one must ble constructs for describing concepts well understood is the theory of ac- consider alternative possible actions. and the relations between concepts, tion, events, and change. In particular, However, it does not work well for nar- grounded in a well-defined logical for- there are very well established repre- rative interpretation, since it treats malism. They have been applied exten- sentational and reasoning techniques events as atomic and requires the or- sively in practice, most notably in the for domains that satisfy the following der of events be known. For narrative semantic Web ontology language OWL. constraints:42 interpretation, the event calculus37 Temporal reasoning. Represent- ˲˲ Events are atomic. That is, one is more suitable. The event calculus ing knowledge and automating rea- event occurs at a time, and the reason- can express many of the temporal re- soning about times, durations, and er need only consider the state of the lations that arise in narratives; how- time intervals is a largely solved prob- world at the beginning and the end of ever, only limited success has been lem.17 For instance, if one knows that the event, not the intermediate states obtained so far in applying it in the in- Mozart was born earlier and died while the event is in progress. terpretation of natural language texts. younger than Beethoven, one can ˲˲ Every change in the world is the re- Moreover, since it uses a linear model infer that Mozart died earlier than sult of an event. of time, it is not suitable for planning. Beethoven. If one knows the Battle ˲˲ Events are deterministic; that is, Many important issues remain un- of Trenton occurred during the Revo- the state of the world at the end of the solved, however, such as the problem lutionary War, the Battle of Gettys- event is fully determined by the state of integrating action descriptions at burg occurred during the Civil War, of the world at the beginning plus the different levels of abstraction. The pro- and the Revolutionary War was over specification of the event. cess of cooking dinner, for instance, before the Civil War started, then ˲˲ Single actor. There is only a single may involve such actions as “Getting to one can infer the Battle of Trenton actor, and the only events are either his know my significant other’s parents,” occurred before the Battle of Gettys- actions or exogenous events in the ex- “Cooking dinner for four,” “Cooking burg. The inferences involved here ternal environment. pasta primavera,” “Chopping a zuc- in almost all cases reduce to solving ˲˲ Perfect knowledge. The entire rel- chini,” “Cutting once through the systems of linear inequalities, usually evant state of the world at the start, zucchini”, and “With the right hand, small and of a very simple form. and all exogenous events are known or grasping the knife by the handle, Integrating such reasoning with can be calculated. blade downward, and lowering it at specific applications, such as natu- For domains that satisfy these con- about one foot per second through the ral language interpretation, has been straints, the problem of representation center of the zucchini, while, with the much more problematic. Natural lan- and important forms of reasoning, left hand, grasping the zucchini and guage expressions for time are com- such as prediction and planning, are holding it against the cutting board.”

96 COMMUNICATIONS OF THE ACM | SEPTEMBER 2015 | VOL. 58 | NO. 9 review articles

Reasoning about how these differ- make any significant use of automat- ent kinds of actions interrelate—for ed commonsense reasoning. Systems example, if you manage to slice off a like Google Translate use statistical finger while cutting the zucchini, your information culled from large datas- prospective parents-in-law may not be ets as a sort of distant proxy for com- impressed—is substantially unsolved. Current computer monsense knowledge, but beyond Qualitative reasoning. One type of programs to carry that sort of crude proxy, common- commonsense reasoning that has been sense reasoning is largely absent. In analyzed with particular success is out language tasks large part, that is because nobody has known as qualitative reasoning. In its succeed to the yet come close to producing a satisfac- simplest form, qualitative reasoning is tory commonsense reasoner. There about the direction of change in interre- extent the tasks are five major obstacles. lated quantities. If the price of an object First, many of the domains involved goes up then (usually, other things being can be carried out in commonsense reasoning are only equal) the number sold will go down. If purely in terms partially understood or virtually un- the temperature of gas in a closed con- touched. We are far from a complete tainer goes up, then the pressure will of manipulating understanding of domains such as go up. If an ecosystem contains foxes individual words physical processes, knowledge and and rabbits and the number of foxes communication, plans and goals, and decreases, then the death rate of the rab- or short phrases, interpersonal interactions. In domains bits will decrease (in the short term). without attempting such as the commonsense understand- An early version of this theory was ing of biology, of social institutions, formulated by Johan de Kleer11 for any deeper or of other aspects of folk psychology, analyzing an object moving on a roller understanding. little work of any kind has been done. coaster. Later more sophisticated forms Second, situations that seem were developed in parallel by de Kleer straightforward can turn out, on ex- and Brown12 for analyzing electronic cir- amination, to have considerable cuits; by Forbus18 for analyzing varieties logical complexity. For example, of physical processes; and by Kuipers30 consider the horse’s head scene in as a mathematical formalism. The Godfather. The box on the previ- This theory has been applied in ous page illustrates the viewer’s un- many domains, from physics to engi- derstanding of the scene. We have a neering, biology, ecology, and engi- statement with embeddings of three neering. It has also served as the basis mental states (“foresaw,” “realize,” for a number of practical programs, in- “realize”), a teleological connec- cluding text understanding;29 analogi- tion (“in order”), two hypotheticals cal mapping and geometric reason- (“could arrange” and “does not ac- ing;33 failure analysis in automotive cede”) and a highly complex tempo- electrical systems;41 and generating ral/causal structure. control strategies for printing.20 Some aspects of these kinds of re- For problems within the scope of the lations have been extensively studied representation, the reasoning mecha- and are well understood. However, nism works well. However, there are there are many aspects of these rela- many problems in physical reasoning, tions where we do not know, even in particularly those involving substantial principle, how they can be represent- geometric reasoning, that cannot be ed in a form usable by computers or represented in this way, and therefore how to characterize correct reason- lie outside the scope of this reasoning ing about them. For example, there mechanism. For example, you want are theories of knowledge that do a to be able to reason a basketball will good job of representing what differ- roll smoothly in any direction, where- ent players know about the deal in a as a football can roll smoothly if its poker game, and what each player long axis is horizontal but cannot roll knows about what the other players smoothly end-over-end. This involves know, because one can reasonably reasoning about the interactions of all idealize all the players as being able three spatial dimensions together. to completely think through the situ- ation. However, if you want to model Challenges in Automating a teacher thinking about what his Commonsense Reasoning students do not understand, and how As of 2014, few commercial systems they can be made to understand, then

SEPTEMBER 2015 | VOL. 58 | NO. 9 | COMMUNICATIONS OF THE ACM 97 review articles

that is a much more difficult problem, easily, because a comparatively small and one for which we currently do not number of common categories in- have a workable solution. Moreover, clude most of the instances. On the even when the problems of represen- other hand, it is often very difficult to tation and inference have been solved attain high quality results, because a in principle, the problem of carrying There is significant fraction of the problems out reasoning efficiently remains. no evidence that arise correspond to very infre- Third, commonsense reasoning quent categories. The result is the pat- almost always involves plausible rea- that IBM’s Watson tern of progress often seen in AI: Rapid soning; that is, coming to conclu- is anything like progress at the start of research up to sions that are reasonable given what a mediocre level, followed by slower is known, but not guaranteed to be a general-purpose and slower improvement. (Of course, correct. Plausible reasoning has been for any given application, partial suc- extensively studied for many years,23 solution to cess may be acceptable or indeed ex- and many theories have been devel- the commonsense tremely valuable; and high quality per- oped, including probabilistic reason- formance may be unnecessary.) ing,38 belief revision,39 and default problem. We conjecture that long-tail phe- reasoning or non-monotonic logic.5 nomena are pervasive in common- However, overall we do not seem to sense reasoning, both in terms of the be very close to a comprehensive solu- frequency with which a fact appears tion. Plausible reasoning takes many in knowledge sources (for example, different forms, including using un- texts) and in terms of the frequency reliable data; using rules whose con- with which it is needed for a reason- clusions are likely but not certain; ing task. For instance, as discussed, a default assumptions; assuming one’s robot waiter needs to realize it should information is complete; reasoning not serve a drink in a glass with a dead from missing information; reason- cockroach; but how often is that men- ing from similar cases; reasoning tioned in any text, and how often will from typical cases; and others. How the robot need to know that fact?d to do all these forms of reasoning Fifth, in formulating knowledge it acceptably well in all commonsense is often difficult to discern the proper situations and how to integrate these level of abstraction. Recall the exam- different kinds of reasoning are very ple of sticking a pin into a carrot and much unsolved problems. the task of reasoning that this action Fourth, in many domains, a small may well create a hole in the carrot but number of examples are highly fre- not create a hole in the pin. Before it quent, while there is a “long tail” of a encounters this particular example, vast number of highly infrequent ex- an automated reasoner presumably amples. In natural language text, for would not specifically know a fact example, some trigrams (for example, specific to pins and carrots; at best it “of the year”) are very frequent, but might know a more general rulee or many other possible trigrams, such theory about creating holes by stick- as “moldy blueberry soda” or “gym- ing sharp objects into other objects. nasts writing novels” are immediately The question is, how broadly should understandable, yet vanishingly rare.c such rules should be formulated? Long tail phenomena also appear in Should such roles cover driving nails many other corpora, such as labeled into wood, driving staples into paper, sets of images.43 driving a spade into the ground, push- The effect of long-tail distributions on AI research can be pernicious. On d Presumably an intelligent robot would not the one hand, promising preliminary necessarily know that specific fact in advance results for a given task can be gotten at all, but rather would infer it when necessary. However, accomplishing that involves describ- ing the knowledge that supports the inference c Google reports no instances of either of these and building the powerful quoted phrases as of June 9, 2015. These are that carries out the inference. not difficult to find in natural text; for example, e Positing that the reasoner is not using rules one recent book review we examined con- at all, but instead is using an instance-based tained at least eight trigrams (not containing theory does not eliminate the problem. proper nouns) with zero Google hits other than Rather, it changes the problem to the ques- the article itself. A systematic study of n-gram tion of how to formulate the features to be distribution can be found in Allison et al.1 used for comparison.

98 COMMUNICATIONS OF THE ACM | SEPTEMBER 2015 | VOL. 58 | NO. 9 review articles ing your finger through a knitted mit- We consider these in turn. are drawing on is largely unverbalized ten, or putting a pin into water (which Research in commonsense reason- and the reasoning processes largely creates a hole that is immediately ing addresses a number of different unavailable to introspection. An auto- filled in)? Or must there be individual objectives: mated reasoner will have to have com- rules for each domain? Nobody has ˲˲ Reasoning architecture. The devel- parable abilities. yet presented a general solution to opment of general-purpose data struc- ˲˲ Breadth. Attaining powerful com- this problem. tures for encoding knowledge and al- monsense reasoning will require a A final reason for the slow progress gorithms and techniques for carrying large body of knowledge. in automating commonsense knowl- out reasoning. (A closely related issue ˲˲ Independence of experts. Paying ex- edge is both methodological10 and so- is the representation of the meaning of perts to hand-code a large knowledge ciological. Piecemeal commonsense natural language sentences.45) base is slow and expensive. Assembling knowledge (for example, specific ˲˲ Plausible inference; drawing provi- the either automati- facts) is relatively easy to acquire, sional or uncertain conclusions. cally or by drawing on the knowledge but often of little use, because of the ˲˲ Range of reasoning modes. Incor- of non-experts is much more efficient. long-tail phenomenon discussed porating a variety of different modes ˲˲ Applications. To be useful, the com- previously. Consequently, there may of inference, such as explanation, gen- monsense reasoner must serve the not be much value in being able to do eralization, abstraction, analogy, and needs of applications and must inter- a little commonsense reasoning. The simulation. face with them smoothly. payoff in a complete commonsense ˲˲ Painstaking analysis of fundamen- ˲˲ Cognitive modeling. Theories of reasoner would be large, especially tal domains. In doing commonsense commonsense in a domain like robotics, but that reasoning, people are able to do com- accurately describe commonsense rea- payoff may only be realized once a plex reasoning about basic domains soning in people. large fraction of the project has been such as time, space, naïve physics, and The different approaches to auto- completed. By contrast, the natural naïve psychology. The knowledge they mating commonsense reasoning have incentives in software development favor projects where there are payoffs Figure 3. Taxonomy of approaches to commonsense reasoning. at every stage; projects that require huge initial investments are much Commonsense less appealing. Reasoning

Approaches and Techniques As with most areas of AI, the study of Web Mining Knowledge based Crowd Sourcing commonsense reasoning is largely divided into knowledge-based ap- NELL, KnowItAll ConceptNet, proaches and approaches based on OpenMind machine learning over large data corpora (almost always text corpora) Mathematical Informal Large-scale with only limited interaction be- Situation calculus, Scripts, tween the two kinds of approaches. Region connection calculus, Frames, There are also crowdsourcing ap- Qualitative process theory Case-based reasoning proaches, which attempt to con- struct a knowledge base by somehow combining the collective knowledge and participation of many non-ex- Table 2. Approaches and typical objectives. pert people. Knowledge-based ap- proaches can in turn be divided into Web Crowd approaches based on mathematical Math-based Informal Large-scale mining sourcing logic or some other mathematical Architecture Substantial Little Substantial Moderate Little formalism; informal approaches, Plausible reasoning Substantial Moderate Substantial Little Little antipathetic to mathematical formal- Range of reasoning Moderate Substantial Moderate Little Little ism, and sometimes based on theo- modes ries from cognitive psychology; and Painstaking Substantial Little Moderate Little Little large-scale approaches, which may fundamentals be more or less mathematical or in- Breadth Little Moderate Substantial Substantial Substantial formal, but in any case are chiefly tar- Independence of Little Little Little Substantial Substantial experts geted at collecting a lot of knowledge Concern with Moderate Substantial Substantial Moderate Moderate (Figure 3). A particularly successful applications form of mathematically grounded Cognitive modeling Little Substantial Little Little Moderate commonsense reasoning is qualita- tive reasoning, described previously.

SEPTEMBER 2015 | VOL. 58 | NO. 9 | COMMUNICATIONS OF THE ACM 99 review articles

often emphasized different objectives, known as the “Cheating Husbands” The most important contribu- as sketched in Table 2. problem), a well-known brain teaser in tion of the informal approach has Knowledge-based approaches. In the theory of knowledge, has been ana- been the analysis of a broad class knowledge-based approaches, experts lyzed in a dozen different variants in a of types of inference. For instance, carefully analyze the characteristics of half-dozen different epistemic logics; Minsky’s frame paper35 discusses the inferences needed to do reasoning but we do not know how to represent an important form of inference, in in a particular domain or for a particular the complex interpersonal interactions which a particular complex indi- task and the knowledge those inferences between Hagen and Woltz in the horse’s vidual, such as a particular house, is depend on. They handcraft representa- head scene, let alone how to automate matched against a known structure, tions that are adequate to express this reasoning about them. such as the known characteristics of knowledge and inference engines that Unlike the other approaches to houses in general. Schank’s theory are capable of carrying out the reasoning. commonsense reasoning, much of the of scripts44 addresses this in the im- Mathematically grounded approach- work in this approach is purely theoret- portant special case of structured es. Of the four successes of common- ical; the end result is a published paper collections of events. Likewise, sense reasoning enumerated in this ar- rather than an implemented program. reasoning by analogy22,26 and case- ticle, all but taxonomic reasoning largely Theoretical work of this kind is evalu- based reasoning28 have been much derive from theories that are grounded ated, either in terms of meta-theorems more extensively studied in informal in mathematics or mathematical logic. (for example, soundness, complete- frameworks than in mathematically (Taxonomic representations are too ness, computational complexity), or grounded frameworks. ubiquitous to be associated with any in terms of interesting examples of It should be observed that in com- single approach.) However, many di- commonsense inferences the theory puter programming, informal ap- rections of work within this approach supports. These criteria are often tech- proaches are very common. Many are marked by a large body of theory nically demanding; however, their rela- large and successful programs—text and a disappointing paucity of practi- tion to the advancement of the state of editors, operating systems shells, cal applications or even of convincing the art is almost always indirect, and all and so on—are not based on any potential applications. The work on too often nonexistent. overarching mathematical or statis- qualitative spatial reasoning6 illustrates Overall, the work is also limited tical model; they are written ad hoc this tendency vividly. There has been ac- in terms of the scope of domains and by the seat of the pants. This is cer- tive work in this area for more than 20 reasoning techniques that have been tainly not an inherently implausible years, and more than 1,000 research pa- considered. Again and again, research approach to AI. The major hazard of pers have been published, but very little in this paradigm has fixated on a small work in this approach is that theories of this connects to any commonsense number of examples and forms of can become very nebulous, and that reasoning problem that might ever knowledge, and generated vast collec- research can devolve into little more arise. Similar gaps between theory and tions of papers dealing with these, leav- than the collection of striking anec- practice arise in other domains as well. ing all other issues neglected. dotes and the construction of dem- The “Muddy Children” problemf (also Informal knowledge-based approaches. onstration programs that work on a In the informal knowledge-based ap- handful of examples. f Alice, Bob, and Carol are playing together. Dad proach, theories of representation and Large-scale approaches. There have says to them, “At least one of you has mud on reasoning are based substantially on been a number of attempts to con- your forehead. Alice, is your forehead muddy?” intuition and anecdotal data, and to a struct very large knowledge bases of Alice answers, “I don’t know.” Dad asks, “Bob, is your forehead muddy?” Bob answers, significant but substantially lesser ex- commonsense knowledge by hand. “I don’t know.” Carol then says, “My forehead tent on results from empirical behav- The largest of these is the CYC pro- is muddy.” Explain. ioral psychology. gram. This was initiated in 1984 by Doug Lenat, who has led the project Table 3. Facts recently learned by NELL (6/11/2015). throughout its existence. Its initial proposed methodology was to en- code the knowledge in 400 sample Fact Confidence articles in a one-volume desk en- federica_fontana is a director 91.5 cyclopedia together with all the illustrations_of_swollen_lymph_nodes is a lymph node 90.3 implicit background knowledge a lake_triangle is a lake 100.0 reader would need to understand the louis_pasteur_and_robert_koch is a scientist 99.6 articles (hence, the name).31 It was Illinois_governor_george_ryan is a politician 99.8 initially planned as a 10-year project, stephen is a person who moved to the state california 100.0 but continues to this day. In the last louis_armstrong is a musician who plays the trumpet 99.6 decade, Cycorp has released steadily cbs_early_show is a company in the economic sector of news 93.8 increasing portions of the knowledge knxv is a TV station in the city phoenix 100.0 base for public or research use. The broncos is a sports team that plays against new_york_jets 100.0 most recent public version, OpenCyc 4.0, released in June 2012 contains 239,000 concepts and 2,039,000 facts,

100 COMMUNICATIONS OF THE ACM | SEPTEMBER 2015 | VOL. 58 | NO. 9 review articles mostly taxonomic. ResearchCyc, of MTs being as deep as 50 levels in which is available to be licensed for some domains. research purposes, contains 500,000 2. There are many redundant sub- concepts and 5,000,000 facts. type relationships that make it difficult A number of successful applications to determine its taxonomical structure. of CYC to AI tasks have been reported,8 If a cat runs in front 3. Some of the MTs are almost emp- including Web query expansion and of a house-cleaning ty but difficult to discard. refinement,7 ,9 and 4. Not all the MTs follow a standard intelligence analysis.19 robot, the robot representation of knowledge.” CYC is often mentioned as a suc- should neither run They further report a large collec- cess of the knowledge-based ap- tion of usability problems including proach to AI; for instance Dennett13 it over nor sweep problems in understandability, learn- writes, “CYC is certainly the most im- ability, portability, reliability, compli- pressive AI implementation of some- it up. These things ance with standards, and interface to thing like a language of thought.” seem obvious, but other systems. More broadly, CYC has However, it is in fact very difficult for had comparatively little impact on AI an outsider to determine what has ensuring a robot research—much less than less sophis- been accomplished here. In its first avoids mistakes ticated online resources as Wikipedia 15 years, CYC published astonishingly or WordNet. little. Since about 2002, somewhat of this kind is very Web mining. During the last de- more has been published, but still challenging. cade, many projects have attempted very little, considering the size of the to use Web mining techniques to ex- project. No systematic evaluation of tract commonsense knowledge from the contents, capacities, and limita- Web documents. A few notable exam- tions of CYC has been published.g ples, of many: It is not, for example, at all clear The KnowItAll program14 collect- what fraction of CYC actually deals ed instances of categories by mining with commonsense inference, and lists in texts. For instance, if the sys- what fraction deals with specialized tem reads a document containing a applications such as medical records phrase like “pianists such as Evgeny or terrorism. It is even less clear what Kissin, Yuja Wang, and Anastasia Gro- fraction of commonsense knowledge moglasova” then the system can infer of any kind is in CYC. The objective these people are members of the cat- of representing 400 encyclopedia egory Pianist. If the system later en- articles seems to have been silently counters a text with the phrase “Yuja abandoned at a fairly early date; this Wang, Anastasia Gromoglasova, and may have been a wise decision; but Lyubov Gromoglasova,” it can infer it would be interesting to know how that Lyubov Gromoglasova may also close we are, 30 years and 239,000 be a pianist. (This technique was first concepts later, to achieving it; or, if proposed by Marti Hearst;25 hence this is not an reasonable measure, patterns like “W’s such as X,Y,Z” are what has been accomplished in terms known as “Hearst patterns.”) More of commonsense reasoning by any recently, the Probase system,50 using other measure. There are not even very similar techniques, has automatically many specific examples of common- compiled a taxonomy of 2.8 million sense reasoning carried out by CYC concepts and 12 million isA relations, that have been published. with 92% accuracy. There have been conflicting reports The Never-Ending Language Learn- about the usability of CYC from outside er (NELL)h program36 has been steadily scientists who have tried to work with accumulating a knowledge base of it. Conesa et al.8 report that CYC is poor- facts since January 2010. These in- ly organized and very difficult to use: clude relations between individuals, “The Microtheory Taxonomy (MT) taxonomic relations between catego- in ResearchCyc is not very usable for ries, and general rules about catego- several reasons: ries. As of January 2015, it has accu- 1. There are over 20,000 MTs in mulated 89 million candidate facts, of CYC with the taxonomical structure which it has high confidence in about two million. However, the facts are of g A number of organizations have done private evaluations but the results were not published. h http://rtw.ml.cmu.edu/rtw/

SEPTEMBER 2015 | VOL. 58 | NO. 9 | COMMUNICATIONS OF THE ACM 101 review articles

very uneven quality (Table 3). The tax- All of these programs are impres- Net 3.5 shown in Figure 4. Even in onomy created by NELL is much more sive—it is remarkable you can get so this small network, we see many of accurate, but it is extremely lopsided. far just relying on patterns of words, the kinds of inconsistencies most fa- As of June 9, 2015 there are 9,047 in- with almost no knowledge of larger- mously pointed out by Woods.49 Some stances of “amphibian” but zero in- scale syntax, and no knowledge at of these links always hold (for ex- stances of “poem.” all of semantics or of the relation of ample, “eat HasSubevent swallow”), Moreover, the knowledge collected these words to external reality. Still, some hold frequently (for example, in Web mining systems tends to suf- they seem unlikely to suffice for the “person Desires eat”) and some only fer from severe confusions and incon- kinds of commonsense reasoning occasionally (“person AtLocation sistencies. For example, in the Open discussed earlier. restaurant”). Some—like “cake AtLo- Information Extraction systemi,15 Crowd sourcing. Some attempts cation oven” and “cake ReceivesAc- the query “What is made of wood?” have been made to use crowd-sourc- tion eat”—cannot be true simulta- receives 258 answers (as of June 9, ing techniques to compile knowledge neously. The node “cook” is used to 2015) of which the top 20 are: “The bases of commonsense knowledge.24 mean a profession in the link “cook frame,” “the buildings,” “Furniture,” Many interesting facts can be collect- isA person” and an activity in “oven “The handle,” “Most houses,” “The ed this way, and the worst of the prob- UsedFor cook” (and in “person Ca- body,” “the table,” “Chair,” “This lems we have noted in Web mining pableOf cook”). Both cook and cake one,” “The case,” “The structure,” systems are avoided. For example, the are “UsedFor satisfy-hunger,” but in “The board,” “the pieces,” “roof,” query “What is made of wood?” gets entirely different ways. (Imagine a ro- “toy,” “all,” “the set,” and “Arrow,” mostly reasonable answers. The top 10 bot who, in a well-intentioned effort Though some of these are reason- are: “paper,” “stick,” “table,” “chair,” at satisfying the hunger of its own able (“furniture,” “chair”), some are “pencil,” “tree,” “furniture,” “house,” owner, fricassees a cook.) On a tech- hopelessly vague (“the pieces”) and “picture frame,” and “tree be plant.”k nical side, the restriction to two-place some are meaningless (“this one,” However, what one does not get is relations also limits the expressiv- “all”). The query “What is located in the analysis of fundamental domains ity in important ways. For example, wood?” gets the answers “The ceme- and the careful distinguishings of there is a link “restaurant UsedFor tery,” “The Best Western Willis,” “the different meanings and categories satisfy-hunger,” another link might cabins,” “The park,” “The Lewis and needed to support reliable reason- easily specify, “restaurant UsedFor Clark Law Review,” “this semi-wilder- ing. For example, naïve users will make-money.” But in this representa- ness camp,” “R&R Bayview,” “‘s Voice not work out systematic theories of tional system there would be no way School” [sic], and so on. Obviously, time and space; it will be difficult to specify the fact the hunger satisfac- these answers are mostly useless. A to get them to follow, systematically tion and money making have distinct more subtle error is that OIE does not and reliably, theories the system de- beneficiaries (the customers vs. the distinguish between “wood” the ma- signers have worked out. As a result, owner). All of this could be fixed post- terial (the meaning of the answers to facts get entered into the knowledge hoc by professional knowledge engi- the first query) and “wood” meaning base without the critical distinctions neers, but at enormous cost, and it is forest (the meaning of the answers to needed for reasoning. Instead, one not clear whether crowds could be ef- the second query).j winds up with a bit of mess. ficiently trained to do adequate work Consider, for example, the frag- that would avoid these troubles. i http://openie.cs.washington.edu/ ment of the crowd-sourced Concept j Thanks to Leora Morgenstern for helpful dis- Going Forward cussions. k This test was carried out in November 2014. We doubt any silver bullet will eas- ily solve all the problems of common- Figure 4. Concepts and relations in ConceptNet (from Havesi) sense reasoning. As Table 2 suggests, each of the existing approaches has

oven follow distinctive merits, hence research in all UsedFor recipe CausesDesire these directions should presumably be cook continued. In addition, we would urge UsedFor UsedFor AtLocation satisfy the following: hunger Benchmarks. There may be no single UsedFor AtLocation IsA restaurant CapableOf perfect set of benchmark problems, UsedFor CreatedBy bake but as yet there is essentially none at AtLocation IsA cake dessert HasProperty all, nor anything like an agreed-upon MotivatedByGoal Desires HasProperty ReceivesAction evaluation metric; benchmarks and swallow person survive sweet evaluation marks would serve to move Desires MotivatedByGoal CapableOf the field forward. HasSubevent Evaluating CYC. The field might Desires eat well benefit if CYC were systemati- cally described and evaluated. If CYC has solved some significant fraction

102 COMMUNICATIONS OF THE ACM | SEPTEMBER 2015 | VOL. 58 | NO. 9 review articles of commonsense reasoning, then translation of languges. Advances in Computers. F. knowledge to overcome brittleness Alt, Ed. Academic Press, New York, 1960, 91–163. and knowledge acquisition bottlenecks. AI Magazine it is critical to know that, both as a 4. Bojanowski, P., Lajugie, R., Bach, F., Laptev, I., Ponce, 6, 4 (1985), 65–85. useful tool, and as a starting point J., Schmid, C. and Sivic, J. Weakly supervised action 32. Levesque, H., Davis, E. and Morgenstern, L. labeling in videos under ordering constraints. ECCV The Winograd schema challenge. Principles of for further research. If CYC has run (2014), 628–643. Knowledge Representation and Reasoning, 2012. into difficulties, it would be useful 5. Brewka, G., Niemelli, I. and Truszczynski, M. 33. Lovett, A., Tomei, E., Forbus, K. and Usher, J. Solving Nonmonotonic reasoning. Handbook of Knowledge geometric analogy problems through two-stage to learn from the mistakes that were Representation. F. van Harmelen, V. Lifschitz and B. analogical mapping. Cognitive Science 33, 7 (2009), made. If CYC is entirely useless, then Porter, Eds. Elsevier, Amsterdam, 2008, 239–284. 1192–1231. 6. Cohn, A. and Renz, J. Qualitative spatial reasoning. 34. Miller, G. WordNet: A lexical database for English. researchers can at least stop worrying Handbook of Knowledge Representation. F. van Commun. ACM 38, 11 (Nov. 1995), 39–41. about whether they are reinventing Harmelen, V. Lifschitz and B. Porter, Eds. Elsevier, 35. Minsky, M. A framework for representing knowledge. Amsterdam, 2007, 551–597. The Psychology of Computer Vision. P. Winston, Ed. the wheel. 7. Conesa, J., Storey, V. and Sugumaran, V. Improving McGraw Hill, New York, 1975. Integration. It is important to at- Web-query processing through semantic knowledge. 36. Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Data and 66, 1 (2008), 18–34. Betteridge, J. and Carlson, A. Never Ending Learning. tempt to combine the strengths of 8. Conesa, J., Storey, V. and Sugumaran, V. (2010). AAAI, 2015. the various approaches to AI. It would Usability of upper level ontologies: The case of 37. Mueller, E. Commonsense Reasoning. Morgan ResearchCyc. Data and Knowledge Engineering 69 Kaufmann, San Francisco, CA, 2006. be useful, for instance, to integrate a (2010), 343–356. 38. Pearl, J. Probabilistic Reasoning in Intelligent 9. Curtis, J., Matthews, G., & Baxter, D. On the Systems: Networks of Plausible Inference. Morgan careful analysis of fundamental do- effective use of Cyc in a question answering system. Kaufmann, San Mateo, CA, 1988. mains developed in a mathematically IJCAI Workshop on Knowledge and Reasoning for 39. Peppas, P. Belief revision. Handbook of Knowledge Answering Questions, 2005. Representation. F. Van Harmelen, V. Lifschitz, and B. grounded theory with the interpreta- 10. Davis, E. The naive physics perplex. AI Magazine 19, Porter, Eds. Elsevier, Amsterdam, 2008, 317–359. tion of facts accumulated by a Web 4 (1998), 51–79; http://www.aaai.org/ojs/index.php/ 40. Pisanelli, D. Ontologies in Medicine. IOS Press, aimagazine/article/view/1424/1323 Amsterdam, 2004. mining program; or to see how facts 11. de Kleer, J. Qualitative and quantitative knowledge 41. Price, C., Pugh, D., Wilson, M. and Snooke, N. (1995). gathered from Web mining can con- in classical mechanics. MIT AI Lab, 1975. The FLAME system: Automating electrical failure 12. de Kleer, J. and Brown, J. A qualitative physics mode and effects analysis (FEMA). In Proceedings of strain the development of mathemati- based on confluences.Qualitative Reasoning about the IEEE Reliability and Maintainability Symposium, cally grounded theories. Physical Systems. D. Bobrow, Ed. MIT Press, (1995), 90–95. Cambridge, MA, 1985, 7–84. 42. Reiter, R. Knowledge in Action: Logical Foundations Alternative modes of reasoning. 13. Dennett, D. Intuition Pumps and Other Tools for for Specifying and Implementing Dynamical Neat theories of reasoning have tend- Thinking. Norton, 2013. Systems. MIT Press, Cambridge, MA, 2001. 14. Etzioni, O. et al. Web-scale extraction in KnowItAll 43. Russell, B., Torralba, A., Murphy, K. and Freeman, ed to focus on essentially deductive (preliminary results). In Proceedings of the 13th W. Labelme: A database and Web-based tool for reasoning (including deduction us- International Conference on World Wide Web, image annotation. Intern. J. Computer Vision 77, (2004), 100–110. Retrieved from http://dl.acm.org/ 1–3 (2008), 157–173. ing default rules). Large-scale knowl- citation.cfm?id=988687 44. Schank, R. and Abelson, R. Scripts, Plans, Goals, and edge bases and Web mining have 15. Etizoni, O., Fader, A., Christensen, J., Soderland, Understanding. Lawrence Erlbaum, Hillsdale, NJ, S. and Mausam. Open information extraction: The 1977. focused on taxonomic and statisti- second generation. IJCAI, 2011, 3–10). 45. Schubert, L. Semantic Representation. AAAI, 2015. cal reasoning. However, common- 16. Ferrein, A., Fritz, C., and Lakemeyer, G. Using Golog 46. Shepard, B. et al. A knowledge-base approach for deliberation and team coordination in robotic to network security: Applying Cyc in the domain sense reasoning involves many dif- soccer. Kuntzliche Intelligenz 19, 1 (2005), 24–30. of network risk assessment. Association for the ferent forms of reasoning including 17. Fisher, M. Temporal representation and reasoning. Advancement of Artificial Intelligence, (2005), Handbook of Knowledge Representation. F. Van 1563–1568. reasoning by analogy; frame-based Harmelen, V. Lifschitz, and B. Porter, Ed. Elsevier, 47. Surdeanu, M. Overview of the tac2013 knowledge 35 Amsterdam, 2008, 513–550. base population evaluation: English slot filling and th reasoning in the sense of Minsky; 18. Forbus, K. Qualitative process theory. Qualitative temporal slot filling. InProceedings of the 6 Text abstraction; conjecture; and reason- Reasoning about Physical Systems. D. Bobrow, Ed. Analysis Conference (2013). MIT Press, Cambridge, MA, 1985, 85–168. 48. Winograd, T. Understanding Natural Language. ing to the best explanation. There 19. Forbus, K., Birnbaum, L., Wagner, E., Baker, J. and Academic Press, New York, 1972. is substantial literature in some of Witbrock, M. Analogy, intelligent IR, and knowledge 49. Woods, W. What’s in a link: Foundations for semantic integration for intelligence analysis: Situation networks. Representation and Understanding: these areas in cognitive science and tracking and the whodunit problem. In Proceedings Studies in Cognitive Science. D. Bobrow and A. informal AI approaches, but much of the International Conference on Intelligence Collins, Eds. Academic Press, New York, 1975. Analysis (2005). 50. Wu, W., Li, H., Wang, H. and Zhu, K.Q. Probase: A remains to be done to integrate them 20. Fromherz, M., Bobrow, D. and de Kleer, J. Model-based probabilistic taxonomy for text understanding. In with more mainstream approaches. computing for design and control of reconfigurable Proceedings of ACM SIGMOD, 2012. systems. AI Magazine 24, 4 (2003), 120. Cognitive science. Intelligent ma- 21. Gene Ontology Consortium. The Gene Ontology (GO) chines need not replicate human tech- database and informatics resource. Nucleic Acids Ernest Davis ([email protected]) is a professor in the Research 32 (suppl. 1), 2004, D258–D261. Department of Computer Science at New York University, niques, but a better understanding 22. Gentner, D. and Forbus, K. Computational models of New York, NY. of human commonsense reasoning analogy. WIREs Cognitive Science 2 (2011), 266–276. Gary Marcus ([email protected]) is a professor of 23. Halpern, J. Reasoning about Uncertainty. MIT Press, psychology and neural science at New York University, might be a good place to start. Cambridge, MA, 2003. and CEO and co-founder of Geometric Intelligence, Inc., Acknowledgments. Thanks to 24. Havasi, C., Pustjekovsky, J., Speer, R. and Lieberman, a machine learning startup. H. Digital intuition: Applying common sense using Leora Morgenstern, Ken Forbus, and dimensionality reduction. IEEE Intelligent Systems William Jarrold for helpful feedback, 24, 4 (2009), 24–35; doi:dx.doi.org/10.1109/ © 2015 ACM 0001-0782/15/09 $15.00. MIS.2009.72 Ron Brachman for useful informa- 25. Hearst, M. Automatic acquisition of hyponyms from tion, and Thomas Wies for checking large text corpora. ACL, 1992, 539–545. 26. Hofstadter, D. and Sander, E. Surfaces and Essences: our German. Analogy as the Fuel and Fire of Thinking. Basic Books, New York, 2013. 27. Kalyanpur, A. Structured data and inference in References DeepQA. IBM Journal of Research and Development 1. Allison, B., Guthrie, D. and Guthrie, L. Another look 53, 3–4 (2012), 10:1–14. at the data sparsity problem. In Proceedings of the 28. Kolodner, J. Case-Based Reasoning. Morgan Watch the authors discuss 9th International Conference on Text, Speech, and Kaufmann, San Mateo, CA, 1993. their work in this exclusive Dialogue, (2006), 327–334). 29. Kuehne, S. Understanding natural language Communications video. 2. Baader, F., Horrocks, I. and Sattler, U. Descriptions description of physical phenomena. Northwestern http://cacm.acm.org/ logics. Handbook of Knowledge Representation. University, Evanston, IL, 2004. videos/commonsense- F. van Harmelen, V. Lifschitz and B. Porter, Eds. 30. Kuipers, B. Qualitative simulation. Artificial reasoning-and- Elsevier, Amsterdam, 2008, 135–179. Intelligence 29 (1986), 289–338. commonsense-knowledge- 3. Bar-Hillel, Y. The present status of automatic 31. Lenat, D., Prakash, M. and Shepherd, M. CYC: Using in-artificial-intelligence

SEPTEMBER 2015 | VOL. 58 | NO. 9 | COMMUNICATIONS OF THE ACM 103