<<

AI Magazine Volume 26 4 (2006)(2005) (© AAAI) Articles

IAAI/AI Magazine 2006 Robert Engelmore Award Address What Do We Know about ?

Bruce G. Buchanan

■ Intelligent need knowledge. However, the stand ,4 but the of knowledge is simple equation “knowledge is ” leaves three much the same. major questions unanswered. First, what do we Bacon was among the first of the modern mean by “knowledge”; second, what do we mean philosophers to separate the concept of scien- by “power”; and third, what do we mean by “is”? tific knowledge from knowledge gained In this article, I will examine the first of these ques- through the two dominant methods for attain- tions. In particular I will focus on some of the mile- stones in understanding the nature of knowledge ing in his : magic and religious reve- and some of what we have learned from 50 years lation. The essential difference for him, as for of AI research. The discipline and detail required to us, is that knowledge gained through experi- write programs that use knowledge have given us ment is replicable by others. some valuable lessons for implementing the Although all the empirical rely on knowledge , one of which is to make our the replication of and experi- programs as flexible as we can. ments, AI has been slow to embrace this prin- ciple.5 Programs demonstrating research in AI are often too large and not well enough documented to allow replication or sharing. Applications programs, however, are designed hank you for this distinguished award to be used by others outside the research lab and the opportunity to share some and thus are more amenable to multiple runs Tthoughts with you.1 What I would like to in diverse conditions. Thus they have the give you in this article2 are some of the princi- potential to provide experimental data demon- ples guiding the implementation of knowl- strating strengths, weaknesses, and benefits. edge-based systems that follow from in and AI. Many of them are well Contributions before AI known, but they can serve as reminders of the difficulty of implementing the “knowledge is The knowledge principle predates Bacon. For power” principle.3 I wish to clarify the knowl- example, it was pretty clearly articulated in Bib- edge principle and try to increase our under- lical : “A man of knowledge increaseth standing of what programmers and program strength” (Proverbs 24: 5). designers need to do to make the knowledge Socrates, , , and other early principle work in practice. Greek philosophers based their lives on acquir- The “knowledge is power” principle is most ing and transferring knowledge. In the course closely associated with , from his of teaching, they sought to understand the 1597 on : “Nam et ipsa scientia nature of knowledge and how we can establish potestas est.” (“In and of itself, knowledge is knowledge of the natural world. power.”) Incidentally, Bacon was probably as Socrates is famous for pointing out the much interested in the political power to be of knowledge and seeking truth, as in “… that gained from knowledge as the power to under- which we desire to have, and to impart to oth-

Copyright © 2006, American Association for Artificial . All rights reserved. ISSN 0738-4602 WINTER 2006 35 Articles

ers, [is] expert knowledge….” (Plato, Phaedrus son. Aristotle wrote at least 31 describ- 270d). ing every aspect of the natural world and offer- He was also fond of pointing out how little ing physical explanations of many phenomena. we actually know—and was put to death, Aristotle also advanced the rational tradition essentially, for pointing that out to everyone: of Plato and by developing a When I conversed with him I came to see that, that captures many forms of symbolic argu- though a great many persons, and most of all ment, which was powerful enough to survive he himself, that he was wise, yet he two thousand years. He demonstrated the was not wise. Then I tried to prove to him that expressive power of simple , “A is he was not wise, though he fancied that he was. B,” along with quantification, “All A’s are B’s,” … I thought to myself, “I am wiser than this or “Some A’s are B’s.” He also established rules man: neither of us knows anything that is really of symbolic inference for combining quanti- worth knowing, but he thinks that he has fied propositions. knowledge when he has not, while I, having no knowledge, do not think that I have. I seem, at Euclid’s geometry firmly established the con- any rate, to be a little wiser than he is on this cept of rigorous within . point: I do not think that I know what I do not Some of the Greek philosophers’ contributions know.” Next I went to another man… (Plato, to our concept of knowledge are highlighted in Apology VI:22). table 1. Plato, Socrates’s pupil and Aristotle’s mentor, In the intervening several centuries before was the first to pose the question in writing of the Middle Ages and the rise of modern what we mean when we say that a person in the West,6 the search for knowledge was knows something about the world (Cornford overwhelmed by the power of the Christian 1935). He was distinguishing empirical knowl- church to make new knowledge fit with estab- edge, lacking complete certainty, from the cer- lished dogma. The resulting dark ages should tain knowledge of mathematics. The whole be a reminder to all of us that knowledge-based dialogue, The , is worth reading but— systems should not merely perpetuate the if I may oversimplify the conclusion—Plato, established dogma of an organization. speaking for Socrates, concludes that person S The English theologian and philosopher knows empirical p if and only if: Robert Grosseteste (1170–1253) is known for S p emphasizing the role of mathematics in under- p is true (otherwise it is a false , not a standing the natural world. Galileo later under- that is known) scored this principle when he wrote that the S can provide a rationale for why p is true “book of nature” is written in “the language of (which Plato calls giving an account). mathematics.” Grosseteste is also credited with The last condition has been modified by establishing the experimental method as a philosophers in recent years to read that S is path to knowledge in his own experimental justified in believing p. This modification pre- work on the of . serves the requirement of a rationale but was the most influential removes the onus of providing it from philosopher of the 13th century. His two major S. That is, the belief is justified, but S does not contributions to the study of knowledge that need to be the one providing it. But, of course, are relevant to AI were nominalism and his philosophers are not at all in agreement about insistence on . With nominalism, he what constitutes a proper justification or argued that what we know is expressed in lan- rationale. One view that would seem to be rel- guage. As a pragmatic principle in AI program- evant for AI is that either the belief or the jus- ming, that translates roughly into the principle tification is formed through a reliable cognitive that if someone can accurately describe how process. S didn’t just come to believe p through they solve a problem, then a program can be a of bad inferences or guessing based on written to solve it. The principle of parsimony, the wrong (Steup 2006). now known as Occam’s Razor, states that plu- Aristotle continued the search for knowl- rality should not be assumed without necessity. In edge, extending the methodology in two other words, explanations that mention fewer important ways beyond the rational discussion entities and mechanisms should be preferred to of Plato and the mathematics of Pythagoras. His more complex ones. term for science, incidentally, was “natural phi- So, by the time modern science was getting losophy,” which was used by as late as started, several important about to describe their own work. One of knowledge had already been clearly established Aristotle’s most lasting contributions was show- by the ancient Greeks. Medieval philosophers ing the importance of knowledge gained reinforced and added to the early , as through , as opposed to pure rea- shown in table 2.

36 AI MAGAZINE Articles

Pythagoras Mathematics holds the key to correct descriptions of theworld. Socrates Seeking knowledge is good. Knowing what we don’t know (metaknowledge) isvaluable. Socrates Empiricalknowledge istruebelief with an account: beliefs have to be justified to be called (Plato) knowledge. Aristotle Observation isalegitimate source of knowledge. Symbolic logic isameansofincreasing our storeofknowledge through valid inference: Knowledge beyond mathematics can be proved. Euclid New knowledge can be derived by rigorousproof.

Table 1. Some Contributions of Early Greek Philosophers to Our Understanding of the Concept of Knowledge.

R. Grosseteste Mathematics isessential forknowledge of thenatural world. Knowledge can be established experimentally. William of Knowledge is linguistic. Simpler expressions of knowledge are Ockham preferable.

Table 2. Some Contributions of Medieval Philosophers to our Understanding of the Concept of Knowledge.

Skipping ahead a few more centuries into emphasized planned experiments as an essen- the 15th and 16th centuries, philosophers con- tial step in the inductive process. tinued to investigate scientific questions by René Descartes is most known today for his elaborating the earlier themes and by making work on algebra and geometry, but he also new distinctions. By the 1500s many people wrote about the (Descartes were using observation and experimentation 1637). In the Discourse on the Method of Rightly (that is, the scientific method) to produce new Conducting the and Seeking Truth in the knowledge about the natural world. In the Sciences he advocates accepting as knowledge 17th and 18th centuries scientific discoveries only “clear and distinct” . He articulated were made at an unprecedented rate (Pledge the divide-and-conquer strategy of problem 1939) and the foundations of modern science solving and advocated enumerating all possi- were clearly established. bilities in considering solutions to a problem so Some of the most relevant principles and as not to overlook any—both powerful ideas ideas to come out of the early modern work are for AI. shown in table 3. Sir Isaac Newton’s scientific and mathemati- Francis Bacon’s epigram “knowledge is pow- cal contributions, of course, are awe-inspiring. er” is where we started. Bacon clearly articulat- Perhaps less appreciated is his emphasis on ed what is now known as the scientific clear writing, starting with publication in a method, the steps of empirical investigation contemporary language (English) instead of that lead to new knowledge, as opposed to Latin. As noted by one historian, “Newton deductive logic “as acting too confusedly and established for the first time in any subject letting nature slip out of its hands.” The old more concrete than the logic, he said, “serves rather to fix and give sta- strictly unadorned expository style of Euclid, bility to the errors which have their foundation an important advance on the cumbrous and in commonly received notions than to help pretentious habits until then current” (Pledge the search after truth” (Bacon 1620). He is also 1939, p. 66). known for his description of science as a mul- Newton’s method, described in 1669, is one tistep process that may involve teams of people of successive approximations, which he in specialized tasks, such as data gathering. He applied to solving and others

WINTER 2006 37 Articles

F. Bacon Knowledge ispower.Planned experiments facilitate scientific investigations. Scienctific questions can be investigated by large teams. R. Descartes Solve problems by divide-and-conquer. Makecomplete enumerations to be surenothing is overlooked. I. Newton Clear exposition facilitates knowledge transfer.Iterative refinement isapowerful heuristicfor solving hardproblems. Crucial experiments can be defined to refute hypotheses, or, if negative, helpsupportthem. T. Bayes Prior probabilities are important pieces of knowledge. Laws of probability can establish the conditional probability of an event given theevidence.

Table 3. Some Contributions from the to Our Understanding of the Concept of Knowledge.

have applied to more complex functions. It was example, nonmonotonic reasoning does not also used by Heron of Alexandria (Hero) to find obey the laws of first-order logic because new square roots in the first century.7 His method can negate previously believed was one of the first clearly articulated princi- propositions. Kurt Gödel’s proof that there are ples of reasoning, which can be some true statements that cannot be proved in applied to symbolic reasoning as well as math- first-order logic is not relevant for building ematics. knowledge-based systems and may be totally Newton added to the concept of scientific irrelevant to the question of whether AI is pos- reasoning the of the “crucial experiment” sible (contrary to what others have claimed). (Boorstin 1985). This kind of experiment estab- However, it does underscore the need to con- lishes a critical question whose answer, once sider metalevel knowledge when building an observed, can falsify a hypothesis. A negative intelligent . outcome cannot prove the hypothesis, but the Russell also showed us that we—or a pro- absence of refutations is widely accepted as gram—can get into serious logical difficulties support. Although there are methodological when we describe things self-referentially. difficulties with the concept, it is still a power- When Epimenides the Cretan says “All Cretans ful idea in every science. are liars,” we don’t know how to assess the Bayes’s Theorem is particularly important truth or falsity of his statement. Also, when we because it establishes the probability of a define a class in terms of membership in the hypothesis H conditional on evidential support same class we lose the ability to assess class E (Joyce 2006). For the first time, scientists had membership. For example, Russell’s famous a way to quantify the degree to which evidence barber paradox concerns a barber who shaves supports a hypothesis and a way to revise a all men who do not shave themselves. Logic degree of belief incrementally as new evidence does not let us answer the question whether becomes available. A cornerstone of Bayesian the barber himself is in that set or not. Para- probabilities is that the prior probability of the doxes such as these are strong reminders that evidence is an essential starting point. we do not yet know everything about manag- Some additional principles we can draw on ing knowledge (Quine 1966). from the early modern period are shown in Because of Gottlob Frege’s work on inten- table 4 without comment. sional predicates, we know that statements During the last hundred years or so, philoso- about beliefs and knowledge, among other things, do not follow the logical rules allowing phers and scientists have continued to add to substitution of equals for equals. Consider: the list of things we need to pay attention to in implementing the knowledge principle (see John believes Aristotle invented logic. table 5). One of the towering giants of the cen- Aristotle was Plato’s pupil tury was . . . . but it is not necessarily true that: Bertrand Russell and Alfred North - John believes Plato’s pupil invented logic head developed the first-order predicate logic (because John does not know the relationship and showed its power for expressing much of between Aristotle and Plato) what we know. But we also have seen that it Therefore, when we build knowledge-based needs to be augmented, or replaced, with other systems, keeping track of what the program tools for representation and reasoning. For knows and believes is not a simple of

38 AI MAGAZINE Articles

G. Leibniz An iscompletelydefined by its features (attribute-valuepairs) . I. Kant Our own conceptual framework determines what we can know aboutobjectsintheworld. D. Hume Knowledge of thenatural world isnever completelycertain. Much of what we know rests on theassumption that thefuturewill be likethepast. J. S. Mill Generallaws can be discovered from data. isdetermined, inpart, by systematic variation of an independent variablewith observed variation inadependentvariableshortly afterward.

Table 4. Additional Contributions from the Early Modern Period to Our Understanding of the Concept of Knowledge. deduction. As John McCarthy and others have implementing a knowledge-based system, we pointed out, it can be enormously complicated now understand the importance of making to keep track of a computer’s—or a person’s— both the and the assumptions as set of beliefs, and revise the set as more infor- explicit as we can. mation becomes available. Jon Doyle’s work on has examined the kinds of truth-maintenance provides some implemen- statements that make good general laws—the tation guidelines to start with. sorts of statements that describe causal rela- Carl Hempel clarified the concept of an tions and can be used in Hempel’s model of explanation in a way that is very relevant to a explanation. Pure descriptions of things in the program’s offering explanations of its beliefs. world that “just happen” to share a common Acceptable explanations of a P— may fit into a generaliza- such as a rainbow8—under the deductive- tion, but they are not good candidates for sci- nomological model of explanation, follow the entific laws. In Aristotle’s terms, these are acci- form: dental predicates. For example, the assertion, Why P? (for example, why is there a ?) “All people in this room at this hour today are If A is observed then P follows interested in AI,” may be a true description (more or less) of everyone here, but it only (If the reflects and refracts off water droplets …) “happens” to be true. We could not use it, for example, to predict that a janitor in the room A is observed waiting for me to finish is interested in AI. (These conditions exist) Goodman and others have found difficulties in Therefore a rainbow is to be expected trying to define criteria for the kinds of predi- Hempel’s model turns out to be intuitively cates that are appropriate for scientific laws. acceptable for the users of knowledge-based But one criterion we can use operationally in systems who want a rationale for a program’s knowledge-based systems and machine learn- advice. It was, in fact, one of the guiding prin- ing systems is that the predicates used appro- ciples behind Mycin’s explanation system priately in general laws are predicates whose (Buchanan and Shortliffe 1984). semantics have allowed such use in the past. So introduced the concept of a a predicate like “reflects light” can be tagged as into discussions of knowledge, potentially useful for a theory, while an acci- extending ’s that our dental predicate like “is in this room at this beliefs and knowledge structures are framed hour” would not be tagged. within a conceptual framework (Kuhn 1962). There are (at least) two other concepts from For Kuhn, a paradigm in science includes not contemporary writers outside of AI that are just the vocabulary and conceptual framework useful for us to remember in building knowl- of a domain, but the shared assumptions, edge-based systems. George Polya (1945) intro- accepted methods, and appropriate instrumen- duced the concept of into our para- tation of the social community working within digm of reasoning, referring to rules that are that domain. In this , the paradigm not always true but can be very powerful when defines both the legitimacy of questions and they are. He was not referring to probabilistic the acceptability of proposed answers.9 When arguments leading to a conclusion that is prob-

WINTER 2006 39 Articles

G. Frege Belief statements are intensional. B. Russell Predicate logicworks. Self-referential statements areproblematic. C. G. Hempel Explanations appeal to generallaws.

Table 5. Some Contributions from the Last Hundred Years to Our Understanding of the Concept of Knowledge.

ably true or about measuring a heuristic’s most effects come from relatively few causes— degree of applicability, but to the importance in quantitative terms, it is known as the 80/20 of the rules themselves in making rule: 80 percent of the benefit from performing toward solving hard problems. One of the a task comes from about 20 percent of the sources of ideas for Polya was a diagram of the effort. It is an important design consideration situation. In a geometry problem, for example, for applications programs. A designer can if two line segments look equal in the diagram, demonstrate an early prototype with high cred- it might be useful to try to prove that they are ibility by carefully choosing which 20 percent in the general case. For us, the main lessons in of the relevant knowledge to put into a system. Polya’s work are that reasoning programs need Herb Simon’s Nobel Prize–winning work in heuristics and heuristics can be made explicit. on bounded established emphasizes that much of the fundamental of AI: finding a satis- the useful knowledge a person has cannot be factory solution is often a better use of made explicit—in his terms, it is tacit knowl- resources than searching longer for the optimal edge. In building expert systems by acquiring solution. This, of course, is the principle of sat- knowledge from experts, we often run into isficing. Don’t leave home without it! areas of expertise that are not easily articulated. Simon’s work with William Chase (1973) We might say the expert’s knowledge has been demonstrated that expert problem solvers store compiled. It is good to be cautioned that such and use their knowledge in chunks. For exam- knowledge is not easy to acquire. Asking an ple, expert chess players use high-level expert, “How do you solve this kind of prob- descriptions, such as king-side attack, instead lem?” may lead nowhere. But we have also of descriptions of the individual pieces on the found another tack to be useful in eliciting board. knowledge: when an expert seems to be strug- Some recent contributions to our under- gling to articulate what he or she does, turn the standing of the concept of knowledge are question around and ask, “For this kind of shown in table 6. problem, what would you tell an assistant to do?” Psychologists and economists have also giv- AI WORK en us some considerations relevant to the With the invention of digital computers, Alan implementation of knowledge-based systems. Turing began the empirical study of how much Kahneman, Slovic, and Tversky (1982) collect- and what kinds of intellectual work computers ed numerous data showing that human reason- can do. from the early work mark the ing under uncertainty is not strictly Bayesian. beginning of AI and what is now known as For example, reasoning about a new case often computational philosophy (Thagard 1988). involves comparisons with the most recent cas- That is, writing programs has become a new es seen. And people tend to ignore reversion to way to study the nature of knowledge. the mean when assessing their own perform- There are several instances of programs using ance on a task; instead they (we) believe that specialized knowledge to achieve high per- our best performance has become the new formance before 1970. Arthur Samuel’s checker norm and we’re disappointed with below-peak player used knowledge from experts. Geoffrey output. One lesson to draw from this work is Clarkson’s simulation of an investment trust that AI programs need not outperform the best officer was essentially a collection of rules from experts to be beneficial: they may help prevent one expert. Richard Greenblatt’s mid-1960s mistakes on the parts of practitioners thus rais- chess-playing program, MacHack, was based ing the lowest common denominator among on clearly articulated principles of good chess people performing a task. play, such as the value of controlling the cen- The well-known Pareto Principle states that ter. Both Jim Slagle’s and Joel Moses’s programs

40 AI MAGAZINE Articles

T. Kuhn Scientists work withinparadigms (ontology plusassumptions plus…). N. Goodman Accidental predicates do not make good generallaws. G. Polya Heuristics areapowerful formofknowledge. M. Polanyi Much of what we know istacit. D. Kahneman, P. Slovic, and A. Tversky Humans arenotBayesians. V. Pareto 80percent of thebenefitoftenresults from 20 percent of the effort. H. Simon Satisficing isoftenbetter than optimizing. H. Simon and W. Chase Knowledge ischunked.

Table 6. Some Important Recent Contributions to Our Understanding of the Concept of Knowledge. for solving integration problems symbolically of Knowledge and Tony Hearn’s program for algebraic simpli- First, of course, it is essential that we choose fication used considerable knowledge of the problems for which there is knowledge. If no tricks use. person knows how to solve a problem, then it But Edward Feigenbaum’s 1968 address to is dangerous to promise a computer program the International Federation for Information that solves it. Predicting terrorist attacks with Processing (IFIP) Congress was the first clear any precision is a good research problem, for statement of the knowledge principle in the example, but it is not a problem to solve with computer science (Feigenbaum a deadline. 1968), describing the role of knowledge in On the other hand, if it is easy to teach , Dendral’s problem in chem- someone how to solve a problem, there is little istry in particular. We believe that Dendral, and point in writing a program to solve it. Finding then Mycin were the first to fully embrace the powdered Tang in a grocery story might be knowledge principle. We were deliberately facilitated with a table of synonyms and aisle seeking knowledge from experts because we labels, but it hardly requires much inference were not ourselves very knowledgeable about unless you are in the wrong store. the domains. We tried to be systematic about the methods we used for eliciting knowledge Search from experts (Scott, Clayton, and Gibson Newton had argued that it was impossible to 1991). We tried to be disciplined about repre- enumerate hypotheses for interesting problems senting knowledge in simple, uniform data in science (called “induction by enumeration”) structures that could be understood and edited because there can be infinite of them. from the outside or used by the program to Turing, however, showed that defining a search explain their reasoning. And we separated the space—even an infinite one—can provide a parts of the program that held the knowledge conceptual starting point for implementing from the parts that reasoned with it. heuristic search. Defining a start state and a Guidelines for implementing knowledge- successor function allows a program, in princi- based systems have been suggested before ple, to generate hypotheses and test them one (Davis 1989, Buchanan and Smith 1988), and by one until a solution is found or time runs those are still relevant to building systems. out. Heuristics allow a program to focus on Here I wish to focus specifically on some of the likely parts of the space. things we have learned about knowledge Simon’s principle of satisficing works well through observing and experimenting with with heuristic search because it generally knowledge-based systems. Since most of these broadens the definition of the goal beyond the items are familiar to most of you, I present in single-best state. Doug Lenat’s AM program table 7 the collection as reminders without broadened the concept of the search space by much explanation. allowing the generator to be a plausible move

WINTER 2006 41 Articles

Solved Cases, Generalizations, and Prototypes 1 Knowledge of the task domain must exist. Arthur Samuel’s checker-playing program 2 Defining a search space is essential. clearly demonstrated the power of rote learn- 3 Explicit knowledge structures facilitate system building. ing. If a problem has been successfully solved, look up the solution and keep adding to the 4 can be defined or assumed. library of solved cases to become smarter with . Case-based reasoning (CBR) is another way Table 7. Some Conclusions about Knowledge from Early Work in AI. to use previously solved cases. Instead of find- ing an exact match for a case in a library and generator, as opposed to an in-principle com- transferring a solution directly, as in rote learn- plete generator as in Dendral. ing, CBR systems find similar matches and modify their solutions. Explicit Representation Marvin Minsky and Roger Schank have advo- John McCarthy’s early paper “Programs with cated encoding frames of prototypical objects Common Sense” (McCarthy 1958) argues for and scripts of prototypical events. Knowledge representing knowledge explicitly in declara- about prototypical members of a class helps us tive statements so that it can be changed easily fill in what we need to know, understand what from the outside. Although the distinction we don’t know, and make good guesses about between declarative and procedural knowledge missing information. Knowledge about how is not a sharp one, users of programs are quick complex situations typically unfold in time to tell us when editing a knowledge base is easy helps us in just the same ways. Both frames and or hard. scripts can give a program expectations for what to expect and default values for filling in miss- Common Sense ing information. Programs that solve real-world problems must deal with numerous assumptions about the Items of Knowledge way the world works. McCarthy has argued Three decades of experience with knowledge that a program needs access to the deductive has taught us that knowledge does consequences of a large axiom set. Lenat and not come in prepackaged “nuggets.” Eliciting Feigenbaum have argued that common sense knowledge from experts is not so much like can be encoded in millions of individual mining for gold as like coauthoring a textbook. and that keep a program from looking It is a social process whose success depends on silly. After McCarthy criticized Mycin for fail- the personalities and interpersonal skills of ing to know that amniocentesis would not be both the knowledge engineer and expert. performed on a male patient or that dead peo- We also know that the chunks of knowledge ple don’t need antibiotics, we came to realize in a knowledge base are not independent, and that we had an effective operational way of therefore large sets of knowledge items can dealing with common sense other than trying interact in ways that are not completely pre- to encode it. Namely, we assumed that users of dictable. For this reason, it is useful to collect a knowledge-based system would themselves items into nearly independent groups, where have some common sense. Doctors and nurses the interactions within a group are easier to see using Mycin, for example, could be presumed and the interactions across groups are infre- to know when their patients are already dead. quent (table 9). One important class of knowledge-based sys- tem shares the load of problem solving so that Rules the system assists a person as a partner but Emil Post proved that simple conditional rules depends on the human user to provide addi- together with modus ponens constitute a logi- tional knowledge of the problem area. In addi- cally complete representation. However, one tion to users providing some of the common lesson from work with rule-based systems is sense, users are likely to have important input that rules are more powerful when augmented when it comes to values and ethical considera- with additional knowledge structures, such as tions in . That is, depending on the definitions and class-subclass hierarchies. context of use, a program may be able to Whatever the data structures that are used to depend on a human partner to provide com- represent knowledge, however, an important monsense knowledge. lesson learned with some pain is to use the Additional lessons about knowledge relevant concepts and vocabulary of the end users. to AI are presented in table 8. Experts in the home office or academe do not

42 AI MAGAZINE Articles necessarily speak the same language as users in the . It is also critical that users provide knowledge about the work environment. There 5 Knowledge of past cases ispowerful. is a danger that the expert providing the 6 Generalizations and prototypes arepowerful. knowledge is unfamiliar with the constraints 7 Nearly-independent chunkscanbedefined. and assumptions implicit in the context of the actual work place. 8 Elicitation of knowledge issocial. Hierarchies Aristotle’s insights on organizing knowledge Table 8. Additional Lessons about Knowledge Relevant to AI. hierarchically have been implemented in semantic networks, frames, prototypes, and object-oriented systems in general. Knowing the class to which an object belongs gives us considerable information about the object. 9 Rules, hierarchies, and definitions complement one another. Thus we save time and space by knowing about 10 Default knowledge adds breadth and power. classes. Moreover, we gain the ability to make plausible inferences about an object in the 11 Uncertainty and incompleteness are ubiquitous. absence of details, since it is usually safe to 12 Metadata facilitates debugging and editing. assume information inherited from a class is correct unless we’re told otherwise.

Default Knowledge Table 9. Additional Lessons about Knowledge Relevant to AI. Giving a program a set of defaults, or a way to determine them, broadens the range of prob- incrementally. Often more than one expert lems the program can deal with. Even in the and more than one knowledge engineer are absence of specific values for some attributes, a involved in the construction and modification. program can find plausible solutions if it can For these reasons, it is good practice to tag the make reasonable guesses about the missing val- identifiable chunks of knowledge with addi- ues. Default values may be stored in proto- tional information. In Mycin, for example, it types, hierarchies, definitions, or lists. was useful to note the author of the chunk, the Uncertainty date it was added, subsequent modification dates, literature references, and reasons why The knowledge we use for problem solving is the chunk was added. frequently uncertain and incomplete. Also, the More lessons about knowledge relevant to AI information we have about the objects and are presented in table 10. events named in any individual problem is uncertain and incomplete. Yet we, and the sys- Constraints tems we build, are expected to provide satisfac- Many problems require synthesizing a solution tory solutions with the best information avail- that satisfies a large number of constraints. able. Fabricating information to fit a Scheduling problems, configuration problems, preconceived conclusion is not scientifically and design problems all have this character, acceptable. Therefore, implementing the each with powerful models for how to solve knowledge principle requires implementing a them. Each constraint is a specific piece of means of dealing with uncertain and incom- knowledge about the problem that can be plete knowledge. examined and reasoned about. For example, a The simplest way to deal with this inconven- program can decide to postpone trying to sat- ience is to ignore it: act as if everything known isfy a constraint until more information to the program is categorically certain. For sim- becomes available if the constraint is an explic- ple problems this is often sufficient. Currently, it chunk of knowledge and not a complicated the most popular way of dealing with uncer- set of program statements. tainty is to assign probabilities to what is known and combine them with Bayes’s Theo- Temporal Relations rem. Nonprobabilistic methods, like contin- James Allen’s interval , based on Rus- gency tables, fuzzy logic, and Mycin’s certainty sell’s interval definition of time, is a workable factors, have also been used successfully. set of inference rules for reasoning about the relative times of events. It should be in every Attached Metadata knowledge engineer’s toolbox for problems in A large knowledge base needs to be constructed which temporal relationships are important.

WINTER 2006 43 Articles

Redundancy Contrary to Occam’s Razor, there are good rea- 13 Constraint-based reasoning ispowerful. sons to encode the same information redun- 14 Temporalreasoning may be required. dantly in a program. For one thing, we cannot 15 Analogies may help. achieve total independence of evidence state- ments in real-world systems because the com- 16 Diagrams aid inelicitation — and infuture reasoning. ponents are interrelated. Many things can cause a fever, for example, and a fever can be evidence for many conditions. Also, the actual Table 10. More Lessons about Knowledge Relevant to AI. description of a problem is generally missing some items of data. If there are multiple paths for making inferences, the missing data will not hurt as much. 17 Knowledge can be used opportunistically. 18 Redundancy isoftenhelpful. Shared Knowledge 19- Knowledge sharing adds power. Reid Smith has suggested that the simple state- ment of the knowledge principle be expanded to: Power = KnowledgeShared Table 11. Still more Lessons about Knowledge Relevant to AI. where the exponent indicates the number of people or machines whose knowledge is Analogies brought to the problem.10 This is an excellent Analogical reasoning can be a powerful tool, point. It fits with the common wisdom that but we still do not understand how to tell a two heads are better than one. A community of good analogy from a bad one. For example, is more or less independent problem solvers is a powerful model, if the environment fosters the Iraq war like the Vietnam war or not? There and rewards sharing. Selfridge made an early is no question that knowledge of similar situa- demonstration of the power of this model in tions can be helpful, but for the case- his Pandemonium program, and it has been based reasoning seems to be more tightly con- generalized in the blackboard model, but only strained than broader analogical reasoning. recently has it become common in the business Diagrams world. For example, an article in The Economist (May 8, 2003) describes a survey of knowledge Polya advocated drawing a diagram to help us management that “found that the best-per- understand a problem and see relevant interre- forming companies were far more likely than lationships among objects. Unquestionably, a the worst-performing ones to use creative tech- diagram contains information that is useful— niques for acquiring, processing and distribut- as we say, a picture is worth a thousand words. ing knowledge.” Gelernter’s geometry program did use dia- The benefits of collaboration among ma - grams, and a few others have, but tools for chines have been demonstrated in a few using diagrams are still missing from the research projects, but collaborative problem knowledge engineer’s toolbox. solving through knowledge sharing is still far Table 11 lists still more lessons about knowl- from commonplace in AI. We have a lot to edge that are relevant to AI. learn about implementing collaborative pro- grams. Parallel searching in search engines like Opportunistic Use Google enables rapid response, but it is more a Oliver Selfridge’s Pandemonium program and result of divide-and-conquer than it is of better subsequent work with the blackboard model thinking through collaboration. show that knowledge need not be used in a fixed sequence. It makes good sense to use the What Does a items of knowledge that have the most to con- tribute to a problem at each moment. In sched- Programmer Have To Do? uling and planning problems, for instance, Every problem-solving or decision-making pro- some constraints are more useful as more of gram requires knowledge of the problem area. their preconditions are met—which means When we write a program to calculate an area either waiting until more information is avail- as length times width, we are codifying a piece able or forcing the program to chase after the of knowledge from geometry in the procedure. preconditions. However, the discipline of knowledge-based

44 AI MAGAZINE Articles

1 Explicit representation of knowledge — withinthe users’ conceptual framework. 2 Explicit reasoning steps. 3 Modular,nearly-independent, chunksof knowledge.

Table 12. Operationalizing Flexibility. A Primary Lesson from the Mycin Experiments (Buchanan and Shortliffe 1984). programming goes beyond “hard wiring” rele- First, Bacon is right: knowledge is power. vant knowledge in procedures. Actually imple- Philosophers have given us many distinctions menting the knowledge principle in a program in what we know about knowledge. For exam- requires codifying the knowledge in ways that ple, what kinds of evidence count as justifica- it can be examined and changed easily. Ideally, tions for a belief, what kinds of knowledge the program can even explain what knowledge there are, and how distinguishing metalevel it uses to solve a particular problem and why it and object-level knowledge clarifies a problem. uses it. Perhaps the most important lesson is that In our description of the Mycin experiments, real-world problems require real-world knowl- Shortliffe and I attempted to find a one-word edge. Problems are hard when relevant knowl- summary of what made Mycin work as well as edge is not neatly codified. Knowledge of it did (Buchanan and Shortliffe 1984). The one objects and processes in world is often word was flexibility. In the design and imple- approximate, ambiguous, incomplete, and mentation of the program, we looked for ways wrong. That is one of the most important rea- of incorporating knowledge about the domain sons why human problem solvers require con- of medicine and the task of diagnosis in the siderable training and experience to solve real- most flexible ways we could. In particular, we world problems and why assistance from wanted the knowledge to be represented to computers is welcome. It is also one of the allow straightforward credit assignment, easy most important reasons why precise, logical examination and modification, and user- formulations of real-world problems are often friendly explanations of why the program inaccurate. And it is the reason why satisficing believed what it did. One simple reason why is a good strategy. this is desirable is that complex knowledge- Second, Feigenbaum is right: more knowl- based systems need to be constructed through edge is more power. The more kinds of things a a process of iterative refinement (recall New- program knows about in a domain, the greater ton’s method). the scope of its capabilities. By using class hier- Staying flexible seems to have three compo- archies, there is considerable economy in nents at the implementation level that I adding knowledge about new instances. believe facilitate constructing and fielding suc- Including more knowledge of the objects cessful knowledge-based systems (table 12): (1) and events in a problem domain and more explicit representation of knowledge within knowledge of special cases will increase the the program using the vocabulary and proce- power of a program. Computer scientists have dures of practitioners in the task domain—and shown how to implement many specialized relating them under a coherent model of the kinds of knowledge needed to solve real-world task domain; (2) explicit reasoning steps that problems. These include spatial and temporal the program can play back (“explain”) in an relationships, conceptual hierarchies, general- understandable fashion; and (3) modular, near- izations with and without uncertainty, excep- ly independent chunks of knowledge, whether tions, and analogies. that knowledge is codified in procedures or Third, discipline is required to maintain flex- data structures. ibility—implementing the knowledge principle requires making knowledge explicit and mod- Conclusions ular. Large knowledge bases are constructed iteratively, with an experimental cycle of test- On the assumption that an audience will ing and modifying. Without the flexibility to remember only seven plus-or-minus two examine and modify a program’s knowledge, things, I have three take-home lessons. we are unable to determine what to change.

WINTER 2006 45 Articles

Worse yet, without the flexibility to 11th century with a massive movement to Encyclopedia of Philosophy. Stanford, CA: examine and modify a program’s translate Arabic and Greek scientific works CSLI, Stanford University. plato.stanford. knowledge, we are at the mercy of the into Latin.” edu/entries/bayes-theorem/#3 (June 2006). program’s power to take the program’s 7. See Answers: Newton’s Method, www. Kahneman, D.; Slovic, P.; and Tversky, A., advice without understanding how it answers.com/topic/newton-s-method eds. 1982. Judgment under Uncertainty: is justified. Plato said that’s a bad idea. (2006). Heuristics and Biases. : Cam- bridge University Press. Let me conclude with a Native 8. See the National Center for Atmospheric American proverb that I hope program Research (http://eo.ucar.edu/rainbows/). Kuhn, T. 1962. The Structure of Scientific Rev- olutions. Chicago: Univ. Chicago Press. designers and future AI programs will 9. Bob Engelmore and I were once buying Leake, D.; and Sorriamurthi, R. 2004. Case be able to operationalize: food in a health food store for a backpack- ing trip. We assumed that the powdered Dispatching Versus Case-Based Merging: If we wonder often, the gift of knowl- drink Tang would be shelved with other When MCBR . International Journal edge will come. fruit drinks and asked an earnest young of Tools 13(1): 237–254. clerk where to find it. It was not a legiti- McCarthy, J. 1958. Programs with Com- Acknowledgements mate question in that store and we ended mon Sense. In Proceedings of the Teddington I am indebted to many colleagues who up with a lecture on food additives. There- Conference on the Mechanization of Thought have, over the years, given me the ben- after, the word Tang was shorthand for dis- Processes, 77–84. London: Her Majesty’s Sta- covering a context in which a lecture is a tionary Office. efit of their own knowledge of philos- proper response to a question, rather than ophy and AI. Please credit them with Pledge, H. T. 1939. Science Since 1500. Lon- a simple answer. don: His Majesty’s Stationery Office. trying. In particular, Ed Feigenbaum 10. See R. G. Smith, www.rgsmithassoci- and Reid Smith provided substantial Polya, G. 1945. How To Solve It. Princeton, ates.com/ Power.htm. NJ: Princeton University Press. comments on an earlier draft and, among other things, encouraged me to Quine, W. V. O. 1966. The Ways of Paradox References and Other Essays. Cambridge: Harvard Uni- simplify. Phil Cohen provided several Bacon, F. 1597. Religious Meditations, Of versity Press. comments and the reminder about Heresies. In Essayes. Religious Meditations. Scott, A. C.; Clayton, J.; and Gibson, E. Russell’s interval definition of time. Places of Perswasion and Disswasion. Seene 1991. A Practical Guide to Knowledge Acqui- Notes and Allowed (The Colours of Good and Evil). sition. Boston: Addison Wesley. London: John Windet. Simon, H. A., and Chase, W. G. 1973. Skill 1. This article is based on the Robert S. Bacon, F. 1620. The New Organon. London: in Chess. American 61(4): 394–403. Engelmore Award lecture, which I was hon- Cambridge University Press, 2000. Steup, M. 2006. of Knowledge. ored to present at the Innovative Applica- Boorstin, D. 1985. The Discoverers. New Stanford Encyclopedia of Philosophy. Stan- tions of Artificial Intelligence conference York: Vintage Books. ford, CA: CSLI, Stanford University. held in July 2006 in Boston, Massachusetts. Buchanan, B. G. 1994. The Role of Experi- plato.stanford.edu/entries/knowledge- 2. As many of you know, Bob Engelmore mentation in Artificial Intelligence. Philo- analysis/#JTB. was dedicated to the clear exposition of sophical Transactions of the Royal Society Thagard, P. 1988. Computational Philosophy what we know and equally dedicated to 349(1689): 153–166. of Science. Cambridge, MA: The MIT Press. keeping silent when we don’t have the facts. Many times when I began hand-wav- Buchanan, B. G., and Shortliffe, E. H. 1984. White, Jr., L. 1967. The Historical Roots of ing, Bob would just give me an incredulous Rule-Based Expert Systems: The MYCIN Exper- Our Ecologic Crisis. Science 155(3767), (10 look. So it was with some misgivings that I iments of the Stanford Heuristic Programming March, 1967): 1203. Project. New York: Addison Wesley. chose the somewhat immodest title, “What Bruce G. Buchanan was Buchanan, B. G., and Smith, R. G. 1988. Do We Know about Knowledge?” a founding member of Fundamentals of Expert Systems. In Annual 3. This is admittedly a whirlwind tour, with AAAI, secretary-treasurer Review of Computer Science 3, 23–58. Palo more reminders than explanations. I hope from 1986–1992, and Alto, CA: Annual Reviews, Inc. it not only will remind us of some of the president from 1999– intellectual roots of AI but may also help us Cornford, F. 1935. Plato’s Theory of Knowl- 2001. He received a B.A. create more robust systems in the future. edge: The Theaetetus and the Sophist of Plato. in mathematics from 4. This was anticipated by the Greek poet London: Routledge and Kegan Paul. Ohio Wesleyan Universi- Menander, who wrote, “In many ways the Davis, R. 1989. Expert Systems: How Far ty (1961) and M.S. and saying ‘Know thyself’ is not well said. It Can They Go? Part One. AI Magazine 10(1): Ph.D. degrees in philosophy from Michi- were more practical to say ‘Know other 61–67 gan State University (1966). He is universi- people’” (Thrasyleon Fragment). Descartes, R. 1637. Discourse on the Method ty professor emeritus at the University of 5. There are, of course, notable exceptions of Rightly Conducting the Reason and Seeking Pittsburgh, where he held joint appoint- of good experimental science and clear Truth in the Sciences. Translated by John ments with the Departments of Computer Science, Philosophy, and Medicine and the documentation in AI. See, for example, Veitch. Edinburgh: W. Blackwood and Intelligent Systems Program. He is a fellow Leake (2006). The case for AI as an experi- Sons, 1870. of the American Association for Artificial mental science is made in Buchanan Feigenbaum, E. A. 1968. Artificial Intelli- Intelligence (AAAI), a fellow of the Ameri- (1994). gence: Themes in the Second Decade. IFIP can College of Medical Informatics, and a Congress (2), 1008–1024. Laxenburg, Aus- 6. Western science owes much to the Arab member of the National of Sci- tria: International Federation for Informa- world, as acknowledged, for example, by ence Institute of Medicine. His e-mail tion Processing Congress. White (1967): “The distinctive Western tra- address is [email protected]. dition of science, in fact, began in the late Joyce, J. M. 2006. Bayes’ Theorem. Stanford

46 AI MAGAZINE