Artificial Intelligence 47 (199l) 57-77 57 Elsevier

Rigor mortis: a response to Nilsson's "Logic and "

Lawrence Birnbaum The Institute for the Learning Sciences and Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, 1L 60201, USA

Received June 1988 Revised December 1989

Abstract Birnbaum, L., Rigor mortis: a response to Nilsson's "Logic and artificial intelligence", Artificial Intelligence 47 (1991) 57-77. Logicism has contributed greatly to progress in AI by emphasizing the central role of mental content and representational vocabulary in intelligent systems. Unfortunately, the logicists' dream of a completely use-independent characterization of knowledge has drawn their attention away from these fundamental AI problems, leading instead to a concentration on purely formalistic issues in deductive inference and model-theoretic "semantics". In addi- tion, their failure to resist the lure of formalistic modes of expression has unnecessarily curtailed the prospects for intellectual interaction with other AI researchers.

1. Introduction

A good friend recently told me about a discussion he had with one of his colleagues about what to teach in a one-semester introductory artificial intellig- ence course for graduate students, given the rather limited time available in such a course. He proposed what he assumed were generally agreed to be central issues and techniques in AI--credit assignment in learning, means-ends analysis in problem-solving, the representation and use of abstract planning knowledge in the form of critics, and so on. Somewhat to his surprise, in his colleague's view one of the most important things for budding AI scientists to learn was... Herbrand' s theorem. I recount this anecdote here for two reasons. First, I suspect that most of us would have been surprised, as my friend was, by this response, and by the rather puzzling set of scientific and educational priorities that it reveals. It would, of course, be unfair to conclude that the scientific world-view displayed in this anecdote is representative of the logicist position in AI, at least as that

0004-3702/91/$03.50 © 1991 -- Elsevier Science Publishers B.V. 58 L. Birnbaum

position is portrayed by Nils Nilsson. Nevertheless, it is worth bearing in mind that, as the anecdote makes clear, this debate is not only--perhaps not even primarily--a technical one, but rather a question of scientific priorities. The second reason I recount this anecdote here is to illustrate that when it comes to making a point, a good story is almost always more useful than a lot of abstract argumentation. We often lean more heavily on personal experience and specific stories than on "book learning" or other abstract knowledge, and we often draw general conclusions from a single experience, even when we know we shouldn't. Human reasoning is powerfully affected by concrete images, by illustrative anecdotes, and by memorable experiences. Of course, artificial intelligence is not psychology: Our goal is not to mimic human thought and behavior in minute detail. Nevertheless, such a pervasive charac- teristic of human thought cannot simply be ignored as if it were prima facie irrelevant to our own work. Although I can't imagine that anyone in AI seriously disagrees with the proposition that there are probably sound function- al reasons why human thinking is the way it is, the point nevertheless often seems to need repeating. The role of concrete cases in reasoning is something that many of us think is an important piece of the puzzle of both artificial and human intelligence; it is also a good example of the kind of question that logicists never seem to address. Of course, it is not necessarily fatal to the logicist enterprise that it addresses only a portion of the problems involved in artificial intelligence: Who would have expected otherwise? The answer, I'm afraid, is the logicists themselves. Despite Nilsson's rather sensible observation that "successful AI systems of the future will probably draw on a combination of techniques .... " logicists do not seem to view logicism as just one approach among many: They see it as the universal framework in terms of which everything else in AI must ultimately be expressed. For example, in his response to McDermott's [19] "A critique of pure reason", Nilsson [22] asserts that "While all AI researchers would acknowledge the general importance of procedures and procedural knowledge (as distinguished from declarative knowledge), they would seem to have no grounds for a special claim on those topics as a subset of computer science." In other words, in Nilsson's view AI is to be distinguished as a sub-area of computer science in general not by the problems it investigates--language understanding, learning, vision, planning, and so on--but by the methods it uses, and in fact by one and only one aspect of those methods, the use of declarative representations. Since Nilsson further makes it clear that in his view the use of declarative representations must, ultimately, entail embracing all of the apparatus of logic, the implication of this assertion is fairly obvious: Anything that doesn't fit the logicist paradigm--visual perception, goals, memory and indexing, attention, emotions, the control of physical movement, and so on--may be very nice computer science, thank you, but it isn't AI. This is not, it should be clear, a scientific proposition, but rather a political one, and as such its role in our field deserves careful scrutiny. Rigor mortis 59

In addition to these sorts of political concerns, however, there are also good technical reasons for doubting the utility--or at the very least, the special utility----of logic, strictly speaking, as a framework for knowledge representa- tion. I say "strictly speaking" because, in a broad sense, any computational scheme for knowledge representation and reasoning could be considered some form of logic, even connectionism. Moreover, most work in AI shares the logicist commitment to the centrality of explicit symbolic representations in mental processes. How then does logicism differ from other approaches to AI? In my view, it is distinguished by two additional commitments. The first is its emphasis on sound, deductive inference, in the belief--for the most part implicit--that such inference plays a privileged role in mental life (see [19]). As a result of this emphasis, logicists tend to ignore other sorts of reasoning that seem quite central to intelligent behavior--probabilistic reasoning, reasoning from exam- ples or by analogy, and reasoning based on the formation of faulty but useful conjectures and their subsequent elaboration and debugging, to name a few----or else, attempt (unsuccessfully, in my view) to re-cast plausible inference of the simplest sort as some species of sound inference. The second distinguishing feature of logicism is the presumption that model- theoretic "semantics" is somehow central to knowledge representation in AI. ~ The primary justification for this presumption is its putative explanatory role, namely, that it is necessary in order to correctly characterize what it means for an agent to know or believe something, and thus to specify precisely what a given agent knows or believes. This, I take it, is Nilsson's argument. Towards this claim, non-logicists seem to be of two minds: Some disagree with it--and I will explain why later in this paper--while others just don't see what difference it makes. In the face of the latter reaction--indifference-- logicists often take another tack, and attempt to justify their preoccupation with model-theoretic "semantics" on methodological, rather than explanatory, grounds. In this vein, they stress its utility as a heuristic aid in solving knowledge representation problems, or to enable the AI researcher to prove things about his program, and so on. Whether the things that can be so proven are of any particular interest is debatable (see [4] for some arguments as to why proving that programs meet certain formal specifications is unlikely to be either possible or useful in software engineering; it seems doubtful that AI systems will prove more tractable in this regard). Whether such a "semantics" is in fact heuristically useful in solving representation problems is more difficult to debate, since that is a matter of personal taste. People should, obviously, work in the manner that they find most congenial and productive.

1I place scare quotes around "semantics" in this sense to emphasize that in logic this is a technical term, and that the theory to which it refers may or may not turn out to be a correct account of the meaning of mental representations. 60 L. Birnbaum

2. The good news

The above criticisms notwithstanding--and I will take them up in greater detail shortly--it cannot be denied that logic and Iogicists have contributed a great deal to AI. Perhaps the logicists' most important contribution has been to focus attention on the fact that it is the content of our knowledge, and the concepts in terms of which that knowledge is expressed--what togicists refer to as "ontological" issues--that lie at the heart of our ability to think. Their arguments have played a key role in legitimizing the study of representations from the perspective of how well they capture certain contents, independently of the details of the particular processes that might employ them. Hayes's [11] "Naive physics manifesto", in particular, is a persuasive and historically important defense of this idea. As Nilsson puts it, "The most important part of "the AI problem" involves inventing an appropriate conceptualization .... "" In fairness, however, the credit for this insight cannot be assigned solely to the logicists: Very much the same point can and has been made without any special commitment to logic, for example by Feigenbaum [6], Schank and Abelson [27], and Newell [21], among others. Moreover, as Nilsson acknowledges, "the [logicist] approach to AI carries with it no special insights into what con- ceptualizations to use." The technical apparatus of logic itself--a clear and unambiguous syntax, the "ability to formulate disjunctions, negations, and universally and existentially quantified" expressions--indisputably plays an important role in AI, as does the technology of mechanical theorem proving. Although I am sympathetic with the view that it is a mistake to attempt to embed "scruffy" thinking in "neat" systems-'--and that the real question is how "neat" thinking can emerge from a "scruffy" system--expressive apparatus of the sort provided by logic seems indispensable, especially when it comes to representing abstract con- cepts) On the other hand, Nilsson's implicit criticism of knowledge representa- tion schemes that lack this technical apparatus probably misses the point. Much research that might be criticized on these grounds has simply been directed towards other issues, primarily issues of content and conceptual vocabulary, or of memory organization and efficient access for a particular set of tasks. Finally, there can be no question that logicists have led the battle for declarative representation in AI, a battle that they have largely won--though again, not entirely single-handedly. A particularly compelling argument, attri- buted by Nilsson to McCarthy, is that declaratively represented knowledge "[can] be used by the machine even for purposes unforeseen by the machine's designer .... " Putting this in somewhat different terms, there can be no question that cross-domain and -purpose application of knowledge is an

' See [1] for an enlightening discussion of these terms. Which aren't necessarily very technical or abstruse: Try representing the concept of "helping" in propositional logic. Rigor morris 61 important functional constraint on representations, and that declarative repre- sentations seem to meet this requirement better than anything else we know of. However, putting it this way makes it clear that what is at stake here is a type of learning--the ability to apply knowledge acquired in one situation, for one purpose, to other, different situations and purposes. Yet, rather oddly, Nilsson, and logicists generally, pay no attention to this issue. There are several reasons for this, but the main one seems to be their attachment to sound inference and model-theoretic "semantics". For example, in his discussion of inference, Nilsson argues that

Often .... the new sentence constructed from ones already in mem- ory does not tell us anything new about the world. All of the models of [the sentences already in memory] are also models of [the new sentence]. Thus, adding [the new sentence] to [memory] does not reduce the set of models. What the new sentence tells us was already implicitly said by the sentences from which it was con- structed.

Indeed, logicists sometimes go so far as to assert that sound inference cannot, in principle, generate any new knowledge. On this account of what knowledge is, or of what makes it "new", the problem of applying lessons learned in one domain for one purpose to other domains and other purposes doesn't exist, because it is already solved. Unfortunately, if this isn't true--and it strikes me as a rather dubious proposition on which to bet a research program--then we must conclude that the problem cannot even be properly characterized within the logicist framework. However, if the problem of cross-domain and -purpose application of knowledge cannot even be character- ized within the logicist framework, then we have good reason to doubt that logic, as construed within that framework, in fact appropriately addresses the functional issues raised by the problem. Since, as Nilsson himself argues, this problem provides the fundamental justification for declarative representations in the first place, the logicists have some explaining to do.

3. The bad news

What drives logicists to adopt the obviously unrealistic position that infer- ence does not change what an agent knows7 It is their devotion to model- theoretic "semantics". And what motivates this devotion? Nilsson puts it as follows: Those designers who would claim that their machines possess declarative knowledge about the world are obliged to say something about what that claim means. The fact that a machine's knowledge 62 L. Birnbaum

base has an expression in it like (Vx)Box(x)53Green(x), for example, doesn't by itself justify the claim that the machine believes all boxes are green.

This is uncontroversial, as far as it goes. The leap of faith in the logicist program is the presumption that, in saying something about what it means for a machine to have beliefs, AI is obliged to reiterate a theory of how logical symbols are to be interpreted, developed over the last century in logic and mathematics for fundamentally different purposes--in particular, for proving the soundness and completeness of inference methods. Nilsson offers no argument why this should be so. But McDermott [19] is a bit more open about how the logicists arrived at this position: "The notation we use.., must have a semantics; so it must have a Tarskian semantics, because there is no other candidate"---or to put his another way: "You have a better theory'?" I must admit that I do not have a better theory--at least, not one that would satisfy the logicists. But the absence of an alternative theory does not make a bad theory good. I find it difficult to understand the zeal with which logicists embrace and defend a theory that has so many problematic implications. Trying to define "knowledge" and "belief" at our current stage in theorizing about the mind is like biologists trying to define "life" a hundred years ago. Rather than seeing this as a complicated puzzle to be resolved by artificial intelligence and other cognitive sciences as they progress, logicists assume that the question has a simple, definitive answer, that logic has provided this definitive answer, and that all AI has to do is work out the details. The obvious alternative to a model-theoretic "semantics" is a functional semantics, based on the idea that representations get their meaning by virtue of their causal rote in the mental processes of the organism, and ultimately, in perception and action. On such a view the meaning of a term is not tied to the inferences that it could in principle license, but to those that it actually licenses in practice. The concept "prime number" does not mean the same thing to me as it does to a number theorist; and its meaning for me would change if I studied some number theory. The problem with such an approach, from a logicist perspective, is that a theory of meaning based on functional role doesn't allow for a specification of what an organism knows independently of what it does. This is exactly right. Moreover, such a situation does not preclude applying the same knowledge in different domains, and for different purposes. Indeed, it seems clear that Nilsson's chief argument for the logicist position is based on a false dichotomy: Just because knowledge cannot be specified in complete independence of use doesn't mean that it can't be specified in a way that enables its application to many different uses. But for logicists, it isn't sufficient that the ability to apply the same knowledge in different domains, and for different purposes, be a functional constraint on mental representations--something that is, all other things being equal, to be desired. Such a formulation implies that generality of Rigor mortis 63 this sort might be involved in "grubby" engineering trade-offs (to use McDer- mott's colorful description), and this is exactly what logicism is trying to escape in the first place. As Dennett [5] cannily observes, logicism is the chief representative, within AI, of a belief in the "dignity and purity of the Crystalline Mind," and of the concomitant notion that psychology must be more like physics than like engineering or biology. The upshot is that logicists believe there is no alternative to embracing model-theoretic "semantics" for mental representations. The major stumbling block in any straightforward application of this approach is, as has often been noted, consistency----or more precisely the lack of it (see, e.g., [12, 20]). 4 If the beliefs of an organism are inconsistent, then there is no model of those beliefs. This is problematic for a number of reasons. First, it means that the content of the organism's knowledge, which Nilsson asserts should be characterized as the "intended" model of the sentences representing that knowledge, cannot in fact be so characterized: In the technical sense, there are no such models. It follows that whatever the relationship between the organism's representations and the content they express, that relationship cannot be described by a model- theoretic "semantics". The second problem, of course, is that if the inference processes of the organism are to be construed as some form of logical deduction, then if its beliefs are inconsistent, anything at all can be deduced. Because its beliefs have no model, all models of its beliefs are also models of any other belief it might entertain. Inconsistency poses such a severe problem for model-theoretic "semantics" because of its extremely "holistic" nature, in that the meaning of an individual symbol in a logical theory is determined by the set of models consistent with the entire theory in which it appears. As Hayes [11] puts it, "a token [in a formal theory] means a concept if, in every possible model of the formalisation taken as a whole, that token denotes an entity which one would agree was a satisfactory instantiation of the concept in the possible state of affairs repre- sented by the model." There is a certain intuitive appeal to this holism; indeed, in any functional approach to semantics, the meaning of a symbol similarly depends upon the entire system in which it is embedded. The problem is that model-theoretic "semantics" takes the holism of meaning to its extreme: Either a theory is completely consistent, or it has no models, and hence no way to determine meaning at all. A knowledge base with a single bug t