AI Magazine Volume 26 Number 4 (2006)(2005) (© AAAI) 25th Anniversary Issue If Not Turing’s Test, Then What?

Paul R. Cohen

■ If it is true that good problems produce good sci- Ignore It, and Maybe ence, then it will be worthwhile to identify good It Will Go Away… problems, and even more worthwhile to discover the attributes that make them good problems. This Blay Whitby (1996) offers this humorous histo- discovery process is necessarily empirical, so we ex- ry of the Turing test: amine several challenge problems, beginning with 1950–1966: A source of inspiration to all con- Turing’s famous test, and more than a dozen attrib- cerned with AI. utes that challenge problems might have. We are 1966–1973: A distraction from some more led to a contrast between research strategies—the promising avenues of AI research. successful “divide and conquer” strategy and the promising but largely untested “developmental” 1973–1990: By now a source of distraction strategy—and we conclude that good challenge mainly to philosophers, rather than AI workers. problems encourage the latter strategy. 1990: Consigned to history. Perhaps Whitby is right, and Turing’s test should be forgotten as quickly as possible and should not be taught in schools. Plenty of peo- ple have tried to get rid of it. They argue that Turing’s Test: The First Challenge the test is methodologically flawed and is based in bad philosophy, that it exposes cultural bias- More than fifty years ago, pro- es and naïveté about what Turing calls the posed a clever test of the proposition that ma- “programming” required to pass the test. Yet chines can think (Turing 1950). He wanted the the test still stands as a grand challenge for ar- proposition to be an empirical, one and he par- tificial , it is part of how we define ticularly wanted to avoid haggling over what it ourselves as a field, it won’t go away, and, if it means for anything to think. did, what would take its place? We now ask the question, ‘What will happen Turing’s test is not irrelevant, though its role when a machine takes the part of [the man] in has changed over the years. Robert French’s this game?’ Will the interrogator decide wrong- (2000) history of the test treats it as an indica- ly as often when the game is played like this as tor of attitudes toward AI. French notes that he does when the game is played between a among AI researchers, the question is no man and a woman? These questions replace our longer, “What should we do to pass the test?” original, “Can machines think?” but, “Why can’t we pass it?” This shift in atti- More recently, the test has taken slightly differ- tudes—from hubris to a gnawing worry that AI ent forms. Most contemporary versions ask is on the wrong track—is accompanied by an- simply whether the interrogator can be fooled other, which, paradoxically, requires even more into identifying the machine as human, not encompassing and challenging tests. The test is necessarily a man or a woman. too behavioral—the critics say—too oriented to There are many published arguments about language, too symbolic, not grounded in the Turing’s paper, and I want to look at three kinds physical world, and so on. We needn’t go into of argument. One kind says Turing’s test is irrel- the details of these arguments to see that Tur- evant; another concerns the philosophy of ma- ing’s test continues to influence the debate on chines that think; the third is methodological. what AI can or should do.

Copyright © 2005, American Association for Artificial Intelligence. All rights reserved. 0738-4602-2005 / $2.00 WINTER 2005 61 25th Anniversary Issue

There is only one sense in which Turing’s test good, in the sense of helping AI researchers is irrelevant: almost nobody thinks we should make progress. Turing’s test has some of these devote any effort in the foreseeable future to good attributes, as well as some really bad ones. trying to pass it. In every other sense, as a his- The one thing everyone likes about the Tur- torical challenge, a long-term goal for AI, a ing test is its proxy function, the idea that the philosophical problem, a methodological case test is a proxy for a great many, wide-ranging study, and an indicator of attitudes in AI, the intellectual capabilities. Dennett puts it this Turing test remains relevant. way: “Nothing could possibly pass the Turing test by Turing the Philosopher winning without being able Would Turing very much that his test no to perform indefinitely many other intelligent longer has the role he intended? If we take Tur- actions. … [Turing’s] test was so severe, he ing at his word, then it is not clear that he ever thought, that nothing that could pass it fair and intended his test to be attempted: square would disappoint us in other quarters.” (Dennett 1998) There are already a number of digital computers in working order, and it may be asked, ‘Why No one in AI claims to be able to cover such not try the experiment straight away?…’ The a wide range of human intellectual capabilities. short answer is that we are not asking whether We don’t say, for instance, “Nothing could pos- all digital computers would do well in the game sibly perform well on the UCI machine learn- nor whether the computers at present available ing test problems without being able to per- would do well, but whether there are imagin- form indefinitely many other intelligent able computers which would do well. actions.” Nor do we think word sense disam- Daniel Dennett thinks Turing intended the biguation, obstacle avoidance, image segmen- test as “a conversational show-stopper,” yet the tation, expert systems, or beating the world philosophical debate over Turing’s test is ironi- chess champion are proxies for indefinitely I am cally complicated. As Dennett says, “Alas, many other intelligent actions, as Turing’s test confident that philosophers—amateur and professional— is. It is valuable to be reminded of the breadth if we pose the have instead taken Turing’s proposal as a pre- of human intellect, especially as our field frac- text for just the sort of definitional haggling tures into subdisciplines, and I suppose one right sorts of and interminable arguing about imaginary methodological contribution of Turing’s test is challenges, counterexamples he was hoping to squelch” to remind us to aim for broad, not narrow com- (Dennett 1998). petence. However, many find it easier and then we will Philosophers wouldn’t be interested if Turing more productive to specialize, and, even make good hadn’t been talking about intentional attributes though we all know about Turing’s test and of machines—beliefs, goals, states of knowl- many of us consider it a worthy goal, it isn’t progress in AI. edge, and so on—and because we in AI are enough to encourage us to develop broad, gen- about building machines with intentional at- eral AI systems. tributes, philosophers will always have some- So in a way, the Turing test is impotent: It thing to say about what we do. However, even has not convinced AI researchers to try to pass if the preponderance of philosophical opinion it. Paradoxically, although the proxy function was that machines can’t think, it probably is the test’s most attractive feature, it puts the wouldn’t affect the work we do. Who among us cookie jar on a shelf so high that nobody reach- would stop doing AI if someone proved that es for it. Indeed, as Pat Hayes and Ken Ford machines can’t think? I would like to know point out, “The Turing Test is now taken to be whether there is life elsewhere in the universe; simply a rather fancy way of stating that the I think the question is important, but it doesn’t goal of AI is to make an artificial human being” affect my work, and neither does the question (Hayes and Ford 1995). of whether machines can think. Consequently, A second notable methodological failing of at least in this article, I am unconcerned with Turing’s test is that it pushes many aspects of philosophical arguments about whether ma- intelligence into one test that has a yes or no chines can think. answer. This isn’t necessary. We could follow the lead of the multiple move- Turing’s Test as Methodology ment in cognitive and devise tests Instead I will focus on a different, entirely of different sorts of intelligence. In fact, Tur- methodological question: Which attributes of ing’s test is not even very complete, when tests for the intentional capabilities of machines viewed in terms of, say, Howard Gardner’s cat- lead to more capable machines? I am confident alog of intelligences (Gardner 1983). It focused that if we pose the right sorts of challenges, mostly on logical, linguistic, and interpersonal then we will make good progress in AI. This ar- intelligence, not on intrapersonal, bodily- ticle is really about what makes challenges kinesthetic, naturalist, musical, and visual-spa-

62 AI MAGAZINE 25th Anniversary Issue tial intelligence (rounding out the eight in sidered. First, it is all or nothing: it gives no in- Gardner’s catalog). dication as to what a partial success might look Robert French goes further and criticizes the like. Second, it gives no direct indications as to test for its focus on culturally oriented human how success might be achieved” (Whitby intelligence: “The Test provides a guarantee not 1996). And Dennett notes the asymmetry of of intelligence but of culturally-oriented hu- the test: “Failure on the Turing test does not man intelligence” (French 2000). The test also predict failure on ... others, but success would says nothing about neonatal or infant intelli- surely predict success” (Dennett 1998). At- gence—which I think are worth tempting the test is a bit like failing a job inter- and emulating. In fact, to the extent that Tur- view: Were my qualifications suspect? Was it ing provided an operational definition of intel- something I said? Was my shirt too garish? All ligence with his test, it was a kind of middle- I have is a rejection letter—the same content- class English intellectual, midcentury dinner free letter that all but one other candidate party kind of intelligence and represents nei- got—and I have no idea how to improve my ther the pinnacle nor the broad plain of human chances next time. intellectual capability. So let’s recognize the Turing test for what it A third failing of the test is that it cannot be is: A goal, not a test. Tests are diagnostic, and passed today. Tests should be challenging, but specific, and predictive, and Turing’s test is nei- tests that cannot be passed provide no informa- ther of the first two and arguably isn’t predic- tion. Periodically the organiza- tive, either. Turing’s test is not a challenge like tion conducts a sort of Turing test. Consider going to the moon, because one can see how to the following brief snippet of a dialogue I had get to the moon and one can test progress at with one of the entrants: every step along the way. The main functions PC: Where is your nose? of Turing’s test are these: To substitute tests of behavior for squabbles about definitions of in- Entrant: Oh, my nose? Why are you so interest- ed? telligence, and to remind us of the enormous breadth of human intellect. The first point is So let’s PC: Is it near your eyes? accepted by pretty much everyone in the AI recognize Entrant: Such is Human Perversity. community, the second seems not to withstand the Turing PC: Is your nose between your mouth and your the social and academic pressure to specialize. eyes? So now we must move on to other tests, test for what Entrant: I can’t tell you that. which, I hope, have fewer methodological it is: A goal, flaws; tests that work for us. PC: How many hands do I have? not a test. Entrant: I don’t understand why you have, and furthermore, I don’t really care. New Challenges PC: My daughter’s name is Allegra. What is my daughter’s name? Two disclaimers: First, artificial intelligence and do not lack challenge prob- Entrant: Is she a big part of your problem? lems, nor do we lack the imagination to pro- It takes only a moment to see I am not con- vide new ones. This section is primarily about versing with a human being. My point isn’t to attributes of challenge problems, not about the make fun of the people who try to win the problems, themselves. Second, assertions about Loebner Prize, nor do I think this snippet is the the utility or goodness of particular attributes best that we can do with today’s technology. are merely conjectures and are subject to em- My point is that even the very best technology pirical review. Now I will describe four prob- in AI today would not bring us anywhere close lems that illustrate conjectured good attributes to passing the Turing test, and this has a very of challenge problems. bad consequence: Few AI researchers try to pass the test. Challenge 1: Robot Soccer Said more positively, a good test is only Invented by Alan Mackworth in the early 1990s slightly out of reach, and the path to success is to challenge the simplifying assumptions of at least partly clear. good old-fashioned AI (Mackworth 1993), ro- Not only is Turing’s goal remote, but at- bot soccer is now a worldwide movement. No tempts to pass his test are not diagnostic: They other AI activity has involved so many people don’t tell us what to do to pass the test next at universities, corporations, primary and sec- time. Blay Whitby puts it this way: “If the Tur- ondary schools, and members of the public. ing test is read as something like an opera- What makes robot soccer a good challenge tional definition of intelligence, then two very problem? Clearly the problem itself is exciting, important defects of such a test must be con- the competitions are wild, and students stay up

WINTER 2005 63 25th Anniversary Issue

late working on their hardware and software. also would be responsible for scoring the es- Much of the success of the robot soccer move- says. ment is due to wise early decisions and contin- As a challenge problem, Handy Andy has uing good management. The community has a several good attributes, some of which it shares clear and easily stated fifty-year goal: to beat with robot soccer. Turing’s test requires simul- the human world champion soccer team. Each taneous achievement of many cognitive func- year, the community elects a steering commit- tions and doesn’t offer partial credit to subsets tee to moderate debate on how to modify the of these functions. In contrast, robot soccer rules and tasks and league structure for the presents a graduated series of challenges: it gets coming year’s competition. It is the responsibil- harder each year but is never out of reach. The ity of this committee to steer the community same is true of the Handy Andy challenge. In toward its ultimate goal in manageable steps. the first year, one might expect weak compre- The bar is raised each year, but never too high; hension of the query, minimal understanding for instance, this year there will be no special of web pages, and reports merely cobbled to- lighting over the soccer pitches. gether from online sources. Later, one expects From the first, competitions were open to all, better comprehension of queries and web and the first challenges could be accomplished. pages, perhaps a clarification dialog with the The cost of entry was relatively low: those who user, and some organization of the report. had robots used them, those who didn’t played Looking further, one envisions strong compre- in the simulation league. The first tabletop hension and not merely assembly of reports games were played on a misshapen pitch—a but some original writing. The first level is common ping-pong table—so participants within striking distance of current information would not have to build special tables. Al- retrieval and text summarization methods. Un- A defining though robotic soccer seems to offer an endless like the Turing test—an all-or-nothing chal- series of research challenges, its evaluation cri- lenge of heroic proportions—we begin with feature of the terion is familiar to any child: win the game! technology that is available today and proceed Handy Andy The competitions are enormously motivating step-by-step toward the ultimate challenge. and bring in thousands of spectators (for exam- Because a graduated series of challenges be- challenge, one ple, 150,000 at the 2004 Japan Open). Two gins with today’s technology, we do not require it shares with hundred Junior League teams participated in a preparatory period to build prerequisites, the Lisbon competition, helping to ensure ro- such as sufficient commonsense knowledge Turing’s test, botic soccer’s future. bases or unrestricted natural language under- is its It isn’t all fun and games: RoboCup teams are standing. This is a strong methodological point universal encouraged to submit technical papers to a because those who wait for prerequisites usual- symposium. The best paper receives the Robo- ly cannot predict when they will materialize, scope. Cup Scientific Challenge Award. and in AI things usually take longer than ex- pected. The approach in Handy Andy and ro- Challenge 2: Handy Andy bot soccer is to come as you are and develop new As ABC News recently reported, people find in- technology over the years in response to in- genious ways to support themselves in college: creasingly stringent challenges. “For the defenders of academic integrity, their The five-page requirement of the Handy nemesis comes in the form of a bright college Andy challenge is arbitrary—it could be three student at an Eastern university with a 3.78 pages or ten—but the required length should GPA. Andy—not his real name—writes term be sufficient for the system to make telling mis- papers for his fellow students, at rates of up to takes. A test that satisfies the ample rope require- $25 a page.” ment provides systems enough rope to hang Here, then, is the Handy Andy challenge: themselves. The Turing test has this attribute Produce a five-page report on any subject. One can and so does robot soccer. administer this test in vivo, for instance, as a A defining feature of the Handy Andy chal- service on the World Wide Web; or in a compe- lenge, one it shares with Turing’s test, is its uni- tition. One can imagine a contest in which ar- versal scope. You can ask about the poetry of tificial agents go against invited humans—stu- Jane Austen, how to buy penny stocks, why the dents and professionals—in a variety of leagues druids wore woad, or ideas for keeping kids or tracks. Some leagues would be appropriate busy on long car trips. Whatever you ask, you for children. All the contestants would be re- get five pages back. quired to produce three essays in the course of, The universality criterion entails something say, three hours, and all would have access to about evaluation: we would rather have a sys- the web. The essay subjects would be designed tem produce crummy reports on any subject with help from education professionals, who than excellent reports on a carefully selected,

64 AI MAGAZINE 25th Anniversary Issue narrow range of subjects. Said differently, the bilities that, collectively, are required for Tur- challenge is first and foremost to handle any ing’s test. Howard Gardner’s inventory of mul- subject and only secondarily to produce excel- tiple intelligences is one place to look for these lent reports. If we can handle any subject, then capabilities. However, it isn’t clear how to test we can imagine how a system might improve whether machines have them. Another place the quality of its reports. On the other hand, to look is elementary school. Every third-grader half a century of AI engineering leaves me skep- is expected to master the skills in table 1. All of tical that we will achieve the universality crite- them can be tested, although some tests will in- rion if we start by trying to produce excellent volve subjective judgments. Here is what my reports about a tiny selection of subjects. It’s daughter wrote for her “convincing letter” as- time to grasp the nettle and go for all subjects, signment: even if we do it poorly. Dear Disney, The web already exists, already has near uni- It disturbs me greatly that in every movie you versal coverage, so we can achieve the univer- make with a dragon, the dragon gets killed by a sality criterion by making good use of the knight. Please, if you could change that, it knowledge the web contains. Our challenge is would be a great happiness to me. The Dragon not to build a universal knowledge base but to is my school mascot. The dragon isn't really make better use of the one that already exists. bad, he/she is just made bad by the villan [sic]. The dragon is not the one who should be killed. The challenge Challenge 3: Never-Ending For example, Sleeping Beauty, the dragon is un- itself Language Learning der the villaness's [sic] power, so it is not neccis- ariliy [sic] bad or evil. Please change that. should test Proposed by Murray Burke in 2002, this chal- Your sad and disturbed writer, lenge takes up a theme of Lenat and Feigen- important baum’s (1987) paper “On the Thresholds of Allegra. cognitive Knowledge.” That paper suggested knowledge- Although grading these things is subjective, based systems would eventually know enough there are many diagnostic criteria for good let- functions. to read online sources and, at that point, would ters: The author must assert a position (stop It should “go critical” and quickly master the world’s killing the dragons) and reasons for it (the drag- knowledge. There are no good estimates of on is my school mascot, and dragons aren’t in- emphasize when this might happen. Burke’s proposal was trinsically bad). Extra points might be given for comprehen- to focus on the bootstrapping relationship be- tact, for suggesting that the recipient of the let- tween learning to read and reading to learn. ter isn’t malicious, just confused (the dragon sion, We always must worry that challenge prob- isn’t the one who should be killed, you got it semantics, lems reward clever engineering more than sci- wrong, Disney!) and entific research. Robot soccer has been criti- knowledge. cized on these grounds. Among its many Criteria for Good Challenges positive attributes, never-ending language It should learning presents us with some fascinating sci- You, the reader, probably have several ideas for entific hypotheses. One states that we have challenge problems. Here are some practical require done enough work on the semantics of a core suggestions for refining these ideas and making problem of English to bootstrap the acquisition of the them work on a large scale. The success of ro- solving. whole language. Another hypothesis is that bot soccer suggests starting with easily under- learning by reading provides sufficient infor- stood long-term goals (such as beating the hu- mation to extend an ontology of concepts and man world soccer team) and an organization so drive the bootstrapping. Both hypotheses whose job is to steer research and development could be wrong; for example, some people in the direction of these goals. The challenge think that the meanings of concepts must be should be administered frequently, every few grounded in interaction with the physical weeks or months, and the rules should be world and that no amount of reading can make changed at roughly the same frequency to dri- up for a lack of grounding. In any case, it is ve progress toward the long-term goals. worth knowing whether one can learn what The challenge itself should test important one needs to understand text from text itself. cognitive functions. It should emphasize com- prehension, semantics, and knowledge. It Challenge 4: The Virtual Third Grader should require problem solving. It should not One answer to the question, “if not the Turing “drop the user at approximately the right loca- test, then what?” was suggested by David Gun- tion in information space and leave him to ning in 2004: If we cannot pass the Turing test fend for himself,” as once today, then perhaps we should set up a “cogni- put it. tive decathlon” or “qualifying trials” of capa- A good challenge has simple success criteria.

WINTER 2005 65 25th Anniversary Issue

stead of subjectively, slowly, and by hand. The challenge should have a kind of monot- onicity to it, allowing one to build on previous Understand and follow instructions work in one’s own laboratory and in others’. This “no throwaways” principle goes hand-in- Communicate innaturallanguage hand with the idea of a graduated series of (for example, dialog) challenges, each slightly out of reach, each pro- Learnandexercise procedures viding ample rope for systems to hang them- (for example, long division, outlining a report) selves, yet leading to the challenge’s long-term goals. It follows from these principles that the Read for content challenge itself should be easily modified, by (for example, show that one gets the main pointsofastory) changing rules, initial conditions, require- Learnbybeing told ments for success, and so on. (for example, life was hard for the pioneers) A successful challenge captures the hearts and of the research community. Popular Common sense inference games and competitions are good choices, pro- (for example, few people wanted tobepioneers) vided that they require new science. The cost of and learning from commonsense inference entry should be low; students should be able to scrape together sufficient resources to partici- Understand math storyproblems and solve them correctly pate, and the organizations that manage chal- Master a lot of facts (math facts, historyfacts,andsoon). lenges should make grants of money and Masterymeansusing the facts to answer questions equipment as appropriate. All participants and solve problems. should share their technologies so that new participants can start with “last year’s model” Prioritize and have a chance of doing well. (for example, choose one book over another, In addition to these pragmatic and, I expect, decide which problems todoonatest) uncontroversial suggestions, I would like to Explainsomething suggest three others which are not so obviously (for example, why plants need light) right. First, Turing proposed his test to answer the Makeaconvincing argument question “Can machines think?” but this does (for example, why recess should be longer) not mean a challenge for AI must provide evi- Make upandwrite a storyaboutanassigned subject dence for or against the proposition that com- (for example, Thanksgiving) puters have intentional states and behaviors. I do not think we have any chance of testing this proposition. There are no objective characteri- zations of human intentional states, and the states of machines can be described in many ways, from the states of registers up to what Newell called the knowledge level. It is at least technically challenging and perhaps impossi- Table 1. Third-Grade Skills (thanks to Carole Beal). ble to establish correspondences between ill- specified human intentional states and ma- chine states, so the proposition that machines “have” intentional states probably cannot be tested. Perhaps the most we can require of chal- lenge problems is that they include tasks that humans describe in intentional terms. Second, in any given challenge, we should However an attempt is scored, one should get accept poor performance but insist on univer- specific, diagnostic feedback to help one under- sal coverage. I admit that it is hard to define stand exactly what worked and what didn’t. universal coverage, but examples are easily Scoring should be transparent so one can see found or imagined: Reading and comprehend- exactly why the attempt got the score it did. If ing any book suitable for five year olds; produc- possible, scoring should be objective, automat- ing an expository essay on any subject; going ic, and easily repeated. For instance, the ma- up the high street to several stores for the chine community experienced a week’s shopping; playing Trivial Pursuit; creat- jump in productivity once could ing a reading list for any undergraduate essay be scored automatically, sometimes daily, in- subject; learning classifiers for a thousand data

66 AI MAGAZINE 25th Anniversary Issue sets without manually retuning the Conclusion References learner’s parameters for each; playing Dennett, D. 1998. Can Machines Think? In In answer to the question, “if not the any two-person strategy game well Brainchildren, Essays on Designing Minds. with minimal training; beating the Turing Test, then what,” AI researchers Cambridge, MA: MIT Press. haven’t been sitting around waiting world champion soccer team. Each of French, R. 2000. The Turing Test: The First these problems requires a wide range for something better; they have been Fifty Years. Trends in Cognitive Sciences 4(3): of capabilities, or has a great many very inventive. There are challenge 115–121. nonredundant instances, or both. One problems in planning, e-commerce, Gardner, H. 1983. Frames of Mind: The The- could not claim success by solving on- knowledge discovery from databases, ory of Multiple Intelligences. New York: Basic ly a part of one of these problems or , game playing, and numerous Books. only a handful of possible problem in- competitions in aspects of natural lan- Hayes, P., and Ford, K. 1995. Turing Test stances. What should we call a pro- guage. Some are more successful or en- Considered Harmful. In Proceedings of the gram that plays chess brilliantly? His- gaging than others, and I have dis- Fourteenth International Joint Conference on Artificial Intelligence, p. 972. San Francisco: tory! What should we call a program cussed some attributes of problems Morgan Kaufmann Publishers. that plays any two-person strategy that might explain these differences. Lenat, D. B., and Feigenbaum, E. A. 1987. game, albeit poorly? A good start! A My goal has been to identify attributes of good challenge problems so that we On the Thresholds of Knowledge. In Pro- program that analyzes the plot of can have more. Many of these efforts ceedings of the Tenth International Joint Con- Romeo and Juliet? History! A program are not supported directly by govern- ference on , 1173–1182. that summarizes the plot of any chil- San Francisco: Morgan Kaufmann Publish- ment, they are the efforts of individu- dren’s book, albeit poorly? A good ers. als and volunteers. Perhaps you can start! Poor performance and universal Mackworth, A. 1993. On Seeing Robots. In see an opportunity to organize some- scope are preferred to good perfor- : Systems, Theory and Appli- thing similar in your area of AI. mance and narrow scope. cations, ed. A. Basu and X. Li, 1–13. Singa- My third and final point is related to Acknowledgments pore: World Scientific Press, 1993. Reprint- the last one. Challenge problems ed in Mind Readings, ed. P. Thagard. This article is based on an invited talk Cambridge, MA: MIT Press, 1998. should foster what I’ll call a develop- at the 2004 National Conference on mental research strategy instead of the Turing, A. 1950. Computing Machinery Artificial Intelligence. While preparing and Intelligence. Mind 59(236):433–460. more traditional and generally suc- the talk I asked for opinions and sug- cessful divide and conquer strategy. The Whitby, B. R. 1996. The Turing Test: AI’s gestions from many people who wrote Biggest Blind Alley? In Machines and word developmental reminds us that sometimes lengthy and thoughtful as- Thought: The , Vol. 1., children do many things poorly, yet sessments and suggestions. In particu- ed. P. Millican and A. Clark. Oxford: Oxford they are complete, competent agents lar, Manuela Veloso, Yolanda Gil, and University Press. (www.informatics.sus- who learn from each other, and adults, Carole Beal helped me very much. sex.ac.uk/users/blayw/tt.html.) and books, and , and play- One fortunate result of the talk was Paul Cohen is with the ing, and physical maturation, and oth- that Edward A. Feigenbaum disagreed er ways, besides. In children we see Intelligent Systems Divi- with some of what I said and told me sion at USC's Informa- gradually increasing competence so. Our discussions helped me better tion Sciences Institute across many domains. In AI we usually understand some issues and refine and he is a Research Pro- see deep competence in narrow do- what I wanted to say. The Defense Ad- fessor of Computer Sci- mains, but there are exceptions: robot- vanced Research Projects Agency ence at USC. He received ic soccer teams have played soccer (DARPA), especially Ron Brachman, his Ph.D. from Stanford every year since the competitions be- David Gunning, Murray Burke, and University in computer science and psychology in 1983 and his gan. If the organizers had followed the Barbara Yoon, have supported this traditional divide and conquer strate- M.S. and B.A. in psychology from the Uni- work (cooperative agreement number versity of California at Los Angeles and the gy, then the first few annual competi- F30602-01-1-0583) as they have sup- University of California at San Diego, re- tions would have tested bits and ported many other activities to review spectively. He served as a councilor of the pieces—vision, navigation, communi- where AI is heading and to help steer American Association for Artificial cation, and control—and we probably it in appropriate directions. I would Intelligence (AAAI) from 1991 to 1994 and would still be waiting to see a com- like to thank David Aha, Michael was elected in 1993 as a fellow of AAAI. plete robotic team play an entire Berthold, Jim Blythe, Tom Dietterich, game. Despite the success of divide- Mike Genesereth, Jim Hendler, and-conquer in many sciences, I don’t Lynette Hirschmann, Jerry Hobbs, Ed think it is a good strategy for AI. Ro- Hovy, Adele Howe, Kevin Knight, Alan botic soccer followed the other, devel- Mackworth Daniel Marcu, Pat Langley, opmental strategy, and required com- Tom Mitchell, Natasha Noy, Tim plete, integrated systems to solve the Oates, Beatrice Oshika, Steve Smith, whole problem. Competent these sys- Milind Tambe, David Waltz, and Mo- tems were not, but competence came hammed Zaki for their discussions and with time, as it does to children. help.

WINTER 2005 67