If Not Turing's Test, Then What?

AI Magazine Volume 26 Number 4 (2006)(2005) (© AAAI) 25th Anniversary Issue If Not Turing’s Test, Then What? Paul R. Cohen ■ If it is true that good problems produce good sci- Ignore It, and Maybe ence, then it will be worthwhile to identify good It Will Go Away… problems, and even more worthwhile to discover the attributes that make them good problems. This Blay Whitby (1996) offers this humorous histo- discovery process is necessarily empirical, so we ex- ry of the Turing test: amine several challenge problems, beginning with 1950–1966: A source of inspiration to all con- Turing’s famous test, and more than a dozen attrib- cerned with AI. utes that challenge problems might have. We are 1966–1973: A distraction from some more led to a contrast between research strategies—the promising avenues of AI research. successful “divide and conquer” strategy and the promising but largely untested “developmental” 1973–1990: By now a source of distraction strategy—and we conclude that good challenge mainly to philosophers, rather than AI workers. problems encourage the latter strategy. 1990: Consigned to history. Perhaps Whitby is right, and Turing’s test should be forgotten as quickly as possible and should not be taught in schools. Plenty of peo- ple have tried to get rid of it. They argue that Turing’s Test: The First Challenge the test is methodologically flawed and is based in bad philosophy, that it exposes cultural bias- More than fifty years ago, Alan Turing pro- es and naïveté about what Turing calls the posed a clever test of the proposition that ma- “programming” required to pass the test. Yet chines can think (Turing 1950). He wanted the the test still stands as a grand challenge for ar- proposition to be an empirical, one and he par- tificial intelligence, it is part of how we define ticularly wanted to avoid haggling over what it ourselves as a field, it won’t go away, and, if it means for anything to think. did, what would take its place? We now ask the question, ‘What will happen Turing’s test is not irrelevant, though its role when a machine takes the part of [the man] in has changed over the years. Robert French’s this game?’ Will the interrogator decide wrong- (2000) history of the test treats it as an indica- ly as often when the game is played like this as tor of attitudes toward AI. French notes that he does when the game is played between a among AI researchers, the question is no man and a woman? These questions replace our longer, “What should we do to pass the test?” original, “Can machines think?” but, “Why can’t we pass it?” This shift in atti- More recently, the test has taken slightly differ- tudes—from hubris to a gnawing worry that AI ent forms. Most contemporary versions ask is on the wrong track—is accompanied by an- simply whether the interrogator can be fooled other, which, paradoxically, requires even more into identifying the machine as human, not encompassing and challenging tests. The test is necessarily a man or a woman. too behavioral—the critics say—too oriented to There are many published arguments about language, too symbolic, not grounded in the Turing’s paper, and I want to look at three kinds physical world, and so on. We needn’t go into of argument. One kind says Turing’s test is irrel- the details of these arguments to see that Tur- evant; another concerns the philosophy of ma- ing’s test continues to influence the debate on chines that think; the third is methodological. what AI can or should do. Copyright © 2005, American Association for Artificial Intelligence. All rights reserved. 0738-4602-2005 / $2.00 WINTER 2005 61 25th Anniversary Issue There is only one sense in which Turing’s test good, in the sense of helping AI researchers is irrelevant: almost nobody thinks we should make progress. Turing’s test has some of these devote any effort in the foreseeable future to good attributes, as well as some really bad ones. trying to pass it. In every other sense, as a his- The one thing everyone likes about the Tur- torical challenge, a long-term goal for AI, a ing test is its proxy function, the idea that the philosophical problem, a methodological case test is a proxy for a great many, wide-ranging study, and an indicator of attitudes in AI, the intellectual capabilities. Dennett puts it this Turing test remains relevant. way: “Nothing could possibly pass the Turing test by Turing the Philosopher winning the imitation game without being able Would Turing mind very much that his test no to perform indefinitely many other intelligent longer has the role he intended? If we take Tur- actions. … [Turing’s] test was so severe, he ing at his word, then it is not clear that he ever thought, that nothing that could pass it fair and intended his test to be attempted: square would disappoint us in other quarters.” (Dennett 1998) There are already a number of digital computers in working order, and it may be asked, ‘Why No one in AI claims to be able to cover such not try the experiment straight away?…’ The a wide range of human intellectual capabilities. short answer is that we are not asking whether We don’t say, for instance, “Nothing could pos- all digital computers would do well in the game sibly perform well on the UCI machine learn- nor whether the computers at present available ing test problems without being able to per- would do well, but whether there are imagin- form indefinitely many other intelligent able computers which would do well. actions.” Nor do we think word sense disam- Daniel Dennett thinks Turing intended the biguation, obstacle avoidance, image segmen- test as “a conversational show-stopper,” yet the tation, expert systems, or beating the world philosophical debate over Turing’s test is ironi- chess champion are proxies for indefinitely I am cally complicated. As Dennett says, “Alas, many other intelligent actions, as Turing’s test confident that philosophers—amateur and professional— is. It is valuable to be reminded of the breadth if we pose the have instead taken Turing’s proposal as a pre- of human intellect, especially as our field frac- text for just the sort of definitional haggling tures into subdisciplines, and I suppose one right sorts of and interminable arguing about imaginary methodological contribution of Turing’s test is challenges, counterexamples he was hoping to squelch” to remind us to aim for broad, not narrow com- (Dennett 1998). petence. However, many find it easier and then we will Philosophers wouldn’t be interested if Turing more productive to specialize, and, even make good hadn’t been talking about intentional attributes though we all know about Turing’s test and of machines—beliefs, goals, states of knowl- many of us consider it a worthy goal, it isn’t progress in AI. edge, and so on—and because we in AI are enough to encourage us to develop broad, gen- about building machines with intentional at- eral AI systems. tributes, philosophers will always have some- So in a way, the Turing test is impotent: It thing to say about what we do. However, even has not convinced AI researchers to try to pass if the preponderance of philosophical opinion it. Paradoxically, although the proxy function was that machines can’t think, it probably is the test’s most attractive feature, it puts the wouldn’t affect the work we do. Who among us cookie jar on a shelf so high that nobody reach- would stop doing AI if someone proved that es for it. Indeed, as Pat Hayes and Ken Ford machines can’t think? I would like to know point out, “The Turing Test is now taken to be whether there is life elsewhere in the universe; simply a rather fancy way of stating that the I think the question is important, but it doesn’t goal of AI is to make an artificial human being” affect my work, and neither does the question (Hayes and Ford 1995). of whether machines can think. Consequently, A second notable methodological failing of at least in this article, I am unconcerned with Turing’s test is that it pushes many aspects of philosophical arguments about whether ma- intelligence into one test that has a yes or no chines can think. answer. This isn’t necessary. We could follow the lead of the multiple intelligences move- Turing’s Test as Methodology ment in cognitive psychology and devise tests Instead I will focus on a different, entirely of different sorts of intelligence. In fact, Tur- methodological question: Which attributes of ing’s test is not even very complete, when tests for the intentional capabilities of machines viewed in terms of, say, Howard Gardner’s cat- lead to more capable machines? I am confident alog of intelligences (Gardner 1983). It focused that if we pose the right sorts of challenges, mostly on logical, linguistic, and interpersonal then we will make good progress in AI. This ar- intelligence, not on intrapersonal, bodily- ticle is really about what makes challenges kinesthetic, naturalist, musical, and visual-spa- 62 AI MAGAZINE 25th Anniversary Issue tial intelligence (rounding out the eight in sidered. First, it is all or nothing: it gives no in- Gardner’s catalog). dication as to what a partial success might look Robert French goes further and criticizes the like. Second, it gives no direct indications as to test for its focus on culturally oriented human how success might be achieved” (Whitby intelligence: “The Test provides a guarantee not 1996).

If Not Turing's Test, Then What?

Much Has Been Written About the Turing Test in the Last Few Years, Some of It

Turing Test Does Not Work in Theory but in Practice

THE TURING TEST RELIES on a MISTAKE ABOUT the BRAIN For

The Turing Test and Other Design Detection Methodologies∗

A Review on Neural Turing Machine

The Future Just Arrived

(GPT-3) in Healthcare Delivery ✉ Diane M

Does the Turing Test Demonstrate Intelligence Or Not?

An Introduction to Bayesian Artificial Intelligence in Medicine

Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation

The Chinese Room: Just Say “No!” to Appear in the Proceedings of the 22Nd Annual Cognitive Science Society Conference, (2000), NJ: LEA

IEEE Paper Template in A4 (V1)