The Age of Intelligent Machines: Can Machines Think?
Can machines think? This has been a conundrum for philosophers for years, but in their fascination with the pure conceptual issues they have for the most part overlooked the real social importance of the answer.
It is of more than academic importance that we learn to think clearly about the actual cognitive powers of computers, for they are now being introduced into a variety of sensitive social roles where their powers will be put to the ultimate test: in a wide variety of areas, we are on the verge of making ourselves dependent upon their cognitive powers. The cast of overestimating them could be enormous.
One of the principal inventors of the computer was the great British mathematician Alan Turing. It was he who first figured out, in highly abstract terms, how to design a programmable computing device, what we now call a universal Turing machine.
All programmable computers in use today are in essence Turing machines. About forty years ago, at the dawn of the computer age, Turing began a classic article “Computing Machinery and Intelligence” with the words “I propose to consider the question, ‘Can machines think?’” but he then went on to say that this was a bad question, a question that leads only to sterile debate and haggling over definitions, a question, as he put it, “too meaningless to deserve discussion.”1
In its place he substituted what he took to be a much better question, a question that would be crisply answerable and intuitively satisfying–in every way an acceptable substitute for the philosophic puzzler with which he began.
First he described a parlor game of sorts, the imitation game, to be played by a man, a woman, and a judge (of either gender). The man and woman are hidden from the judge’s view but are able to communicate with the judge by teletype; the judge’s task is to guess, after a period of questioning each contestant, which interlocutor is the man and which the woman.
The man tries to convince the judge he is the woman, and the woman tries to convince the judge of the truth. The man wins the judge makes the wrong identification. A little reflection will convince you, I am sure, that aside from lucky breaks, it would take a clever man to convince the judge that he was the woman–on the assumption that the judge is clever too, of course.
Now suppose, Turing said, we replace the man or woman with a computer and give the judge the task of determining which is the human being and which is the computer. Turing proposed that any computer that can regularly or often foal a discerning judge in this game would be intelligent, a computer that thinks, beyond any reasonable doubt.
Now, it is important to realize that failing this test is not supposed to be a sign of lack of intelligence. Many intelligent people, after all, might not be willing or able to play the imitation game, and we should allow computers the same opportunity to decline to prove themselves. This is, then, a one-way test; failing it proves nothing.
Furthermore, Turing was not committing himself to the view (although it is easy to see how one might think he was) that to think is to think just like a human being–any more than he was committing himself to the view that for a man to think, he must think exactly like a woman. Men, women, and computers may all have different ways of thinking.
But surely, he thought, one can think in one’s own peculiar style well enough to imitate a thinking man or woman, one can think well, indeed. This imagined exercise has come to be known as the Turing test.
It is a sad irony that Turing’s proposal has had exactly the opposite effect on the discussion of what he intended. Turing didn’t design the test as a useful tool in scientific psychology, a method of confirming or disconfirming scientific theories or evaluating particular models of mental function; he designed it to be nothing more than a philosophical conversation stopper.
He proposed, in the spirit of “Put up or shut up!”, a simple test for thinking that is surely strong enough to satisfy the sternest skeptic (or so he thought). He was saying, in effect, that instead of arguing interminably about the ultimate nature and essence of thinking, we should all agree that whatever that nature is, anything that could pass this test would surely have it; then we could turn to asking how or whether some machine could be designed and built that might pass the test fair and square.
Alas, philosophers, amateur and professional, have instead taken Turing’s proposal as the pretext for just the sort of definitional haggling and interminable arguing about imaginary counter-examples that he was hoping to squelch.
This forty-year preoccupation with the Turing test has been all the more regrettable because it has focused attention on the wrong issues. There are real world problems that are revealed by considering the strengths and weaknesses of the Turing test, but these have been concealed behind a smoke screen of misguided criticisms. A failure to think imaginatively about the test actually proposed by Turing has led many to underestimate its severity and to confuse it with much less interesting proposals.
So first I want to show that the Turing test, conceived as he conceived it, is (as he thought) quite strong enough as a test of thinking. I defy anyone to improve upon it. But here is the point almost universally overlooked by the literature: there is a common misapplication of the Turing test that often leads to drastic overestimation of the powers of actually existing computer systems. The follies of this familiar sort of thinking about computers can best be brought out by a reconsideration of the Turing test itself.
The insight underlying the Turing test is the same insight that inspires the new practice among symphony orchestras of conducting auditions with an opaque screen between the jury and the musician. What matters in a musician is, obviously, musical ability and only musical ability; such features as sex, hair length, skin color, and weight are strictly irrelevant. Since juries might be biased even innocently and unawares by these irrelevant features, they are carefully screened off so only the essential feature, musicianship, can be examined.
Turing recognized that people might be similarly biased in their judgments of intelligence by whether the contestant had soft skin, warm blood, facial features, hands, and eyes–which are obviously not themselves essential components of intelligence. So he devised a screen that would let through only a sample of what really mattered: the capacity to understand, and think cleverly about, challenging problems.
Perhaps he was inspired by Descartes, who in his Discourse on Method (1637) plausibly argued that there was no more demanding test of human mentality than the capacity to hold an intelligent conversation: “It is indeed conceivable that a machine could be so made that it would utter words, and even words appropriate to the presence of physical acts or objects which cause some change in its organs; as, for example, it was touched in some so spot that it would ask what you wanted to say to it; in another, that it would cry that it was hurt, and so on for similar things. But it could never modify its phrases to reply to the sense of whatever was said in its presence, as even the most stupid men can do.”2
This seemed obvious to Descartes in the seventeenth century, but of course, the fanciest machines he knew were elaborate clockwork figures, not electronic computers. Today it is far from obvious that such machines are impossible, but Descartes’ hunch that ordinary conversation would put as severe a strain on artificial intelligence as any other test was shared by Turing. Of course, there is nothing sacred about the particular conversational game chosen by Turing for his test; it is just a cannily chosen test of more general intelligence.
The assumption Turing was prepared to make was this: Nothing could possibly pass the Turing test by winning the imitation game without being able to perform indefinitely many other clearly intelligent actions. Let us call that assumption the quick-probe assumption.
Turing realized, as anyone would, that there are hundreds and thousands of telling signs of intelligent thinking to be observed in our fellow creatures, and one could, one wanted, compile a vast battery of different tests to assay the capacity for intelligent thought. But success on his chosen test, he thought, would be highly predictive of success on many other intuitively acceptable tests of intelligence.
Remember, failure on the Turing test does not predict failure on those others, but success would surely predict success. His test was so severe, he thought, that nothing that could pass it fair and square would disappoint us in other quarters. Maybe it wouldn’t do everything we hoped–maybe it wouldn’t appreciate ballet, understand quantum physics, or have a good plan far world peace, but we’d all see that it was surely one of the intelligent, thinking entities in the neighborhood.
Is this high opinion of the Turing tests severity misguided? Certainly many have thought so, but usually because they have not imagined the test in sufficient detail, and hence have underestimated it. Trying to forestall this skepticism, Turing imagined several lines of questioning that a judge might employ in this game that would be taxing indeed–lines about writing poetry or playing chess. But with thirty years’ experience with the actual talents and foibles of computers behind us, perhaps we can add a few more tough lines of questioning.
Terry Winograd, a leader in AI efforts to produce conversational ability in a computer, draws our attention to a pair of sentences.3 They differ in only one word. The first sentence is this: “The committee denied the group a parade permit because they advocated violence.” Here’s the second sentence: “The committee denied the group a parade permit because they feared violence.”
The difference is just in the verb–”advocated” or “feared.” As Winograd points out, the pronoun “they” in each sentence is officially ambiguous. Both readings of the pronoun are always legal. Thus, we can imagine a world in which governmental committees in charge of parade permits advocate violence in the streets and, for some strange reason, use this as their pretext for denying a parade permit. But the natural, reasonable, intelligent reading of the first sentence is that it’s the group that advocated violence, and of the second, that it’s the committee that feared the violence.
Now sentences like this are embedded in a conversation, the computer must figure out which reading of the pronoun is meant, it is to respond intelligently. But mere rules of grammar or vocabulary will not fix the right reading. What fixes the right reading for us is knowledge about politics, social circumstances, committees and their attitudes, groups that want to parade, how they tend to behave, and the like. One must know about the world, in short, to make sense of such a sentence.
In the jargon of artificial intelligence, a conversational computer needs lots of world knowledge to do its jab. But, it seems, it is somehow endowed with that world knowledge on many topics, it should be able to do much more with that world knowledge than merely make sense of a conversation containing just that sentence.
The only way, it appears, for a computer to disambiguate that sentence and keep up its end of a conversation that uses that sentence would be for it to have a much more general ability to respond intelligently to information about social and political circumstances and many other topics. Thus, such sentences, by putting a demand on such abilities, are good quick probes. That is, they test for a wider competence.
People typically ignore the prospect of having the judge ask off-the-wall questions in the Turing test, and hence they underestimate the competence a computer would have to have to pass the test. But remember, the rules of the imitation game as Turing presented it permit the judge to ask any question that could be asked of a human being–no holds barred. Suppose, then, we give a contestant in the game this question: An Irishman found a genie in a bottle who offered him two wishes.
“First I’ll have a pint of Guinness,” said the Irishman, and when it appeared, he took several long drinks from it and was delighted to see that the glass filled itself magically as he drank. “What about your second wish?” asked the genie. “Oh well, that’s easy,” said the Irishman. “I’ll have another one of these!” Please explain this story to me, and tell me there is anything funny or sad about it.
Now even a child could express, even not eloquently, the understanding that is required to get this joke. But think of how much one has to know and understand about human culture, to put it pompously, to be able to give any account of the point of this joke.
I am not supposing that the computer would have to laugh at, or be amused by, the joke. But it wants to win the imitation game–and that’s the test, after all–it had better know enough in its own alien, humorless way about human psychology and culture to be able to pretend effectively that it was amused and explain why.
No comments: