<<

THE YEAR IN COGNITIVE NEUROSCIENCE 2009

The of Language Michael C. Corballis Department of Psychology, University of Auckland, Auckland, New Zealand

Language, whether spoken or signed, can be viewed as a gestural system, evolving from the so-called mirror system in the brain. In nonhuman the gestural system is well developed for the productions and perception of manual action, especially transitive acts involving the grasping of objects. The emergence of in the hominins freed the for the of the mirror system for intransitive acts for communication, initially through the miming of events. With the emergence of the genus from some 2 million years ago, pressures for more complex communication and increased vocabulary size led to the conventionalization of gestures, the loss of iconic representation, and a gradual shift to vocal gestures replacing manual ones— although signed languages are still composed of manual and facial gestures. In parallel with the conventionalization of symbols, languages gained grammatical complexity, perhaps driven by the evolution of episodic memory and mental time travel, which involve combinations of familiar elements—Who did what to whom, when, where, and why? Language is thus adapted to allow us to share episodic structures, whether past, planned, or fictional, and so increase survival fitness.

Key words: language; evolution; gesture; FOXP2; mirror neurons

Introduction mechanical principles, confirming for him, at least, that it must indeed be God-given. Non- Is language evolution “the hardest problem , on the other , were mere in science,” as recently suggested (Christiansen automata, and therefore incapable of human- & Kirby 2003, p. 1)? Certainly, it has had an like language. Descartes has been called the extraordinarily difficult and contentious his- founder of modern philosophy, but his views tory. In early times, at least, part of the diffi- on language clearly left little scope for an evo- culty has been religious, but that in turn might lutionary account. relate to the seemingly miraculous flexibility The publication of Darwin’s Origin of Species and open-endedness of language. In Christian in 1859 should have encouraged a more evo- thought it was proclaimed that language must lutionary approach but was quickly followed in be a gift from God: “In the beginning,” says 1866 by the famous ban on all discussion of St. John in the Bible, “was the Word, and the the by the Linguistic Society Word was with God, and the Word was God.” of Paris. The ban was reiterated in 1872 by the This view constrained philosophical thinking Philological Society of London. These suppres- well into the second millennium AD. In the sive moves may well have been influenced by 17th century,Ren´ e´ Descartes (1985/1647) con- the religious opposition to Darwin’s theory of sidered that language allowed such freedom but may have also been mo- of expression that it could not be reduced to tivated by the apparent uniqueness of human language, the absence of material evidence as to how it evolved, and the speculative nature of any discussion. Address for correspondence: Michael C. Corballis, Department of Psychology, University of Auckland, Auckland, New Zealand. In more recent times, discussion of language [email protected] evolution was further stifled by the views of

The Year in Cognitive Neuroscience 2009: Ann. N.Y. Acad. Sci. 1156: 19–43 (2009). doi: 10.1111/j.1749-6632.2009.04423.x C 2009 New York Academy of Sciences. 19 20 Annals of the New York Academy of Sciences

Noam Chomsky, the most influential linguist of within the first few generations of Homo sapi- the late 20th century. In 1975, he wrote: ens sapiens” (p. 69). The idea that grammatical language evolved gradually is more consistent We know very little about what happens when 1010 neurons are crammed into something the size of a with the notion of grammaticalization (e.g., Heine basketball, with further conditions imposed by the & Kuteva 2007; Hopper & Traugott 1993), specific manner in which this system developed whereby grammar emerges in an incremental over time. It would be a serious error to suppose manner, than with Chomsky’s (1975) view that that all properties, or the interesting structures that all are equipped with an innate uni- evolved, can be ‘explained’ in terms of natural se- lection. (p. 59) versal grammar. In this respect, this article is in accord with growing skepticism about univer- As an avowed Cartesian, Chomsky’s views sal grammar as a useful or viable concept (e.g., were based on the firm opinion that language Christiansen & Chater 2008; Everett 2005, is based on “an entirely different principle” 2007; Tomasello 2003, 2008; Wray 2002). Fi- (1966, p. 77) from communication. Un- nally, I assume that language is fundamentally like Descartes, though, he did not appeal to a gestural system, and evolved from manual divine intervention. gestures rather than animal calls, and indeed it The tide turned, though, with the publi- persists in this form in signed languages. This cation of an influential paper by Pinker and assumption is in accord with the notion that Bloom in 1990. They took special issue with language is an embodied system (e.g., Barsalou Chomsky’s view that language could not have 2008), rather than a system based on amodal evolved through natural selection. Their argu- abstract symbols. ment rested primarily on the simple fact that These assumptions will not find universal fa- language is complex, designed for the commu- vor.They are chosen to support what I see as the nication of propositional structures, and that most plausible account of how language might the only rational explanation for complex struc- have evolved. But if we heed Dobzhansky’s tures lies in natural selection. Even Chomsky (1973, p. 125) famous statement that “[n]othing appears to have been somewhat influenced, in biology makes sense except in the light of evo- since he was coauthor of an article accepting lution,” then a coherent evolutionary account some degree of continuity between human and should itself support the assumptions underly- animal communication, but still insisting on a ing it. critical component unique to humans (Hauser It is convenient, then, to begin with the con- et al. 2002). Pinker and Bloom’s article spurred troversial theory that language evolved from a flurry of books, articles and conferences on manual gestures. the evolution of language, offering such a diver- sity of opinion that one might begin to wonder whether the ban should be restored. In pre- The Gestural Theory senting my own views, then, it may be helpful to establish some preliminary assumptions. History First, pace Chomsky, I assume that language did indeed evolve through natural selection. Despite religious objection to evolution- This makes it at least reasonable to seek pre- ary accounts, the idea that language may cursors to language in our primate forebears. have originated in manual gestures did sur- Second, I assume that language did not ap- face in early writings. In 1644, Bulwer wrote pear abruptly in hominin evolution, as sug- on “the natural language of the hand,” and gested by authors such as Bickerton (1995), who Cordemoy (1668/1972) called gestures “the wrote that “true language, via the emergence first of all languages,” noting that they were of syntax, was a catastrophic event, occurring universal and understood everywhere. These Corballis: The Evolution of Language 21 observations were based partly on the experi- how the child still learns to understand its mother. ences of European traders, who discovered that In general, painful sensations were probably also they could communicate in foreign lands by us- expressed by a gesture that in its turn caused pain (for example, tearing the , beating the breast, ing only bodily gestures. violent distortion and tensing of the facial mus- Some authors found ways to avoid re- cles). Conversely, gestures of pleasure were them- ligious opprobrium through gestural theory. selves pleasurable and were therefore easily suited The Italian philosopher Giambattista Vico to the communication of understanding (laughing (1953/1744) accepted the Biblical story of the as a sign of being tickled, which is pleasurable, then divine but proposed that af- served to express other pleasurable sensations). ter the Flood humans reverted to savagery, and As soon as men understood each other in gesture, language emerged afresh. He suggested that a symbolism of gesture could evolve. I mean, one one of the post-Deluge languages made use could agree on a language of tonal signs, in such a of gestures and physical objects, but later be- way that at first both tone and gesture (which were came vocal through onomatopoeia and inter- joined by tone symbolically) were produced, and jections. Two years later the Abbe´ de Condil- later only the tone. (1986/1878, p. 129) lac (1971/1746) also proposed that language This extract also anticipates the later dis- evolved from manual gestures but, as an or- covery of the mirror system. In 1900, Wilhelm dained priest, he, too, was constrained by reli- Wundt, the founder of the first laboratory of gious doctrine and presented his theory in the experimental psychology at Leipzig in 1879, form of a fable about two children wander- wrote a two-volume work on speech and ar- ing about in the desert after the Flood. They gued that a universal sign language was the ori- began by communicating in manual and bod- gin of all languages. He wrote, though, under ily gestures, which were eventually supplanted the misapprehension that all deaf communities by vocalizations. Condillac’s compatriot Jean- use the same system of signing, and that signed Jacques Rousseau endorsed the gestural theory languages are useful only for basic communi- more openly in an essay published in 1782. cation, and are incapable of communicating In the following century, abstract ideas. We now know that signed lan- made reference to the role of gestures in his guages vary widely from community to com- book The Descent of Man: “I cannot doubt that munity and can have all of the communicative language owes its origins to the imitation and sophistication of speech (e.g., Emmorey 2002; modification of various natural sounds, and Neidle et al. 2000). man’s own distinctive cries, aided by signs and gestures” (1871/1896, p. 86, emphasis added). Modern Developments Shortly afterwards, the philosopher Friedrich Nietzsche chimed in with the following extract In 1973, the anthropologist Gordon W. from his 1878 book Human, All too Human: Hewes presented the gestural theory in more modern dress. He too drew on evidence from Imitation of gesture is older than language, and signed languages but made no claim for a uni- goes on involuntarily even now, when the language of gesture is universally suppressed, and the edu- versal signed language. He also referred to con- cated are taught to control their muscles. The imi- temporary work showing that great were tation of gesture is so strong that we cannot watch unable to learn to speak but could use man- a face in movement without the innervation of our ual gestures in a language-like way, with at own face (one can observe that feigned yawning least moderate success. Gestural theory lan- will evoke natural yawning in the man who ob- guished, though, until the 1990s. The influ- serves it). The imitated gesture led the imitator back to the sensation expressed by the gesture in ential article by Pinker and Bloom (1990) had the body or face of the one being imitated. This is made no mention of Hewes’ work but was fol- how we learned to understand one another; this is lowed by an increasing number of publications 22 Annals of the New York Academy of Sciences that picked up the gestural theme (e.g., Arbib language must have appeared de novo at some 2005; Armstrong 1999; Armstrong et al. 1995; point in hominin evolution, and intermediate Armstrong & Wilcox 2007; Corballis 1991; forms are not available. Nevertheless, the chim- 1992, 2002, 2003; Givon´ 1995; Place 2000; panzee and probably provide the best Pollick & de Waal 2007; Rizzolatti & Arbib available proxies for the communicative abili- 1998; Rizzolatti & Sinigaglia 2008; Ruben ties of the earliest hominins and a starting point 2005; Skoyles 2000; Tomasello 2008). for the construction of evolutionary scenarios. A major impetus has been the growing ev- Consideration of communication capabili- idence that the signed languages of the deaf ties in primates indeed supports the idea that have all of the grammatical and semantic so- language evolved from manual gestures. For phistication of spoken languages, as exempli- one thing, nonhuman primates have little if fied by the fact that university-level instruction any cortical control over vocalization, but ex- at Gallaudet University in Washington, DC, cellent cortical control over the hands and is conducted entirely in American Sign Lan- arms (Ploog 2002). This is illustrated by the guage (ASL). Children exposed to signed lan- fact that attempts over the past half-century to guages from infancy learn them as easily and teach our closest nonhuman relatives, the great naturally as those exposed to speech learn to apes, to speak have been strikingly unsuccess- speak, even going through a stage of manual ful, but relatively good progress has been made “babbling” (Petitto & Marentette 1991). Al- toward teaching them to communicate by a though it has been argued that manual and form of manual sign language, as in the case vocal babbling are evidence for language as of the Washoe (Gardner & Gard- a fundamentally amodal system (Petitto et al. ner 1969), or by pointing to visual symbols on 2004), others have used evidence from sign lan- a keyboard, as in the bonobo Kanzi (Savage- guage to argue that language, whether manual Rumbaugh et al. 1998). These visual forms or vocal, is fundamentally gestural rather than of communication scarcely match the gram- amodal, with manual gestures taking prece- matical sophistication of modern humans, but dence in (Armstrong et al. they are a considerable advance over the re- 1995; Armstrong & Wilcox 2007). At least in stricted nature of the speech sounds that these principle, language could have existed in a form animals make. The human equivalents of pri- consisting entirely of manual and facial ges- mate vocalizations are probably emotionally- tures, comparable to present-day signed lan- based sounds like laughing, crying, grunting, guages. It is perhaps more likely, though, that a or shrieking, rather than words. vocal component was blended in gradually,and Although vocal communication is wide- even today speech is normally accompanied by spread in the animal kingdom, surprisingly few manual gestures (Goldin-Meadow & McNeill species are capable of vocal learning, which is of 1999, McNeill 1985, 1992). course critical to speech. These species include Evidence has also come from the study of elephants, seals, killer whales, and some communication in nonhuman primates, and (Jarvis 2006). Among the primates, according to in particular our closest nonhuman relatives, Jarvis, only humans are vocal learners. Studies and . Such evidence is of nevertheless suggest that primate calls do show course indirect, since present-day chimpanzees limited modifiability, but its basis remains un- and bonobos have themselves evolved since clear, and it is apparent in subtle changes within the last common human–chimpanzee ancestor call types rather than the generation of new call some 6 or 7 million years ago. Given the gen- types (Egnor & Hauser 2004). Where language eral consensus that chimpanzees and bonobos is flexible and conveys propositional informa- are incapable of true language (or have yet to tion about variable features of the world, ani- demonstrate it), there is the added problem that mal calls are typically stereotyped and convey Corballis: The Evolution of Language 23 information in circumscribed contexts. For ex- infants seem already to understand sharing and ample, a study by Pollick and de Waal (2007) shared communication before language itself, showed that both chimpanzees and bonobos at least as manifest as speech, has appeared. communicate more often and more flexibly us- In this, they have already moved beyond the ing bodily gestures than vocalizations. They communicative capacity of the chimpanzee. recorded the incidence of 31 manual gestures Chimpanzees, along with bonobos, are our and 18 facial/vocal gestures over six different closest living nonhuman relatives and provide contexts (affiliative, agonistic, food, grooming, the best estimate of what communication was play,andsex)andfoundthatthefacial/vocal like prior to the emergence of true language. gestures were much more tied to specific con- Tomasello’s work therefore provides strong ev- texts than the manual gestures. Other studies idence that language evolved from manual have shown that the communicative bodily ges- gestures. Intermediate forms can be seen in ges- tures of (Pika et al. 2003), chimpanzees tural requests made by chimpanzees and bono- (Liebal et al. 2004), and bonobos (Pika et al. bos, and in the progression to more collabora- 2005) are subject to social learning and are sen- tive forms of gesture in human infants. sitive to the attentional state of the recipient— both prerequisites for language. The Mirror System Tomasello (2008) nevertheless notes that sen- sitivity to the attentional state of the recipi- The gestural theory was boosted in the 1990s ent is not sufficient for true language, at least by the discovery of mirror neurons, and later of in the Gricean sense of language as a co- a more general mirror system, in the primate operative enterprise directed at a joint goal brain (Arbib 2005; Rizzolatti & Arbib 1998; (Grice 1975). The chimpanzee gestural reper- Rizzolatti & Sinigaglia 2008). Mirror neurons, toire seems largely confined to making requests first recorded in area F5 in the ventral premo- that are essentially self-serving; for example, a tor cortex of the monkey, are activated both chimp may point to a desirable object that is just when the animal makes grasping movements out of reach, with the aim of getting a watching and when it observes another individual mak- human to help. According to Tomasello, true ing the same movements (Rizzolatti et al. 1996). language requires the further step of shared at- Although these neurons responded to manual tention, so that the communicator knows not actions, area F5 is considered the homologue only what the recipient knows or is attending of Broca’s area in humans (Rizzolatti & Arbib to, but knows also that the recipient knows that 1998), an area long associated with the produc- the communicator knows this! This kind of re- tion of speech (Broca 1861). More precisely, cursive mind-reading enables communication Broca’s area in humans can be divided into beyond the making of simple requests to the Brodman areas 44 and 45, with area 44 con- sharing of knowledge, which is one of the dis- sidered the true analogue of area F5. In hu- tinctive properties of language. mans, it is now evident that area 44 is involved Tomasello (2008) goes on to summarize his not only in speech, but also in motor functions and his colleagues’ work on pointing behavior unrelated to speech, including complex hand in human infants, suggesting that shared at- movements, and sensorimotor learning and in- tention and cooperative interchange begins to tegration (Binkofski & Buccino 2004). Indeed it emerge from about one year of age. For exam- has been proposed that “Broca’s area” should ple, 1-year-old infants not only point to objects now be regarded as a collective term, involving in order to request them, but sometimes point many different functions, and no clearly demar- to things that an adult is already looking at, in- cated subdivisions (Lindenberg et al. 2007). dicating the understanding that attention to the It has also become apparent that mirror object is shared. During the second year of life, neurons are part of a more general “mirror 24 Annals of the New York Academy of Sciences system” involving other regions of the brain as underlies the so-called motor theory of speech well as F5. In monkeys, the superior temporal perception, which holds that speech sounds are sulcus (STS) also contains cells that respond to perceived in terms of how they are produced, observed biological actions, including grasping rather than on the basis of acoustic analysis actions (Perrett et al. 1989), although few if any (Liberman et al. 1967). The issue leading to respond when the animal itself performs an ac- the motor theory was not the lack of informa- tion. F5 and STS are connected to area PF in tion in the acoustic signal, but rather the fact the inferior parietal lobule, where there are also that individual phonemes are perceived as in- neurons that respond to both the execution and variant despite extreme variability in the acous- perception of actions. These neurons are now tic signal. Liberman and colleagues proposed known as “PF mirror neurons” (Rizzolatti et al. that invariance lay instead in the articulatory 2001). Other areas, such as the amygdala and gestures; as Galantucci et al. (2006) put it in a orbito-frontal cortex, may also be part of the recent review, “perceiving speech is perceiving mirror system. Moreover, the extended mirror gestures” (p. 361). If this theory is correct, the system in monkeys largely overlaps with the ho- perception of speech might therefore be con- mologues of the cortical circuits in humans that sidered a natural function of the mirror system. are involved in language, leading to the notion The idea of speech as a gestural system has that language is just part of the mirror system it- led to the concept of articulatory phonology self (Fogassi & Ferrari 2007; but see Grodzinsky (Browman & Goldstein 1995; Goldstein et al. 2006, for some caveats). 2006). Speech gestures comprise movements of six articulatory organs, the lips, the velum, the From Hand to Mouth larynx, and the blade, body, and root of the tongue. Each is controlled separately, so that The primate mirror system has to do mainly individual speech units are comprised of dif- with manual gestures, and the signed languages ferent combinations of movements. The distri- of the deaf are also predominantly manual, al- bution of action over these articulators means though facial movements also play a promi- that the elements overlap in time, which makes nent role. But of course the dominant mode of possible the high rates of production and per- present-day language is speech, although move- ception. As support for the motor theory of ments of the hand and face play a secondary speech perception, the speech signal cannot be role in normal conversation (Goldin-Meadow decoded by means of visual representations of & McNeill 1999; McNeill 1985, 1992). If lan- the sound patterns, such as that provided in guage evolved from manual gestures, then, a sound spectrograph, but can be discerned there must have been a switch from hand to in mechanical representations of the gestures mouth during the course of hominin evolution. themselves, through X-rays, magnetic reso- Some have regarded this as a critical weak- nance imaging, and palatography (Studdert- ness of the gestural theory.The linguist Robbins Kennedy 2005). The transition from manual Burling, for example, wrote that “the gestural gesture to speech, then, can be regarded as theory has one nearly fatal flaw. Its sticking one occurring within the gestural domain, with point has always been the switch that would manual gestures gradually replaced by gestures have been needed to move from a visual lan- of the articulatory organs, but with likely over- guage to an audible one” (2005, p. 123). lap at all stages. Part of the answer to this lies in the grow- Although the mirror system in nonhuman ing realization that speech itself is a gestural primates seems not to incorporate vocalization system, so that the switch is not so much from (Rizzolatti & Sinigaglia 2008), it is receptive to vision to audition as from one kind of gesture acoustic as well as visual input. Kohler et al. to another. The notion of speech as gesture (2002) recorded neurons in area F5 of the Corballis: The Evolution of Language 25 monkey that responded to the sounds of man- voice spectra of syllables simultaneously pro- ual actions, such as tearing paper or breaking nounced by the viewer (Gentilucci 2003). In peanuts. Significantly, there was no response to the course of evolution, this mechanism of joint monkey calls. This is consistent with evidence control of hand and mouth could have been in- that vocalizations in nonhuman primates are strumental in the transfer of a communication controlled by the limbic system, rather than by system, based on the mirror system, from move- neocortex (Ploog 2002), and are therefore not ments of the hand to movements of the mouth part of the mirror system. Even in the chim- (Gentilucci & Corballis 2006). panzee, voluntary control of vocalization ap- The relationship between representations of pears to be limited, at best (Goodall 1986). The actions and spoken language is further sup- incorporation of vocalization into the mirror ported by neuroimaging studies, which show system was therefore a critical development in activation of Broca’s area when people make the evolution of speech—which is not to say meaningful arm gestures (Buccino et al. 2001; that it was critical to the development of lan- Decety et al. 1997; Gallagher & Frith 2004; guage itself. Grezes` et al. 1998), or even imagine them Other properties of the primate mirror sys- (Gerardin et al. 2000; Grafton et al. 1996; tem indicate the close links between hand and Hanakawa et al. 2003; Kuhtz-Buschbeck et al. mouth. Some neurons in area F5 in the mon- 2003; Parsons et al. 1995). key fire when the animal makes movements to In evolutionary terms, a gradual shift from grasp an object with either the hand or the hand to mouth for purposes of intentional com- mouth (Rizzolatti et al. 1988). Petrides et al. munication might well have begun with the in- (2005) have identified an area in the monkey creasing involvement of the hands in manufac- brain just rostral to premotor area 6 that is in- ture, and perhaps in transporting belongings volved in control of the orofacial musculature. or booty from one location to another. Man- This area is also considered a homologue of ufactured stone tools, often considered to be part of Broca’s area. The close neural associa- a conceptual advance beyond the opportunis- tions between hand and mouth may be related tic use of sticks or rocks as tools, appear in to eating rather than communication, but later the fossil record from some 2.5 million years exapted for gestural and finally vocal language. ago, perhaps in , a likely precur- The connection between hand and mouth can sor to (Semaw et al. 1997). From also be demonstrated behaviorally in humans. some 1.8 million years, erectus began to migrate In one study, people were instructed to open out of Africa into Asia and later into Europe their mouths while grasping objects, and the (Tattersall 2003), and the Acheulian industry size of the mouth opening increased with the emerged, with large bifacial tools and handaxes size of the grasped object; conversely,when they that seemed to mark a significant advance over opened their hands while grasping objects with the simple flaked tools of the earlier Oldowan their mouths, the size of the hand opening also industry (Gowlett 1992). increased with the size of the object (Gentilucci With the hands increasingly involved in such et al. 2001). activities, the burden of communication may Grasping with the hand also affect the kine- have shifted to the face, which provides suf- matics of speech itself. Grasping larger objects ficient diversity of movement and expression induces selective increases in parameters of lip to act as a signaling device. It also naturally kinematics and voice spectra of syllables pro- conveys emotion and can serve to direct at- nounced simultaneously with action execution tention. Signed languages involve communica- (Gentilucci et al. 2004). Even observing another tive movements of the face as well as of the individual grasping or bringing to the mouth hands (Sutton-Spence & Boyes-Braem 2001), larger objects affects the lip kinematics and the and Muir and Richardson (2005) found that 26 Annals of the New York Academy of Sciences native signers watching discourse in British facial gestures, especially those of the tongue, Sign Language focused mostly on the face and are internal to the mouth and largely hidden mouth, and relatively little on the hands or up- from sight. With the addition of sound through per body. Facial expressions and head move- vibrations of the vocal folds, these hidden ges- ments can turn an affirmative sentence into a tures are potentially recoverable through the negation, or a question. Facial gestures serve to mirror system, in much the same way as the disambiguate hand gestures and provide the vi- primate mirror system responds to the sounds sual equivalent of prosody in speech (Emmorey of actions such as tearing paper or cracking nuts 2002). The face may play a much more promi- (Kohler et al. 2002). Speech, then, is facial ges- nent role in signed languages than has been ture half swallowed, and rendered partly invisi- hitherto recognized and may have been critical ble. Once sound was introduced, though, these in the transition from manual gesture to speech. gestures became accessible through audition, The face also plays a role in the perception not vision. The problem, though, is that vo- of normal speech. Although we can understand calization in nonhuman primates is controlled the radio announcer or the voice on the cell subcortically and appears not to be part of the phone, there is abundant evidence that watch- mirror system, so its incorporation must have ing people speak can aid understanding of what occurred at some point in hominin evolution. they are saying. It can even distort it, as in the One clue as when this might have occurred McGurk effect, in which dubbing sounds onto comes from genetics. A mutation of the fork- a mouth that is saying something different al- head box transcription factor, FOXP2,insome ters what the hearer actually hears (McGurk members of an English family known as the KE & MacDonald 1976). Evidence from an fMRI family has resulted in a severe deficit in vocal ar- study shows that the mirror system is activated ticulation (Watkins et al. 2002). Moreover, the when people watch mouth actions, such as bit- members of the family affected by the muta- ing, lip-smacking, oral movements involved in tion, unlike their unaffected relatives, show no vocalization, when these are performed by peo- activation in Broca’s area while covertly gener- ple, but not when they are performed by a ating verbs (Liegeois´ et al. 2003). This might be monkey or a (Buccino et al. 2004). Actions taken to mean that the FOXP2 gene in humans belonging to the observer’s own motor reper- is involved in the cooption of vocal control by toire are mapped onto the observer’s motor sys- Broca’s area (Corballis 2004a). In songbirds, tem, while those that do not belong are not— knockdown of the FOXP2 gene impairs the im- instead, they are perceived in terms of their vi- itation of song (Haesler et al. 2007), and in- sual properties. Watching speech movements, sertion of the FOXP2 point mutation found in and even stills of a mouth making a speech the KE family into the mouse critically impairs sound, also activates the mirror system, includ- synaptic plasticity and motor learning (Groszer ing Broca’s area (Calvert & Campbell 2003). et al. 2008). These observations are consistent with the Although highly conserved in , the idea that speech may have evolved from visual FOXP2 gene underwent two mutations since displays that included movements of the face. the split between hominin and chimpanzee lines. According to one theoretical estimate, Adding Sound the more recent of these occurred “not less than” 100,000 years ago (Enard et al. 2002), Despite the close association between hand although the error associated with this estimate and mouth in primates, the one missing ingre- makes it not unreasonable to suppose that it dient is vocalization. The incorporation of vo- coincided with the emergence of Homo sapiens calization into the mirror system may have nat- around 170,000 years ago. Contrary evidence, urally followed involvement of the face. Many though, comes from a report that the mutation Corballis: The Evolution of Language 27 is also present in the DNA of a 45,000-year-old tained well within the modern human range Neandertal fossil (Krause et al. 2007), suggest- (Kay et al. 1998), although this has been dis- ing that it goes back as much as 700,000 years puted (DeGusta et al. 1999). A further clue ago to the common ancestor of humans and comes from the finding that the thoracic re- Neandertals (Noonan et al. 2006). But this is gion of the spinal cord is relatively larger in challenged in turn by Coop et al. (2008), who humans than in nonhuman primates, probably used phylogenetic dating of the haplotype to because breathing during speech involves extra reestimate the time of the most recent com- muscles of the thorax and abdomen. Fossil ev- mon ancestor carrying the FOXP2 mutation. idence indicates that this enlargement was not Their answer was 42,000 years ago, with an es- present in the early hominins or even in Homo timated 95% confidence interval from 38,000 ergaster, dating from about 1.6 million years ago to 45,500 years ago. Even allowing for distor- but was present in several Neandertal fossils tions in their assumptions, this is much more (MacLarnon & Hewitt 2004). consistent with the estimate reported by Enard The Neandertals would have been incapable et al. (2002) than with the estimate implied by of fully articulate speech, based on reconstruc- Krause et al. (2007). tions of their vocal tract (D.E. Lieberman 1998; Coop et al. argue that the presence of the P.Lieberman et al. 1972). Robert McCarthy of mutation in Neandertal was more likely due to Florida Atlantic University has recently simu- contamination of the Neandertal DNA, or to lated how the would have sounded low rates of gene flow between human and Ne- when articulating the syllable /i/ (or ee), based andertal, on the assumption that the allele was on the shape of the vocal tract.a One observer globally beneficial. Recent evidence suggests described it as sounding more like a sheep or a that microcephalin, a gene involved in regu- goat than a human. P.Lieberman’s claims have lating brain size, may have entered the human nevertheless been controversial (e.g., Gibson & gene pool through interbreeding with Neander- Jessee 1999), but there is other evidence that tals (Evans et al. 2006), so the reverse possibil- the cranial structure underwent changes subse- ity of FOXP2 entering the late Neandertal gene quent to the split between anatomically modern pool from Homo sapiens is not completely ruled and earlier “archaic” Homo,suchastheNean- out. Our forebears might have been slightly dertals, ,andHomo rhodesien- friendlier toward the Neandertals than is gen- sis, and that these changes on the ability erally thought. to speak. Of course, mutation of the FOXP2 gene need One such change is the shortening of the not have been the only factor in the switch from sphenoid, the central bone of the cranial base manual gesture to speech. Fossil evidence sug- from which the face grows forward, result- gests that the anatomical requirements for fully ing in a flattened face (P. Lieberman 1998). articulate speech were probably not complete D. E. Lieberman et al. (2002) speculate that until the emergence of Homo sapiens.Forexam- this is an adaptation for speech, contributing ple, the hypoglossal canal is much larger in hu- to the unique proportions of the human vo- mans than in great apes, suggesting that the hy- cal tract, in which the horizontal and verti- poglossal nerve, which innervates the tongue, is cal components are roughly equal in length. also much larger in humans, perhaps reflecting This configuration, they argue, improves the the importance of tongued gestures in speech. ability to produce acoustically distinct speech The evidence suggests that the size of the hy- sounds, such as the vowel [i]. It is not seen poglossal canal in early , and perhaps in , was within the range of that in modern great apes, while that of the a It can be found on http://anthropology.net/2008/04/16/ Neandertal and early H. sapiens skulls was con- reconstructing-Neanderthal-vocalizations/. 28 Annals of the New York Academy of Sciences in Neandertal skeletal structure (see also Vleck production through the vocal folds had not yet 1970). Another adaptation unique to H. sapi- been incorporated. ens is neurocranial globularity, defined as the The most persistent advocate of the late roundness of the cranial vault in the sagittal, emergence of speech is P. Lieberman, and as coronal, and transverse planes, which is likely recently as 2007 he summarized a review of to have increased the relative size of the tempo- the evidence as follows: ral and/or frontal lobes relative to other parts of fully human speech first appears in the brain (D.E. Lieberman et al. 2002). These the fossil record in the Upper (about changes may reflect more refined control of ar- 50,000 years ago) and is absent in both Nean- ticulation and also, perhaps, more accurate per- derthals and earlier humans (p. 39). ceptual discrimination of articulated sounds. This statement is not inconsistent with the Indirect support for the late emergence of dating of the most recent FOXP2 mutation, dis- speech comes from African click languages, cussed earlier and is later than the estimate which may be residues of prevocal language. of the origin of click languages, discussed in Click sounds are made entirely in the mouth, the preceding paragraph. It suggests that ar- and some languages have as many as 48 click ticulate speech arose even later than the emer- sounds (Crystal 1997). Clicks could therefore gence of our species, Homo sapiens, thought to have provided sufficient variety to carry a form have originated in East Africa not later than of language prior to the incorporation of vocal- around 120,000 years ago (e.g., Ray et al. ization. Two of the many groups that make ex- 2005). The question of precisely when artic- tensive use of click sounds are the Hadzabe and ulate speech emerged is by no means resolved, San, who are separated geographically by some and P.and D.E. Lieberman’s provocative claims 2000 kilometers, and genetic evidence suggests heighten the challenge to discover unequivocal that the most recent common ancestor of these data. groups goes back to the root of present-day mi- tochondrial DNA lineages, perhaps as early as Why the Switch? 100,000 years ago (Knight et al. 2003), prior to the migration of anatomically modern hu- Given the fairly extensive anatomical and mansoutofAfrica.Thedateofthismigration neurophysiological changes required, along is still uncertain. Mellars (2006) suggests that with the heightened risk choking due to the low- modern humans may have reached Malaysia ering of the larynx, evolutionary pressure for and the Andaman Islands as early as 60,000 the switch must have been strong. Since signed to 65,000 years ago, with migration to Europe languages are as sophisticated linguistically as and the Near East occurring from western or spoken ones (Armstrong et al. 1995; Emmorey southern Asia, rather than from Africa as previ- 2002, Neidle et al. 2000), the pressure was al- ously thought. Those who migrated may have most certainly based on practical rather than already developed vocal speech, leaving behind linguistic considerations. One advantage of African speakers who retained click sounds. speech is that it can be carried on at night, The only known non-African click language or when the line of sight between sender and is Damin, an extinct Australian aboriginal lan- receiver is blocked. Communication at night guage. Homo sapiens may have arrived in Aus- may have been critical to survival in a hunter- tralia as early as 60,000 years ago (Thorne gatherer society. The San, a modern hunter- et al. 1999), not long after they migrated out gatherer society, are known to talk late at night, of Africa. This is not to say that the early Aus- sometimes all through the night, to resolve con- tralians and Africans did not have full vocal flict and share knowledge (Konner 1982). control; rather, click languages may be simply Speech is much less energy-consuming than a vestige of earlier languages in which sound manual gesture. Anecdotal evidence from Corballis: The Evolution of Language 29 courses in sign language suggests that the in- This account, though, does not explain the ex- structors require regular massages in order traordinary complexity and flexibility of hu- to meet the sheer physical demands of sign- man language relative to other forms of animal language expression. In contrast, the physio- communication. logical costs of speech are so low as to be nearly unmeasurable (Russell et al. 1998). In terms of Grammar expenditure of energy, speech adds little to the cost of breathing, which we must do anyway to In parallel with the gradual switch from man- sustain life. ual to vocal mode, language must have ac- But it was perhaps the freeing of the hands quired the distinctive grammatical properties for other adaptive functions, such as carrying that allow us the unlimited freedom of ex- things, and the manufacture of tools, that was pression. Following Chomsky (1975), this has probably the most critical. Vocal language al- commonly been attributed to universal gram- lows people to use tools and at the same time mar, considered to be an innate endowment explain verbally what they are doing, leading unique to humans. More recent developments perhaps to pedagogy (Corballis 2002). Indeed, suggest, though, that grammar arises from this may explain the dramatic rise of more more general cognitive capacities (Christiansen sophisticated tools, bodily ornamentation, art, & Chater 2008, Tomasello 2003, 2008; and perhaps music, in our own species. These Wray 2002). developments have been dubbed a “human revolution” (Mellars & Stringer 1989), dating Mental Time Travel from some 40,000 years ago in Europe, but more recent evidence suggests that the rise One such capacity may be the ability to travel toward technological and behavioral moder- mentally in time. The concept of mental time nity originated in Africa. Two phases of tech- travel was based initially on a distinction, drawn nological innovation in southern Africa have by Tulving (1972), between two forms of mem- been dated at around 70,000 to 75,000 years ory. Semantic memory is our vast storehouse of ago and 60 to 65,000 years ago, respectively facts about the world, the combined dictionary (Jacobs et al. 2008), just preceding the esti- and encyclopedia of the mind, whereas episodic mated date of human dispersal from Africa at memory is the memory for events, the mind’s around 55,000 to 60,000 years ago (Mellars personal diary. Both are regarded as forms of 2006). These developments, associated with what has been called declarative memory—or Homo sapiens, are often attributed to the emer- memory that can be declared—which already gence of language itself, but I have proposed suggests a connection with language. Episodic elsewhere, and suggest here, that the criti- memory, unlike semantic memory, implies a cal innovation was not language, but speech mental journey into the past, as when one men- (Corballis 2004b). tally relives or imagines some past episode. Tul- In summary, the view adopted here is that ving (1972, 1985) proposed that episodic mem- language is a gestural system, based on the pri- ory is unique to humans. mate mirror system. In the course of time, it To some extent, these two forms of memory shifted from a general bodily system to one oc- must be related. Tulving (2002) has argued, for cupying a small region of the body, namely the instance, that the storage of episodic memo- articulatory organs of the vocal tract, and to ries must depend on semantic memories that a system requiring minimal energy. This freed are already in place, but are then related to the rest of the body from obligatory involve- the self in subjectively-sensed time. This allows ment in communication, allowing the hands the experience of an event to be stored sepa- to be used for other manipulative activities. rately from the semantic system. Yet there is 30 Annals of the New York Academy of Sciences also evidence that semantic and episodic mem- the remarkable feature of mental time travel ory can be doubly dissociated. In most cases of in humans may not be its existence but rather amnesia, episodic memory is lost while seman- its flexibility and combinatorial structure. Our tic memory remains largely intact (e.g., Tulving past and future imaginings are typically made et al. 1988). Conversely, people with semantic up of different combinations of elements that dementia, a degenerative neurological disorder are otherwise familiar, such as people, objects, that afflicts some people in late adulthood, show actions, situations, and emotions. severe decline in semantic memory, but their Human language is exquisitely designed episodic memories remain remarkably and sur- to transmit exactly this kind of information prisingly intact (Hodges & Graham 2001). (Corballis & Suddendorf 2007). As Pinker The idea that episodic memory involves re- (2003, p. 27) put it, language seems to have enactment of past episodes can be extrapolated evolved to express “who did what to whom, to the more general idea of mental time travel, when, where, and why,” thus allowing personal the human capacity to travel mentally both experiences to be shared, with consequent ben- forward and backward in time (Suddendorf efits in social bonding and social and practical & Corballis 1997, 2007). Evidence from understanding. The sharing of past, future, and brain imaging that remembering the past indeed imaginary experiences places a much and imagining the future activate a common greater burden on the communication chan- “core” network (e.g., Addis et al. 2007), and nel than if events are experienced only in the patients with amnesia typically have as much present. Events in the present are shared by mu- difficulty imagining future episodes as in tual experience, and it may take only a few sig- recalling past ones (e.g., Hassabis et al. 2007). nals to direct attention, or to convey the impor- Mental time travel into the future does not of tance of some components rather than others. course imply precognition but rather refers to Animals will sometimes use simple signals to the ability to imagine possible future episodes, draw attention to events that their conspecifics whether for the purpose of planning detailed may not be able to see, as when chimpanzees activities or for comparing and evaluating use pant hoot calls to signal the discovery of different strategies. Episodic memory provides food (Goodall 1986), or vervet monkeys use the vocabulary of scenarios that enable us to different calls to warn of different predators envisage particular future scenarios, its survival (Cheney & Seyfarth 1990), but no syntax or value must lie, not in the memory component combinatorial structure is necessary. per se, but rather in what it contributes to The time dimension vastly increases the present and future survival. mental canvas, since reference to different times Just as Tulving proposed episodic memory generally involves different places, different ac- to be uniquely human, so it has been pro- tions, different actors, and so on. In order to posed that only humans have the capacity for represent or refer to episodic elements that are mental time travel (Suddendorf & Corballis not available in the present, we need very large 1997, 2007). This has been challenged (e.g., vocabularies of concepts, as well as of words Clayton et al. 2003, and see commentaries on to represent them. And we need rules to rep- Suddendorf & Corballis 2007), but whether or resent the way in which the elements of an not genuine counter-examples can be found in event are combined, and corresponding rules nonhuman animals there seems little doubt that to convey these combinations to others in the the human ability to conjure past and future form of language. If there is adaptive advantage episodes, and indeed purely imaginary ones, to be gained from mental time travel through exceeds any capacity so far demonstrated in one’s own personal experiences, that advan- nonhuman animals. Even if nonhuman ani- tage can be multiplied by adding the experi- mals can be shown to travel mentally in time, ences of others. It provides information about Corballis: The Evolution of Language 31 possible scenarios, about how individuals be- ancestors produced tools at one site for use at have in different circumstances, about how the another (Hallos 2005). Tool manufacture re- social world works. This perhaps explains the mained fairly static, though, for most of the human predilection for gossip, and also for fic- Pleistocene until the dramatic surge in tech- tion, as in stories, novels, plays, TV soaps, and nological innovation associated with our own the like. Language, then, was born of the value species from around 60,000 to 75,000 years of sharing, and indeed inventing, scenarios in- ago, discussed earlier. volving the interactions of humans with each The increased memory demands due to both other and with the world. mental time travel and language may well have It is likely that the distinctive features of both driven the dramatic increase in brain size as- language and mental time travel evolved during sociated with the Pleistocene. The brain size the Pleistocene. This era is usually dated from of the early hominins was about the same, rel- about 1.8 million years to about 10,000 years ative to body size, as that of the present-day ago (e.g., Janis 1993)—although it has been ar- great apes, but from the emergence of the genus gued that it should be dated from as early as Homo some 2 to 2.5 million years ago it in- 2.58 million years ago (Suc et al. 1997), which creased, and had doubled by about 1.2 million corresponds more closely to the emergence of years ago. It reached a peak, not with Homo the genus Homo. With the global shift to cooler sapiens, but with the Neandertals, who shared a climate after 2.5 million years ago, much of common ancestry with modern humans until southern and eastern Africa probably became about 700,000 years ago (Noonan et al. 2006). more open and sparsely wooded (Foley 1987). In some individual Neandertals, brain capacity This left the hominins not only more exposed to seems to have been as high as 1800 cc, with attack from dangerous predators, such as saber- an average of around 1450 cc. Brain size in our tooth cats, lions, and hyenas, but also obliged to own species, Homo sapiens, is a little lower, with a compete with them as carnivores. The solution present-day average of about 1350 cc (Wood & was not to compete on the same terms, but to Collard 1999). This is about three times the size establish what Tooby and DeVore (1987) called expected for a great of the same body size. the “cognitive niche,” relying on social coop- Of course brain size depends on factors other eration and intelligent planning for survival. than cognitive demands, such as body size. Fos- The problem is that the number of combina- sil evidence suggests that the absolute size of tions of actions, actors, locations, time periods, the human brain has decreased from around implements, and so forth that define episodes 35,000 years ago, but this was paralleled by a becomes very large, and a system of holistic decrease in body size (Ruff et al. 1997). calls to describe those episodes rapidly taxes the Let’s consider in more detail, then, some of perceptual and memory systems. Syntax may the ways in which mental time travel has shaped then have emerged as a series of rules whereby language. episodic elements could be combined. Further evidence that mental time travel Symbols and Mime evolved during the Pleistocene comes from stone tools, which appear to have been trans- In order to communicate about events at ported for repeated use. As we have seen, points in time other than the present, we the relatively primitive Oldowan industry goes require ways of referring to them in absentia. back some 2.5 million years (Semaw et al. Here, the use of manual gesture has some- 1997), but more direct evidence that mental thing of an advantage over speech, especially time travel was involved comes from the recon- as bipedalism emerged, freeing the hands for struction of knapping routines suggesting that, other purposes, including communication. A by the Middle Pleistocene at least, our hominin feature of the hands and arms, once freed from 32 Annals of the New York Academy of Sciences locomotory and postural duties, is that they can ture may still be necessary to establish links in move in four dimensions (three of space and the first place—the child can scarcely learn the one of time), and so mimic real-world events. meaning of the word dog unless someone draws The early hominins, though, were facultative her attention to the animal itself—but there is bipeds, retaining some adaptive features of otherwise no reason why the labels themselves arboreal life, and there is little evidence that need not be based on patterns of sound. Of they were different in cognitive or behavioral course some concepts, such as the moo of a terms from the other great apes. Obligate cow or miaow of a cat, depend on sound rather bipedalism emerged later, with the genus Homo, than sight, and it is not surprising that the words and Donald (1991) suggested that this marked for these sounds tend to be onomatopoeic. An- the beginning of what he called “mimetic cul- other example is zanzara,theevocativeItalian ture,” in which events were related through word for mosquito, and Pinker (2007) notes a mimed movements of the body, with the hands number of newly minted examples: oink, tin- assuming special importance. Communication kle, barf , conk, woofer, tweeter. But most spoken of this sort persists in dance, ballet, and mime, words bear no physical relation to what they and we all resort to mime when trying to com- represent. municate with people who speak a language The Swiss linguist Ferdinand de Saussure different from our own. (1977/1916) wrote of the “arbitrariness of the The modern sign languages of the deaf are sign” as a defining property of language, and also partially dependent on mime, or on di- on this basis it is sometimes supposed that rect copying of real-world actions. It has been signed languages, with their strong founda- estimated, for example, that in Italian Sign tion in iconic representations, are not true lan- Language some 50% of the hand signs and guages. The arbitrariness of words (or mor- 67% of the bodily locations of signs stem phemes) is not so much a necessary property from iconic representations, in which there of language, though, as a matter of expedience, is a degree of spatiotemporal mapping be- and of the constraints imposed by the language tween the sign and its meaning (Pietrandea medium. Speech, for example, requires that the 2002). Emmorey (2002) notes that in ASL some information be linearized, squeezed into a se- signs are purely arbitrary, but many more are quence of sounds that are necessarily limited iconic. For example, the sign for “erase” re- in terms of how they can capture the physi- sembles the action of erasing a blackboard, cal nature of what they represent. The linguist and the sign for “play piano” mimics the ac- Charles Hockett (1978) put it this way: tion of actually playing a piano. But signs can when a representation of some four-dimen- be iconic without being transparently so, and sional hunk of life has to be compressed into the sin- they often cannot be guessed by na¨ıve observers gle dimension of speech, most iconicity is necessar- (Pizzuto & Volterra 2000). They also tend to ily squeezed out. In one-dimensional projections, become less iconic and more arbitrary over an elephant is indistinguishable from a woodshed. time, in the interests of speed, efficiency, and Speech perforce is largely arbitrary, if we speakers the constraints of the communication medium. take pride in that, it is because in 50,000 years or so of talking we have learned to make a virtue of This process is known as conventionalization necessity (pp. 274–275). (Burling 1999). Once the principle of conventionalization is The symbols of signed languages are not established, there is no need for communica- so constrained. The hands and arms can tion to retain an iconic component, or even to mimic the shapes of real-world objects and ac- depend on visual signals. We are quick to learn tions, and to some extent lexical information arbitrary labels, whether for objects, actions, can be delivered in parallel instead of being emotions, or abstract concepts. Manual ges- forced into rigid temporal sequence. Even so, Corballis: The Evolution of Language 33 conventionalization allows signs to be simpli- ditional, continuous and non-continuous, and fied and speeded up, to the point that many of so on. Thus the words walk, walked,andwalking, them lose most or all of their iconic aspect. For along with auxiliaries (e.g., will walk, might have example, the ASL sign for home was once a com- been ), refer to different times or timing bination of the sign for eat, which is a bunched conditions to do with a perambulatory event. hand touching the mouth, and the sign for sleep, Some languages have no tenses as such, but which is a flat hand on the cheek. Now it consists have other ways of indicating time. Chinese, of two quick touches on the cheek, both with for example, has no tenses, but the time of an a bunched handshape, so the original iconic event can be indicated by adverbs, such as to- components are effectively lost (Frishberg morrow, and what are called aspectual markers, 1975). as in a sentence that might be roughly rendered The increasing complexity of human culture as He break his before (Lin 2005). as it developed during the Pleistocene and be- The variety of ways in which time is marked yond no doubt accelerated the drive toward in different languages suggests cultural influ- conventionalization and increased efficiency. It ences rather than the operation of universal has been estimated that the average literate per- grammar. A revealing example comes from the son today knows some 50,000 concepts (Pinker language spoken by the Piraha,˜ a tribe of some 2007), and even with the degrees of freedom 200 people in Brazil, which has only a very re- afforded by hand and arm movements it would stricted way of talking about relative time. This be slow and cumbersome in the extreme to takes the form of two tense-like morphemes represent even the spatial concepts manually— which indicate simply whether an event is in the although we often point or incline the head to present or not. Piraha˜ also includes a few words indicate direction (“He went that way”). Con- serving as temporal markers, such as night, day, ventionalization can be regarded simply as a full moon, and so on. The Pirahaaresaidtolive˜ device, dependent on associative learning, to largely in the present, with no creation myths, streamline the representational system for max- no art or drawing, no individual or collective imum efficiency. Just as signs become more memory for more than two generations past compact, so words tend to become shorter with (Everett 2005). more frequent use. This is captured by Zipf’s One might suppose that the apparent law, which states that the length of a word is paucity of mental time travel in Piraha˜ life re- inversely proportional to its rank in frequency. sults from the lack of ways to express time, as The reason for this is evident from the title of suggested by the Whorfian hypothesis. But the Zipf’s 1949 book, Human Behavior and the Principle reverse is more likely true—that is, the language of Least-Effort. Hence we have the progression of the Piraha˜ adapted to their lack of concern from television to telly to TV , or in my own coun- for time (cf. Christiansen & Chater 2008). The try from universitytovarsity to uni. Piraha˜ language is limited in other ways besides the dearth of time markers. It has no numbers Marking Time or system of counting, no color terms, and may even be said to lack verbs, in the sense of a verb If it is to convey information about episodes, as a linguistic class, the Pirahalearnverbsone˜ language must include some mechanism for by one as individual entities. There is no recur- conveying when in time they occurred, or will sive embedding of clauses (Everett 2005). One occur. In many languages this is accomplished might be tempted to believe that the Piraha˜ by tense markers. In English, for example, verbs suffer from some genetic defect, but this idea describing actions and states are endowed with is rejected by Everett, who describes them as tense to indicate different points in time, as well “some of the brightest, pleasantest, and most as distinctions between conditional and uncon- fun-loving people that I know” (p. 621). 34 Annals of the New York Academy of Sciences

Everett suggests that even these additional (1957) attempted an answer. In an appendix to features derive fundamentally from their very that book, Skinner proposed that Whitehead limited sense of time, supporting the idea that was unconsciously expressing the fear that be- the characteristics of language derive from haviorism might indeed take over, likening it mental time travel. He writes: to a black scorpion that he would not allow to [the] apparently disjointed facts about the Piraha˜ tarnish his philosophy. Skinner’s explanation is language—gaps that are very surprising from just ironic, because it seems to owe more to psycho- about any grammarian’s perspective—ultimately analysis than to behaviorism, and Skinner was derive from a single cultural constraint in Piraha,˜ well known for anti-Freudian views. namely, the restriction of communication to the immediate We now know, largely through the efforts of experience of the interlocutors. (p. 622). Chomsky (1957 1959), that language cannot Everett’s work on the Piraha˜ is understand- be explained in terms of learned sequences. In- ably controversial (see the critique by Nevins stead, it depends on rules. These rules combine et al. 2007 and the response by Everett 2007). words in precise ways to enable us to create and Despite its seeming simplicity, though, Piraha˜ extract an essentially unlimited number of dif- language is rich in morphology and prosody. ferent meanings. As the German philosopher Everett insists that it should not be regarded Gottlob Frege (1980/1914, p. 79) put it: as in any way “primitive” and suggests that it is probably not especially unusual. Other lan- The possibility of our understanding sentences that we have never heard before rests evidently on this, guages of nonliterate peoples may have similar that we can construct the sense of a sentence out characteristics. For example, the Iatmul lan- of parts that correspond to words. guage of New Guinea is also said to have no recursion (Karlsson 2007). Tomasello (2003) The combinatorial structure of sentences, I suggests that theories of language have been suggest, derives in large part from the combi- unduly influenced by the characteristics of writ- natorial structure of episodes, and words pro- ten language, and remarks that “there are very vide the access to the components of episodes. few if any specific grammatical constructions Most of the episodes we witness, remember, or or markers that are universally present in all construct in our minds are combinations of the languages” (p. 5). familiar. Indeed it is generally the combinations that count, rather than the individual elements. In Whitehead’s sentence, the notions of a black Generativity scorpion, falling, and a table are of themselves The most distinctive property of language is of less interest than the unusual combination of that it is generative (Chomsky 1966). We can a scorpion in downward above the very both construct and understand sentences that table at which the two savants sat—and there we have never used or heard before. A classic may have been relief that this unusual event was example comes from the British philosopher not occurring. The manner in which the words Alfred North Whitehead. In 1934 he had been describing such episodes are arranged depends seated at dinner next to B.F. Skinner, who was on the conventions that make up grammar. trying to explain how the principles of behav- One such convention has to do with the or- iorism would change the face of psychology. der in which words are uttered or signed. The Obliged to challenge this view, Whitehead ut- most basic episodes are those involving objects tered the sentence “No black scorpion is falling and actions, so the first “words” were probably upon this table,” and asked Skinner to explain nouns and verbs—an idea that goes back to the the behavioral principles that might have led 19th-century English philologist John Horne himtosaythat.Itwasnotuntilthepublication Tooke (1857), who regarded nouns and verbs of Verbal Behavior 23 years later that Skinner as “necessary words.” The prototypical episode Corballis: The Evolution of Language 35 of someone doing something to someone or Sign Language (ABSL) is an SOV language something else, then, requires one noun to be (Aronoff et al. 2007). In any event, episodes the subject, another to be a verb describing the themselves are typically sequential, and it is nat- action, and another noun to be the object of ural for the sequence of events to be copied into the action. How these are ordered is simply a the language that expresses them. matter of convention. In English, the conven- tion is to place them in the order subject verb object Grammaticalization (SVO). To use a well-chewed example, the sen- tence “dog bites man” means something very Grammar can be regarded as a device different from “man bites dog,” the latter is for making communication more efficient and news, the former simply a personal misfortune. streamlined. For example, many of the words Among the world’s languages, the most com- we use do not refer to actual content, but serve mon word order is SOV, with the verb at the functions that are purely grammatical. These end, but all possible combinations exist among are called function words, and include articles, the world’s languages. such as a and the, prepositions such as at, on, In speaking, we can only utter one word at a or about, and auxiliaries such will in “They will time, so word order can be critical. Some lan- come.” Function words nevertheless almost cer- guages, though, mark the roles played by dif- tainly have their origins in content words, and ferent words with changes to the words them- the process by which content words are stripped selves. In Latin, for example, the subject and of meaning to serve purely grammatical func- object of a sentence are signaled by different tions is known as grammaticalization (Heine & inflections—changes to the end of the word— Kuteva 2007; Hopper & Traugott 1993). A and the words can be reordered without losing classic example is the word have,whichpro- the meaning. So canis virum mordet means “dog gressed from a verb meaning to “seize” or bites man,” while canem vir mordet means “man “grasp” (Latin capere), to one expressing posses- bites dog,” although it would be normal still sion (as in I have a pet porcupine,Latinhabere), to a to place the subject first. The Australian abo- marker of the perfect tense (“I have gone”) and riginal language Walpiri is a more extreme ex- a marker of obligation (“I have to go”). Simi- ample of an inflected language in which word larly, the word will probably progressed from a order makes essentially no difference. Such lan- verb (as in “Do what you will”) to a marker of guages are sometimes called scrambling languages. the future tense (“They will laugh”). Chinese, by contrast, is an example of an iso- Another example comes from the word go. lating language, in which words are not inflected It still carries the meaning of travel, or making and different meanings are created by adding a move from one location from to another, but words or altering word order. English is closer in sentences like “We’re going to have lunch” to being an isolating language than a scram- it has been bleached of content and simply in- bling one. dicates the future. The phrase going to has been Unlike spoken languages, signed languages compressed into the form gonna,asin“We’re are less constrained to present words in se- gonna have lunch,” or even “I’m gonna go.” quence, since they make use of spatial infor- In the US, people make an additional com- mation as well as sequential information, and pression when they say “Let’s go eat,” where different words can be signed simultaneously. we less hungry Kiwis say “Let’s go and eat.” For example, the hand can take the shape of an I’m waiting to hear someone to say “Let’s go object while its movement indicates the action. go-go.”b Even so, the order in which signs are displayed can be critical. In ASL the basic order is SVO, b I have since learned that “Let’s go go-go” is the battle song of the while the newly emerged Al-Sayyid Bedouin Chicago White Sox. 36 Annals of the New York Academy of Sciences

There are other ways in which grammatical- produce meanings that were earlier expressed ization operates to make communication more holistically (see Kirby & Hurford 2002, for streamlined. One has to do with the embed- a review). ding and concatenization of phrases. For ex- ample, the statements “He pushed the door” and “The door opened” can be concatenated Putting It Together into “He pushed the door open.” Statements like “My uncle is generous with money” and In this article, I have suggested two ways in “My uncle helped my sister out” can be con- which language evolved. First, I have argued catenated by embedding the first in the second: that language evolved from intentional manual “My uncle, who is generous with money,helped gestures, with vocal gestures gradually assum- my sister out.” One can also alter the priority ing dominance, perhaps with the emergence of of the two statements by reversing the embed- Homo sapiens within the last 200,000 years. In ding: “My uncle, who helped my sister out, is this sense, language is an extension of the so- generous with money.” called mirror system, whereby primates under- Efficiency can also be improved by breaking stand the actions of others. Second, I have pro- down concepts into component parts, which posed that the evolution of mental time travel can then be recombined to form new concepts. and the awareness of time led to a more com- An interesting example comes from a signed plex form of language for the communication of language. In Nicaragua deaf people were iso- episodes that take place at times other than the lated from one another until the Sandinista present. The demands of referring to episodes government assumed power in 1979 and cre- that are not immediately accessible to the senses ated the first schools for the deaf. Since that led to the construction of grammar, or gram- time, the children in these schools invented mars, which probably took place over the past their own sign language, which has blended 2 million or so years from the emergence of the into the system now called Lenguaje de Sig- large-brained genus Homo. nos Nicaraguense¨ (LSN). In the course of time, Although mental time travel may have set LSN has changed from a system of holistic signs the initial stage for language, the two must to a more combinatorial format. For example, also have co-evolved. Thus Gardenfors¨ (2004) one generation of children were told a story of writes that, in his view, “there has been a co- a cat that swallowed a bowling ball, and then evolution of cooperation about future goals rolled down a steep street in a “waving, wob- and symbolic communication” (p. 243). Lan- bling manner.” The children were then asked guage itself adds to the capacity for mental to sign the motion. Some indicated the motion time travel, since it provides a means by which holistically, moving the hand downward in a people can create the equivalent of episodic waving motion. Others, however, segmented memories in others, and therefore contributes the motion into two signs, one representing to their episodic thinking. By telling you what downward motion and the other representing happened to me, I can effectively create an the waving motion, and this version increased imagined episode in your mind, and this added after the first cohort of children had moved information might help you adapt more effec- through the school (Senghas et al. 2004). tively to future conditions. And by telling you One need not appeal to universal gram- what I am about to do, you may form an image martoexplainhowthiskindofsegmenta- in your own mind, and work out a plan to help tion occurs. Computer simulations have shown me, or perhaps thwart me. that cultural transmission can change a lan- Just when language became grammatical in guage that begins with holistic units into one relation to when speech became the dominant in which sequences of forms are combined to mode has been a matter of conjecture. Arbib Corballis: The Evolution of Language 37

(2005) proposed that language evolved from been more prominent among African tribes manual gestures, but suggested that manual than among those of the industrialized West, as communication did not progress beyond what suggested by the following provocative quote he called protosign. This is the signed equivalent from the 19th-century British explorer, Mary of what Bickerton (1995) called protolanguage, Kingsley (1965/1897): which is effectively language without gram- [African languages are not elaborate enough] to mar, as exemplified by forms of communica- enable a native to state his exact thought. Some tion acquired by captive great apes such as of them are very dependent upon gesture. When Washoe and Kanzi, or by the 2-year-old child. I was with the Fans they frequently said “We will It is roughly equivalent to the mimetic stage go to the fire so that we can see what they say”, proposed by Donald (1991). In Arbib’s view, when any question had to be decided after dark, and the inhabitants of Fernando Po, the Bubis, are the transition was then from protosign to pro- quite unable to converse with each other unless tospeech, and grammatical language evolved they have sufficient light to see the accompanying from there. gestures of the conversation (p. 504). Since the signed languages of the deaf are fully grammatical, there seems no reason in While this may seem condescending, it may principle why gestural language should not wellbethecasethatsomeculturesmay have achieved the status of full language prior make more use of manual gesture than others, to the emergence of speech as the dominant through cultural rather than biological neces- mode. It is perhaps more likely, though, that sity. In suggesting that the African languages the switch from protolanguage to language was she observed were not elaborate, Kingsley also itself gradual and occurred in parallel with the overlooked her own observation that elabora- switch from a primarily manual form of com- tion was provided by manual gestures, not by munication to a primarily vocal one, with vary- spoken words. ing degrees of admixture. The distinction be- I suggested above that grammar and speech tween protolanguage and language is generally evolved independently, but this may not be depicted as all-or-none, perhaps encouraged by completely true. As suggested earlier, speech the notion that grammatical language depends may have freed the hands for manufacture, on the innate, uniquely human endowment vastly increasing the number of objects to be known as universal grammar (e.g., Chomsky named. This may have increased the pressure 1975). But if grammar emerges gradually, as for language to grammaticalize and become suggested earlier, and is culturally rather than more efficient and sophisticated. We saw ear- biologically tuned, then the evolution of gram- lier that the languages of nonliterate societies mar and the switch from manual to vocal modes may be simpler in terms of such features as may have been contemporaneous and largely the recursive embedding of clauses. The rise independent. of may also have increased the de- Just as grammars vary considerably between mand for language to function as a pedagog- cultures, so different cultures may vary in the ical device. To that extent, then, Arbib (2005) extent to which speech dominates. At one ex- may be correct in suggesting that grammati- treme, of course, are the signed languages de- calization accelerated after speech became the veloped in deaf communities. Less extreme dominant mode. are signed languages developed by some na- tive Australian tribes (Kendon 1988), and in the so-called Plains Sign Talk of Native North Summary and Conclusions American tribes (Mithun 1999), in both cases, theses tribes also speak, but use signed language I have argued that language evolved from for special purposes. Signing may also have the mirror system in primates, which provides 38 Annals of the New York Academy of Sciences a platform for both the production and per- must be an innate component, since grammat- ception of intentional bodily acts. This system ical language is universally and uniquely hu- was adapted for communication in hominins in man. The critical question is whether the con- part as a result of bipedalism, which freed the cept of universal grammar is useful in helping us hands for more varied and elaborate gestures. understand the different forms that languages But the climb to true language probably began can take, from manual to vocal, from Piraha˜ with the emergence of the genus Homo,andthe to standard English. It may be more useful to pressure to more cooperative behavior during view the constructive nature of language as the the Pleistocene, when our forebears were forced product of what Locke and Bogin (2006), after from the canopy onto the more open and Marler (1991), called an “instinct for inventive- dangerous environment of the . Mem- ness” that goes beyond language per se. This ory systems evolved from simple learning and instinct may well be uniquely human but is ev- pattern recognition to the storage and retrieval ident in many activities other than language, of particular episodes, enabling more precise including mental time travel, manufacture, art, planning and prediction of the future. Mental music, and other modes of storytelling, such as time travel into past and future also gave rise dance, drama, movies, and television. We are to a sense of the self through time. One con- at once the most articulate and time-conscious sequence may have been an understanding of of all species, and I dare any other species to mortality, leading to the emergence of religions contradict me. promising a life after death. Language was adapted to the sharing of Acknowledgments episodic information, whether based on mem- ory, future plans, or on fiction. Grammatical I thank Thomas Suddendorf and an anony- language, whether signed or spoken, seems mous referee for their constructive comments uniquely adapted for the sharing of this in- on an earlier draft. Others who have con- formation in an efficient, streamlined manner. tributed significantly to my thinking on lan- Along with the development of grammar, vocal guage evolution include Nicola Clayton, Karen elements were increasingly introduced, so that Emmorey, Tecumseh Fitch, Maurizio Gen- with the emergence of our own species, Homo tilucci, Russell Gray, Nicholas Humphrey, Jim sapiens, speech became the dominant mode. Hurford, Giacomo Rizzolatti, and Michael This freed the hands for the development of Studdert-Kennedy—but not all of them agree more sophisticated manufacture and use of with me. tools, as well as other artifacts such as bodily ornamentation, clothing, and perhaps musical instruments. This may have given further impe- Conflicts of Interest tus to the development of sophisticated gram- The author declares no conflicts of interest. mar and the use of language for more exten- sive and varied activities, such as pedagogy or argument. References A recurrent theme of this article has been the cultural shaping of language. The sheer variety Addis, D. R., Wong, A. T., & Schacter, D. L. (2007). of different language structures argues against Remembering the past and imagining the future: Chomsky’s (1975) notion of an innate univer- Common and distinct neural substrates during event sal grammar, or what Pinker (1994) called the construction. Neuropsychologia, 45, 1363–1377. Arbib, M. A. (2005). From monkey-like action recognition “language instinct.” This point has been elab- to human language: An evolutionary framework for orated in the recent article by Christiansen and neurolinguistics. Behavioral & Brain Sciences, 28, 105– Chater (2008). At some level, though, there 168. Corballis: The Evolution of Language 39

Armstrong, D. F. (1999). Original Signs: Gesture, Sign, and the Chomsky, N. (1975). Reflections on Language.NewYork: Source of Language. Washington: Gallaudet University Pantheon. Press. Christiansen, M. H., & Chater, N. (2008). Language as Armstrong, D. F., Stokoe, W. C., & Wilcox, S. E. (1995). shaped by the brain. Behavioral & Brain Sciences, 31, Gesture and the Nature of Language. Cambridge: Cam- 489–558. bridge University Press. Christiansen, M. H., & Kirby, S. (2003). Language evo- Armstrong, D. F., & Wilcox, S. E. (2007). The Gestural Origin lution: The hardest problem in science? In M. H. of Language. Oxford: Oxford University Press. Christiansen & S. Kirby (Eds.), Language Evolution (pp. Aronoff, M., Meir, I., Padden, C. A., et al. (2007). The 1–15). Oxford: Oxford University Press. roots of linguistic organization in a new language. Clayton, N. S., Bussey, T. J., & Dickinson, A. (2003). Can Interaction Studies, 9, 133–153. animals recall the past and plan for the future? Nat. Barsalou, L. W. (2008). Grounded cognition. Ann. Rev. Rev. Neurosci., 4, 685–691. Psych., 59, 617–645. Condillac, E. B. de. (1971). An Essay on the Origin of Human Bickerton, D. (1995). Language and Human Behavior. Seattle: Knowledge. T. Nugent (Tr.), Gainesville, FL: Scholars University of Washington Press. Facsimiles and Reprints. (Work originally published Binkofski, F., & Buccino, G. (2004). Motor functions of 1746). the Broca’s region. Brain & Language, 89, 362–389. Coop, G., Bullaughev, K., Luca, F., & Przeworski, M. Broca, P.(1861). Remarques sur le siegedelafacult` edela´ (2008). The timing of selection of the human FOXP2 parole articulee,´ suivies d’une observation d’aphemie´ gene. Mol. Biol. Evol., 25, 1257–1259. (perte de parole). Bulletin de la Soci´et´e d’Anatomie (Paris), Corballis, M. C. (1991). The Lopsided Ape.NewYork:Ox- 36, 330–357. ford University Press. Browman, C. P., & Goldstein, L. F. (1995). Dynamics and Corballis, M. C. (1992). On the evolution of language and articulatory phonology. In T.van Gelder & R. F.Port generativity. Cognition, 44, 197–226. (Eds.), Mind as Motion (pp. 175–193). Cambridge, Corballis, M. C. (2002). From Hand to Mouth: The Origins of MA: MIT Press. Language. Princeton, NJ: Princeton University Press. Buccino, G., Binkofski, F., Fink, G. R., et al. (2001). Action Corballis, M. C. (2003). From mouth to hand: observation activates premotor and parietal areas Gesture, speech, and the evolution of right- in a somatotopic manner: An fMRI study. Eur. J. handedness. Behavioral & Brain Sciences, 26, 199– Neurosci., 13, 400–404. 260. Buccino, G., Lui, F., Canessa, N., Patteri, I., Lagravi- Corballis, M. C. (2004a). FOXP2 and the mirror system. nese, G., Benuzzi, F., et al. (2004). Neural circuits Trends Cogn. Sci., 8, 95–96. involved in the recognition of actions performed by Corballis, M. C. (2004b). The origins of modernity: Was nonconspecifics: An fMRI study. J. Cogn. Neurosci., autonomous speech the critical factor? Psychol. Rev., 16, 114–126. 111, 543–522. Bulwer, J. (1644). Chirologia: On the Natural Language of the Corballis, M. C., & Suddendorf, T.(2007). Memory, time, Hand. London. and language. In C. Pasternak (Ed.), What Makes Us Burling, R. (1999). Motivation, conventionalization, and Human (pp. 17–36). Oxford, UK: Oneworld Publi- arbitrariness in the origin of language. In B. J. King cations. (Ed.), The Origins of Language: What Nonhuman Primates Cordemoy, G.de. (1972). Discours physique de la parole Can Tell Us (pp. 307–350). Santa Fe, NM: School of [A philosophical discourse concerning speech, con- American Research Press. formable to the Cartesian principles]. Delmar, NY: Burling, R. (2005). The Talking Ape.NewYork:Oxford Scholars’ Facsimiles & Reprints, Inc (Work originally University Press. published 1688). Calvert, G. A., & Campbell, R. (2003). Reading speech Crystal, D. (1997). The Cambridge Encyclopedia of Language from still and moving faces: The neural substrates of (2nd ed.). Cambridge, UK: Cambridge University visible speech. J. Cogn. Neurosci., 15, 57–70. Press. Cheney, D. L., & Seyfarth, R. M. (1990). How Monkeys See Darwin, C. (1859). The Origin of Species. London: John the World. Chicago: University of Chicago Press. Murray. Chomsky, N. (1957). Syntactic Structures. The Hague: Mou- Darwin, C. (1896). The Descent of Man and Selection in Relation ton. to Sex. London: William Clowes. (Work originally Chomsky, N. (1959). A review of B. F. Skinner’s “Verbal published 1871). behavior.” Language, 35, 26–58. Decety, J., Grezes, J., Costes, N., et al. (1997). Brain ac- Chomsky, N. (1966). Cartesian Linguistics: A Chapter in the tivity during observation of actions. Influence of ac- History of Rationalist Thought.NewYork:Harper& tion content and subject’s strategy. Brain, 120, 1763– Row. 1777. 40 Annals of the New York Academy of Sciences

DeGusta, D., Gilbert, W. H., & Turner, S. P. (1999). Hy- Griebel (Eds.), Evolution of Communication Systems (pp. poglossal canal size and hominid speech. Proc. Nat. 237–256). Cambridge, MA: MIT Press. Acad. Sci. USA, 96, 1800–1804. Gardner, R. A., & Gardner, B. T. (1969). Teaching sign Descartes, R. (1985). Discourse on method. In J. Cotting- language to a chimpanzee. Science, 165, 664–672. ham, R. Stootfoff, & D. Murdock (Eds. & tr.), The Gentilucci, M. (2003). Grasp observation influences Philosophical Writings of Descartes. Cambridge: Cam- speech production. Eur. J. Neurosci., 17, 179–184. bridge University Press. (Work originally published Gentilucci, M., Benuzzi, F., Gangitano, M., et al. (2001). 1647). Grasp with hand and mouth: A kinematic study on Dobzhansky, T. (1973). Nothing in biology makes sense healthy subjects. J. Neurophysiol., 86, 1685–1699. except in the light of evolution. Amer. Biol. Teacher, 35, Gentilucci, M., Santunione, P., Roy, A. C., et al. (2004). 125–129. Execution and observation of bringing a fruit to the Donald, M. (1991). Origins of the Modern Mind. Cambridge, mouth affect syllable pronunciation. Eur. J. Neurosci, MA: Harvard University Press. 19, 190–202. Egnor, S. E. R., & Hauser, M. D. (2004). A paradox in the Gentilucci, M., & Corballis, M. C. (2006). From man- evolution of primate vocal learning. Trends Neurosci., ual gesture to speech: A gradual transition. Neurosci. 27, 649–654. Biobehav. Rev., 30, 949–960. Emmorey, K. (2002). Language, Cognition, and Brain: Insights Gerardin, E., Sirigu, A., Lehericy, S., et al. (2000). Par- from Sign Language Research. Hillsdale, NJ: Erlbaum. tially overlapping neural networks for real and imag- Enard, W., Przeworski, M., Fisher, S. E., et al. (2002). ined hand movements. Cerebral Cortex, 10, 1093– Molecular evolution of FOXP2, a gene involved in 1104. speech and language. Nature, 418, 869–871. Gibson, K. R., & Jessee, S. (1999). Language evolution Evans, P. D., Mekel-Bobrov, N., Yallender, E. J., et al. and expansions of multiple neurological processing (2006). Evidence that the adaptive allele of the brain areas. In B. J. King (Ed.), The Origins of Language: What size gene microcephalin introgressed into Homo sapiens Nonhuman Primates Can Tell Us. Santa Fe, NM: School from an archaic Homo lineage. Proc. Nat. Acad. Sci. of American Research Press. USA, 103, 18178–18183. Givon,` T. (1995). Functionalism and Grammar. Philadelphia: Everett, D. L. (2005). Cultural constraints on grammar Benjamins.. and cognition in Piraha.˜ Curr. Anthro., 46, 621– Goldin-Meadow, S., & McNeill, D. (1999). The role of 646. gesture and mimetic representation in making lan- Everett, D. L. (2007). Cultural Constraints on Grammar in Pi- guage the province of speech. In M. C. Corballis & raha:˜ A Reply to Nevins, Pesetsky, and Rodrigues (2007). S. E. G. Lea (Eds.), The Descent of Mind (pp. 155–172). Available online at http://ling.auf.net/lingBuzz/ Oxford, UK: Oxford University Press. 000427 Goldstein, L., Byrd, D., & Saltzman, E. (2006). The role of Fogassi, L., & Ferrari, P.F.(2007). Mirror neurons and the vocal tract gestural action units in understanding the evolution of embodied language. Curr. Direct. Psych. evolution of phonology. In M. A. Arbib (Ed.), Action Sci., 16, 136–141. to Language via the Mirror Neuron System (pp. 215–249). Foley, R. (1987). Another Unique Species: Patterns in Human Cambridge, UK: Cambridge University Press. Evolutionary Ecology. Harlow: Longman Scientific and Goodall, J. (1986). The Chimpanzees of Gombe: Patterns of Technical. Behavior. Harvard University Press. Frege, G. (1980). Letter to Jourdain. In G. Gabriel, H. Gowlett, J. A. J. (1992). Early human mental abilities. Hermes,F.Kambartel,C.Thiel,&A.Veraart(Eds.), In S. Jones, R. Martin, & D. Pilbeam (Eds.), The Philosophical and Mathematical Correspondence (pp. 78– Cambridge Encyclopedia of Human Evolution (pp. 341– 80). Chicago: Chicago University Press (Work origi- 345). Cambridge: Cambridge University Press. nally published 1914) Grafton, S. T., Arbib, M. A., Fadiga, L., et al. (1996). Frishberg, N. (1975). Arbitrariness and iconicity in Amer- Localization of grasp representations in humans by ican Sign Language. Language, 51, 696–719. positron emission tomography. 2. Observation com- Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). pared with imagination. Exp. Brain Res., 112, 103– The motor theory of speech perception reviewed. 111. Psychonom. Bull. Rev., 13, 361–377. Grezes,` J., Costes, N., & Decety, J. (1998). Top-down effect Gallagher, H. L., & Frith, C. D. (2004). Dissociable neural of strategy on the perception of human biological pathways for the perception and recognition of ex- motion: A PET investigation. Cogn. Neuropsych., 15, pressive and instrumental gestures. Neuropsychologia, 553–582. 42, 1725–1736. Grice, P. (1975). Logic and conversation. In P. Cole & J. Gardenfors,¨ P. (2004). Cooperation and the evolution Morgan (Eds.), Syntax and Semantics, Vol. 3: Speech Acts of symbolic communication. In D. K. Oller & U. (pp. 43–58). New York: Academic Press. Corballis: The Evolution of Language 41

Grodzinsky, Y. (2006). The language faculty, Broca’s re- Corisco and Cameroons. London: F. Cass (Work origi- gion, and the mirror system. Cortex, 42, 464–468. nally published 1897). Groszer, M., Keays, D. A., Deacon, R. M. J., et al. (2008). Kirby, S., & Hurford, J. R. (2002). The emergence of lin- Impaired synaptic plasticity and motor learning in guistic structure: An overview of the iterated learning mice with a point mutation implicated in human model. In A. Cangelosi & D. Parisi (Eds.), Simulat- speech deficits. Curr. Biol., 18, 354–362. ing the Evolution of Language (pp. 121–148). London: Haesler, S., Rochefort, C., Georgi, B., et al. (2007). Incom- Springer-Verlag. plete and inaccurate vocal imitation after knockdown Kohler, E., Keysers, C., Umilta, M. A., et al. (2002). Hear- of FoxP2 in songbird basal ganglia nucleus area X. ing sounds, understanding actions: Action represen- PLoS Biol., 5, 2885–2897. tation in mirror neurons. Science, 297, 846–848. Hallos, J. (2005). “15 minutes of fame:” Exploring the tem- Konner, M. (1982). The Tangled : Biological Constraints poral dimensions of Middle Pleistocene lithic tech- on the Human Spirit.NewYork:Harper. nology. J. Hum. Evol., 29, 155–179. Knight, A., Underhill, P. A., Mortensen, H. M., et al. Hanakawa, T.,Immisch, I., Toma, K., et al. (2003). Func- (2003). African Y chromosome and mtDNA diver- tional properties of brain areas associated with mo- gence provides insight into the history of click lan- tor execution and imagery. J. Neurophys., 89, 989– guages. Curr. Biol., 13, 464–473. 1002. Krause, J., Lalueza-Fox, C., Orlando, L., et al. (2007). Hassabis, D., Kumaran, D., Vann, S. D., et al. (2007). The derived FOXP2 variant of modern humans Patients with hippocampal amnesia cannot imagine was shared with Neandertals. Curr. Biol., 17, 1908– new experiences. Proc. Nat. Acad. Sci. USA, 104, 1726– 1912. 1731. Kuhtz-Buschbeck, J. P., Mahnkopf, C., Holzknecht, C., Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The et al. (2003). Effector-independent representations faculty of language: What is it, who has it, and how of simple and complex imagined finger movements: did it evolve? Science, 298, 1569–1579. A combined fMRI and TMS study. Eur. J. Neurosci., Heine, B., & Kuteva, T. (2007). The Genesis of Grammar. 18, 3375–3387. Oxford:OxfordUniversityPress. Liebal, K., Call, J., & Tomasello, M. (2004). Use of gesture Hewes, G. W. (1973). Primate communication and the sequences in chimpanzees. Amer. J. Primat., 64, 377– gestural origins of language. Curr. Anthro., 14, 5–24. 396. Hockett, C. (1978). In search of love’s brow. American Liberman A. M., Cooper F. S., Shankweiler, D. P., et al. Speech, 53, 243–315. (1967). Perception of the speech code. Psychol. Rev., Hodges, J. R. & Graham, K. S. (2001). Episodic memory: 74, 431–461. Insights from semantic dementia. Phil. Trans. R. Soc. Lieberman, D. E. (1998). Sphenoid shortening and the Lond. B: Biol. Sci., 356, 1423–1434. evolution of modern cranial shape. Nature, 393, 158– Hopper, P.J., & Traugott, E. C. (1993). Grammaticalization. 162. Cambridge, UK: Cambridge University Press. Lieberman, D. E., McBratney, B. M., & Krovitz, G. Horne Tooke, J. (1857). Epea Pteroenta or the Diversions of (2002). The evolution and development of cranial Purley. London. form in Homo sapiens. Proc. Nat. Acad. Sci. USA, 99, Jacobs, Z., Roberts, R. G., Galbraith, R. F., et al. (2008). 1134–1139. Ages for the Middle Stone Age of Southern Africa: Lieberman, P.(1998). Eve Spoke: Human Language and Human Implications for human behavior and dispersal. Sci- Evolution.NewYork:W.W.Norton. ence, 322, 733–735. Lieberman, P. (2007). The evolution of human speech. Janis, C. (1993). Victors by default: The mammalian suc- Curr. Anthro., 48, 39–46. cession. In S. J. Gould (Ed.), The Book of Life (pp. Lieberman, P., Crelin, E. S., & Klatt, D. H. (1972). 169–217). New York: W.W. Norton. Phonetic ability and related anatomy of the new- Jarvis, E. D. (2006). Selection for and against vocal learn- born, adult human, Neanderthal man, and the chim- ing in birds and mammals. Ornithol. Sci., 5, 5–14. panzee. Amer. Anthropol., 74, 287–307. Karlsson, F. (2007). Constraints on multiple center- Liegeois,´ F., Baldeweg, T., Connelly, A., et al. (2003). embedding of clauses. Journal of Linguistics, 43, 365– Language fMRI abnormalities associated with 392. FOXP2 gene mutation. Nat. Neurosci., 6, 1230– Kay, R. F., Cartmill, M., & Barlow, M. (1998). The hy- 1237. poglossal canal and the origin of human vocal be- Lin, J-W. (2005). Time in a language without tense: The havior. Proc. Nat. Acad. Sci. USA, 95, 5417–5419. case of Chinese. J. Semantics, 23, 1–53. Kendon, A. (1988). Sign Languages of Aboriginal Australia. Lindenberg, R., Fangerau, H., & Seitz, R. J. (2007). Melbourne: Cambridge University Press. “Broca’s area” as a collective term? Brain & Lan- Kingsley, M. (1965). Travels in West Africa, Congo Fran¸caise, guage, 102, 22–29. 42 Annals of the New York Academy of Sciences

Locke, J. L., & Bogin, B. (2006). Language and life history: of language: Hearing babies acquiring sign lan- A new perspective on the development and evolution guage babble silently on the hands. Cognition, 93, of human language. Behavioral & Brain Sciences, 29, 43–73. 259–325. Pettito, L. A., & Marentette, P. (1991). Babbling in the MacLarnon, A., & Hewitt, G. (2004). Increased breathing manual mode: Evidence for the ontogeny of lan- control: Another factor in the evolution of human guage. Science, 251, 1493–1496. language. Evol. Anthro., 13, 181–197. Pietrandrea, P. (2002). Iconicity and arbitrariness in Ital- Marler, P. (1991). The instinct to learn. In S. Carey & B. ian Sign Language. Sign Language Studies, 2, 296– Gelman (Eds.), The Epigenesis of Mind: Essays on Biology 321. and Cognition. Hillsdale, NJ: Erlbaum. Pika, S., Liebal, K., & Tomasello, M. (2003). Gestural McGurk, H., & MacDonald, J. (1976). Hearing lips and communication in young gorillas ( gorilla): Ges- seeing voices. Nature, 264, 746–748. turalrepertoire,anduse.Amer. J. Primat., 60, 95– McNeill, D. (1985). So you think gestures are nonverbal? 111. Psychol. Rev., 92, 350–371. Pika, S., Liebal, K., & Tomasello, M. (2005). Gestural McNeill, D. (1992). Hand and Mind: What Gestures Reveal communication in subadult bonobos (Pan paniscus): About Thought. Chicago, IL: University of Chicago Repertoire and use. Amer. J. Primat., 65, 39–61. Press, Pinker, S. (1994). TheLanguageInstinct.NewYork:Morrow. Mellars, P. (2006). Going east: New genetic and archaeo- Pinker, S. (2003). Language as an adaptation to the cogni- logical perspectives on the modern human coloniza- tive niche. In M. H. Christiansen, & S. Kirby (Eds.), tion of Eurasia. Science, 313, 796–800. Language evolution (pp. 16–37). Oxford: Oxford Uni- Mellars, P. A., & Stringer, C. B. (Eds.). (1989). The Hu- versity Press. man Revolution: Behavioural and Biological Perspectives on Pinker, S. (2007). The Stuff of Thought. London: Penguin the Origins of Modern Humans. Edinburgh: Edinburgh Books. University Press. Pinker, S., & Bloom, P. (1990). Natural language and Mithun, M. (1999). The Languages of Native North America. natural selection. Behavioral & Brain Sciences, 13, 707– Cambridge: Cambridge University Press. 784. Muir, L. J., & Richardson, J. E. G. (2005). Perception of Pizutto, E., & Volterra, V. (2000). Iconicity and trans- sign language and its application to visual commu- parency in sign languages: A cross-linguistic cross- nications for deaf people. J. Deaf Studies & Deaf Edu., cultural view. In K. Emmorey & H. Lane (Eds.), The 10, 390–401. Signs of Language Revisited: An Anthology to Honor Ursula Neidle, C., Kegl, J., MacLaughlin, D., et al. (2000). The Bellugi and Edward Klima (pp. 261–286). Mahwah, NJ: Syntax of American Sign Language. Cambridge, MA: The Lawrence Erlbaum Associates. MIT Press. Place, U. T. (2000). The role of the hand in the evolution Nevins, A., Pesetsky, D., & Rodrigues, C. (2007). Pi- of language. Psycoloquy, 11(7). raha˜ Exceptionality: A Reassessment. Available online at Ploog, D. (2002). Is the neural basis of vocalisation dif- http://ling.auf.net/lingBuzz/000411 ferent in non-human primates and Homo sapiens? Nietzsche, F. W. (1986). Human, All Too Human: A Book In T. J. Crow (Ed.), The Speciation of Modern Homo for Free Spirits (Trans. R. J. Hollingdale). Cambridge: sapiens (pp. 121–135). Oxford, UK: Oxford Univer- Cambridge University Press. (Work originally pub- sity Press. lished 1878). Pollick, A. S., & de Waal, F. B. M. (2007). Apes gestures Noonan, J. P.,Coop, G., Alessi, J., et al. (2006). Sequencing and language evolution. Proc.Nat.Acad.Sci.USA, 104, and analysis of Neanderthal genomic DNA. Science, 8184–8189. 314, 1113–1118. Ray, N., Currat, M., Berthier, P., & Excoffier, L. (2005). Parsons, L. M., Fox, P. T., Downs, J. H., et al. (1995). Use Recovering the geographic origin of early modern of implicit motor imagery for visual shape discrimi- humans by realistic and spatially explicit simulations. nation as revealed by PET. Nature, 375, 54–58. Genome Res., 15, 1161–1167. Perrett, D. I., Harries, M. H., Bevan, R., et al. (1989). Rizzolatti, G., & Arbib, M. A. (1998). Language within Frameworks of analysis for the neural representation our grasp. Trends Cogn. Sci., 21, 188–194. of animate objects and actions. J. Exp. Biol., 146, Rizzolatti, G., Camardi, R., Fogassi, L., et al. (1988). 87–113. Functional organization of inferior area 6 in the Petrides, M., Cadoret, G., & Mackey, S. (2005). Orofa- macaque monkey. II. Area F5 and the control of cial somatomotor responses in the macaque monkey distal movements. Exp. Brain Res., 71, 491–507. homologue of Broca’s area. Nature, 435, 1325–1328. Rizzolatti, G., Fadiga, L., Gallese, V., et al. (1996). Pre- Petitto,L.A.,Holowka,S.,Sergio,L.E.,etal. motor cortex and the recognition of motor actions. (2004). Baby hands that move to the rhythm Cogn. Brain Res., 3, 131–141. Corballis: The Evolution of Language 43

Rizzolatti, G., Fogassi, L., & Gallese V.(2001). Neurophys- Sutton-Spence, R., & Boyes-Braem, P. (Eds.). (2001). The iological mechanisms underlying the understanding Hands Are the Head of the Mouth: The Mouth as Articu- and imitation of action. Nat. Rev., 2, 661–670. lator in Sign Language. Hamburg, Germany: Signum- Rizzolatti, G., & Sinigaglia, C. (2008). Mirrors in the Brain. Verlag. Oxford:OxfordUniversityPress. Tattersall, I. (2003). Once we were not alone. Scientific Rousseau, J.-J. (1969). Essai sure l’origine des langues.Paris: American, 13(2), 20–27. A. G. Nizet. (Work originally published 1782). Thorne, A., Grun,¨ R., Mortimer, G., et al. (1999). Aus- Ruben, R. J. (2005). Sign language: Its history and contri- tralia’s oldest human remains: Age of the Lake bution to the understanding of the biological nature Mungo human skeleton. J. Hum. Evol., 36, 591– of language. Acta Oto-Laryngolica, 125, 464–467. 612. Ruff, C. B., Trinkaus, E., & Holliday, T. W. (1997). Body Tomasello, M. (2003). Introduction: Some surprises for mass and encephalization in Pleistocene Homo. Sci- psychologists. In M. Tomasello (Ed.), New Psychol- ence, 387, 173–176. ogy of Language: Cognitive and Functional Approaches to Russell, B. A., Cerny, F. J., & Stathopoulos, E. T. (1998). Language structure (pp. 1–14). Mahwah, NJ: Lawrence Effects of varied vocal intensity on ventilation and Erlbaum. energy expenditure in women and men. J. Speech Tomasello, M. (2008). The Origins of Human Communication. Lang. Hear. Res., 41, 239–248. Cambridge, MA: MIT Press. Saussure, F.de. (1916). Cours de linguistique g´en´erale, C. Bally Tooby, J., & DeVore, I. (1987). The reconstruction of and A. Sechehaye (Ed.), with the collaboration of A. hominid evolution through strategic modeling. In Riedlinger. Lausanne and Paris: Payot; Translated as W. G. Kinzey (Ed.), The Evolution of Human Behavior: W.Baskin (1977), Course in General Linguistics. Glasgow: Primate Models (pp. 183–237). Albany, NY: SUNY Fontana/Collins. Press. Savage-Rumbaugh, S., Shanker, S. G., & Taylor, T. J. Tulving, E. (1972). Episodic and semantic memory. In (1998). Apes, Language, and the Human Mind.NewYork: E. Tulving & W. Donaldson (Eds.), Organization of Oxford University Press. Memory (pp. 381–403). New York: Academic Press. Semaw, S., Renne, P., Harris, J. W. K., et al. (1997). 2.5- Tulving, E. (1985). Memory and consciousness. Canadian million-year-old stone tools from Gona, Ethiopia. Psychologist, 26, 1–12. Nature, 385, 333–336. Tulving, E. (2002) Episodic memory: From mind to brain. Senghas, A., Kita, S., & Ozy¯ urek,¨ A. (2004). Children Ann. Rev. Psych., 53, 1–25. creating core properties of language: Evidence from Tulving, E., Schacter, D. L., McLachlan, D. R., et al. an emerging sign language in Nicaragua. Science, 305, (1988). Priming of semantic autobiographical knowl- 1780–1782. edge: A case study of retrograde amnesia. Brain & Skinner, B. F.(1957). Verbal Behavior. New York: Appleton- Cognition, 8, 3–20. Century-Crofts. Vico, G. B. (1953). La Scienza Nova.Bari:Laterza(Work Skoyles, J. R. (2000). Gesture, language origins, and right originally published 1744). handedness. Psycoloquy, 11(24). Vleck, E. (1970). Etude comparative onto-phylogen´ etique´ Studdert-Kennedy, M. (2005). How did language go dis- de l’enfant du Pech-de-L’Aze´ par rapport a` d’autres crete? In M. Tallerman (Ed.), Language Origins: Per- enfants neanderthaliens.´ In D. Feremback (Ed.), spectives on Evolution (pp. 48–67). Oxford, UK: Oxford L’enfant Pech-de-L’Az´e (pp. 149–186). Paris: Masson. University Press. Watkins, K. E., Dronkers, N. F., & Vargha-Khadem F. Suc, J.-P.,Bertini, A., Leroy, S. A. G., et al. (1997). Towards (2002). Behavioural analysis of an inherited speech the lowering of the Pliocene/Pleistocene boundary and language disorder: Comparison with acquired to the Gauss-Matuyama reversal. Quarter. Inter., 40, aphasia. Brain, 125, 452–464. 37–42. Wood, B., & Collard, M. (1999). The human genus. Sci- Suddendorf, T., & Corballis, M. C. (1997). Mental time ence, 284, 65–71. travel and the evolution of the human mind. Gen. Soc. Wray, A. (Ed.) (2002). The Transition to Language. Oxford: Gen. Psych. Mono., 123, 133–167. Oxford University Press. Suddendorf, T., & Corballis, M. C. (2007). The evolution Wundt, W. (1900). Die Sprache (2 Vols.). Leipzig: of foresight: What is mental time travel, and is it Enghelman. unique to humans? Behavioral & Brain Sciences, 30, Zipf, G. K. (1949). Human Behavior and the Principle of Least- 299–351. Effort. New York: Addison-Wesley.