Brain Mechanisms of Acoustic Communication in Humans and Nonhuman Primates: an Evolutionary Perspective
Total Page:16
File Type:pdf, Size:1020Kb
BEHAVIORAL AND BRAIN SCIENCES (2014) 37, 529–604 doi:10.1017/S0140525X13003099 Brain mechanisms of acoustic communication in humans and nonhuman primates: An evolutionary perspective Hermann Ackermann Neurophonetics Group, Centre for Neurology – General Neurology, Hertie Institute for Clinical Brain Research, University of Tuebingen, D-72076 Tuebingen, Germany [email protected] www.hih-tuebingen.de/neurophonetik Steffen R. Hage Neurobiology of Vocal Communication Research Group, Werner Reichardt Centre for Integrative Neuroscience, and Institute for Neurobiology, Department of Biology, University of Tuebingen, D-72076 Tuebingen, Germany [email protected] www.vocalcommunication.de Wolfram Ziegler Clinical Neuropsychology Research Group, City Hospital Munich- Bogenhausen, D-80992 Munich, and Institute of Phonetics and Speech Processing, Ludwig-Maximilians-University, D-80799 Munich, Germany. [email protected] www.ekn.mwn.de Abstract: Any account of “what is special about the human brain” (Passingham 2008) must specify the neural basis of our unique ability to produce speech and delineate how these remarkable motor capabilities could have emerged in our hominin ancestors. Clinical data suggest that the basal ganglia provide a platform for the integration of primate-general mechanisms of acoustic communication with the faculty of articulate speech in humans. Furthermore, neurobiological and paleoanthropological data point at a two-stage model of the phylogenetic evolution of this crucial prerequisite of spoken language: (i) monosynaptic refinement of the projections of motor cortex to the brainstem nuclei that steer laryngeal muscles, presumably, as part of a “phylogenetic trend” associated with increasing brain size during hominin evolution; (ii) subsequent vocal-laryngeal elaboration of cortico-basal ganglia circuitries, driven by human-specific FOXP2 mutations. This concept implies vocal continuity of spoken language evolution at the motor level, elucidating the deep entrenchment of articulate speech into a “nonverbal matrix” (Ingold 1994), which is not accounted for by gestural-origin theories. Moreover, it provides a solution to the question for the adaptive value of the “first word” (Bickerton 2009) since even the earliest and most simple verbal utterances must have increased the versatility of vocal displays afforded by the preceding elaboration of monosynaptic corticobulbar tracts, giving rise to enhanced social cooperation and prestige. At the ontogenetic level, the proposed model assumes age-dependent interactions between the basal ganglia and their cortical targets, similar to vocal learning in some songbirds. In this view, the emergence of articulate speech builds on the “renaissance” of an ancient organizational principle and, hence, may represent an example of “evolutionary tinkering” (Jacob 1977). Keywords: articulate speech; basal ganglia; FOXP2; human evolution; speech acquisition; spoken language; striatum; vocal behavior; vocal learning 1. Introduction: Species-unique (verbal) and troglodytes) and bonobos (Pan paniscus) (Hillix 2007; primate-general (nonverbal) aspects of human Wallman 1992), despite the fact that these species have vocal behavior “notoriously mobile lips and tongues, surely transcending ” 1.1. Nonhuman primates: Speechlessness in the face the human condition (Tuttle 2007, p. 21). As an example, the cross-fostered chimpanzee infant Viki mas- of extensive vocal repertoires and elaborate oral-motor “ ” capabilities tered less than a handful of words even after extensive training. These utterances were not organized as speech- All attempts to teach great apes spoken language have like vocal tract activities, but rather as orofacial manoeuvres failed – even in our closest cousins, the chimpanzees (Pan imposed on a (voiceless) expiratory air stream (Hayes 1951, © Cambridge University Press 2014 0140-525X/14 $40.00 Downloaded from https://www.cambridge.org/core. Graduate Institute, on 30 Jan 2019 at 16:12:11, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms529 . https://doi.org/10.1017/S0140525X13003099 Ackermann et al.: Brain mechanisms of acoustic communication in humans and nonhuman primates p. 67; see Cohen 2010). By contrast, Viki was able to skill- point of language evolution in our species (e.g., Corballis fully imitate manual and even orofacial movement 2002, p. ix; 2003). sequences of her caretakers (Hayes & Hayes 1952) and Tracing back to the 1960s, vocal tract morphology has learned, for example, to blow a whistle (Hayes 1951, been assumed to preclude production of “the full range pp. 77, 89). of human speech sounds” (Lieberman 2006a; 2006b, Nonhuman primates are, nevertheless, equipped with p. 289) and, thereby, to constrain imitation of spoken lan- rich vocal repertoires, related specifically to ongoing guage in nonhuman primates (Lieberman 1968; Lieber- intra-group activities or environmental events (Cheney & man et al. 1969). However, this model cannot account for Seyfarth 1990; 2007). Yet, their calls seem to be linked to the inability of nonhuman primates to produce even the different levels of arousal associated with especially most simple verbal utterances. The complete lack of urgent functions, such as escaping predators, surviving in verbal acoustic communication rather suggests more fights, keeping contact with the group, and searching for crucial cerebral limitations of vocal tract motor control food resources or mating opportunities (Call & Tomasello (Boë et al. 2002; Clegg 2012; Fitch 2000a; 2000b). Accord- 2007; Manser et al. 2002; Seyfarth & Cheney 2003b; Tom- ing to a more recent hypothesis, lip smacking – a rhythmic asello 2008). Several studies point, indeed, at a more elab- facial expression frequently observed in monkeys – might orate “cognitive load” to the vocalizations of monkeys and constitute a precursor of the dynamic organization of apes in terms of subtle audience effects (Wich & de Vries speech syllables (Ghazanfar et al. 2012; MacNeilage 2006), conceptual-semantic information (Zuberbühler 1998). As an important evolutionary step, a phonation 2000a; Zuberbühler et al. 1999), proto-syntactical call con- channel must have been added in order to render lip catenations (Arnold & Zuberbühler 2006; Ouattara et al. smacking an audible behavioral pattern (Ghazanfar et al. 2009), conditionability (Aitken & Wilson 1979; Hage 2013). Hence, this theory calls for a neurophysiological et al. 2013; Sutton et al. 1973; West & Larson 1995), and model of how articulator movements were refined and, the capacity to use distinct calls interchangeably under finally, integrated with equally refined laryngeal move- different conditions (Hage et al. 2013). It remains, ments to create the complex motor skill underlying the pro- however, to be determined whether such communicative duction of speech. skills really represent precursors of higher-order cognitive–linguistic operations. In any case, the motor 1.2. Dual-pathway models of acoustic communication mechanisms of articulate speech appear to lack significant and the enigma of emotive speech prosody vocal antecedents within the primate lineage. This limita- tion of the faculty of acoustic communication is “particular- The calls of nonhuman primates are mediated by a complex ly puzzling because [nonhuman primates] appear to have so network of brainstem components, encompassing a mid- many concepts that could, in principle, be articulated” brain “trigger structure,” located in the periaqueductal (Cheney & Seyfarth 2005, p. 142). As a consequence, the gray (PAG) and adjacent tegmentum, and a pontine vocal manual and facial gestures rather than the vocal calls of pattern generator (Gruber-Dujardin 2010; Hage 2010a; our primate ancestors have been considered the vantage 2010b). In addition to various subcortical limbic areas, the medial wall of the frontal lobes, namely, the cingulate vocalization region and adjacent neocortical areas, also pro- jects to the PAG. This region, presumably, controls higher- order motor aspects of vocalization such as operant call HERMANN ACKERMANN is Professor of Neurological conditioning (e.g., Trachy et al. 1981). By contrast, the Rehabilitation at the Centre for Neurology, Hertie In- stitute for Clinical Brain Research, University of Tue- acoustic implementation of the sound structure of spoken bingen. His research focuses on the cerebral basis of language is bound to a cerebral circuit including the ventro- speech production and speech perception, and he is lateral/insular aspects of the language-dominant frontal the author or coauthor of more than 120 publications lobe and the primary sensorimotor cortex, the basal within the domains of neuropsychology, neurolinguis- ganglia, and cerebellar structures in either hemisphere tics, and neurophonetics. (Ackermann & Riecker 2010a; Ackermann & Ziegler 2010; Ackermann et al. 2010). Given the virtually complete STEFFEN R. HAGE is Head of the Neurobiology of speechlessness of nonhuman primates, the behavioral ana- Vocal Communication Research Group at the Werner logues of acoustic mammalian communication might not be Reichardt Centre for Integrative Neuroscience, Univer- sought within the domain of spoken language, but rather in sity of Tuebingen. He is the author of more than 20 pub- lications within the area of neuroscience, especially the nonverbal affective vocalizations of our species such as laughing, crying, or moaning (Owren et al. 2011). Against neurophysiology