The Evolution of Human Vocal Emotion
Total Page:16
File Type:pdf, Size:1020Kb
EMR0010.1177/1754073920930791Emotion ReviewBryant 930791research-article2020 SPECIAL SECTION: EMOTION IN THE VOICE Emotion Review Vol. 13, No. 1 (January 2021) 25 –33 © The Author(s) 2020 ISSN 1754-0739 DOI:https://doi.org/10.1177/1754073920930791 10.1177/1754073920930791 The Evolution of Human Vocal Emotion https://journals.sagepub.com/home/emr Gregory A. Bryant Department of Communication, Center for Behavior, Evolution, and Culture, University of California, Los Angeles, USA Abstract Vocal affect is a subcomponent of emotion programs that coordinate a variety of physiological and psychological systems. Emotional vocalizations comprise a suite of vocal behaviors shaped by evolution to solve adaptive social communication problems. The acoustic forms of vocal emotions are often explicable with reference to the communicative functions they serve. An adaptationist approach to vocal emotions requires that we distinguish between evolved signals and byproduct cues, and understand vocal affect as a collection of multiple strategic communicative systems subject to the evolutionary dynamics described by signaling theory. We should expect variability across disparate societies in vocal emotion according to culturally evolved pragmatic rules, and universals in vocal production and perception to the extent that form–function relationships are present. Keywords emotion, evolution, signaling, vocal affect Emotional communication is central to social life for many ani- 2001; Pisanski, Cartei, McGettigan, Raine, & Reby, 2016; mals. Beginning with Darwin (1872), comparative analyses Sauter, Eisner, Ekman, & Scott, 2010; Scherer, 1984; Shariff & have documented clear similarities in the structures of affective Tracy, 2011). Here, I approach human vocal emotion from an expressions across species, including facial and vocal emotions. adaptationist perspective, and point to some current issues that More recently, researchers in neuroscience and related fields researchers might consider in their examinations of this ubiqui- have described the phylogenetically conserved mechanisms tous communicative behavior. implementing emotional vocal signaling across mammals, including humans. But relatively less is known about the spe- Emotional Expressions cific functional roles that vocal emotions play in human and nonhuman animal social life. By vocal emotions, I refer to the Emotions can be construed as superordinate programs that coor- modulation of acoustic properties of vocalizations associated dinate physiological and psychological components to guide with affective communication, including emotional nonlinguis- behavior in functionally organized ways (Al-Shawaf, Conroy- tic utterances (e.g., crying and laughter) and affective prosody Beam, Asao, & Buss, 2016; Cosmides & Tooby, 2000). Affective (i.e., pitch, loudness, speech rate, and voice quality), both of expression is one important subcomponent often included in the which interact with verbal content in speech. Vocal signals are activation of an emotional program, and affective vocalizations adaptations, meaning that they are evolved solutions to recur- represent the most phylogenetically ancient channel of emo- rent adaptive problems of communication, shaped by natural tional communication (Bass, Gilland, & Baker, 2008). In the selection. Historically, scholars of emotional expression have study of humans, a long research tradition has explored many largely ignored questions of adaptation, but increasingly, systematic connections between acoustic structure in vocal researchers examining human vocal emotion from a biological behavior and broad emotional categories (e.g., Banse & Scherer, perspective have started focusing more on communicative func- 1996; Juslin & Laukka, 2003; Murray & Arnott, 1993; Russell, tion (e.g., Altenmüller, Schmidt, & Zimmermann, 2013; Bachorowski, & Fernández-Dols, 2003). In the last decade or Bachorowski & Owren, 2003; Briefer, 2012; Bryant & Barrett, so, there has been a fair amount of research demonstrating that 2007, 2008; Cosmides, 1983; Filippi et al., 2017; Keltner & listeners across distant cultures can recognize several emotion Gross, 1999; Laukka & Elfenbein, 2012; Owren & Rendall, categories with some success (Bryant & Barrett, 2008; Cordaro, Corresponding author: Gregory A. Bryant, Department of Communication, Center for Behavior, Evolution, and Culture, University of California, Los Angeles, 2225 Rolfe Hall, Los Angeles, CA 90095-1563, USA. Email: [email protected] 26 Emotion Review Vol. 13 No. 1 Keltner, Tshering, Wangchuk, & Flynn, 2016; Cordaro et al., Signals and Cues 2018; Gendron, Roberson, van der Vyver, & Barrett, 2014; Pell, Monetta, Paulmann, & Kotz, 2009; Sauter et al., 2010; Scherer, The origins of emotional vocal signals can only be understood Banse, & Wallbott, 2001; Thompson & Balkwill, 2006). Even in the communicative context of senders and receivers. Animal moderate consistency across widely varying human societies signaling of any kind involves the strategic production of suggests a systematic mapping between vocal acoustics and behavioral acts or structures that affect receivers by design, affective states (Elfenbein & Ambady, 2002). and these signals work effectively because of coevolved This should come as little surprise given the biological basis responses by receivers, generally for the mutual benefit of of mechanisms underlying vocal behavior. Most generally, both parties (Maynard Smith & Harper, 2003). But not all voice production in humans involves the independent contribu- information transmitted between organisms constitutes signal- tions of vocal source dynamics and supralaryngeal filtering, ing, leaving important empirical questions for researchers described by source–filter theory (Fant, 1960; Titze, 1994). The examining any communicative interactions (Scott-Phillips, voice source involves controlled air flow from the lungs through 2008). Thus, it is important to distinguish between adaptive the glottis (housed in the larynx), which is converted to sound signals, byproduct cues, and deceptive coercion (i.e., costly through the oscillating action of the vocal folds. Air flow causes influence on a receiver). Applying this evolutionary frame- vibration regimes in the vocal folds that result in a generally work to vocal affect in humans and nonhuman animals is a tonal, harmonically rich sound. The voiced sound then travels fundamental theoretical and empirical problem for researchers through the supraglottal vocal tract (the top of the larynx to the studying communicative behavior. end of the mouth and nasal cavities) and is subject to a filtering process. Resonating frequencies resulting from this supralaryn- Form and Function geal filter are called formants, which in different configurations form the basis of vowel sounds. These production components The physical structure of any signal is inherently shaped by its result in many measurable sound parameters that are linked to communicative function. For vocal emotion signals specifically, systematic percepts in human listeners, including emotional this principle has been useful for understanding nonhuman ani- information (Kreiman & Sidtis, 2011; Taylor, Charlton, & Reby, mal vocal communication (Morton, 1977; Owren & Rendall, 2016). Brain mechanisms responsible for vocal emotional pro- 2001), as well as human vocalizations (e.g., Bryant, 2013; duction are evolutionarily conserved, meaning that selection Bryant & Barrett, 2007; Cosmides, 1983; Fernald, 1992). The has maintained functional organization, and we can identify acoustic effects of signals on receivers can be direct, as in the homologies across current species indicating their likely pres- function of interrupting behavior through the use of a loud pro- ence in a shared ancestor (Ackermann, Hage, & Ziegler, 2014; hibitive utterance; or indirect, such as the pairing of a threat Jurgens, 2002; Owren, Amoss, & Rendall, 2011). Consequently, display with an actual attack, which subsequently in future we see the same kinds of perceptible acoustic features in emo- interactions requires only the display to be effective (Owren & tional vocalizations across quite different species (e.g., Briefer, Rendall, 2001). Additionally, this perspective allows us to 2012; Filippi et al., 2017). understand evolutionary convergences in why signals are some- These facts about the nature of vocal production suggest that times structured similarly due to overlaps in the adaptive prob- many aspects of vocal communication are likely to manifest lems they solve. For example, a fear scream produced by a themselves universally across humans, and to some extent scared individual might share perceptible acoustic features with across mammalian species. But humans are particularly cultural an infant wailing (e.g., Sobin & Alpert, 1999). They both tend to beings, and consequently, many behaviors that are rooted in bio- be relatively high in average pitch, pitch variability, and loud- logical systems can appear variably as a function of cultural ness. But the signals have these particular features for some- evolutionary processes. Moreover, people exhibit an extraordi- what different evolutionary reasons. Fear screams evolved from nary ability to nonverbally communicate subtle affective dis- alarm signals: loud, abrupt, and acoustically chaotic displays tinctions in their voices and faces. For example, recent work penetrate noisy environments and are difficult for animals to found that vocalizers can convey up to 24 different distinct habituate to—ideal