Download Preprint
Total Page:16
File Type:pdf, Size:1020Kb
The role of familiarity on within-person age judgements from voices Abstract Listeners can perceive a person’s age from their voice with above-chance accuracy. Studies have usually established this by asking listeners to directly estimate the age of unfamiliar voices. The recordings used mostly include cross-sectional samples of voices, including people of different ages to cover the age range of interest. Such cross-sectional samples likely not only include cues to age in the sound of the voice but also socio-phonetic cues, encoded in how a person speaks. How accuracy is affected when minimising socio-phonetic cues by sampling the same voice at different timepoints remains largely unknown. Similarly, with the voices in age perception studies being usually unfamiliar to listeners, it is unclear how familiarity with a voice affects age perception. We asked listeners who were either familiar or unfamiliar with a set of four voices to complete an age discrimination task: Listeners heard two recordings of the same person’s voice, recorded 15 years apart, and were asked to indicate in which recording the person was younger. Accuracy for both familiar and unfamiliar listeners was above chance. While familiarity advantages were apparent, accuracy was not particularly high: Familiar and unfamiliar listeners were correct for 68.2% and 62.7% of trials respectively (chance = 50%). Familiarity furthermore interacted with the voices included. Overall, our findings indicate that age perception from voices is not a trivial task at all times – even when listeners are familiar with a voice. We discuss our findings in light of how reliable voice may be as a signal for age. Keywords: Age perception; voice; familiarity; longitudinal voice sample 1 The role of familiarity on within-person age judgements from voices Introduction The sound of our voices changes throughout our life: As a result of these changes, listeners can make perceptual inferences about a person’s age from the voice only (e.g., Linville, 1996; Moyse, 2014). This ability has most frequently been demonstrated using studies that ask listeners to estimate the exact age of a set of voices, often covering most of the human life span. These studies tend to show significant correlations between the estimates age of a person and their true, chronological age (Braun & Cerrato, 1999; Cerrato et al., 2000; Hughes & Rhodes, 2010; Jiao et al., 2019; Shipp & Hollien, 1969; for a meta-analysis see Hunter et al., 2016). Further studies report that listeners can categorise voices with above chance accuracy into age ranges (e.g., 20-30 years, 30-40 years, etc. or by being given labels, such as “middle-aged” vs “older”; Braun & Decker, 2014; Linville & Korabic, 1986; Ptacek & Sanders, 1966; Shipp & Hollien, 1969). Another study reported good accuracy for relative age judgements, where listeners heard pairs of voices and were asked to indicate which one is the younger (Pettorino & Giannini, 2011). Overall, accuracy of age estimates is generally therefore well above chance performance, and/or clear relationships between the estimated and chronological age are apparent, performance on across the different tasks has, however, been influenced by stimulus, listener and speaker characteristics (e.g., Braun & Decker, 2014; Huntley et al., 1969; Linville & Korabic, 1986; Moyse, 2014; Skoog Waller et al., 2015). Notably, although a speaker’s age can therefore be determined with some accuracy, it is thought that errors of age estimations are on the order of 10 years for voices based on the existing literature (Moyse, 2014 for a review). 2 The role of familiarity on within-person age judgements from voices These perceptual studies complement findings from studies of acoustic changes in the voice across the lifespan: The most dramatic changes to a person’s voice occur early on in life during childhood and adolescence across the adult lifespan, further – potentially more subtle – acoustic changes occur during a person’s adult life (e.g., Hazan, 2017 for a review). For example, the fundamental frequency (F0) of a voice changes throughout adulthood, with the F0 of males increasing in later adulthood and older age, while the F0 of females decreases in later adulthood (Linville, 1998; but see Eichhorn et al., 2018 who does not report changes in F0 for males). In addition to these broad trends across all adults, further factors such as a person’s lifestyle, occupation or their physical and mental health can affect the sound of a person’s voice throughout adulthood. For example, professions that put a strain on a person’s voice may result in a higher prevalence of voice disorders (e.g., teachers, Roy et al., 2004), while smoking can also affect the characteristics of a person’s voice (Gonzalez & Carpi, 2004; Sorensen & Horii, 1982). Listeners can perceive these acoustic changes and make inferences about a person’s age, resulting in the significant relationships between the perception of a speaker’s age and their true age in the studies reviewed above. The accuracy of age perception from voices has so far almost exclusively been studies using voices drawn from cross-sectional samples, using a single recording from a large number of speakers of different ages (but see Hunter & Ferguson, 2017). Such cross-sectional samples are on the surface ideal for age perception research, because they enable experimenters to tightly control the stimulus properties (e.g., the content of the stimulus recordings, the recording quality). When taking a cross-sectional sample, the sampled voices will, however, likely not only vary in how old the voice 3 The role of familiarity on within-person age judgements from voices itself sounds but – depending on the stimuli used – will also include socio-phonetic markers that can vary cross-sectionally and can be diagnostic of a person’s age. These markers can take many forms, from speaking in a globally different style to systematic changes in the use and/or prevalence of certain speech sounds. If listeners are able to perceive such tell-tale socio-phonetic cues, age perception judgements become influenced by features that are relatively independent of the sound of the voice itself. One example of the influence of such socio-phonetic cues on age perception is described by Ruch (2018): Listeners perceive a voice recording that includes a phonetic feature that is primarily used by younger people (aspirated /s/ in Andalusian Spanish) as being younger than the very same recording from which this feature had been removed. To minimise the effects of such socio-phonetic cues and to focus on age perception based on the sound of the voice, longitudinal voice samples are better suited, where the same person’s voice has been sampled at different time points. To our knowledge, only one study has addressed this issue so far, using longitudinal recordings of a single speaker across 50 years (Hunter & Ferguson, 2017). This study also reports that listeners were indeed able to perceive the age of this speaker. The current study set out to build on this initial evidence by using more voices and controlling for effects of recording quality across longitudinal voice samples. Another question that remains largely unaddressed in the age perception literature is whether familiarity with a voice can aid accurate age perception. Advantages due to being familiar with a voice have been widely reported for a number of perceptual judgements in the voice and speech processing literature. For example, voice identity perception is less error-prone when listeners are familiar with a voice (voice discrimination: Lavan et al., 2016; Lavan, Kreitewolf, et al., 2020; voice identity sorting: 4 The role of familiarity on within-person age judgements from voices Lavan, Burston & Garrido, 2019; Lavan Burston, Merriman, et al., 2019; Stevenage, et al., 2020). Interestingly, judgements of a person’s social traits (e.g., their perceived trustworthiness or dominance) are not systematically affected by voice familiarity (Lavan et al., 2020). For speech intelligibility, studies also report that listeners are able to better understand what a familiar other is saying in challenging listening situations (e.g., speech perception in noise) compared to when listening to an unfamiliar voice (Domingo et al., 2019; Holmes et al., 2018; Holmes & Johnrude, 2019; Johnsrude et al., 2013; Kreitewolf et al., 2017; Nygaard et al., 1994). For age perception, one study also reports some evidence for a familiarity advantage for a longitudinal study (Hunter & Ferguson, 2017). In the current study, we set out to assess to what degree age perception from the voice is possible for within-voice comparisons and to examine whether familiarity with a voice leads to more accurate age perception. For this purpose, we sampled four voices from the TV documentary series Time Team that was recorded and aired over the course of 20 years. Voice clips include full, semi-spontaneously uttered sentences, that were largely uncontrolled in content without giving away the age of the speakers and recording quality, and thus ecologically valid stimulus materials. Listeners who were either familiar or unfamiliar with Time Team completed an age perception task, thus manipulating familiarity via having knowledge of these four famous voices via having watched or not Time Team. Listeners completed an age discrimination task, in which they were asked to judge which of two voice clips of a person was recorded when the person was 15 years younger than in the other clip. Based on the previous literature, we predicted that both familiar and unfamiliar listeners would be able to 5 The role of familiarity on within-person age judgements from voices judge vocal age with above chance accuracy, but that familiar listeners would be more accurate overall. Methods Participants All participants were aged between 18-65 years of age and were primarily native speakers of English (one native speaker of Italian and one native speaker of Norwegian) and had no major self-reported hearing difficulties.