Speech Categorization Is Better Described by Induced Rather Than Evoked Neural Activity

Total Page:16

File Type:pdf, Size:1020Kb

Speech Categorization Is Better Described by Induced Rather Than Evoked Neural Activity bioRxiv preprint doi: https://doi.org/10.1101/2020.10.20.347526; this version posted October 21, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Title: Speech categorization is better described by induced rather than evoked neural activity Md Sultan Mahmud1,2,a, Mohammed Yeasin1,2, and Gavin M. Bidelman2,3,4 1 Department of Electrical and Computer Engineering, University of Memphis, Memphis, TN 38152, USA 2Institute for Intelligent Systems, University of Memphis, Memphis, TN 38152, USA 3School of Communication Sciences & Disorders, University of Memphis, Memphis, TN 38152, USA 4University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN 38103, USA a Email: [email protected] , ORCID: 0000-0002-7361-122X 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.20.347526; this version posted October 21, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. ABSTRACT Categorical perception (CP) describes how the human brain categorizes speech despite inherent acoustic variability. We examined neural correlates of CP in both evoked and induced EEG activity to evaluate which mode best describes the process of speech categorization. Using source reconstructed EEG, we used band-specific evoked and induced neural activity to build parameter optimized support vector machine (SVMs) model to assess how well listeners’ speech categorization could be decoded via whole-brain and hemisphere-specific responses. We found whole-brain evoked β-band activity decoded prototypical from ambiguous speech sounds with ~70% accuracy. However, induced γ-band oscillations showed better decoding of speech categories with ~95% accuracy compared to evoked β-band activity (~70% accuracy). Induced high frequency (γ-band) oscillations dominated CP decoding in the left hemisphere, whereas lower frequency (θ-band) dominated decoding in the right hemisphere. Moreover, feature selection identified 14 brain regions carrying induced activity and 22 regions of evoked activity that were most salient in describing category-level speech representations. Among the areas and neural regimes explored, we found that induced γ-band modulations were most strongly associated with listeners’ behavioral CP. Our data suggest that the category-level organization of speech is dominated by relatively high frequency induced brain rhythms. Keywords: Categorical perception; time-frequency analysis; induced oscillations; gamma-band activity, machine learning; support vector machine (SVM). 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.20.347526; this version posted October 21, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 I. INTRODUCTION 2 The human brain classifies diverse acoustic information into smaller, more meaningful 3 groupings (Bidelman and Walker, 2017), a process known as categorical perception (CP). CP 4 plays a critical role in auditory perception, speech acquisition, and language processing. Brain 5 imaging studies have shown that neural responses elicited by prototypical speech sounds (i.e., 6 those heard with a strong phonetic category) differentially engage Heschl’s (HG) and inferior 7 frontal (IFG) gyrus compared to ambiguous speech (Bidelman et al., 2013; Bidelman and Lee, 8 2015; Bidelman and Walker, 2017). Previous studies also demonstrate that the N1 and P2 waves 9 of the event-related potentials (ERPs ) are highly sensitive to speech perception and correlate 10 with CP (Alain, 2007; Bidelman et al., 2013; Mankel et al., 2020). These studies demonstrate 11 that evoked activity in the time domain provides a neural correlate of speech categorization. 12 However, ERP studies do not reveal how induced brain activity (so-called neural oscillations) 13 might contribute to this process. 14 The electroencephalogram (EEG) can be divided into evoked (i.e., phase-locked) and 15 induced (i.e., non-phase locked) responses that vary in a frequency-specific manner (Shahin et 16 al., 2009). Evoked responses are largely related to the stimulus, whereas induced responses are 17 additionally linked to different perceptual and cognitive processes that emerge during task 18 engagement. These later brain oscillations (neural rhythms) play an important role in perceptual 19 and cognitive processes and reflect different aspects of speech perception. For example, low 20 frequency [e.g., θ (4-8 Hz) ] bands are associated with syllable segmentation (Luo and Poeppel, 21 2012) whereas α (9-13 Hz) band has been linked with attention (Klimesch, 2012) and speech 22 intelligibility (Dimitrijevic et al., 2017). Several studies report listeners’ speech categorization 23 efficiency varies in accordance with their underlying induced and evoked neural activity 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.20.347526; this version posted October 21, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 24 (Bidelman et al., 2013; Bidelman and Alain, 2015; Bidelman and Lee, 2015). For instance, 25 Bidelman assessed correlations between ongoing neural activity (e.g., induced activity) and the 26 slopes of listeners’ identification functions, reflecting the strength of their CP (Bidelman, 2017). 27 Listeners were slower and varied in their classification of more category-ambiguous speech 28 sounds, which covaried with increases in induced γ activity (Bidelman, 2017). Changes in β (14- 29 30 Hz) power are also strongly associated with listeners’ speech identification skills (Bidelman, 30 2015). The β frequency band is linked with auditory template matching (Shahin et al., 2009) 31 between stimuli and internalized representations kept in memory (Bashivan et al., 2014), whereas 32 the higher γ frequency range (>30 Hz) is associated with auditory object construction (Tallon- 33 Baudry and Bertrand, 1999) and local network synchronization (Giraud and Poeppel, 2012; 34 Haenschel et al., 2000; Si et al., 2017). 35 Studies also demonstrate hemispheric asymmetries in neural oscillations. During syllable 36 processing, there is a dominance of γ frequency activity in LH and θ frequency activity in RH 37 (Giraud et al., 2007; Morillon et al., 2012). Other studies show that during speech perception and 38 production, lower frequency bands (3-6 Hz) better correlate with behavioral reaction times than 39 higher frequencies (20-50 Hz) (Yellamsetty and Bidelman, 2018). Moreover, induced γ-band 40 correlates with speech discrimination and perceptual computations during acoustic encoding (Ou 41 and Law, 2018), further suggesting it reflects a neural representation of speech above and beyond 42 evoked activity alone. 43 Still, given the high dimensionality of EEG data, it remains unclear which frequency 44 bands, brain regions, and “modes” of neural function (i.e., evoked vs. induced signaling) are 45 most conducive to describing the neurobiology of speech categorization. To this end, the recent 46 application of machine learning (ML) to neuroscience data might prove useful in identifying the 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.20.347526; this version posted October 21, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 47 most salient features of brain activity that predict human behaviors. ML is an entirely data-driven 48 approach that “decodes” neural data with minimal assumptions on the nature of exact 49 representation or where those representations emerge. Germane to the current study, ML has 50 been successfully applied to decode the speed of listeners’ speech identification (Al-Fahad et al., 51 2020) and related receptive language brain networks (Mahmud et al., 2020) from multichannel 52 EEGs. 53 Departing from previous hypothesis-driven studies (Bidelman, 2017; Bidelman and 54 Alain, 2015; Bidelman and Walker, 2017), the present work used a comprehensive data-driven 55 approach (i.e., stability selection and SVM classifiers) to investigate the neural mechanisms of 56 speech categorization using whole-brain electrophysiological data. Our goals were to evaluate 57 which neural regime [i.e., evoked (phase-synchronized ERP) vs. induced oscillations], frequency 58 bands, and brain regions are most associated with CP using whole-brain activity via a data-driven 59 approach. Based on prior work, we hypothesized that evoked and induced brain responses would 60 both differentiate the degree to which speech sounds carry category-level information (i.e., 61 prototypical vs. ambiguous sounds from an acoustic-phonetic continuum). However, we 62 predicted induced activity would best distinguish category-level speech representations, 63 suggesting a dominance of endogenous brain rhythms in describing the neural underpinnings of 64 CP. 65 II. MATERIALS & METHODS 66 A. Participants 67 Forty-eight young (male: 15, female: 33; aged 18 to 33 years) participated in the study 68 (Bidelman et al., 2020; Bidelman and Walker, 2017; Mankel et al., 2020). All participants had 69 normal hearing sensitivity (i.e., <25 dB HL between 500-2000 Hz) and no previous history of 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.20.347526;
Recommended publications
  • Categorical Perception of Natural and Unnatural Categories: Evidence for Innate Category Boundaries1
    UCLA Working Papers in Linguistics, vol.13, December 2005 Papers in Psycholinguistics 2, Okabe & Nielsen (eds.) CATEGORICAL PERCEPTION OF NATURAL AND UNNATURAL CATEGORIES: EVIDENCE FOR INNATE CATEGORY BOUNDARIES1 JILL GILKERSON [email protected] The results reported here support the claim that naturally occurring phonemic contrasts are easier to acquire than unnatural contrasts. Results are reported from two experiments in which English speakers were exposed to non-native phonemic categories using a bi-modal statistical frequency distribution modeled after Maye and Gerken (2000). Half of the participants heard a distribution in which the category boundary was that of the Jordanian Arabic uvular/pharyngeal contrast, while the other half heard a distribution with an unnatural category boundary. Immediately after exposure, participants completed an A-X delayed comparison task, where they were presented with stimuli that crossed category boundaries. Results indicated that participants in the natural training group responded “different” to across-category pairs significantly more often than participants in the unnatural training group. 1. INTRODUCTION The purpose of this research is to investigate whether adult acquisition of non-native phonemic contrasts is solely dependent on general learning principles or whether it is influenced by principles specific to natural language. Theories of innate grammar claim that humans are endowed with a genetically predetermined system specifically designed to facilitate language acquisition (cf. Chomsky 1965, among others). Nativists maintain that humans are “prewired” with sensitivity to linguistically relevant information in the input. Others have suggested that there are no learning mechanisms specific to language acquisition; it is facilitated by an interaction between statistical distribution of elements in the input and general learning mechanisms not specific to language (cf.
    [Show full text]
  • Categorical Perception of Consonants and Vowels: Evidence from a Neurophonetic Model of Speech Production and Perception
    Categorical Perception of Consonants and Vowels: Evidence from a Neurophonetic Model of Speech Production and Perception Bernd J. Kröger, Peter Birkholz, Jim Kannampuzha, and Christiane Neuschaefer-Rube Department of Phoniatrics, Pedaudiology, and Communication Disorders, University Hospital Aachen and RWTH Aachen University, Aachen, Germany {bkroeger,pbirkholz,jkannampuzha,cneuschaefer}@ukaachen.de Abstract. While the behavioral side of categorical perception in speech is al- ready well investigated, little is known concerning its underlying neural mecha- nisms. In this study, a computer-implemented neurophonetic model of speech production and perception is used in order to elucidate the functional neural mechanisms responsible for categorical perception. 20 instances of the model (“virtual listeners/speakers”) underwent a speech acquisition training procedure and then performed behavioral tests, i.e. identification and discrimination expe- riments based on vocalic and CV-syllabic speech stimuli. These virtual listeners showed the expected behavioral results. The inspection of the neural organi- zation of virtual listeners indicated clustering in the case of categorical percep- tion and no clustering in the case of non-categorical (continuous) perception for neurons representing the stimuli. These results highlight a possible neural or- ganization underlying categorical and continuous perception. Keywords: speech perception, categorical perception, identification, discrimi- nation, neural model of speech production. 1 Introduction Categorical perception is an important feature of speech, needed for successfully differentiating and identifying sounds, syllables, or words. Categorical speech per- ception enables humans to map different realizations of one speech sound into one category. This is important in order to achieve a robust discrimination of different speech items, even if these items are realized by different speakers or by different articulations of the same speaker.
    [Show full text]
  • Effects of Sign Language Experience on Categorical Perception of Dynamic ASL Pseudosigns
    Attention, Perception, & Psychophysics 2010, 72 (3), 747-762 doi:10.3758/APP.72.3.747 Effects of sign language experience on categorical perception of dynamic ASL pseudosigns CATHERINE T. BEST University of Western Sydney, Penrith, New South Wales, Australia and Haskins Laboratories, New Haven, Connecticut GAURAV MATHUR Gallaudet University, Washington, D.C. and Haskins Laboratories, New Haven, Connecticut KAREN A. M IRANDA Massachusetts School of Professional Psychology, Boston, Massachusetts AND DIANE LI ll O -MARTIN University of Connecticut, Storrs, Connecticut and Haskins Laboratories, New Haven, Connecticut We investigated effects of sign language experience on deaf and hearing participants’ categorical perception of minimal manual contrast stimuli that met key criteria of speech perception research. A continuum of meaningless dynamic stimuli was created with a morphing approach, which manipulated videorecorded productions of phono- tactically permissible pseudosigns differing between American Sign Language (ASL) handshapes that contrast on a single articulatory dimension (U–V: finger-spreading). AXB discrimination and AXB categorization and goodness ratings on the target items were completed by deaf early (native) signers (DE), deaf late (nonnative) signers (DL), hearing late (L2) signers (HL), and hearing nonsigners (HN). Categorization and goodness functions were less categorical and had different boundaries for DL participants than for DE and HL participants. Shape and level of discrimination functions also differed by ASL experience and hearing status, with DL signers showing better perfor- mance than DE, HL, and especially HN participants, particularly at the U end of the continuum. Although no group displayed a peak in discrimination at the category boundary, thus failing to support classic categorical perception, discrimination was consistent with categorization in other ways that differed among the groups.
    [Show full text]
  • Linguistic Effect on Speech Perception Observed at the Brainstem
    Linguistic effect on speech perception observed at the brainstem T. Christina Zhaoa and Patricia K. Kuhla,1 aInstitute for Learning & Brain Sciences, University of Washington, Seattle, WA 98195 Contributed by Patricia K. Kuhl, May 10, 2018 (sent for review January 5, 2018; reviewed by Bharath Chandrasekaran and Robert J. Zatorre) Linguistic experience affects speech perception from early infancy, as 11 mo of age were observed to be significantly reduced compared previously evidenced by behavioral and brain measures. Current with MMRs observed at 7 mo age (15). research focuses on whether linguistic effects on speech perception The current study set out to examine whether the effects of lin- canbeobservedatanearlierstageinthe neural processing of speech guistic experience on consonant perception already manifest at a (i.e., auditory brainstem). Brainstem responses reflect rapid, auto- deeper and earlier stage than the cortex, namely, at the level of matic, and preattentive encoding of sounds. Positive experiential auditory brainstem. The auditory brainstem encodes acoustic in- effects have been reported by examining the frequency-following formation and relays them to the auditory cortex. Its activity can be response (FFR) component of the complex auditory brainstem recorded noninvasively at the scalp using a simple setup of three response (cABR) in response to sustained high-energy periodic electroencephalographic (EEG) electrodes (16). Using extremely portions of speech sounds (vowels and lexical tones). The current brief stimuli such as clicks or tone pips, the auditory brainstem re- study expands the existing literature by examining the cABR onset sponse (ABR) can be elicited and has been used clinically to assess component in response to transient and low-energy portions of the integrity of the auditory pathway (17).
    [Show full text]
  • Enhanced Sensitivity to Subphonemic Segments in Dyslexia: a New Instance of Allophonic Perception
    brain sciences Article Enhanced Sensitivity to Subphonemic Segments in Dyslexia: A New Instance of Allophonic Perception Willy Serniclaes 1,* and M’ballo Seck 2 1 Speech Perception Lab., CNRS & Paris Descartes University, 75006 Paris, France 2 Human & Artificial Cognition Lab., Paris 8 University, 93526 Saint-Denis, France; [email protected] * Correspondence: [email protected] or [email protected]; Tel.: +33-680-230-886 Received: 28 February 2018; Accepted: 22 March 2018; Published: 26 March 2018 Abstract: Although dyslexia can be individuated in many different ways, it has only three discernable sources: a visual deficit that affects the perception of letters, a phonological deficit that affects the perception of speech sounds, and an audio-visual deficit that disturbs the association of letters with speech sounds. However, the very nature of each of these core deficits remains debatable. The phonological deficit in dyslexia, which is generally attributed to a deficit of phonological awareness, might result from a specific mode of speech perception characterized by the use of allophonic (i.e., subphonemic) units. Here we will summarize the available evidence and present new data in support of the “allophonic theory” of dyslexia. Previous studies have shown that the dyslexia deficit in the categorical perception of phonemic features (e.g., the voicing contrast between /t/ and /d/) is due to the enhanced sensitivity to allophonic features (e.g., the difference between two variants of /d/). Another consequence of allophonic perception is that it should also give rise to an enhanced sensitivity to allophonic segments, such as those that take place within a consonant cluster.
    [Show full text]
  • Categorical Perception in Animal Communication and Decision-Making
    applyparastyle "fig//caption/p[1]" parastyle "FigCapt" Behavioral The official journal of the ISBE Ecology International Society for Behavioral Ecology Downloaded from https://academic.oup.com/beheco/advance-article-abstract/doi/10.1093/beheco/araa004/5740107 by duke university medical ctr user on 18 February 2020 Behavioral Ecology (2020), XX(XX), 1–9. doi:10.1093/beheco/araa004 Invited Review Categorical perception in animal communication and decision-making Patrick A. Green,a,b, Nicholas C. Brandley,c and Stephen Nowickib,d, aCentre for Ecology and Conservation, College of Life and Environmental Sciences, University of Exeter, Stella Turk Building, Treliever Road, Penryn, Cornwall TR10 9FE, UK, bBiology Department, Duke University, Box 90338, Duke University, Durham, NC 27708, USA, cDepartment of Biology, College of Wooster, Ruth W. Williams Hall, 931 College Mall, Wooster, OH 44691, USA, and dDepartment of Neurobiology, Duke University Medical School, 311 Research Drive, Box 3209, Durham, NC 27710, USA Received 14 November 2019; revised 14 January 2020; editorial decision 15 January 2020; accepted 17 January 2020. The information an animal gathers from its environment, including that associated with signals, often varies continuously. Animals may respond to this continuous variation in a physical stimulus as lying in discrete categories rather than along a continuum, a phenom- enon known as categorical perception. Categorical perception was first described in the context of speech and thought to be uniquely associated with human language. Subsequent work has since discovered that categorical perception functions in communication and decision-making across animal taxa, behavioral contexts, and sensory modalities. We begin with an overview of how categorical per- ception functions in speech perception and, then, describe subsequent work illustrating its role in nonhuman animal communication and decision-making.
    [Show full text]
  • Theories of Speech Perception
    Theories of Speech Perception • Motor Theory (Liberman) • Auditory Theory – Close link between perception – Derives from general and production of speech properties of the auditory • Use motor information to system compensate for lack of – Speech perception is not invariants in speech signal species-specific • Determine which articulatory gesture was made, infer phoneme – Human speech perception is an innate, species-specific skill • Because only humans can produce speech, only humans can perceive it as a sequence of phonemes • Speech is special Wilson & friends, 2004 • Perception • Production • /pa/ • /pa/ •/gi/ •/gi/ •Bell • Tap alternate thumbs • Burst of white noise Wilson et al., 2004 • Black areas are premotor and primary motor cortex activated when subjects produced the syllables • White arrows indicate central sulcus • Orange represents areas activated by listening to speech • Extensive activation in superior temporal gyrus • Activation in motor areas involved in speech production (!) Wilson and colleagues, 2004 Is categorical perception innate? Manipulate VOT, Monitor Sucking 4-month-old infants: Eimas et al. (1971) 20 ms 20 ms 0 ms (Different Sides) (Same Side) (Control) Is categorical perception species specific? • Chinchillas exhibit categorical perception as well Chinchilla experiment (Kuhl & Miller experiment) “ba…ba…ba…ba…”“pa…pa…pa…pa…” • Train on end-point “ba” (good), “pa” (bad) • Test on intermediate stimuli • Results: – Chinchillas switched over from staying to running at about the same location as the English b/p
    [Show full text]
  • How Infant Speech Perception Contributes to Language Acquisition Judit Gervain* and Janet F
    Language and Linguistics Compass 2/6 (2008): 1149–1170, 10.1111/j.1749-818x.2008.00089.x How Infant Speech Perception Contributes to Language Acquisition Judit Gervain* and Janet F. Werker University of British Columbia Abstract Perceiving the acoustic signal as a sequence of meaningful linguistic representations is a challenging task, which infants seem to accomplish effortlessly, despite the fact that they do not have a fully developed knowledge of language. The present article takes an integrative approach to infant speech perception, emphasizing how young learners’ perception of speech helps them acquire abstract structural properties of language. We introduce what is known about infants’ perception of language at birth. Then, we will discuss how perception develops during the first 2 years of life and describe some general perceptual mechanisms whose importance for speech perception and language acquisition has recently been established. To conclude, we discuss the implications of these empirical findings for language acquisition. 1. Introduction As part of our everyday life, we routinely interact with children and adults, women and men, as well as speakers using a dialect different from our own. We might talk to them face-to-face or on the phone, in a quiet room or on a busy street. Although the speech signal we receive in these situations can be physically very different (e.g., men have a lower-pitched voice than women or children), we usually have little difficulty understanding what our interlocutors say. Yet, this is no easy task, because the mapping from the acoustic signal to the sounds or words of a language is not straightforward.
    [Show full text]
  • Acquisition of Language Lecture 10 Phonological Development
    Ling 51/Psych 56L: Acquisition of Language Lecture 10 Phonological development III Announcements Be working on the phonological development review questions Be working on HW3 Prelinguistic speech perception Infant hearing Infant hearing is not quite as sensitive as adult hearing - but they can hear quite well and remember what they hear. Ex 1: Fetuses 38 weeks old A loudspeaker was placed 10cm away from the mother’s abdomen. The heart rate of the fetus went up in response to hearing a recording of the mother’s voice, as compared to hearing a recording of a stranger’s voice. Infant hearing Infant hearing is not quite as sensitive as adult hearing - but they can hear quite well and remember what they hear. Ex 2: Newborns Pregnant women read a passage out loud every day for the last 6 weeks of their pregnancy. Their newborns showed a preference for that passage over other passages read by their mothers. Infant hearing Infant hearing is not quite as sensitive as adult hearing - but they can hear quite well and remember what they hear. Ex 3: Newborns (Moon, Lagercrantz, & Kuhl 2012) Swedish and English newborns heard different ambient languages while in the womb (Swedish and English, respectively), and were surprised when they heard non-native vowels only hours after birth. Studying infant speech perception http://www.thelingspace.com/episode-16 https://www.youtube.com/watch?v=3-A9TnuSVa8 beginning through 3:34: High Amplitude Sucking Procedure (HAS) Studying infant speech perception Researchers use indirect measurement techniques. High Amplitude Sucking (HAS) Infants are awake and in a quietly alert state.
    [Show full text]
  • Categorical Effects in the Perception of Faces
    COGNITION ELSEVIER Cognition 57 (1995) 217-239 Categorical effects in the perception of faces James M. Beale*, Frank C. Keil* Department of Psychology, Cornell University, Ithaca, NY 14853, USA Received April 25, 1994, final version accepted February 9, 1995 Abstract These studies suggest categorical perception effects may be much more general than has commonly been believed and can occur in apparently similar ways at dramatically different levels of processing. To test the nature of individual face representations, a linear continuum of "morphed" faces was generated between individual exemplars of familiar faces. In separate categorization, discrimination and "better-likeness" tasks, subjects viewed pairs of faces from these continua. Subjects discriminate most accurately when face-pairs straddle apparent category boundaries; thus individual faces are perceived categorically. A high correlation is found between the familiarity of a face-pair and the magnitude of the categorization effect. Categorical perception therefore is not limited to low-level perceptual continua, but can occur at higher levels and may be acquired through experience as well. I. Introduction The categorization of natural objects in the world, whether they be faces, vehicles or vegetables, is normally thought to involve many levels of processing ranging from general beliefs about the category as a whole to psychophysical invariants that occur across instances. It is normally assumed that categorical perception effects are due to perceptual processes at the psychophysical level. Yet, these low-level categorical perception effects may be indicative of general cognitive processes (Harnad, 1987). This raises the question of whether the techniques for studying low-level perceptual categories might be effectively applied to the study of higher-level categories as well.
    [Show full text]
  • Speech Perception Is ‘Special’ – I.E
    24.963 Linguistic Phonetics Perception - cues Discrimination Identification 100 /b/ /d/ /g/ 0 -6 0 +8 -6 0 +8 HC HC 100 Percent correct 0 Percent /b.d.g./ -6 0 +8 -6 0 +8 PG PG 100 0 -6 0 +8 -6 0 +8 DL DL Image by MIT OCW. Adapted from Liberman, A. M. "Some characteristics of perception in the speech mode." Perception and its Disorders 48 (1970): 238-254. And Liberman, A. M. "Discrimination in speech and nonspeech modes." Cognitive Psychology 2 (1970): 131-157. 1 • Reading for next week: Johnson ch. 9 • Assignments: – Acoustics assignment 4, due 11/3 – Talk to me about a paper topic. 2 Speech perception • The problem faced by the listener: To extract meaning from the acoustic signal. • This involves the recognition of words, which in turn involves discriminating the segmental contrasts of a language. • Much phonetic research in speech perception has been directed toward identifying the perceptual cues that listeners use. 3 Perceptual Cues • Production studies can reveal differences between minimally contrasting words, e.g. aspirated and unaspirated stops in Mandarin differ in VOT. – Are listeners sensitive to these differences in speech perception? • Most direct test of perceptual significance of an acoustic property: manipulate the acoustic property synthetically (or by editing, resynthesis) and see if perceptual response is affected. – E.g. vary VOT in synthetic CV syllables (or by editing natural utterances), and have listeners classify the syllables as voiced or voiceless. 4 VOT as a cue to stop voicing and aspiration contrasts - Lisker & Abramson (1970) • Synthesized a VOT continuum for /Ta/ syllables.
    [Show full text]
  • Speech Perception
    UC Berkeley Phonology Lab Annual Report (2010) Chapter 5 (of Acoustic and Auditory Phonetics, 3rd Edition - in press) Speech perception When you listen to someone speaking you generally focus on understanding their meaning. One famous (in linguistics) way of saying this is that "we speak in order to be heard, in order to be understood" (Jakobson, Fant & Halle, 1952). Our drive, as listeners, to understand the talker leads us to focus on getting the words being said, and not so much on exactly how they are pronounced. But sometimes a pronunciation will jump out at you - somebody says a familiar word in an unfamiliar way and you just have to ask - "is that how you say that?" When we listen to the phonetics of speech - to how the words sound and not just what they mean - we as listeners are engaged in speech perception. In speech perception, listeners focus attention on the sounds of speech and notice phonetic details about pronunciation that are often not noticed at all in normal speech communication. For example, listeners will often not hear, or not seem to hear, a speech error or deliberate mispronunciation in ordinary conversation, but will notice those same errors when instructed to listen for mispronunciations (see Cole, 1973). --------begin sidebar---------------------- Testing mispronunciation detection As you go about your daily routine, try mispronouncing a word every now and then to see if the people you are talking to will notice. For instance, if the conversation is about a biology class you could pronounce it "biolochi". After saying it this way a time or two you could tell your friend about your little experiment and ask if they noticed any mispronounced words.
    [Show full text]