HCS 7367 Speech Perception  Anatomical, Acoustic, Neural Properties  Production and Perception

Phonetics HCS 7367 Speech Perception Anatomical, acoustic, neural properties Production and perception Focuses on continuous/physical properties Dr. Peter Assmann Basic unit: phone (=speech sound) h Fall 2014 Narrow transcription: [ p ] Phonology Phonology Abstract representations mediate between Phonemes: minimum meaning- the speech stream and its interpretation as differentiating units in a language meaningful words Phonemic inventory: list of the phonemes Discrete entities (phonemes, syllables, ...) of a given language Basic unit: phoneme (smallest contrastive American English has ~12 vowel unit of a language) phonemes, 3 diphthongs, ~24 consonants Broad transcription: / p / Phonology Phonology Minimal pairs test: criterion for deciding Allophones: variants (pronunciations) of a whether two sounds belong to the same given phoneme phonemic category. In English, the “p” sound in “pot” is aspirated h If two words or phrases in a given language (pronounced [ p ɔt ]) while the “p” in “spot” is differ only in one phonological segment (e.g. unaspirated (pronounced [ spɔt ]). English “pin” and “bin”) but are perceived as [ph] and [p] are allophones of the phoneme /p/ distinct words, then that sound contrast contains two distinct phonemes (/p/ and /b/). 1 Consonant Perception Phonology What is a consonant? Preliminaries to Jakobson, Fant and Halle (1961). Consonants are sounds produced with a major Speech Analysis. constriction somewhere in the vocal tract. Distinctive feature theory: universal set of (binary) features, combining articulation and perception Chomsky and Halle (1968). The Sound Pattern of English Phonological representations as sequences of segments (phonemes) composed of distinctive features. Two levels: underlying abstract representation and surface phonetic representation English Consonants Consonant production Labio‐ Place of articulation Bilabial Dental Alveolar Palatal Velar Glottal dental American English has 24 consonant phonemes Voicing –+–+–+–+–+–+– The standard phonetic classification system Stop p b t d k g uses three main features to classify the Nasal m n ŋ articulation of consonants: Fricative f v θ ð s z ∫ ʒ h Affricate t∫ ʤ Place of articulation Manner Approximant (w) r jw Lateral l Manner of articulation Ladefoged, P. (1993). A Course in Phonetics. Harcourt Brace: Fort Worth. 3rd Ed. Voicing Consonant classification Stop consonants in English Place of articulation Place of articulation bilabial alveolar velar bilabial, labiodental, dental, alveolar, palatal, velar, glottal /ba/ /da/ /ga/ Manner of articulation voiced stops, nasal, fricatives, affricates, approximant, lateral Voicing /pa/ /ta/ /ka/ Voicing voiceless voiced, voiceless Manner of articulation = stop consonants 2 Consonant production Source‐filter theory for consonants Speech production theory Consonants: sounds produced with a major Source‐filter concept still applies constriction somewhere in the vocal tract. Relatively narrow constriction or closure Source‐filter theory still applies, but there are Secondary sound source in the upper vocal tract complications: 1. secondary sound source within the vocal tract 2. secondary source is aperiodic, turbulent noise (frication) rather than quasi‐periodic like the glottal source 3. secondary source may combine with the glottal source to produce mixed excitation (voiced fricatives) Acoustic tube perturbation and vowel formants Filter properties of stop consonants Source: Stevens (1999, p. 286) After Stevens (1997) velar alveolar 2.0 1.5 labial F2 frequency (kHz) F2 frequency 1.0 2.0 2.5 3.0 F3 frequency (kHz) Area functions: cylindrical tube Source‐filter theory for consonants approximations of the vocal tract Obstruents (fricatives, affricates, stops) Sonorants (nasals, liquids, glides) In obstruents, the constriction can divide the resonating vocal tract into two glottis lips separate cavities. coupled vs. uncoupled interactions with the source 3 Schematic model for fricative /s/ After Kent, Dembowski & Lass (1996) Voiceless Obstruction (maxillary incisors) unaspirated stop consonant Glottis Jet Lips [pɑ] Eddies Back Cavity Front Cavity Anterior constriction Time (ms) Place of articulation Place of articulation What are the acoustic cues for place of What are the acoustic cues for place of articulation (i.e., what acoustic properties do articulation (i.e., what acoustic properties do listeners use to distinguish /p/, /t/, /k/ (and listeners use to distinguish /p/, /t/, /k/ (and /b/, /d/, /g/) from each other)? /b/, /d/, /g/) from each other)? 1. Frequency of burst 2. Formant transitions (F2 and F3) from the burst to the vowel Acoustic cues for place of Formant transitions articulation in stops Burst cues During the closure for a stop consonant the Release burst, frication noise, aspiration noise vocal tract is completely closed, and no sound is emitted. At the moment of release Spectral shape the resonances change rapidly. These Static vs. dynamic changes are called formant transitions. Formant transition cues F2, F3 onset vs. target (vowel) frequency 4 Formant transitions Burst cues The first formant (F1) always increases in The burst is a brief acoustic transient that frequency following the release of a stop occurs at the moment of consonant release. closure. F2 and F3 increases or decreases, Burst intensity is higher for voiceless stops than depending on the place of constriction and for voiced stops the flanking vowels. Burst frequency varies as a function of place voicing onset burst Burst cues Stop consonants (from Kent and Read, 1992) Is consonant place information available when the burst is removed? Closure (stop gap) Release Aspirated Syllable-initial (noise burst) prevocalic stop Unaspirated Transition (formant transitions) / pɑ / Alveolar stop Velar stop ata aka 4 4 Closure Release Formant (stop gap) (noise burst) transitions / ada / Closure Release Formant /aga/ (stop gap) (noise burst) transitions 3 3 •high F2 •high F2 F3 •high F3 •low F3 2 2 F3 F2 F2 Frequency (kHz) Frequency Frequency (kHz) Frequency 1 1 F1 F1 0 0 0 100 200 300 400 500 0 100 200 300 400 500 Time (ms) Time (ms) 5 Bilabial stop Stop consonants apa (from Kent and Read, 1992) 4 Closure Release Formant Transition (stop gap) (noise burst) transitions /aba/ (formant transitions) Syllable-final 3 postvocalic stop Closure •low F2 (stop gap) •low F3 F3 Release (noise burst) 2 or No release (no burst) Frequency (kHz) Frequency 1 F2 F1 / / 0 0 100 200 300 400 500 Time (ms) Alveolar stop Bilabial stop ap 4 4 /ad/ /ab/ 3 3 •high F2 •low F2 F3 •high F3 F3 •low F3 2 2 F2 Frequency (kHz) F2 Frequency (kHz) 1 1 0 0 0 50 100 150 200 250 0 100 200 300 400 Time (ms) Time (ms) Burst cues: /ba/ and /pa/ Burst cues: /da/ and /ta/ /ba/ /pa/ /da/ /ta/ 4 4 4 4 3 3 3 3 2 2 2 2 Frequency (kHz) Frequency (kHz) 1 1 1 1 0 0 0 0 0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 0 50 100 150 200 250 300 350 Time (ms) Time (ms) Time (ms) Time (ms) • Bilabials: Burst energy is concentrated at low frequencies • Alveolars: Burst energy is concentrated at high frequencies 6 Burst cues: /ga/ and /ka/ Place of articulation /ga/ /ka/ What are the acoustic cues for place of 4 4 articulation (i.e., what acoustic properties do 3 3 listeners use to distinguish /p/, /t/, /k/ (and 2 2 /b/, /d/, /g/) from each other)? Frequency (kHz) Frequency (kHz) 1 1 0 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 350 Time (ms) Time (ms) • Velars: Burst energy is concentrated in the mid range Place of articulation Acoustic cues for place of articulation in stops What are the acoustic cues for place of Burst cues articulation (i.e., what acoustic properties do Release burst, frication noise, aspiration noise listeners use to distinguish /p/, /t/, /k/ (and Spectral shape /b/, /d/, /g/) from each other)? Static vs. dynamic 1. Frequency of burst Formant transition cues 2. Formant transitions (F2 and F3) from the F2, F3 onset vs. target (vowel) frequency burst to the vowel Formant transitions Formant transitions The first formant (F1) always increases in During the closure for a stop consonant the vocal tract is completely closed, and no frequency following the release of a stop sound is emitted. At the moment of release closure. F2 and F3 increases or decreases, the resonances change rapidly. These depending on the place of constriction and changes are called formant transitions. the flanking vowels. 7 Burst cues Burst cues The burst is a brief acoustic transient that Is consonant place information available when occurs at the moment of consonant release. the burst is removed? Burst intensity is higher for voiceless stops than for voiced stops Burst frequency varies as a function of place voicing onset burst Invariance problem for stop consonants Invariance problem for stop consonants Liberman, Delattre, and Cooper (1952) NOISE VOWELS BURSTS B C t A 3.5 4K 3.0 3K 2.5 2K 2.0 ppkpk p 1K 1.5 Burst frequency(kHz) i eu c 1.0 burst + transitionless vowels 0.5 pp Cooper, Delattre, Liberman, Borst & Gerstman (1952) JASA 24, 597-606 ie a c ou Majority responses Invariance problem for stop consonants Invariance problem for stop consonants Transition + vowel without burst 8 Categorical Perception Categorical Perception of stop consonants Phoneme boundary (50%) Equal acoustic changes unequal auditory percepts Identification place of articulation of stops: /b/ vs /d/ vs /g/ (labeling) function Discrimination function b d g Liberman, Harris, Hoffman, and Griffith (1957) Journal of Experimental Psychology 54, 358-368 Categorical Perception Segmentation problem for stops Where are the segments? 1. Identification function shows steep "bag" slope (cross‐over) between categories /bæg/ 2. Good between‐category discrimination F3 (near perfect) when the members of the F2 pair belong to different categories F1 3. Poor within‐category discrimination time (near chance) when they are perceived to belong to the same category Same noise ‐ different consonant Different transition ‐ same consonant F 2 1400 Hz 1400 Hz F 1 pea ka dee da 9 Motor Theory K.N.

HCS 7367 Speech Perception  Anatomical, Acoustic, Neural Properties  Production and Perception

Pg. 5 CHAPTER-2: MECHANISM of SPEECH PRODUCTION AND

Phonological Use of the Larynx: a Tutorial Jacqueline Vaissière

Acoustic-Phonetics of Coronal Stops

Vowel Acoustics Reliably Differentiate Three Coronal Stops of Wubuy Across Prosodic Contexts

Relations Between Speech Production and Speech Perception: Some Behavioral and Neurological Observations

Building a Universal Phonetic Model for Zero-Resource Languages

Prestopped Bilabial Trills in Sangtam*

Theories of Speech Perception

Appendix B: Muscles of the Speech Production Mechanism

Saudi Speakers' Perception of the English Bilabial

Illustrating the Production of the International Phonetic Alphabet

Acoustic Characteristics of Aymara Ejectives: a Pilot Study