Cues to Vowels in the Aperiodic Phase of English Plosive Onsets Kaj Christian Nyman Mphil University of York Language and Lingui
Total Page:16
File Type:pdf, Size:1020Kb
1 Cues to Vowels in the Aperiodic Phase of English Plosive Onsets Kaj Christian Nyman MPhil University of York Language and Linguistic Science September 2011 2 Abstract This thesis addresses the problem of vowel recognition in coarticulatory theory and phonology by assessing how early vowel quality can be recognised from English onset plosives realised with aspiration. Particular attention is paid to aspects of production and perception timing. A gating experiment was used to assess how reliably listeners can recognise English monophthongs. The treatment of coarticulation distinguishes between phonetic and phonological aspects of production and perception, with a clear demarcation between these levels of representation. The results are interpreted through the lens of prosodic phonology, as this framework constrains the grammar more optimally than segmental-phonemic ones and better exemplifies listeners’ sensitivity to the distribution of FPD. Velar and bilabial onsets give rise to significantly more correct responses than alveolars, which require more precise articulations. High vowels are recognised more reliably than low ones. This result is due to their intrinsically shorter duration, making high vowels less variable through time. This perceptual link is proportionate to the total amount of variation in vowel inherent spectral change (VISC), which corresponds to spectro-temporal variation in formant centre frequencies through time in vowel realisations. Nasal rimes give rise to a smaller proportion of correct responses than non-nasal rimes, especially in the context of high and low front vowels: the VISC and changes in vowel height undergone in the context of such articulations, as well as the phonetic consequences of the overall articulatory constellation shape the resulting percept. CVCs with non-nasal rimes give rise to more correct responses than CVVs, despite there being more articulations on-going: the shortness of the vowel in CVCs compensates for this deficit, making perception more robust. Word frequency does not have a significant effect on recognition for any of the syllable types investigated. Overall, a much larger temporal window than the phoneme is required for the robust processing and perceptual integration of speech. Phonemes alone cannot adequately define how the relationship between the phonetic co- extensiveness of different sounds and feature sharing is to be accounted for in speech understanding. Since articulators are in constant motion during production, and consonantal gestures have distinctive coarticulatory influences over vocalic ones, the formant frequencies for both types of sound are in constant flux. This variation reinforces perceptual cohesion and has systematic effects on the mapping of FPD, through which larger structures become audible. 3 Table of Contents Abstract __________________________________________ 2 List of Figures ______________________________________ 8 List of Tables ______________________________________ 15 Preface __________________________________________ 17 Acknowledgements ________________________________ 20 Author’s Declaration _______________________________ 21 1. Background ______________________________________ 22 1.1 Studying the Perception of Coarticulation within the Context of Vowel Recognition from Aspiration ___________________ 22 1.2 Research Questions _______________________________ 27 1.2.1 Research Questions: Generalisations on this Study and its Relationship with Previous Research ________________________ 30 1.3 General Accounts of Coarticulation __________________ 32 1.3.1 A Non-Segmental Structural Definition of Coarticulation: Phonetic vs. Phonological Aspects __________________________ 43 1.4 Influences on this Study and its Theoretical Rationale: Theories and Approaches to Coarticulation _______________ 48 1.4.1 FPA and DP ________________________________________ 48 1.4.2 Polysp ____________________________________________ 51 1.4.3 The Contributions of Previous Theories: Explaining Vowel Timing Non-Segmentally __________________________________ 55 1.5 The Application and Use of Terminology in this Study ___ 58 2. Literature Review ________________________________ 66 2.1 Vowel Timing ____________________________________ 67 2.1.1 Timing Information: VISC _____________________________ 67 2.1.2 Articulatory-Phonetic Timing _________________________ 72 2.1.3 Functional Timing (Information Encoding) _______________ 75 4 2.1.4 Non-Linearity in Vowel Perception: Order Effects and Perceptual Confusions____________________________________ 77 2.2 Vowel Timing and Aspiration in English CV(V)/Cs _______ 79 2.2.1 The Relationship of Contrast and Representation to Recognition ____________________________________________ 79 2.2.2 Structural Variation and Vowel Recognition _____________ 84 2.2.3 FPD and Coarticulatory Direction Effects ________________ 90 2.2.4 Long-Domain Coarticulation and Airflow in CV(C)s ________ 94 2.3 The Phonological Treatment of Vowel Recognition _____ 107 2.3.1 A Formal Model for Reconciling Inconsistent Findings on Vowel Recognition Timing: Units and Devices Available _______ 107 2.4 An Evaluation of the Methods of Earlier Studies _______ 114 2.5 Secondary Research Questions _____________________ 118 2.6 Hypotheses _____________________________________ 122 3. Methodology __________________________________ 127 3.1. Overview ______________________________________ 127 3.2 Experimental Design and Rationale _________________ 127 3.2.1 Justifications for Choosing the Gating Paradigm ______ 127 3.3 Participants ____________________________________ 131 3.3.1 Speakers _________________________________________ 131 3.3.2 Listeners _________________________________________ 132 3.3.3 Method of Recruiting Participants ____________________ 133 3.4 Materials ______________________________________ 134 3.4.1 Stimuli and Stimulus Structure _______________________ 134 3.5 Implementing the Design and Experiment ____________ 138 3.5.1 Recordings _______________________________________ 138 3.5.2 Stimulus Segmentation _____________________________ 139 3.5.3 Stimulus Presentation and Procedure _________________ 142 3.6 Analysis Method ________________________________ 144 3.6.1 General Phonetic Aspects of Plosives and the Aperiodic Phase _____________________________________________________ 144 5 3.6.2 Plosive-Vowel Transitions in Unaspirated Plosive-Vowel CVs and in Aperiodic Noise __________________________________ 146 3.6.3 On the Phonetic Properties of Aspiration and Accompanying Formant Transitions into Vowels __________________________ 149 3.6.4 Spectro-Temporal Analysis of Production Timing ________ 152 3.6.5 Statistical Methods and External Analyses of Perception Timing ________________________________________________ 156 4. Results ________________________________________ 158 4.1 Overview ______________________________________ 158 4.2. Vowel Timing and Aspiration in CV(V)/C Production ___ 159 4.2.1 Temporal Dynamics and VISC – the Evolution of Spectral Information in Production _______________________________ 160 4.2.2 FPD and Coarticulatory Direction Effects _______________ 173 4.2.3 Long-Domain Coarticulation and Airflow: ______________ 178 Phonetic Exponency and Structure for [ + Nasal ] Stimuli ______ 178 4.3 Vowel Timing and Aspiration in CV(V)/C Perception ____ 182 4.3.1 Temporal Dynamics and VISC: the Evolution of Spectral Information in Vowel Recognition _________________________ 182 4.3.2 FPD and Coarticulatory Direction Effects _______________ 183 4.3.3 Long-Domain Coarticulation and Airflow _______________ 185 4.4 Differences in Vowel Recognition Relating to Nasality __ 186 4.5 Perceptual Confusions and Vowel Length ____________ 190 4.5.1 Overall Values Across Time __________________________ 191 4.5.2 An Examination of the Results between Gates __________ 192 4.6 Lexical Frequency ________________________________ 197 4.7 A Summary of the Results Presented in Chapter 4 ______ 203 4.7.1 Production Results _________________________________ 203 4.7.2 Perception Results _________________________________ 204 5. Recognising and Building Representations for Vowels through Time ____________________________________ 207 5.1 Overview ______________________________________ 207 5.2 Extending Our Understanding of the Perception of Coarticulation and Vowel Recognition __________________ 209 6 5.2.1 A Re-examination of the Hypotheses Presented in Chapter 2 _____________________________________________________ 209 5.2.2 Reconciling the Aims and Results of this Study __________ 212 5.2.3 Main Findings _____________________________________ 213 5.2.4 The Way Recognition Evolves Through Time ____________ 215 5.2.5 Contrast, Representation and Vowel Recognition ________ 217 5.2.6 FPD and Coarticulatory Direction Effects _______________ 217 5.2.7 Phonological/Syllable Structure ______________________ 218 5.2.8 Long-Domain Coarticulation and Airflow _______________ 218 5.3 General Aspects of a Model of Vowel Recognition _____ 219 5.3 Projecting Vowel and Syllable Structures Step-by-Step Using Incremental Dynamic Information _________________________ 230 5.3.1 Example: abstraction of ‘pea’ ________________________ 231 5.3.2 Abstraction at Time Slot 2 (Burst Transient with 10ms Vowel Resonance) ____________________________________________ 234 5.3.3 Abstraction at Time Slot 3 (Plosive Burst with 20ms Accompanying Vowel Resonance) _________________________ 240 5.3.4 Abstraction at Time slot 4 (burst transient + 30ms vowel resonance) ____________________________________________