CODA GLOTTALIZATION IN AMERICAN ENGLISH

Scott Seyfarth and Marc Garellek

University of California, San Diego [email protected], [email protected]

ABSTRACT increased vocal fold constriction or sustained clo- sure. Yet phrase-finally, coda glottalization rates Glottalization of coda /t, p/ is a common process in generally increase [12]. This challenges the en- American English. This study uses acoustic mea- hancement account, since there is often no coarticu- sures to determine when coda glottalization occurs latory voicing at a phrase boundary. in the conversational speech of the Buckeye Cor- However, these findings are based on read speech pus. preceding coda /t, p/ tokens for 40 from two to six speakers, using a relatively narrow speakers were analyzed using H1*–H2*, an acous- selection of segment and word types. Prior identi- tic correlate of glottal constriction. Results indi- fication of coda glottalization has also relied on vi- cate that coda glottalization is more common be- sual inspection of voicing periodicity in waveforms. fore a , and this effect is still found phrase- Yet there is no straightforward relationship between finally, even when phrasal creak is taken into ac- coda glottalization and irregular glottal pulses: coda count. Nonetheless, the process occurs in other en- glottalization by definition involves increased glot- vironments. While we conclude that coda glottaliza- tal constriction, but not all glottal constriction causes tion may occur to enhance the of coda visibly-irregular voicing [3]. Further, irregular voic- /t/ before [20, 28], we argue that this can- ing is not always due to increased glottal constric- not fully explain the phenomenon. tion preceding coda stops. For example, in Ameri- Keywords: glottalization, coda stops, quality can English, vowels can be glottalized when word- initial and stressed [21, 4, 8], and is 1. INTRODUCTION also used to mark ends of phrases and index social identity [15, 9, 24, 23]. In American English, /t/ and /p/ may undergo coda Thus, the goals of the present study are: glottalization [19, 20, 7, 12, 29], which refers to ei- (a) to use an acoustic measure that captures in- ther glottal reinforcement or glottal replacement in creased glottal constriction to test whether prior coda position. Glottal reinforcement occurs when findings on coda glottalization in American English the oral gestures for coda /t, p/ are produced with hold within a large corpus of spontaneous speech; > > simultaneous glottal constriction: [Pt, Pp]. Glottal (b) to evaluate phrase-final rates of coda glottal replacement occurs when the oral gestures for coda replacement, independently of irregular voicing; /t/ are replaced by glottal constriction: [P] or con- (c) to generate hypotheses for why coda glottal- stricted voicing. In American English, glottal rein- ization occurs in American English. forcement is found for both /t/ and /p/ in coda posi- tion, whereas glottal replacement is attested only for 2. CORPUS DATA /t/. Coda glottalization occurs in other varieties of English [25, 5, 17, 1], as well as in other languages The data in this study come from the Buckeye Cor- [7]. For example, glottal replacement is common in pus of spontaneous American English [22], which German [14] and glottal reinforcement is obligatory consists of recorded speech of 40 adults (20 male, in many East and Southeast Asian languages [18, 6]. 20 female) from Ohio. The recordings took place Previous work has shown that coda glottalization in a quiet room and were digitized at 16 kHz with in American English is more common for /t/ than 16-bit resolution. The corpus contains both canon- for /p/, and (phrase-medially) is more likely to oc- ical phonemic and close phonetic transcriptions for cur when the coda precedes a sonorant, such as in each word. The canonical transcriptions were gen- ‘gate number’ [19, 20, 12]. It has been proposed erated by automatic alignment software, which were that glottalization thus serves as an enhancement of then hand-corrected by corpus annotators to create coda voicelessness [28], particularly to prevent voic- the close phonetic transcriptions and segmentation. ing from spreading from the following segment [20]. Words with syllable-final /t, p/ in both the canon- Glottalization weakens or prohibits voicing through ical form and the close transcription were ex- tracted from the corpus. Syllabification was adapted mean, as these suggest erroneous octave jumps. We from [10]. Complex codas (‘kept’, ‘rant’) and also plotted F1–F2 distributions for each type phonetically-voiced stops (/t/ realized as [R]) were by sex, removed tokens > 2.5 SD from each cate- excluded. For complex codas with two stops, glot- gory mean, and additionally hand-pruned a total of talization cannot be attributed to a single stop; for 104 vowels that were likely mistracked. complex codas with sonorant-stop sequences, the Finally, we removed tokens whose H1*–H2* presence of a sonorant could affect the acoustic mea- measurements were > 2.5 SD from the global mean. sures of voice quality (see §3.1). Further, because Of the 5415 remaining tokens, 1549 were followed acoustic glottalization was measured on the vowel by a breath, silence, laugh, or other non-speech preceding /t, p/, we excluded tokens where the tar- noise, and were removed from the present analysis, get vowel was < 50 ms, as short samples are prob- in order to exclude tokens that are likely phrase-final lematic for voice analyses. We also excluded very and where there is no immediately-following seg- long vowels (> 300 ms) and stops (> 150 ms), ment. In total, 3866 tokens and 274 word types are which are likely to be hesitations or disfluencies. included in the analysis of H1*–H2*.

3. STUDY 1: GLOTTALIZATION 3.1.1. Glottal stops ENVIRONMENTS As a secondary measure, we examined separately The first goal was to replicate previous findings, us- the rate of glottal replacement for /t/ codas, which ing acoustic measures, regarding the environments was identified using the annotations in that trigger coda glottalization. The primary variable the close phonetic transcriptions. Hand inspection of interest in these analyses is the segment following suggests that glottal stop annotations were largely the coda stop. Phrase-medially, glottalization is re- accurate: we inspected 1824 tokens labeled as hav- ported to be more prevalent before sonorants than ing a glottal stop, and identified glottal stops as hav- and is predicted to be more prevalent be- ing no [t] formant transitions or release burst and fore voiced obstruents than voiceless ones [20, 12]. irregular voicing localized to the onset and offset This pattern should be observed if glottalization is of the target stop. Of these 1824 tokens, we re- > being used as a strategy to prevent voiceless codas vised only 62 annotations to glottal-reinforced [Pt] from undergoing coarticulatory voicing. because of the presence of a [t] release burst. This dataset included only /t/ codas, since re- 3.1. Methods placement is not attested in American English for /p/. Likely due to the segmentation strategy noted Glottalization was measured using H1*–H2*, which above, H1*–H2* was not correlated with replace- is the difference in amplitude between the first and ment by [P] (rpb = 0.06). Of the 11,594 /t/ tokens second harmonics, corrected for formant frequen- in the dataset, 3762 were followed by a silence or cies and bandwidths. Lower values of H1*–H2* non-speech sound and removed from this analysis, are correlated with increased glottal constriction leaving 7832 tokens and 312 word types. [11, 26, 16], meaning that this acoustic measure should reflect coda glottalization even in the absence 3.2. Models of visually-irregular voicing. H1*–H2* was mea- sured on the vowel preceding each coda stop in our Using lme4 [2], we fit a linear mixed-effects model dataset using VoiceSauce [27], and averaged over to the H1*–H2* data described in §3.1, and a logis- the target vowel’s entire duration. tic model to the glottal stop data in §3.1.1. The vari- For this analysis, we took out 6224 /t/ tokens able of interest—the segment following the stop— transcribed as [P], since the segment boundaries for was Helmert-coded to test two contrasts: voiced vs. [P] were coextensive with regions of glottalization. voiceless obstruents, and obstruents vs. sonorants. These tokens were analyzed separately as described As control variables, we included three factors in §3.1.1, but we found that all results here are qual- that could influence voice quality: absence of a syl- itatively the same with these tokens included. lable onset (favoring word-initial glottalization), po- Since the accuracy of the measurements depends sition of the syllable in the word, and phrasal creak, on correct f 0 and formant tracking, we took several which was identified using the ‘creaky voice’ labels steps to identify mistracked tokens. We excluded provided by corpus annotators. However, when this tokens if f 0 increased by more than 100 Hz (for label was applied only to a local region around a tar- women) or 50 Hz (for men) during the vowel, or get coda (less than twice the nucleus vowel dura- if the mean f 0 was > 2.5 SD from each speaker’s tion), we treated the label as referring to coda glot-   talization rather than phrasal creak. For the model of   H1*–H2*, f 0 was also included as a control. Both   models included maximal converging per-speaker   random effects. Also included were per-word inter-     cepts and slopes for the two Helmert contrasts, and     intercepts for the identity of the following segment.       3.3. Results     Figure 1 shows a summary of the data; fixed-effects  

    estimates for the two models are shown in Table 1.  In the model of H1*–H2*, there is a significant de-  crease in the measure (i.e., more glottal constriction)  when a sonorant follows the coda stop, replicating  the findings in [20, 12]. H1*–H2* is not signifi- cantly lower when the coda stop is followed by a  voiced , relative to a voiceless one.     The model of glottal stop replacement showed   the same pattern: glottal replacement of coda /t/ is Figure 1: Means of H1*–H2* measurements by significantly more likely when there is a following variable. Error bars show bootstrapped 95% CIs. sonorant, but there was not a significant difference between voiced and voiceless obstruents. 4. STUDY 2: GLOTTAL REPLACEMENT BY For the most part, the control variables patterned PHRASE POSITION in expected directions, as shown in Figure 1, al- though in the model they were non-significant (stop Study 1 supported prior work showing that coda place was marginal, p < 0.07). This may be because glottalization (as both glottal constriction and glot- there was insufficient variation in the controls: for tal replacement) is more prevalent preceding a sono- example, less than 4% of tokens were word-medial rant. However, prior work has also reported high (about 75% of the data are monosyllabic function rates of coda glottalization phrase-finally. Why words). might coda glottalization be more common both when the following sound is a sonorant and at the ends of phrases? If coda glottalization occurs pri- H1*–H2* (dB z) /t/ [P] ! marily to enhance voicelessness of the coda stop (by Intercept 0.072 0.465 preventing sonorant voicing from spreading [20]), (0.096)(0.230) then it is unclear why coda glottalization rates would Voiced vs. voiceless obstruent 0.023 0.047 also increase phrase-finally, especially for utterance- (Helmert comparison) (0.019)(0.168) final phrases that are not followed by a speech Sonorant vs. obstruent 0.041** 0.821*** sound. (Helmert comparison) (0.014)(0.112) On the other hand, it is also plausible that phrase- Onset 0.007 0.006 (present: 1, absent: 1) (0.021)(0.112) final coda glottalization is mainly an expression Phrasal creak 0.008 0.352*** of phrasal creak, which is common at the ends (present: 1, absent: 1) (0.037)( 0.069) of prosodic phrases in American English [15, 24]. Coda syllable position 0.020 0.098 Study 2 thus examines the effect of phrase position (word-final: 1, medial: 1) (0.031)(0.110) only on glottal replacement, where the presence of Coronal vs. labial coda stop 0.050 — glottalization is probably not a result of mistaking (p: 1, t: 1) (0.026) — phrasal creak for coda glottalization. Unlike glottal f 0 0.351*** — reinforcement, glottal replacement involves a glottal (Hz, z-score) (0.048) — stop with no coronal formant transitions during the * p < 0.05; ** p < 0.01; *** p < 0.001 (adj. for multiple tests) preceding vowel and no [t] release. Thus, even if a vowel before coda /t/ is creaky because of phrasal Table 1: Model estimates for Study 1 (SEs be- creak rather than coda glottalization, we would still low in parentheses). Italics show units or coding expect to see both formant transitions and a [t] re- scheme. lease. In this study, we also attempt to control for the presence of phrasal creak, as defined in §3.2.      

        

  

  

Figure 2: Proportion of /t/ codas realized as [P], by type of following segment. White bars show the proportion realized as [P] when the /t/ was not followed by a speech sound; all of these codas were phrase-final.

4.1. Coding of phrasal position 5. GENERAL DISCUSSION

Five coders noted whether the target word with a This study tested whether prior findings on coda coda stop was phrase-final (at the end of an inter- glottalization in American English hold across a di- mediate or full intonational phrase). The end of a verse selection of phonological environments and a phrase was identified by lengthening of the phrase- larger number of speakers, using an acoustic corre- final vowel and/or by the presence of a following late of glottal constriction, H1*–H2*, as well as cod- phrase accent, pause, breath, silence, or disfluency. ings for [P]. We find that coda glottalization is more This study reports a preliminary analysis of 6347 common predominantly before sonorants, confirm- /t/ words that have been hand-coded for phrasal po- ing previous findings [20, 12]. On an annotated sub- sition; annotation of the remaining words is cur- set of the corpus, we also find that the sonorant effect rently underway. Figure 2 shows the proportion of still exists phrase-finally, contra [12]. These results /t/ codas that were replaced with glottal stop, by support a glottalized allophone of coda /t/ before phrase position and the following segment type. sonorants (regardless of phrasal position), whose precise phonetic articulation ranges from glottal- > 4.2. Model and Results reinforced [Pt] to a glottal stop [P]. This stems from our finding that both glottal reinforcement and glot- A logistic mixed-effects model was fit to the anno- tal replacement are more common before sonorants. tated glottal replacement data using the procedure, Our results support the claim that coda glottaliza- variables, and coding in §3.2, with two differences. tion occurs when it helps to prevent coarticulatory First, phrasal position was added as a variable, in- anticipatory voicing from a following sonorant [20]. cluding interactions with the type of segment fol- Although coda glottalization rates were not found lowing the coda. Second, the following segment to be different preceding voiced and voiceless ob- variable was re-coded in the model to test three con- struents, voicing is relatively weak for English ob- trasts, based on visualizing the data (Figure 2): ob- struents. Thus, it may be that there is less need to struents vs. sonorants, obstruents vs. utterance-final prevent anticipatory voicing in codas before voiced tokens (those not followed by a speech sound), and obstruents relative to those before sonorants. voiced vs. voiceless obstruents. Utterance-finally, when there is no following In Study 2, there was a marginal overall effect of sound, glottal replacement nonetheless occurs over phrase-final position (b = 0.20, p < 0.07). As in 50% of the time. Additionally, replacement rates be- Study 1, there was more replacement before sono- fore obstruents are somewhat higher phrase-finally rants than obstruents, both phrase-medially (b = than medially. This suggests that there are other con- 1.51, p < 0.001) and phrase-finally (b = 0.83, p < siderations that trigger coda glottalization beyond a 0.001). The sonorant effect was significantly smaller need to prevent coarticulatory voicing, which would phrase-finally (p < 0.001). not be present in final position. For example, it is possible that glottalization serves to enhance the There was no significant difference between voiceless/voiced distinction more generally. Thus, it voiced and voiceless obstruents in either phrase po- may enhance the relative percept of voicelessness in sition. However, phrase-finally, the replacement rate final position, where cues to voicedness are weak- increased more before voiced obstruents than voice- ened and less reliable [13]. less ones (b = 0.29, p < 0.01). Utterance-final to- Nonetheless, even phrase-medially before voice- kens (those not followed by a speech sound) did not less sounds, glottalization is hardly rare (Figure 2). undergo more replacement than phrase-final tokens Further work is needed to better understand why it preceding obstruents (p > 0.5). occurs across such a wide variety of environments. 6. REFERENCES [17] Mees, I., Collins, B. 1999. Cardiff: a real-time study of glottalisation. In: Foulkes, P., Docherty, [1] Ashby, M., Przedlacka, J. 2014. Measuring incom- G., (eds), Urban Voices: Accent Studies in the pleteness: Acoustic correlates of glottal articula- British Isles. London: Arnold 85–202. tions. Journal of the International Phonetic Asso- [18] Michaud, A. 2004. Final and glottal- ciation 44, 283–296. ization: New perspectives from Hanoi Vietnamese. [2] Bates, D., Maechler, M., Bolker, B., Walker, S. Phonetica 61, 119–146. 2014. lme4: Linear mixed-effects models using [19] Pierrehumbert, J. 1994. Knowledge of variation. Eigen and S4. R package version 1.1-7. Papers from the parasession on variation, 30th [3] Blankenship, B. 2002. The timing of nonmodal meeting of the Chicago Linguistic Society Chicago. in vowels. Journal of Phonetics 30, 163– Chicago Linguistic Society 232–256. 191. [20] Pierrehumbert, J. 1995. Prosodic effects on glottal [4] Dilley, L., Shattuck-Hufnagel, S., Ostendorf, M. allophones. In: Fujimura, O., Hirano, M., (eds), 1996. Glottalization of word-initial vowels as a Vocal fold physiology: voice quality control. San function of prosodic structure. Journal of Phonetics Diego: Singular Publishing Group 39–60. 24, 423–444. [21] Pierrehumbert, J., Talkin, D. 1992. of /h/ [5] Docherty, G., Foulkes, P. 1999. Sociophonetic vari- and glottal stop. In: Docherty, G. J., Ladd, D. R., ation in glottals in Newcastle English. Proceedings (eds), Papers in Laboratory Phonology II. Cam- of the International Congress of Phonetic Sciences bridge: Cambridge University Press 90–117. San Francisco. 1037–1040. [22] Pitt, M. A., Dilley, L., , Johnson, K., Kiesling, S., [6] Edmondson, J. A., Chang, Y., Hsieh, F., Huang, Raymond, W., Hume, E., Fosler-Lussier, E. 2007. H. J. 2011. Reinforcing voiceless finals in Tai- Buckeye Corpus of Conversational Speech (2nd re- wanese and Hakka: Laryngoscopic case studies. lease). Department of Psychology, Ohio State Uni- Proceedings of the 17th International Congress of versity Columbus, OH. Phonetic Sciences Hong Kong. [23] Podesva, R. J., Callier, P. 2015. Voice quality and [7] Esling, J. H., Fraser, K. E., Harris, J. G. 2005. Glot- identity. Annual Review of Applied Linguistics 35, tal stop, glottalized resonants, and pharyngeals: 173–194. A reinterpretation with evidence from a laryngo- [24] Redi, L., Shattuck-Hufnagel, S. 2001. Variation in scopic study of Nuuchahnulth (Nootka). Journal the realization of glottalization in normal speakers. of Phonetics 33, 383–410. Journal of Phonetics 29, 407–429. [8] Garellek, M. 2014. Voice quality strengthening and [25] Roach, P. J. 1973. Glottalization of English /p/, /t/, glottalization. Journal of Phonetics 45, 106–113. /k/ and /tS/ – a re-examination. Journal of the In- [9] Garellek, M. 2015. Perception of glottalization and ternational Phonetic Association 3, 10–21. phrase-final creak. Journal of the Acoustical Soci- [26] Samlan, R. A., Story, B. H. 2011. Relation of struc- ety of America 137, 822–831. tural and vibratory kinematics of the vocal folds to [10] Gorman, K. 2013. Generative phonotactics. PhD two acoustic measures of based on thesis University of Pennsylvania. computational modeling. Journal of Speech, Lan- [11] Holmberg, E. B., Hillman, R. E., Perkell, J. S., guage, and Hearing Research 54, 1267–1283. Guiod, P., Goldman, S. L. 1995. Compar- [27] Shue, Y.-L., Keating, P. A., Vicenik, C., Yu, K. isons among aerodynamic, electroglottographic, 2011. VoiceSauce: A program for voice analysis. and acoustic spectral measures of female voice. Proceedings of the International Congress of Pho- Journal of Speech and Hearing Research 38, 1212– netic Sciences Hong Kong. 1846–1849. 1223. [28] Stevens, K., Keyser, S. J. 1989. Primary features [12] Huffman, M. K. 2005. Segmental and prosodic ef- and their enhancement in consonants. Language fects on coda glottalization. Journal of Phonetics 65(1), 86–106. 33, 335–362. [29] Sumner, M., Samuel, A. G. 2005. Perception and [13] Keyser, S. J., Stevens, K. N. 2006. Enhancement representation of regular variation: The case of fi- and Overlap in the Speech Chain. Language 82(1), nal /t/. Journal of Memory and Language 52, 322– 33–63. 338. [14] Kohler, K. J. 1994. Glottal stops and glottalization in German. Data and theory of connected speech processes. Phonetica 51, 38–51. [15] Kreiman, J. 1982. Perception of sentence and para- graph boundaries in natural conversation. Journal of Phonetics 10, 163–175. [16] Kreiman, J., Shue, Y.-L., Chen, G., Iseli, M., Ger- ratt, B. R., Neubauer, J., Alwan, A. 2012. Variabil- ity in the relationships among voice quality, har- monic amplitudes, open quotient, and glottal area waveform shape in sustained phonation. Journal of the Acoustical Society of America 132, 2625–2632.