Usage-Based Effects in Latin American Spanish Syllable
Total Page:16
File Type:pdf, Size:1020Kb
USAGE-BASED EFFECTS IN LATIN AMERICAN SPANISH SYLLABLE-FINAL /s/ LENITION Michelle Annette Minnick Fox A DISSERTATION in Linguistics Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy 2006 __________________________ Mark Liberman Supervisor of Dissertation __________________________ Eugene Buckley Graduate Group Chairperson PDF created with pdfFactory trial version www.pdffactory.com Acknowledgments I am very grateful to my advisor, Mark Liberman, for all of his support and advice in this project, and to committee members Gene Buckley and Gillian Sankoff for their insightful comments and suggestions. Many others have been kind enough to give me help and comments on this work over the past few years. In particular, I thank Rolf Noyer, Bill Labov, Pam Beddor, Susan Garrett, Terri Lander, and Corey Miller, for their insights. I thank Susan Garrett and Doug Chavez for performing the coding of the CallHome corpus. I thank Kazuaki Maeda for all of his help with technical issues. I am fortunate to have started at Penn with a wonderful group of classmates. I thank Jason Baldridge, Atissa Banuazizi, Patricia Chow, Ron Kim, Eon-Suk Ko, Masato Kobayashi, Eleni Miltsakaki, Kieran Snyder, and Alexander Williams for their camaraderie. For countless enjoyable hours in the phonetics lab, I thank John Bell, Susan Garrett, Eon-Suk Ko, Terri Lander, Kazuaki Maeda, and Amanda Seidl. Several students who started ahead of me have been an inspiration, particularly Corey Miller, Sean Crist, and Beth-Ann Hockey. It has often been challenging to work on the dissertation from Michigan. I thank the Phondi group at the University of Michigan linguistics department, and especially Pam Beddor, José Benkí, and San Duanmu for their hospitality. I thank my parents who have been constantly supportive of my pursuit of the PhD. This work would not have been possible without continuous motivation and help from my mother. ii PDF created with pdfFactory trial version www.pdffactory.com My deepest gratitude is to my husband, Jason, who has shown unsurpassed patience throughout all phases of graduate school, and to my three children born while I have been working on this dissertation, Andrea, Jeremy, and Brandon. Their love and enthusiasm have been an inspiration to me. iii PDF created with pdfFactory trial version www.pdffactory.com ABSTRACT USAGE-BASED EFFECTS IN LATIN AMERICAN SPANISH SYLLABLE-FINAL /s/ LENITION Michelle Annette Minnick Fox Supervisor: Mark Liberman Previous studies have identified factors that contribute to the weakening and deletion of syllable-final /s/ in Latin American Spanish, including the dialect and sex of the speaker, the phonetic environment and grammatical status of the /s/, and functional considerations. Proponents of a usage-based model of language claim that the structure of language is shaped by how it is used, and therefore factors such as word frequency, word predictability, and the context that words appear in have an effect on the form of language. This dissertation investigates the extent to which these usage-based factors contribute to syllable-final /s/ lenition in addition to the previously identified factors. Automated speech recognition methods were used to code three dependent variables for a corpus of over 50,000 tokens of syllable-final /s/: deletion or retention of /s/, duration of retained /s/, and the spectral center of gravity of retained /s/. Multiple regression was performed for each of the dependent variables, on all of the data combined and on several subsets of the data. For each multiple regression, usage-based factors were added to a base model to determine which of them improve the model. Word frequency and word predictability based on the following word both have an effect in the expected direction, with more frequent and more predictable words having higher levels of lenition. Word predictability based on the preceding word has the opposite effect, with more predictable iv PDF created with pdfFactory trial version www.pdffactory.com words having lower levels of lenition. The phonetic context that words appear in most frequently also has an effect, with words that are more often followed by a consonant having more advanced lenition, even after taking into consideration the actual phonetic context. These usage-based factors contribute both to the categorical process of deletion and the gradient processes of shortening and weakening of articulation. This data supports the claim that these usage-based variables form part of the speaker’s knowledge and that speakers have knowledge of low-level phonetic detail. This study suggests that the extent of lenition may be determined both within the lexical entry as in an exemplar model and by processing during production. v PDF created with pdfFactory trial version www.pdffactory.com Table of Contents Acknowledgments...........................................................................................................ii ABSTRACT...................................................................................................................iv Table of Contents...........................................................................................................vi 1 Introduction.............................................................................................................1 2 Description of Syllable-final /s/ lenition...................................................................4 2.1 Extralinguistic factors......................................................................................4 2.1.1 Dialect.........................................................................................................4 2.1.2 Age, Sex, Social Class..................................................................................8 2.1.3 Speech Style..............................................................................................10 2.2 Linguistic factors...........................................................................................12 2.2.1 Phonetic environment.................................................................................13 2.2.2 Grammatical category................................................................................16 2.2.3 Functional considerations...........................................................................18 2.2.4 Number of syllables in the word.................................................................21 2.2.5 Lexical exceptions......................................................................................21 2.3 Summary.......................................................................................................22 3 Usage-based accounts of variation.........................................................................24 3.1 Frequency......................................................................................................25 3.2 Word probability............................................................................................28 3.3 Exemplar theory and lexical representations...................................................32 3.4 Usage-based factors in relation to syllable-final /s/ lenition............................36 4 Data Collection......................................................................................................38 4.1 Spanish pronunciation lexicon........................................................................38 4.2 CallHome Spanish.........................................................................................40 4.2.1 The LDC Hub5-LVCSR CALLHOME Spanish Speech Corpus.................40 4.2.2 CallHome Spanish Data Coding.................................................................43 4.2.3 Categorical Data.........................................................................................46 4.2.4 Data Accuracy............................................................................................47 4.3 Broadcast News.............................................................................................52 4.3.1 The LDC 1997 HUB-4 Spanish Speech Corpus..........................................52 4.3.2 Broadcast News Data Labeling...................................................................54 4.3.2.1 Data Preparation.................................................................................55 4.3.2.2 Creation of Monophone HMM acoustic models..................................57 4.3.2.3 Issues encountered during the HTK coding procedure........................59 4.3.3 Categorical Data.........................................................................................61 4.3.4 Acoustic Measurements.............................................................................63 4.3.4.1 Duration.............................................................................................66 4.3.4.2 Spectral Moments of /s/......................................................................71 4.3.4.3 Intensity.............................................................................................75 4.3.4.4 Preceding vowel formants..................................................................78 4.3.4.5 Summary............................................................................................83 4.3.5 Data Accuracy............................................................................................85 4.3.5.1 Categorical Data.................................................................................85