Cross-Linguistic Comparison of Rhythmic and Phonotactic Similarity

CROSS-LINGUISTIC COMPARISON OF RHYTHMIC AND PHONOTACTIC SIMILARITY A DISSERTATION SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAI‘I AT MĀNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN LINGUISTICS DECEMBER 2013 By Diana Stojanović Dissertation Committee: Ann M. Peters, Chairperson Patricia Donegan Victoria Anderson Kamil Ud Deen Kyungim Baek © 2013, DIANA STOJANOVIĆ ii ACKNOWLEDGEMENTS I would like to express my deepest gratitude to all who provided inspiration, guidance, help, love, and support during my journey at the Department of Linguistics. Members of my dissertation committee Professors Ann M. Peters, Patricia J. Donegan, Victoria B. Anderson, Kyungim Baek and Kamil Ud Deen; All professors at the Department of Linguistics and in particular Byron Bender, Bob Blust, Mike Forman, William O’Grady, Ken Rehg, Albert Shutz, David Stampe, Ben Bergen, Katie Drager, Luca Onnis, Yuko Otsuka, and Amy Schafer; Department secretaries who made impossible possible: Jen Kanda and Nora Lum; Classmates and officemates: among many, Kaori Ueki, Yumiko Enyo, Gabriel Correa, Karen Huang, Laura Viana, Tatjana Ilic, Maria Faehndrich, Kathreen Wheeler, and Mie Hiramoto; East-West Center and in particular Prof. Andrew Mason; Udacity for teaching me enough Python to support this dissertation; EWCPA and wonderful neighbors in Hale Kuahine; Graduate Division, GSO, ISS, and in particular Martha Stuff and Linda Duckworth; Family away from home Nelda Peterson, Christobel Sanders, Nina, Jo, and Kano; My wonderful friends: among many, Bosiljka Pajic, Jadranka Bozinovska, Milka Smiljkovic, Svetlana Stanojevic, Aleksandra Petrovic, Branko Stojkovic, Ljiljana Milenkovic, Olga Jaksic, Jelena and Kosta Ilic, Helen Saar, Ange Nariswari, Ina Sebastian, Yoko Sato, and Parichat Jungwiwattanaporn; My family and in particular my grandparents who spoke different languages and instilled the love for language in me; and my parents who supported me unconditionally; And my dear husband Turro Wongkaren: THANK YOU iii ABSTRACT Literature on speech rhythm has been focused on three major questions: whether languages have rhythms that can be classified into a small number of types, what the criteria are for the membership in each class, and whether the perceived rhythmic similarity between languages can be quantified based on properties found in the speech signal. Claims have been made that rhythm metrics – simple functions of the durations of vocalic and consonantal stretches in the speech signal – can be used to quantify rhythmic similarity between languages. Despite wide popularity of the measures, criticisms emerged stating that rhythm metrics reflect differences in syllable structure rather than rhythm. In this dissertation, I first investigate what kind of similarity is captured via rhythm metrics. Then, I examine the relationship between the assumed rhythm type and the language structural complexity measured by the distributions of 1) consonant-cluster sizes, 2) phonotactic patterns, and 3) word lengths. Materials on which the measures of structural complexity were computed were automatically transcribed from written texts in 21 test languages. The transcriber is implemented in Python using grapheme-to-phoneme rules and simple phonological rules. Complexity measures are calculated using a set of functions, components of the complexity calculator. Results show that several rhythm metrics are strongly correlated with the phonotactic complexity. In addition, linear relationship found between some metrics suggests that the information they provide is redundant. These results corroborate and extend results in the literature and suggest that rhythmic similarity must be measured differently. Structural similarity in many cases points to historical language grouping. Similarity of word-final clusters arises as a factor that most resembles rhythmic classification, although a large body of independent evidence of rhythmic similarity is necessary in order to establish this correspondence with more certainty. Based on the results in this dissertation and the literature, a possible model of rhythmic similarity based on feature comparison is discussed, juxtaposing the current model based on rhythm metrics. This new ‘Union of features’ model is argued to better fit the nature of rhythm perception. iv TABLE OF CONTENTS Acknowledgements................................................................................................................iii Abstract..................................................................................................................................iv List of tables...........................................................................................................................viii List of figures.........................................................................................................................x CHAPTER 1: INTRODUCTION .........................................................................................1 1.1 Rhythm correlates ..........................................................................................1 1.2 Rhythm class hypothesis (RCH)....................................................................2 1.3 Rhythm metrics..............................................................................................4 1.4 Issues present in the current literature ...........................................................8 1.5 Questions and approaches used to solve them...............................................9 1.6 Contribution of this Dissertation....................................................................11 1.7 Outline............................................................................................................11 CHAPTER 2: BACKGROUND ...........................................................................................13 2.1 Durational variability in speech ...................................................................13 2.2 Phonotactics: Sonority scale and markedness of consonant clusters.............20 CHAPTER 3: METHODS ....................................................................................................23 3.1 Model of the transcriber & the phonotactic calculator ..................................23 3.2 Raw data assembly.........................................................................................24 3.3 Creating phonemic corpora............................................................................24 3.3.1 Choice of grapheme-to-phoneme method......................................................27 3.3.2 Implementation of grapheme-to-phoneme method........................................28 3.4 Complexity calculator....................................................................................29 3.4.1 Photactic metrics and rhythm metrics............................................................29 3.4.2 Consonant-cluster measures...........................................................................34 3.4.3 Word-length measures ...................................................................................35 v CHAPTER 4: RESULTS ......................................................................................................37 4.1 Phonotactic component of Rhythm Metrics ..................................................37 4.1.1 Introduction...............................................................................................37 4.1.2 Correlations between the Phonotactic and Rhythm Metrics.....................38 4.1.3 Classification power of RMs and PMs .....................................................43 4.1.4 Language classification based on Phonotactic Metrics.............................46 4.1.5 Conclusion ................................................................................................45 4.2 Consonant cluster lengths at different positions in the word.........................50 4.2.1 Word-initial cluster distributions ..............................................................50 4.2.2 Word-final cluster distributions................................................................52 4.2.3 Word-medial cluster distributions.............................................................55 4.2.3 Summary ...................................................................................................56 4.3 Phonotactic patterns at different positions in the word..................................58 4.3.1 Basic sonority (ALT) level........................................................................59 4.3.2 Detailed sonority (saltanajc) level.............................................................67 4.4 Word length distributions ..............................................................................76 4.5 Variability of measures over different materials ...........................................81 CHAPTER 5: GENERAL DISCUSSION AND CONCLUSION........................................89 5.1 Summary........................................................................................................89 5.2 Overview........................................................................................................91 5.3 Limitations of the study .................................................................................92 5.4 Discussion......................................................................................................95 5.4.1 Additional questions .................................................................................95

Load more