Mutual Intelligibility of Chinese Dialects an Experimental Approach
Total Page:16
File Type:pdf, Size:1020Kb
Mutual intelligibility of Chinese dialects An experimental approach Published by LOT phone: +31 30 253 6006 Janskerkhof 13 fax: +31 30 253 6406 3512 BL Utrecht e-mail: [email protected] The Netherlands http://www.lotschool.nl Cover illustration: Map of mainland China with the locations of the target dialects of this study indicated. ISBN: 978-94-6093-001-0 NUR 616 Copyright © 2009: Chaoju Tang. All rights reserved. MUTUAL INTELLIGIBILITY OF CHINESE DIALECTS AN EXPERIMENTAL APPROACH PROEFSCHRIFT ter verkrijging van de graad van Doctor aan de Universiteit Leiden, op gezag van Rector Magnificus prof. mr. P.F. van der Heijden, volgens besluit van het College voor Promoties te verdedigen op dinsdag 8 september 2009 klokke 13.15 uur door CHAOJU TANG geboren te Chongqing, China in 1968 Promotiecommissie Promotor: Prof. dr. Vincent J. van Heuven Overige leden: Prof. dr. Willem F.H. Adelaar Dr. Yiya Chen Dr. Charlotte S. Gooskens-Christiansen (Rijksuniversiteit Groningen) Prof. dr. ir. John Nerbonne (Rijksuniversiteit Groningen) Contents Acknowledgments xi Chapter One Introduction 1.1 Questions 1 1.1.1 Dialect versus Language 1 1.1.2 Resemblance versus Difference 1 1.1.3 Complex versus Simplex 2 1.1.4 Intelligibility versus Mutual Intelligibility 2 1.2 (Mutual) Intelligibility tested experimentally 5 1.2.1 Functional testing method 5 1.2.2 Opinion testing method 5 1.2.3 The application of functional testing and judgment/opinion testing 6 1.3 Statement of the problem 7 1.3.1 The choice between functional and opinion testing 7 1.3.2 Asymmetry between Mandarin and Southern language varieties 8 1.3.2.1 The classification issue of Sinitic varieties 8 13.2.2 Asymmetrical mutual intelligibility between Sinitic varieties 9 1.3.3 Predicting mutual intelligibility from structural distance measures 9 13.3.1 Structural measures for European language varieties 10 1.3.3.2 Structural measures on Chinese language varieties 11 1.3.3.3 Predicting mutual intelligibility of Sinitic varieties 12 1.4 Determining the power of functional testing against opinion testing 13 1.5 Goal of this research 14 1.6 Summary of research questions 15 1.7 Research design and plan 15 1.7.1 Judgment/Opinion tests 16 1.7.2 Functional tests 16 1.7.3 Levenshtein distance measure 17 1.7.4 Other distance measures 17 1.8 Outline of the dissertation 17 Chapter Two The Chinese Language Situation 2.1 Introduction 19 2.2 Taxonomy of Chinese language varieties 19 2.3 Primary split between Mandarin and non-Mandarin branches 25 2.3.1 The non-Mandarin branch 27 2.3.2 The Mandarin branch 31 2.4 The traditional (sub)grouping of Chinese language varieties 32 2.5 Structural distance measures on Sinitic language varieties 38 2.6 Mutual intelligibility between Chinese language varieties 40 2.7 The popularity of Chinese dialects 43 vi C. TANG: MUTUAL INTELLIGIBILITY OF CHINESE DIALECTS Chapter Three Mutual Intelligibility of Chinese Dialects: Opinion Tests 3.1 Introduction 45 3.2 Method 47 3.2.1 Materials 47 3.2.2 Listeners 49 3.2.3 Procedure 50 3.2.4 Results 51 3.2.4.1 Judged intelligibility 51 3.2.4.2 Judged similarity 55 3.3 Correlation between judged intelligibility and judged similarity 59 3.4 Mutual intelligibility within and between Mandarin and non-Mandarin groups 60 3.5 Conclusions 62 3.5.1 Asymmetry between Mandarin and Non-Mandarin dialects 62 3.5.2 Convergence with linguistic taxonomy 63 3.5.3 Effect of tonal information 63 3.5.4 Similarity versus intelligibility judgments 65 3.6 Testing possible artefacts of sound quality ― a control experiment 65 3.6.1 Introduction 65 3.6.2 Procedure 66 3.6.3 Results and conclusion 66 Chapter Four Mutual Intelligibility of Chinese Dialects: Functional Tests 4.1 Introduction 69 4.2 Functional Experiments 70 4.2.1 Methods 71 4.2.1.1 The recordings 71 4.2.1.1.1 Recording materials: word and sentence selection 71 4.2.1.1.2 Sound recordings 72 4.2.1.2 Listening test 72 4.2.1.2.1 Data segmentation and processing 72 4.2.1.2.2 Creating CDs 73 4.2.1.2.3 Answer sheets 74 4.2.2 Procedure 74 4.2.3 Results 76 4.2.3.1 Results from the isolated word intelligibility test 77 4.2.3.2 Results from the sentence intelligibility test 80 4.2.3.3 Mutual intelligibility within and between (non-)Mandarin groups 82 4.3 Correlations between subjective measures 84 4.3.1 Intelligibility at word and sentence level 84 4.3.2 Functional tests versus opinion tests 85 4.4 Discussion 87 4.5 Conclusion 91 Chapter Five Collecting objective measures of structural distance 5.1 Introduction 93 5.2 Measures of lexical affinity 95 5.2.1 Cheng’s lexical affinity index 96 TABLE OF CONTENTS vii 5.2.2 Lexical affinity tree versus traditional dialect taxonomy 99 5.3 Measures of phonological affinity 99 5.3.1 Introduction 99 5.3.2 Distance between dialects based on sound inventories 101 5.3.2.1 Initials 102 5.3.2.2 Vocalic nuclei 103 5.3.2.3 Codas 104 5.3.2.4 Tones 105 5.3.2.5 Finals 106 5.3.2.6 Combining initials and codas 107 5.3.2.7 Concluding remarks 109 5.3.3 Weighing sound structures by their lexical frequency 109 5.3.3.1 Lexical frequency of initials the CASS database 111 5.3.3.2 Lexical frequency of finals in the CASS database 112 5.3.3.3 Lexical frequency of codas in the CASS database 113 5.3.3.4 Lexical frequency of tones in the CASS database 114 5.3.3.5 Lexical frequency of vocalic nuclei in the CASS database 116 5.3.3.6 Initials and finals combined in the CASS database 117 5.3.3.7 Initials, finals and tones combined in the CASS database 118 5.3.3.8 Concluding remarks on the trees based on the CASS database 118 5.3.4 Levenshtein distance measures 119 5.3.4.1 Segmental Levenshtein distance, unweighed 120 5.3.4.2 Segmental Levenshtein distance, perceptually weighed 121 5.3.4.3 Tonal distance, unweighed 122 5.3.4.4 Tonal distance, perceptually weighed 125 5.3.4.5 Conclusions with respect to Levenshtein distance 127 5.3.5 Measures published in the literature 128 5.3.5.1 Phonological affinity based on initials 129 5.3.5.2 Phonological affinity based on finals 130 5.3.5.3 Phonological affinity based on tone transcription 131 5.3.5.4 Phonological affinity based on initials and finals combined 132 5.3.5.5 Phonological affinity based on segments and tones combined 133 5.3.5.6 Cheng’s phonological affinity based on correspondence rules 134 5.4 Conclusions 137 Chapter Six Predicting mutual intelligibility 6.1 Introduction 139 6.2 Predicting subjective ratings from objective measures 141 6.2.1 Single predictors of judgement scores 141 6.2.2 Multiple predictions of judgment scores 143 6.2.3 Single predictors of functional scores 144 6.2.4 Multiple predictions of functional scores 145 6.3 Conclusions 147 Chapter Seven Conclusion 7.1 Summary 149 7.2 Answers to research questions 150 7.2.1 The correlation between judged (mutual) intelligibility and similarity 150 viii C. TANG: MUTUAL INTELLIGIBILITY OF CHINESE DIALECTS 7.2.2 Mutual intelligibility within and between (non-)Mandarin dialects 151 7.2.3 Mutual intelligibility predicted from objective distance measures 151 7.2.3.1 Correlation between subjective tests 151 7.2.3.2 Predicting subjective results from objective measures 152 7.2.3.2.1 Single predictors of judgment and functional scores 152 7.2.3.2.2 Multiple predictions of judgment and functional scores 153 7.3 The status of Taiyuan 154 7.4 Relating mutual intelligibility to traditional Chinese dialect taxonomy 155 7.5 Remaining questions 156 References 157 Samenvatting 167 Summary in English 177 摘要 (summary in Chinese) 187 Appendices (numbered separately by chapter) 3.1 Listener information form 195 3.2 Proximity matrix generated from Table 3.1 (judged intelligibility based on monotonized speech samples) 196 3.3 Proximity matrix generated from Table 3.2 (judged intelligibility based on intonated speech samples) 197 3.4 Proximity matrix generated from Table 3.3 (judged similarity based on monotonized speech samples 198 3.5 Proximity matrix generated from Table 3.4 (judged similarity based on intonated speech samples) 199 4.1 Stimulus words used for semantic classification task (10 categories, 15 instantiations per category) 200 4.2 Mandarin SPIN sentences in Chinese characters, with Pinyin translitera- tion (including tone numbers) and English original sentences 203 5.1a Lexical affinity index (LAI, proportion of cognates shared) for all pairs of listener dialects (across) and speaker dialects (down) 206 5.1b Proximity matrix generated from Appendix 5.1a (LAI) 207 5.2a Occurrence of initials (onset consonants) in the phoneme inventories of 15 dialects 208 5.2b Proximity matrix derived from Appendix 5.2a (initials in phoneme inventory) 209 5.3a Occurrence of vocalic nuclei in the phoneme inventories of 15 dialects 210 5.3b Proximity matrix derived from Appendix 5.3a (nuclei in phoneme inventory) 213 5.4a Occurrence of codas in the phoneme inventories of 15 dialects 214 5.4b Proximity matrix derived from Appendix 5.4a (codas in phoneme in- ventory) 215 5.5a Occurrence of word tones in the sound inventories of 15 dialects 216 TABLE OF CONTENTS ix 5.5b Proximity matrix derived from Appendix 5.5a (tone inventories) 217 5.6 Occurrences of finals in 15 dialects 218 5.6b Proximity matrix derived from Appendix 5.6a (inventory of finals) 231 5.7a Union of occurrences of initials and codas in 15 dialects 232 5.7b Proximity matrix derived from Appendix 5.7a (union of initials and 232 codas) 5.8a Lexical frequency of initials (onsets) in 15 dialects counted in the CASS database 233 5.8b Proximity matrix derived from Appendix 5.8a (lexical frequency of initials in the CASS database) 234 5.9a Lexical frequency of finals (rhymes) in 15 dialects counted in the CASS database 235 5.9b Proximity matrix derived from Appendix 5.9a (lexical frequency of finals) 243 5.10a Lexical frequency of codas in 15 dialects counted in the CASS database 244 5.10b Proximity matrix derived from Appendix 5.10a.