Presented at the International Conference on Yi-Burmese Languages and Linguistics (ICYBLL) Chengdu, Sichuan, 2012 November A Preliminary Study of Nuosu Yi Syllable Frequency in Text Revised 2015 February

Dennis Walters, Doerthe Schilken, and Susan Walters SIL International, Group

Abstract The standard written form of Nuosu Yi includes 1,165 symbols. The symbols correspond to phonemic syllables as spoken in the Shengzha variety of Northern Yi. Aspiring readers of Nuosu Yi text must memorize this sound-symbol correspondence. Studies across languages and writing systems have shown that early study of frequently used symbols can speed learning progress. (Dale and Chall 1948; Hu and Catts 1998; Johnson, Smith, and Jensen 1972) While software exists for doing corpus-based analysis of Yi language data (Chen 2010; 2011), until now the literature lacked an ordered list of frequently used Nuosu syllables. This study lists Nuosu Yi syllables in order of their frequency of occurrence in a body of text. The corpus includes nine texts, containing a total of 23,536 syllables. In the sample, 783 unique syllables occurred at least once. The cumulative usage data show that a person with reading knowledge of 402 symbols (of the available 1,165) could have read 95% of our text sample. The Nuosu Yi syllable frequency data are also compared with syllable frequency (Sung 2005). ICYBLL 2012

Introduction Since the Yi language syllabary was approved by the China State Council in 1980, its use has been successfully popularized in the Liangshan region (Lewis, Simons, and Fennig 2014; Bradley 2009). Nuosu people take pride in seeing their written language on public signs, in their schools, on television, and in the Liangshan Daily newspaper (Yi language version). The writing system is used in traditional Nuosu culture, and there is a body of popular literature available in bookstores. Yi language material is increasingly available on the Internet as well, and it is clear that many people are studying written Yi language, both in school and informally. The task for aspiring readers of Nuosu Yi is to memorize the sound-symbol correspondence; most of them find it takes considerable effort and time to learn to read it. Hu and Catts (1998), showed that high frequency symbols are more readily learned than low frequency symbols in logographic orthographies as well as in alphabetic ones. This means that knowing the most frequently used characters can help a teacher teach effectively and a reader to learn more quickly. Until now, a quantitative study listing the most commonly used Nuosu Yi syllables has not been publicly available. This study lists the syllables, which occur in a body of Nuosu Yi text, in order of their frequency of occurrence (Appendix). It is intended to provide a preliminary set of data, to explore and refine the method, and to suggest directions for future research. Nuosu Yi syllables and symbols The Nuosu Yi syllabary is based on a traditional writing system used among the of southern Sichuan Province. (Huang 2001; Chen et al. 1985) Since its approval, it has been used, in addition to the national language, in education in the Liangshan region. The syllabary includes 819 basic symbols, plus a syllable iteration character and punctuation. Unlike an abugida (or alphasyllabary) system, the Nuosu Yi system pairs a single unitary symbol with each basic phonemic syllable. The symbols generally do not have systematic variations that could help a reader memorize the corresponding sounds, except that mid-high tone syllables are formed by adding an inverted breve mark above related basic symbols. There is also a syllable iteration character ꀕ , which stands in for the second occurrence of a reduplicated syllable: ꈀꎭꎭ → ꈀꎭꀕ. Including mid-high tone symbols and the iteration character, there are 1,165 symbols (Table 1). Because of the one-to-one correspondence between syllables and their symbols, we refer to Nuosu “symbols” and “syllables” interchangeably in this paper. Commonly recognized varieties of Northern Yi include Yinuo and Tianba in the north, Shengzha in the central and southwestern parts of Liangshan, and Suodi and Adur in the south (Bradley 2001; Chen et al. 1985). The standard syllabary is based on phonemic analysis of the Shengzha variety as spoken in the vicinity of Xide. Because it is a phonemic system, native speakers of most Northern Yi varieties find at least an approximate match between the syllables they speak and the symbols in the syllabary.

Walters and Schilken Nuosu Yi Syllable Frequency 2 of 24 ICYBLL 2012

Table 1. Number of standard Yi symbols Nuosu Symbols Count Basic symbols 819 Mid high tone symbols 345 Iteration symbol “w” 1 Total 1,165 Traditionally, Nuosu writing was taught in the home by bimos, the keepers and agents of Nuosu traditional religion, equipping their sons, and sometimes their daughters, to use the writing in Nuosu folk culture. Teaching involved memorization of traditional poetry and other texts. A literacy campaign in the 1950s promoted a romanized writing system, not using the traditional Nuosu symbols. While the romanized system was easy to learn, Nuosu people preferred their traditional writing. Further study and development resulted in approval of the character-based Scheme for Standard Yi Writing (China State Council 1980; Chen, et al. 1985). After that time, public education in Liangshan began to include a special track using Nuosu Yi language as the medium of instruction for all subjects. Currently, home instruction and school instruction ensure that some Nuosu Yi people become confident readers, yet the proportion is small, and many others still desire to learn to read their own language. Text Corpus The text corpus under study was a collection of material readily available to the authors in electronic form. As shown in Table 2, about half the material is narrative, including some transcribed oral material. Another forty percent or so is behavioral, in the form of traditional proverbs and poetry. About five percent of the material is hortatory or expository in Longacre’s (1996) classification. The variety in genre as well as in written versus spoken text gives a measure of balance to the corpus. Still, the present sample has a greater proportion of poetry and proverbs than anything else. Because of this, we might expect a reduced frequency of some function words, and a greater proportion of content words—nouns, verbs, and descriptors—than we would see in a more balanced sample. Table 2. Text corpus by size and genre Text Description Genre Syllable Count Proportion of Total Witch Folk tale from trad Folk Narrative 2,029 8.6% Nuosu culture Day die Folk tale from trad Folk Narrative 861 3.7% Nuosu culture Firewood Young person describes a Personal 215 0.9% daily life task. Narrative Flood Mythical flood account. Poetic 8,525 36.2% Narrative Proverbs Poetic proverbs. Proverbs 10,242 43.5%

Walters and Schilken Nuosu Yi Syllable Frequency 3 of 24 ICYBLL 2012

Magpie Old person recounts a Personal 328 1.4% childhood experience. Narrative No fight A teacher warns students Hortatory 187 0.8% not to fight. Welcome A teacher welcomes new Hortatory 180 0.8% students. Sewing needle Adult recounts an Personal 969 4.1% experience as a student. Narrative Total count 23,536 100%

Data Processing Finding the relative frequency of language symbols is done by combing through volumes of text, listing units found there, counting their occurrences, and storing the results. Afterward, the data may be collated and presented in various ways. Automated techniques for storing and processing text have been available almost since the invention of electronic computers. For Yi language material, (Shama 2000) initially used a double-byte encoding, similar to what was done for Chinese characters before Unicode. This scheme allowed for input, storage, editing, and typesetting of Yi language material. Later, Yi characters were included in the GB18030 standard, and in Unicode since version 3.0 (Unicode Consortium 2000). These developments have greatly facilitated computer processing of Yi language data. Data shown in this study were extracted in the following steps:  Install Primer (Weber 1999) software and set up a project for the language under study.  Choose texts and prepare the electronic files.  Create working copies of data files for analysis. Strip each text of metadata, leaving raw text only.  For each text, use BabelPad (West 2004) or a similar utility to convert Yi symbols to romanized form with spaces between each syllable.  Place each file in the directory where Primer will expect to find data.  Use Primer to generate the frequency word list.  Import the frequency word list to a spreadsheet program.  Sort the data, record counts, generate histogram, etc. This work flow yielded the desired data, but with some drawbacks. For example, the iteration character ꀕ appears in our frequency list as number 42 although in the text corpus it actually stands for a number of different characters. Ideally, the software would automatically identify the reduplicated syllables and correct the counts. Also, Primer’s counting feature expected text data to be presented in romanized form with spaces between counted forms, so we

Walters and Schilken Nuosu Yi Syllable Frequency 4 of 24 ICYBLL 2012 converted the character texts to romanized form. With newer tools, syllable counts and word counts may be done more simply. PrimerPro (Schroeder 2011), an updated version of Primer, is Unicode compliant and has a graphical user interface. UnicodeCCount (Warfel and White 2011) may produce the ordered frequency list without the need to convert syllabary symbols to romanized form. Alternatively, a skilled programmer could automate the entire process of harvesting electronic data and analyzing it for frequency, as described in Chen (2010, 2011). Results

As shown in Table 3, the most frequently occurring character in our sample ꃅ/mu/ ‘do; ADVR’ occurred 707 times. Three hundred eighty syllables did not occur at all in the corpus. Table 3. Most and fewest occurrences Highest frequency 707 Tokens 1 syllable Low frequency 1 Token 114 syllables Lowest frequency 0 Tokens 380 syllables In the corpus, 783 unique syllables occurred at least once (Table 4). Table 4. Percent of syllable inventory represented in sample Unique syllables appearing in sample 783 of 1,164 available Percent of total syllabary 67.3% Table 5 shows the fifteen most frequently occurring Nuosu syllables with corresponding sample glosses. Table 5. Most frequently occurring Nuosu Yi syllables Rank Symbol Tokens Gloss/Function 1 ꃅ mu 707 do; ADVR 2 ꆏ ne 631 2S; TOPIC 3 ꌠ su 594 REL 4 ꇬ go 543 LOC; 3S 5 ꀋ ap 514 NEG 6 ꄉ da 422 PERF; because; from 7 ꇁ la 418 come; FUT 8 ꑌ nyi 370 also; sit 9 ꉬ nge 364 COP; five 10 ꆹ li 337 TOPIC; go up 11 ꋌ cy 274 3S; wash 12 ꁧ bbo 257 go; tree 13 ꌺ sse 254 NOM; son 14 ꒉ yy 250 go down; water 15 ꂿ mo 240 see; FUT

Walters and Schilken Nuosu Yi Syllable Frequency 5 of 24 ICYBLL 2012

Figure 1 shows the raw frequencies of Nuosu syllables, sorted from highest to lowest. The syllable /mu/ (rank 1) occurs far more frequently than any other, with a limited set of high frequency syllables following before the curve flattens significantly. At the right-hand end of the graph, 114 syllables occurred only one time in the corpus. Figure 1. Raw frequencies of Nuosu Yi syllables

Raw frequency in text 800 700

s 600 e

c 500 n e

r 400 r u

c 300 c

O 200 100 0

All syllables by raw frequency

The left-hand side of the probability density curve (Figure 2) follows the expected log normal distribution. Beginning from rank 47, the right-hand end of the graph has spikes, showing that multiple syllables have the same rank value. For instance, the syllables /ga/ and /gu/ each occurred 111 times and therefore share rank 47. This effect would likely be less pronounced in a larger text corpus. Figure 2. Probability density

X=Random syllable in corpus is rank r 3.5%

3.0%

2.5%

2.0% ) X

( 1.5% P 1.0%

0.5%

0.0%

Rank r (1=Highest frequency)

Our Nuosu Yi corpus failed to instantiate roughly one third of the phonemic syllables of the language. There are several possible explanations for this. The language may have changed, dropping certain syllables since the time the writing system was designed. The data corpus

Walters and Schilken Nuosu Yi Syllable Frequency 6 of 24 ICYBLL 2012 may not represent the language truly, e.g., through genre imbalance or faulty transcription. Or the corpus may simply be too small to instantiate all the syllables. To provide a check on the Nuosu Yi data, we will compare a Mandarin Chinese corpus of similar size. Comparison to Chinese syllable frequency Nuosu Yi has several points of comparison with Mandarin Chinese. Both are isolating languages with relatively little morphology. There is significant overlap in their phonemic segments, and each has four tones. Both are written with characters that represent a single syllable each, and they have a similar number of phonemic syllables. There are about 1,300 phonemic syllables in (Lu 2001; San 2005), with approximately 7,000 Chinese characters used to write them down (Yin and Rohsenow 1994:i). Frequency data for Chinese characters were compiled as early as 1928 (Chen, referenced in Yin and Rohsenow 1994:6) and have been used in the design of reading curriculum. However, since multiple Chinese characters can map to a single phonemic syllable, their frequency data cannot be directly compared to the Nuosu Yi syllable frequencies. Frequency of phonemic syllables in Mandarin was reported by Sung (2005) for a corpus of 30,328 syllables, comparable to what we have done for Nuosu Yi. Sung’s study shows some of the difficulties of machine processing of text. Since the computer cannot judge meaning, it may associate alternate pronunciations with some Chinese characters. For instance, Sung’s data show only 87 occurrences of /bu4/ ‘NEG’ (92nd in Sung’s list) while the corresponding character 不 is in the top ten of recognized Chinese character frequency lists (Da 2004; Ministry of Education 2014). This could mean that the character /bu4/ was not in the top ten in Sung’s corpus, or perhaps that the character was mapped to pronunciation /bu2/, reflecting a well-known Mandarin tone change process in which /bu4/ becomes /bu2/ preceding another fourth tone. Table 6 shows the fifteen most frequently occurring Mandarin syllables compared to the top fifteen from our Nuosu Yi corpus. In both data sets, the most frequent syllables tend to have multiple senses; they tend to carry fundamental, abstract concepts; and many provide overt syntactic structure or discourse level functions, such as grounding (Arlund 2001; Langacker 1991; Dooley and Levinsohn 2001). Functions or glosses that appear in both sets are LOCATIVE, RELATIVIZER, DEFINITE, NEGATIVE, NOMINALIZER, COPULA, and THIRD PERSON SINGULAR. Table 6. Comparison of High Frequency Syllables in NuosuYi and Chinese Rank Nuosu Yi Fraction Function/Gloss Chinese Fraction Function/Gloss Syllable Syllable 1 mu 3.0039% do; ADVR de5 4.2601% POSS; NOM; REL; LINK 2 ne 2.6810% 2S; TOPIC shi4 2.5785% COP; business 3 su 2.5238% REL; DEF ren2 1.5959% person/people 4 go 2.3071% LOC guo2 1.4376% country 5 ap 2.1839% NEG yi1 1.4211% INDEF; one

Walters and Schilken Nuosu Yi Syllable Frequency 7 of 24 ICYBLL 2012

Rank Nuosu Yi Fraction Function/Gloss Chinese Fraction Function/Gloss Syllable Syllable 6 da 1.7930% PERF; because; from zhong1 1.3453% LOC; middle; China/Chinese 7 la 1.7760% come; FUT zai4 1.3420% LOC; exist; again 8 nyi 1.5721% also; sit zhi4 1.1013% until; control; govern; wisdom 9 nge 1.5466% COP; five gong1 1.0815% public; work; supply; success 10 li 1.4318% TOPIC; go up you3 1.0485% exist; have; friend(ship) 11 cy 1.1642% 3S; other shi2 0.9760% time; know; true; ten 12 bbo 1.0919% go; CLASSIFIER ta1 0.9364% 3S; other 13 sse 1.0792% NOM; son han4 0.9002% or ethnicity 14 yy 1.0622% go down; water bu2 0.8771% NEG 15 mo 1.0197% see; FUT zhi1 0.8738% CLASSIFIER; know; carry As in our Nuosu Yi data (Table 4), the Mandarin corpus instantiates only 837 (66.2%) of 1,265 available syllables—if we take Lu’s (2001) syllable count—in a corpus of 30,328 syllables. We might ask, what size corpus would be required to instantiate all the phonemic syllables in Yi or in Chinese. For a specific corpus of news articles in Taiwan, Skultety (2012a) projects that a corpus of 53,776 characters would be enough to instantiate all that are likely to ever appear there. This suggests that a large enough, well-balanced corpus of Nuosu Yi or Chinese would likewise instantiate all the phonemic syllables available. Discussion Considering the relative frequency of phonemes and words in a number of languages, Zipf (1929; 1935; 1945) consistently observes a log normal distribution, with a limited set of words, syllables, or segments comprising a high proportion of usage, and many unique units on the other end of the scale occurring infrequently. This is the pattern shown in Nuosu Yi as well. In addition, Zipf observes that the highest frequency phonetic forms are simple to produce, and semantically plain, as compared to less frequently used forms. The Nuosu syllable data bear this out. Of the ten most frequent syllables, seven are function words; four are basic verbs with wide meanings; and two are pronouns. Each of the ten also occurs in multi- syllabic words, giving them a higher usage rate. For example, the more complex forms /mu yot/ ‘make; do; be’ and /syt mu lat mu/ ‘take care of business’ share similar glosses with /mu/ ‘do; make’. As in Mandarin Chinese, Nuosu Yi syllables often carry significant meaning by themselves, yet apart from a specific context, the meaning can be hard to determine. Specific meanings become clearer in multi-syllable words. This gives some functional motivation for Dai and Ling’s (1998:7) finding that most Nuosu words (62.8%) are two-syllable forms, with single- syllable words comprising only 10.7% of Nuosu words. Three- and four-syllable words comprise 11.2% and 14.4% respectively.

Walters and Schilken Nuosu Yi Syllable Frequency 8 of 24 ICYBLL 2012

Application to Language Teaching and Learning For hundreds of years, teachers have known intuitively that frequently encountered material would be more readily learned and remembered than rare or unfamiliar material. More recently this effect has been corroborated experimentally and used extensively in “readability” formulas to classify text by level of difficulty (Klare 1968; Lorge 1944). Hu and Catts (1998) confirmed that high frequency material is more readily learned in logographic orthographies, such as Chinese and Yi, as well as in alphabetic ones. Table 7 lists the cumulative frequency of syllables as a percentage of all syllables in the corpus. The data show that a reader who knows the sounds for 401 specific Yi characters would be able to sound out 95% of the syllables in our corpus. The point may be taken that a reader need not memorize all 819 basic characters before beginning to read. A smaller set of high-frequency characters can be enough to begin with. Table 7. Number of Nuosu Yi syllables comprising portion of sample Rank (Inclusive) Number of unique Usage in sample syllables 1-50 51 50% 1-94 147 75% 1-118 292 90% 1-125 401 95% 1-131 608 99% 1-132 783 100% Barnwell’s (2003) method of primer construction assumes the need to build up meaningful words, phrases, and sentences from frequently used units. In Yi, the lower level meaning units are syllabary symbols. Words used in the primer would be a carefully chosen mix of content words and function words of high frequency. Some would be single-syllable words while others would be more than one syllable. With syllable frequencies in hand, curriculum designers may be able to improve existing materials or create new curriculum for hopeful readers of Nuosu Yi. Conclusion The Nuosu Yi writing system has a growing body of aspiring readers, all of whom face the difficulty of remembering the sounds of 1,165 syllabary symbols. This study (Appendix) lists the characters (syllables) used in a text corpus, arranged from highest to lowest frequency of use. The text corpus includes nine texts in five different genres, containing a total of 23,356 syllables. Through a semi-automated process, all the unique syllables were listed, counted, and sorted from highest to lowest frequency of occurrence. The sorted list is presented as an appendix. It includes 783 symbols of 1,164 available. The most frequently occurring syllable /mu/ occurred 707 times, while 114 unique syllables occurred just one time each. This follows the expected log normal distribution.

Walters and Schilken Nuosu Yi Syllable Frequency 9 of 24 ICYBLL 2012

Comparing these data with a Mandarin Chinese corpus, a number of parallel effects are observed. The Chinese corpus likewise only instantiates about two thirds of the available phonemic syllables. In each list, the most frequent characters tend to be phonetically simple and semantically broad. We conclude that the results obtained for Nuosu Yi syllable frequency are reasonable and can be of value for development of reading curriculum. The character frequency data show that by learning a relatively small number of Yi characters, a reader would be able to read much of the corpus. Early readers might study the most frequently encountered characters first, building their skills and confidence, while gradually adding selected mid- and low-frequency characters. It remains for curriculum designers to develop and test Nuosu Yi language materials using the most common characters heavily in the early stages. In large scale corpus analysis, software tools can be used not only to count characters, but to parse words and tag them, and to present the data in various arrangements. For building a large corpus of Yi language material, optical character recognition (OCR) software would be a helpful addition to the catalog of electronic processing tools already available. This would allow for mass input of Yi language materials that are currently available in print only. This is a preliminary study, and there is much more that might be done. Ideally, to reduce statistical skewing, this sort of study would examine a much larger text corpus, and would balance the sample by including equal proportions of various text genre. Also, it would be desirable to list frequently used word forms, in addition to isolated syllables. Abbreviations

1S First person singular 2S Second person singular 3S Third person singular ADVR Adverbializer COP Copula DEF Definite marker FUT Future time INDEF Indefinite marker LINK Linking particle LOC Locative NEG Negative NOM Nominalizer PERF Perfective aspect POSS Possessive REL Relativizer TOPIC Topic marker

Walters and Schilken Nuosu Yi Syllable Frequency 10 of 24 ICYBLL 2012

REFERENCES Arlund, Pam. 2001. “The Function of Classifiers and Topic Markers in Norther Yi.” In The 34th International Conference on Sino-Tibetan Languages and Linguistics. Kunming, , China. Barnwell, Katherine. 2003. “A Workshop Guide for Primer Construction.” LinguaLinks Library 5.0 plus (vii): 120. Bradley, David. 2001. “Language Policy for the Yi.” In Perspectives on the Yi of Southwest China, 195–213. Berkeley, California: University of California Press. http://faculty.washington.edu/stevehar/Bradley.pdf. ———. 2009. “Language Policy for China’s Minorities Orthography Development for the Yi.” Written Language & Literacy 12 (2): 170–87. doi:10.1075/wll.12.2.03bra. Chen, He-qin. 1928. Glossary of Characters Used in Written Language. Chen, Shilin, Shiming Bian, and Xiuqing Li, eds. 1985. Yiyu jianzhi (A brief description of the Yi language). Zhongguo Shaoshu Minzu Yuyan Jianzhi Congshu. Minzu Chubanshe. Chen, Shun-qiang. 2010. “The design and implementation of a standard system for Yi Language word frequency.” Journal of Southwest University for Nationalities-Natural Science Edition 36 (4): 644–48. ———. 2011. “Research of the information processing in the Yi language participle normative.” Journal of Southwest University for Nationalities-Natural Science Edition 37 (1): 158–60. doi:10.3969/j.issn.1003-2483.2011.01.037. China State Council. 1980. “Yiwen Guifan Fang’an (The Scheme for Standard Yi Writing).” Dai, Qing-xia, and Fu-xiang Ling, eds. 1998. Yiyu Cihui Xue (Yi Language Vocabulary Studies). Beijing: Central University for Nationalities Press. Da, Jun. 2004. “Modern Chinese Character Frequency List.” http://lingua.mtsu.edu/chinese- computing/statistics/char/list.php?Which=MO. Dale, Edgar, and Jeanne S. Chall. 1948. “A Formula for Predicting Readability.” Educational Research Bulletin 27 (1): 11–28. Dooley, Robert A., and Stephen H. Levinsohn. 2001. Analyzing Discourse: A Manual of Basic Concepts. Dallas: SIL International. http://www.ntslibrary.com/PDF%20Books/Analyzing %20Discourse%20-%20A%20Manual%20of%20Basic%20Concepts.pdf. Huang, Jian-ming. 2001. Yiwen Wenzi Xue (Study of Yi Language Characters). Beijing: Ethnic Publishing House. Hu, Chieh-Fang, and Hugh W. Catts. 1998. “The Role of Phonological Processing in Early Reading Ability: What We Can Learn From Chinese.” Scientific Studies of Reading 2 (1): 55–79. doi:10.1207/s1532799xssr0201_3. Johnson, Dale D., Richard J. Smith, and Kenneth L. Jensen. 1972. “Primary Children’s Recognition of High-Frequency Words.” The Elementary School Journal 73 (3): 162–67. Klare, George R. 1968. “The Role of Word Frequency in Readability.” Elementary English 45 (1): 12–22. Langacker, Ronald W. 1991. Foundations of Cognitive Grammar. Vol. 2. Stanford University Press.

Walters and Schilken Nuosu Yi Syllable Frequency 11 of 24 ICYBLL 2012

Lewis, Paul M., Gary F. Simons, and Charles D. Fennig, eds. 2014. Ethnologue: Languages of the World. 17th, Online version. Dallas, Texas: SIL International. http://www.ethnologue.com/language/iii. Longacre, Robert E. 1996. The Grammar of Discourse. 2nd ed. New York: Plenum Press. http://dx.doi.org/10.1007/978-1-4899-0162-0. Lorge, Irving. 1944. “Word Lists as Background for Communication.” Teachers College Record 45 (8): 543–52. Lu, Wo. 2001. “The Disagreement on Number (Quantity) and Composition Distribution of Modern Chinese Syllables.” Language Teaching and Linguistic Studies 2001 (6). http://www.cnki.com.cn/Article/CJFDTotal-YYJX200106004.htm. Ministry of Education, Institute of Applied Linguistics. 2014. “CNCORPUS-Public Resources.” 语料 库在线. http://www.cncorpus.org/Resources.aspx. San, Duanmu. 2005. “Chinese (Mandarin), Phonology of.” In Encyclopedia of Language and Linguistics, 2nd ed. Elsevier Publishing House. Schroeder, Kent. 2011. PrimerPro (version 2.1). MS-Windows. Dallas: SIL International. http://lingtransoft.info/apps/primerpro. Shama, Layi. 2000. Jisuanji Yiwen Xinxi Chuli (Yi Language Computer Information Processing). Chengdu, China: Sichuan Minzu Chubanshe. Sung, Dylan. 2005. “Pinyin.” http://www.dylwhs.talktalk.net/scilang/pinyin-stats.htm. Unicode Consortium. 2000. The Unicode Standard: Version 3.0. Reading, Mass.: Addison-Wesley. Warfel, Kevin, and S.E. White. 2011. UnicodeCCount. Dallas: SIL International. http://scripts.sil.org/UnicodeCharacterCount. Weber, David. 1999. Primer (version 1.0). MS-DOS. English. Dallas: SIL International. West, Andrew. 2004. BabelPad (version 1.4.3). MS-Windows. http://www.babelstone.co.uk/software/babelpad.html. Yin, Bin-yong, and John R. Rohsenow. 1994. Xiandai Hanzi (Modern Chinese Characters). Beijing: Sinolingua. Zipf, George Kingsley. 1929. “Relative Frequency as a Determinant of Phonetic Change.” Harvard Studies in Classical Philology 40 (January): 1–95. doi:10.2307/310585. ———. 1935. The Psycho-Biology Of Language: AN INTRODUCTION TO DYNAMIC PHILOLOGY. Vol. ix. Oxford, England: Houghton, Mifflin. ———. 1945. “The Meaning-Frequency Relationship of Words.” The Journal of General Psychology 33 (2): 251–56. doi:10.1080/00221309.1945.10544509.

Walters and Schilken Nuosu Yi Syllable Frequency 12 of 24 ICYBLL 2012

Appendix: Nuosu Yi Syllables in Order of Frequency in Text

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

1 ꃅ mu 707 3.00% 3.0039% 33 ꐯ jjy 154 40.35% 0.6543%

2 ꆏ ne 631 5.68% 2.6810% 34 ꉪ ngop 144 40.96% 0.6118%

3 ꌠ su 594 8.21% 2.5238% 35 ꈨ gge 133 41.52% 0.5651%

4 ꇬ go 543 10.52% 2.3071% 36 ꃀ mop 131 42.08% 0.5566%

5 ꀋ ap 514 12.70% 2.1839% 37 ꑳ yi 130 42.63% 0.5523%

6 ꄉ da 422 14.49% 1.7930% 38 ꌊ six 128 43.18% 0.5438%

7 ꇁ la 418 16.27% 1.7760% 39 ꄸ ddi 127 43.72% 0.5396%

8 ꑌ nyi 370 17.84% 1.5721% 40 ꐛ jjip 125 44.25% 0.5311%

9 ꉬ nge 364 19.39% 1.5466% 41 ꂶ max 124 44.77% 0.5269%

10 ꆹ li 337 20.82% 1.4318% 42 ꀕ w 123 45.30% 0.5226%

11 ꋌ cy 274 21.98% 1.1642% 43 ꉐ hxa 117 45.79% 0.4971%

12 ꁧ bbo 257 23.08% 1.0919% 44 ꈬ ggu 116 46.29% 0.4929%

13 ꌺ sse 254 24.15% 1.0792% 45 ꊿ co 114 46.77% 0.4844%

14 ꒉ yy 250 25.22% 1.0622% 46 ꂵ mat 112 47.25% 0.4759%

15 ꂿ mo 240 26.24% 1.0197% 47 ꇤ ga 111 47.72% 0.4716%

16 ꑟ xi 218 27.16% 0.9262% 48 ꇴ gu 111 48.19% 0.4716%

17 ꏮ jo 212 28.06% 0.9007% 49 ꌅ nzy 107 48.64% 0.4546%

18 ꌦ sy 210 28.96% 0.8923% 50 ꊂ wa 107 49.10% 0.4546%

19 ꂷ ma 208 29.84% 0.8838% 51 ꈌ ke 106 49.55% 0.4504%

20 ꇈ lox 206 30.71% 0.8753% 52 ꏃ shyp 105 50.00% 0.4461%

21 ꀊ a 201 31.57% 0.8540% 53 ꇮ get 100 50.42% 0.4249%

22 ꀉ ax 198 32.41% 0.8413% 54 ꑍ nyip 100 50.85% 0.4249%

23 ꂸ map 194 33.23% 0.8243% 55 ꃴ vut 100 51.27% 0.4249%

24 ꀐ ox 186 34.02% 0.7903% 56 ꎹ shep 97 51.68% 0.4121%

25 ꌕ suo 177 34.78% 0.7520% 57 ꁮ bbu 96 52.09% 0.4079%

26 ꋍ cyp 175 35.52% 0.7435% 58 ꐧ jjut 92 52.48% 0.3909%

27 ꄷ ddix 173 36.26% 0.7350% 59 ꐥ jjo 91 52.87% 0.3866%

28 ꐚ jji 166 36.96% 0.7053% 60 ꅉ dde 89 53.25% 0.3781%

29 ꑠ xip 165 37.66% 0.7011% 61 ꏭ jox 89 53.62% 0.3781%

30 ꇉ lo 164 38.36% 0.6968% 62 ꐊ qo 87 53.99% 0.3696%

31 ꄚ tit 158 39.03% 0.6713% 63 ꏂ shy 87 54.36% 0.3696%

32 ꉢ nga 156 39.69% 0.6628% 64 ꉎ hxat 86 54.73% 0.3654%

Walters and Schilken Nuosu Yi Syllable Frequency 13 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

65 ꑴ yip 86 55.09% 0.3654% 100 ꁨ bbop 60 65.43% 0.2549%

66 ꎭ sha 85 55.46% 0.3611% 101 ꀀ it 60 65.69% 0.2549%

67 ꀱ bur 82 55.80% 0.3484% 102 ꅊ ddep 59 65.94% 0.2507%

68 ꋻ nzop 82 56.15% 0.3484% 103 ꉆ hxit 59 66.19% 0.2507%

69 ꌋ si 82 56.50% 0.3484% 104 ꆍ nop 58 66.43% 0.2464%

70 ꐮ jjyx 80 56.84% 0.3399% 105 ꊈ wo 58 66.68% 0.2464%

71 ꇖ ly 77 57.17% 0.3272% 106 ꋒ zzi 58 66.93% 0.2464%

72 ꀑ o 77 57.49% 0.3272% 107 ꇫ gox 57 67.17% 0.2422%

73 ꎼ shu 76 57.82% 0.3229% 108 ꃪ vat 57 67.41% 0.2422%

74 ꌧ syp 75 58.14% 0.3187% 109 ꅍ ddu 56 67.65% 0.2379%

75 ꈭ ggup 74 58.45% 0.3144% 110 ꉘ hxo 56 67.89% 0.2379%

76 ꈍ kep 74 58.77% 0.3144% 111 ꃵ vux 56 68.13% 0.2379%

77 ꁏ pur 74 59.08% 0.3144% 112 ꑿ yo 55 68.36% 0.2337%

78 ꌐ sat 70 59.38% 0.2974% 113 ꄿ dda 54 68.59% 0.2294%

79 ꏦ jie 69 59.67% 0.2932% 114 ꉜ hxep 53 68.81% 0.2252%

80 ꃶ vu 69 59.96% 0.2932% 115 ꊌ wep 53 69.04% 0.2252%

81 ꍈ zha 69 60.26% 0.2932% 116 ꄰ tut 51 69.26% 0.2167%

82 ꅪ hni 68 60.55% 0.2889% 117 ꐈ qot 50 69.47% 0.2124%

83 ꆈ nuo 67 60.83% 0.2847% 118 ꇨ guo 48 69.67% 0.2039%

84 ꄡ tat 67 61.11% 0.2847% 119 ꄜ ti 48 69.88% 0.2039%

85 ꑷ yie 67 61.40% 0.2847% 120 ꇌ le 47 70.08% 0.1997%

86 ꄂ di 66 61.68% 0.2804% 121 ꑓ nyuo 47 70.28% 0.1997%

87 ꏸ jy 66 61.96% 0.2804% 122 ꐨ jjux 46 70.47% 0.1954%

88 ꊰ ci 65 62.24% 0.2762% 123 ꃆ mup 46 70.67% 0.1954%

89 ꋋ cyx 65 62.51% 0.2762% 124 ꃋ my 46 70.86% 0.1954%

90 ꅐ ddur 65 62.79% 0.2762% 125 ꑐ nyie 46 71.06% 0.1954%

91 ꎴ sho 65 63.07% 0.2762% 126 ꄹ ddip 45 71.25% 0.1912%

92 ꑭ xy 64 63.34% 0.2719% 127 ꌷ sso 45 71.44% 0.1912%

93 ꄓ dep 63 63.60% 0.2677% 128 ꊪ zy 45 71.63% 0.1912%

94 ꉉ hxip 63 63.87% 0.2677% 129 ꋦ zzur 45 71.82% 0.1912%

95 ꏢ ji 63 64.14% 0.2677% 130 ꉂ mgu 44 72.01% 0.1869%

96 ꅇ ddop 61 64.40% 0.2592% 131 ꇐ lu 43 72.19% 0.1827%

97 ꇇ lot 61 64.66% 0.2592% 132 ꇅ luo 43 72.37% 0.1827%

98 ꏿ qip 61 64.92% 0.2592% 133 ꐎ qu 43 72.56% 0.1827%

99 ꌌ sip 61 65.18% 0.2592% 134 ꋠ zze 43 72.74% 0.1827%

Walters and Schilken Nuosu Yi Syllable Frequency 14 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

135 ꉠ ngat 42 72.92% 0.1785% 170 ꈎ kut 33 78.42% 0.1402%

136 ꆗ hlit 41 73.09% 0.1742% 171 ꆺ lip 33 78.56% 0.1402%

137 ꑊ nyit 41 73.27% 0.1742% 172 ꅑ ndit 33 78.70% 0.1402%

138 ꑞ xix 41 73.44% 0.1742% 173 ꑘ nyop 33 78.85% 0.1402%

139 ꉈ hxi 40 73.61% 0.1700% 174 ꑻ yuo 33 78.99% 0.1402%

140 ꑲ yix 40 73.78% 0.1700% 175 ꍔ zhep 33 79.13% 0.1402%

141 ꀃ ip 39 73.95% 0.1657% 176 ꆀ nip 32 79.26% 0.1360%

142 ꆿ lat 39 74.11% 0.1657% 177 ꌡ sup 32 79.40% 0.1360%

143 ꅔ ndip 39 74.28% 0.1657% 178 ꁱ bbur 31 79.53% 0.1317%

144 ꄮ te 39 74.44% 0.1657% 179 ꂓ hmi 31 79.66% 0.1317%

145 ꒃ yu 39 74.61% 0.1657% 180 ꆽ lie 31 79.79% 0.1317%

146 ꇊ lop 38 74.77% 0.1615% 181 ꀮ bu 30 79.92% 0.1275%

147 ꃺ vyt 38 74.93% 0.1615% 182 ꄀ dit 30 80.05% 0.1275%

148 ꋓ zzip 38 75.09% 0.1615% 183 ꏤ jiet 30 80.18% 0.1275%

149 ꐰ jjyp 37 75.25% 0.1572% 184 ꁈ po 30 80.30% 0.1275%

150 ꁌ pu 37 75.41% 0.1572% 185 ꌿ ssup 30 80.43% 0.1275%

151 ꐔ qy 37 75.57% 0.1572% 186 ꃬ va 30 80.56% 0.1275%

152 ꄩ tot 37 75.72% 0.1572% 187 ꏾ qi 29 80.68% 0.1232%

153 ꃰ vo 37 75.88% 0.1572% 188 ꋧ zzyt 29 80.80% 0.1232%

154 ꋩ zzy 37 76.04% 0.1572% 189 ꀨ bop 28 80.92% 0.1190%

155 ꁵ bbyp 36 76.19% 0.1530% 190 ꆪ hlep 28 81.04% 0.1190%

156 ꇰ ge 36 76.34% 0.1530% 191 ꈴ mga 28 81.16% 0.1190%

157 ꈜ gga 36 76.50% 0.1530% 192 ꀒ op 28 81.28% 0.1190%

158 ꇭ gop 36 76.65% 0.1530% 193 ꄐ dop 27 81.39% 0.1147%

159 ꉌ hxie 36 76.80% 0.1530% 194 ꐙ jjix 27 81.51% 0.1147%

160 ꌒ sa 36 76.95% 0.1530% 195 ꃹ vur 27 81.62% 0.1147%

161 ꎷ shex 36 77.11% 0.1530% 196 ꊩ zyx 27 81.74% 0.1147%

162 ꊭ zyr 36 77.26% 0.1530% 197 ꄻ ddie 26 81.85% 0.1105%

163 ꁯ bbup 35 77.41% 0.1487% 198 ꉇ hxix 26 81.96% 0.1105%

164 ꎔ nrat 35 77.56% 0.1487% 199 ꀂ i 26 82.07% 0.1105%

165 ꑋ nyix 35 77.71% 0.1487% 200 ꇓ lur 26 82.18% 0.1105%

166 ꊁ wax 35 77.86% 0.1487% 201 ꉡ ngax 26 82.29% 0.1105%

167 ꈩ ggep 34 78.00% 0.1445% 202 ꊨ zyt 26 82.40% 0.1105%

168 ꌩ syr 34 78.14% 0.1445% 203 ꁬ bbut 25 82.51% 0.1062%

169 ꉻ ho 33 78.28% 0.1402% 204 ꋊ cyt 25 82.61% 0.1062%

Walters and Schilken Nuosu Yi Syllable Frequency 15 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

205 ꈚ ggat 25 82.72% 0.1062% 240 ꎰ shuo 21 86.14% 0.0892%

206 ꈧ ggex 25 82.83% 0.1062% 241 ꎺ shut 21 86.23% 0.0892%

207 ꏡ jix 25 82.93% 0.1062% 242 ꃱ vop 21 86.31% 0.0892%

208 ꈁ ka 25 83.04% 0.1062% 243 ꊇ wox 21 86.40% 0.0892%

209 ꈐ ku 25 83.14% 0.1062% 244 ꀵ byp 20 86.49% 0.0850%

210 ꂴ miep 25 83.25% 0.1062% 245 ꉛ hxe 20 86.57% 0.0850%

211 ꆫ hlut 24 83.35% 0.1020% 246 ꏣ jip 20 86.66% 0.0850%

212 ꅲ hna 24 83.46% 0.1020% 247 ꏯ jop 20 86.74% 0.0850%

213 ꅷ hnox 24 83.56% 0.1020% 248 ꏪ juo 20 86.83% 0.0850%

214 ꈈ ko 24 83.66% 0.1020% 249 ꇂ lap 20 86.91% 0.0850%

215 ꈾ mge 24 83.76% 0.1020% 250 ꊏ zi 20 87.00% 0.0850%

216 ꆅ na 24 83.86% 0.1020% 251 ꄔ dut 19 87.08% 0.0807%

217 ꅝ ndo 24 83.96% 0.1020% 252 ꆳ hly 19 87.16% 0.0807%

218 ꎸ she 24 84.07% 0.1020% 253 ꉖ hxot 19 87.24% 0.0807%

219 ꃨ vie 24 84.17% 0.1020% 254 ꐋ qop 19 87.32% 0.0807%

220 ꃮ vot 24 84.27% 0.1020% 255 ꑮ xyp 19 87.40% 0.0807%

221 ꒊ yyp 24 84.37% 0.1020% 256 ꉒ hxuot 18 87.48% 0.0765%

222 ꂪ hmy 23 84.47% 0.0977% 257 ꀁ ix 18 87.56% 0.0765%

223 ꂱ mip 23 84.57% 0.0977% 258 ꇗ lyp 18 87.63% 0.0765%

224 ꑱ yit 23 84.67% 0.0977% 259 ꇔ lyt 18 87.71% 0.0765%

225 ꊐ zip 23 84.76% 0.0977% 260 ꃈ mur 18 87.78% 0.0765%

226 ꀘ bi 22 84.86% 0.0935% 261 ꅥ ndup 18 87.86% 0.0765%

227 ꈤ ggo 22 84.95% 0.0935% 262 ꋽ nze 18 87.94% 0.0765%

228 ꈄ kuo 22 85.04% 0.0935% 263 ꎵ shop 18 88.01% 0.0765%

229 ꈻ mgo 22 85.14% 0.0935% 264 ꌸ ssop 18 88.09% 0.0765%

230 ꂰ mi 22 85.23% 0.0935% 265 ꄯ tep 18 88.17% 0.0765%

231 ꃥ vip 22 85.32% 0.0935% 266 ꋇ cup 17 88.24% 0.0722%

232 ꃢ vit 22 85.42% 0.0935% 267 ꂇ nbu 17 88.31% 0.0722%

233 ꑵ yiet 22 85.51% 0.0935% 268 ꁁ pa 17 88.38% 0.0722%

234 ꃚ fu 21 85.60% 0.0892% 269 ꐒ qyt 17 88.46% 0.0722%

235 ꇱ gep 21 85.69% 0.0892% 270 ꍇ zhax 17 88.53% 0.0722%

236 ꂯ mix 21 85.78% 0.0892% 271 ꇷ gur 16 88.60% 0.0680%

237 ꅽ nit 21 85.87% 0.0892% 272 ꅸ hnop 16 88.66% 0.0680%

238 ꑎ nyiet 21 85.96% 0.0892% 273 ꐩ jju 16 88.73% 0.0680%

239 ꐂ qie 21 86.05% 0.0892% 274 ꃃ mut 16 88.80% 0.0680%

Walters and Schilken Nuosu Yi Syllable Frequency 16 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

275 ꃄ mux 16 88.87% 0.0680% 310 ꐭ jjyt 13 91.04% 0.0552%

276 ꆎ nex 16 88.94% 0.0680% 311 ꇙ lyr 13 91.09% 0.0552%

277 ꑆ njy 16 89.00% 0.0680% 312 ꈱ mgie 13 91.15% 0.0552%

278 ꁍ pup 16 89.07% 0.0680% 313 ꂃ nbo 13 91.20% 0.0552%

279 ꏽ qix 16 89.14% 0.0680% 314 ꅞ ndop 13 91.26% 0.0552%

280 ꌉ sit 16 89.21% 0.0680% 315 ꎧ nry 13 91.31% 0.0552%

281 ꌞ sut 16 89.28% 0.0680% 316 ꋲ nzie 13 91.37% 0.0552%

282 ꊛ zot 16 89.34% 0.0680% 317 ꀺ pi 13 91.42% 0.0552%

283 ꀞ bat 15 89.41% 0.0637% 318 ꐕ qyp 13 91.48% 0.0552%

284 ꍶ chyt 15 89.47% 0.0637% 319 ꊝ zo 13 91.53% 0.0552%

285 ꊾ cox 15 89.54% 0.0637% 320 ꋚ zza 13 91.59% 0.0552%

286 ꈯ ggur 15 89.60% 0.0637% 321 ꋖ zzie 13 91.64% 0.0552%

287 ꆜ hlie 15 89.66% 0.0637% 322 ꀖ bit 12 91.69% 0.0510%

288 ꐞ jjie 15 89.73% 0.0637% 323 ꊼ cuop 12 91.74% 0.0510%

289 ꈿ mgep 15 89.79% 0.0637% 324 ꄶ ddit 12 91.80% 0.0510%

290 ꂾ mox 15 89.85% 0.0637% 325 ꇵ gup 12 91.85% 0.0510%

291 ꀿ pat 15 89.92% 0.0637% 326 ꉷ huo 12 91.90% 0.0510%

292 ꏅ shyr 15 89.98% 0.0637% 327 ꇽ kie 12 91.95% 0.0510%

293 ꄧ tuo 15 90.05% 0.0637% 328 ꈓ kur 12 92.00% 0.0510%

294 ꁦ bbox 14 90.10% 0.0595% 329 ꑇ njyp 12 92.05% 0.0510%

295 ꁴ bby 14 90.16% 0.0595% 330 ꑣ xie 12 92.10% 0.0510%

296 ꀧ bo 14 90.22% 0.0595% 331 ꑶ yiex 12 92.15% 0.0510%

297 ꋉ cur 14 90.28% 0.0595% 332 ꍠ zhyr 12 92.20% 0.0510%

298 ꃘ fut 14 90.34% 0.0595% 333 ꊙ zuo 12 92.25% 0.0510%

299 ꈹ mgot 14 90.40% 0.0595% 334 ꁳ bbyx 11 92.30% 0.0467%

300 ꉩ ngo 14 90.46% 0.0595% 335 ꄙ dur 11 92.35% 0.0467%

301 ꁉ pop 14 90.52% 0.0595% 336 ꈥ ggop 11 92.39% 0.0467%

302 ꎿ shur 14 90.58% 0.0595% 337 ꉙ hxop 11 92.44% 0.0467%

303 ꏁ shyx 14 90.64% 0.0595% 338 ꀄ iet 11 92.49% 0.0467%

304 ꄵ tur 14 90.70% 0.0595% 339 ꇸ kit 11 92.53% 0.0467%

305 ꊋ we 14 90.76% 0.0595% 340 ꐺ njuo 11 92.58% 0.0467%

306 ꇢ gat 13 90.81% 0.0552% 341 ꀏ ot 11 92.63% 0.0467%

307 ꇯ gex 13 90.87% 0.0552% 342 ꐓ qyx 11 92.68% 0.0467%

308 ꉱ hat 13 90.92% 0.0552% 343 ꎍ rrur 11 92.72% 0.0467%

309 ꉔ hxuo 13 90.98% 0.0552% 344 ꎽ shup 11 92.77% 0.0467%

Walters and Schilken Nuosu Yi Syllable Frequency 17 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

345 ꌱ ssat 11 92.82% 0.0467% 380 ꌬ ssi 9 94.26% 0.0382%

346 ꌶ ssox 11 92.86% 0.0467% 381 ꒈ yyx 9 94.30% 0.0382%

347 ꄝ tip 11 92.91% 0.0467% 382 ꍆ zhat 9 94.34% 0.0382%

348 ꍋ zhuo 11 92.96% 0.0467% 383 ꋪ zzyp 9 94.38% 0.0382%

349 ꍸ chy 10 93.00% 0.0425% 384 ꁠ bba 8 94.41% 0.0340%

350 ꇣ gax 10 93.04% 0.0425% 385 ꁙ bbip 8 94.45% 0.0340%

351 ꈘ ggie 10 93.08% 0.0425% 386 ꊸ ca 8 94.48% 0.0340%

352 ꈢ ggot 10 93.13% 0.0425% 387 ꅀ ddap 8 94.51% 0.0340%

353 ꆧ hlop 10 93.17% 0.0425% 388 ꄏ do 8 94.55% 0.0340%

354 ꐦ jjop 10 93.21% 0.0425% 389 ꃐ fip 8 94.58% 0.0340%

355 ꏶ jyt 10 93.25% 0.0425% 390 ꇜ gi 8 94.62% 0.0340%

356 ꈀ kax 10 93.30% 0.0425% 391 ꆮ hlup 8 94.65% 0.0340%

357 ꇍ lep 10 93.34% 0.0425% 392 ꂘ hmat 8 94.68% 0.0340%

358 ꈼ mgop 10 93.38% 0.0425% 393 ꂑ hmit 8 94.72% 0.0340%

359 ꉣ ngap 10 93.42% 0.0425% 394 ꅹ hnex 8 94.75% 0.0340%

360 ꐷ njie 10 93.47% 0.0425% 395 ꉹ hot 8 94.79% 0.0340%

361 ꆉ nuop 10 93.51% 0.0425% 396 ꀆ ie 8 94.82% 0.0340%

362 ꏈ ra 10 93.55% 0.0425% 397 ꆷ lit 8 94.85% 0.0340%

363 ꎆ rre 10 93.59% 0.0425% 398 ꉅ mgur 8 94.89% 0.0340%

364 ꎻ shux 10 93.64% 0.0425% 399 ꎝ nre 8 94.92% 0.0340%

365 ꌳ ssa 10 93.68% 0.0425% 400 ꐑ qur 8 94.96% 0.0340%

366 ꃤ vi 10 93.72% 0.0425% 401 ꌵ ssot 8 94.99% 0.0340%

367 ꃷ vup 10 93.76% 0.0425% 402 ꃼ vy 8 95.02% 0.0340%

368 ꍝ zhy 10 93.81% 0.0425% 403 ꀜ bie 7 95.05% 0.0297%

369 ꍣ cha 9 93.84% 0.0382% 404 ꍯ che 7 95.08% 0.0297%

370 ꄎ dox 9 93.88% 0.0382% 405 ꍬ chop 7 95.11% 0.0297%

371 ꆰ hlur 9 93.92% 0.0382% 406 ꍻ chyr 7 95.14% 0.0297%

372 ꉗ hxox 9 93.96% 0.0382% 407 ꊴ cie 7 95.17% 0.0297%

373 ꇿ kat 9 94.00% 0.0382% 408 ꇧ guox 7 95.20% 0.0297%

374 ꉃ mgup 9 94.03% 0.0382% 409 ꅳ hnap 7 95.23% 0.0297%

375 ꂮ mit 9 94.07% 0.0382% 410 ꉼ hop 7 95.26% 0.0297%

376 ꆊ not 9 94.11% 0.0382% 411 ꐪ jjup 7 95.29% 0.0297%

377 ꎤ nrur 9 94.15% 0.0382% 412 ꏲ ju 7 95.32% 0.0297%

378 ꎐ rry 9 94.19% 0.0382% 413 ꏨ juot 7 95.35% 0.0297%

379 ꎓ rryr 9 94.23% 0.0382% 414 ꏹ jyp 7 95.38% 0.0297%

Walters and Schilken Nuosu Yi Syllable Frequency 18 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

415 ꈏ kux 7 95.41% 0.0297% 450 ꑃ njur 6 96.35% 0.0255%

416 ꆸ lix 7 95.44% 0.0297% 451 ꑗ nyo 6 96.38% 0.0255%

417 ꎞ nrep 7 95.47% 0.0297% 452 ꌃ nzyt 6 96.40% 0.0255%

418 ꑙ nyut 7 95.50% 0.0297% 453 ꁕ pyr 6 96.43% 0.0255%

419 ꐆ quo 7 95.53% 0.0297% 454 ꁐ pyt 6 96.45% 0.0255%

420 ꎇ rrep 7 95.56% 0.0297% 455 ꐏ qup 6 96.48% 0.0255%

421 ꎳ shox 7 95.59% 0.0297% 456 ꎂ rro 6 96.50% 0.0255%

422 ꌗ sot 7 95.62% 0.0297% 457 ꏜ ry 6 96.53% 0.0255%

423 ꌻ ssep 7 95.65% 0.0297% 458 ꍅ ssyr 6 96.55% 0.0255%

424 ꍂ ssy 7 95.68% 0.0297% 459 ꌥ syx 6 96.58% 0.0255%

425 ꍀ ssyt 7 95.71% 0.0297% 460 ꑬ xyx 6 96.61% 0.0255%

426 ꋐ zzit 7 95.74% 0.0297% 461 ꑾ yox 6 96.63% 0.0255%

427 ꀠ ba 6 95.76% 0.0255% 462 ꒇ yyt 6 96.66% 0.0255%

428 ꁘ bbi 6 95.79% 0.0255% 463 ꍑ zhet 6 96.68% 0.0255%

429 ꁭ bbux 6 95.81% 0.0255% 464 ꋨ zzyx 6 96.71% 0.0255%

430 ꄽ ddat 6 95.84% 0.0255% 465 ꁥ bbot 5 96.73% 0.0212%

431 ꄃ dip 6 95.87% 0.0255% 466 ꁰ bburx 5 96.75% 0.0212%

432 ꈝ ggap 6 95.89% 0.0255% 467 ꀙ bip 5 96.77% 0.0212%

433 ꈪ ggut 6 95.92% 0.0255% 468 ꀥ bot 5 96.79% 0.0212%

434 ꆱ hlyt 6 95.94% 0.0255% 469 ꀯ bup 5 96.81% 0.0212%

435 ꂥ hmu 6 95.97% 0.0255% 470 ꀬ but 5 96.83% 0.0212%

436 ꂫ hmyp 6 95.99% 0.0255% 471 ꊯ cix 5 96.86% 0.0212%

437 ꅫ hnip 6 96.02% 0.0255% 472 ꄖ du 5 96.88% 0.0212%

438 ꅶ hnot 6 96.04% 0.0255% 473 ꇳ gux 5 96.90% 0.0212%

439 ꐬ jjur 6 96.07% 0.0255% 474 ꆠ hla 5 96.92% 0.0212%

440 ꈂ kap 6 96.10% 0.0255% 475 ꆞ hlat 5 96.94% 0.0212%

441 ꇑ lup 6 96.12% 0.0255% 476 ꆩ hle 5 96.96% 0.0212%

442 ꇎ lut 6 96.15% 0.0255% 477 ꆴ hlyp 5 96.98% 0.0212%

443 ꇘ lyrx 6 96.17% 0.0255% 478 ꉚ hxex 5 97.00% 0.0212%

444 ꃉ myt 6 96.20% 0.0255% 479 ꐣ jjot 5 97.03% 0.0212%

445 ꂁ nbot 6 96.22% 0.0255% 480 ꇀ lax 5 97.05% 0.0212%

446 ꂂ nbox 6 96.25% 0.0255% 481 ꈲ mgat 5 97.07% 0.0212%

447 ꂋ nbyt 6 96.27% 0.0255% 482 ꂽ mot 5 97.09% 0.0212%

448 ꉨ ngox 6 96.30% 0.0255% 483 ꂼ muop 5 97.11% 0.0212%

449 ꐳ nji 6 96.32% 0.0255% 484 ꆄ nax 5 97.13% 0.0212%

Walters and Schilken Nuosu Yi Syllable Frequency 19 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

485 ꂅ nbut 5 97.15% 0.0212% 520 ꏳ jup 4 97.84% 0.0170%

486 ꅙ nda 5 97.17% 0.0212% 521 ꇋ lex 4 97.86% 0.0170%

487 ꐱ njit 5 97.20% 0.0212% 522 ꂻ muo 4 97.88% 0.0170%

488 ꎪ nryr 5 97.22% 0.0212% 523 ꃌ myp 4 97.89% 0.0170%

489 ꆓ nu 5 97.24% 0.0212% 524 ꁻ nbie 4 97.91% 0.0170%

490 ꋯ nzi 5 97.26% 0.0212% 525 ꂊ nbur 4 97.93% 0.0170%

491 ꋺ nzox 5 97.28% 0.0212% 526 ꅠ nde 4 97.94% 0.0170%

492 ꐍ qux 5 97.30% 0.0212% 527 ꅡ ndep 4 97.96% 0.0170%

493 ꎫ shat 5 97.32% 0.0212% 528 ꅜ ndox 4 97.98% 0.0170%

494 ꎬ shax 5 97.34% 0.0212% 529 ꅧ ndur 4 97.99% 0.0170%

495 ꌤ syt 5 97.37% 0.0212% 530 ꆐ nep 4 98.01% 0.0170%

496 ꄟ tie 5 97.39% 0.0212% 531 ꉫ ngex 4 98.03% 0.0170%

497 ꄲ tu 5 97.41% 0.0212% 532 ꅿ ni 4 98.05% 0.0170%

498 ꄳ tup 5 97.43% 0.0212% 533 ꆂ nie 4 98.06% 0.0170%

499 ꑝ xit 5 97.45% 0.0212% 534 ꋮ nzix 4 98.08% 0.0170%

500 ꒀ yop 5 97.47% 0.0212% 535 ꌈ nzyr 4 98.10% 0.0170%

501 ꑽ yot 5 97.49% 0.0212% 536 ꀻ pip 4 98.11% 0.0170%

502 ꒌ yyr 5 97.51% 0.0212% 537 ꑰ xyr 4 98.13% 0.0170%

503 ꊎ zix 5 97.54% 0.0212% 538 ꒆ yur 4 98.15% 0.0170%

504 ꊤ zu 5 97.56% 0.0212% 539 ꊖ za 4 98.16% 0.0170%

505 ꊫ zyp 5 97.58% 0.0212% 540 ꁡ bbap 3 98.18% 0.0127%

506 ꋑ zzix 5 97.60% 0.0212% 541 ꀣ buo 3 98.19% 0.0127%

507 ꋬ zzyr 5 97.62% 0.0212% 542 ꍤ chap 3 98.20% 0.0127%

508 ꀈ at 4 97.64% 0.0170% 543 ꍫ cho 3 98.22% 0.0127%

509 ꈔ ggit 4 97.65% 0.0170% 544 ꍲ chu 3 98.23% 0.0127%

510 ꈫ ggux 4 97.67% 0.0170% 545 ꊱ cip 3 98.24% 0.0127%

511 ꇠ gie 4 97.69% 0.0170% 546 ꋀ cop 3 98.25% 0.0127%

512 ꇪ got 4 97.71% 0.0170% 547 ꄅ die 3 98.27% 0.0127%

513 ꆚ hlip 4 97.72% 0.0170% 548 ꄁ dix 3 98.28% 0.0127%

514 ꆲ hlyx 4 97.74% 0.0170% 549 ꃝ fur 3 98.29% 0.0127%

515 ꂨ hmur 4 97.76% 0.0170% 550 ꈞ gguot 3 98.30% 0.0127%

516 ꅼ hnut 4 97.77% 0.0170% 551 ꉾ he 3 98.32% 0.0127%

517 ꉏ hxax 4 97.79% 0.0170% 552 ꆶ hlyr 3 98.33% 0.0127%

518 ꐟ jjiep 4 97.81% 0.0170% 553 ꂛ hmap 3 98.34% 0.0127%

519 ꏬ jot 4 97.82% 0.0170% 554 ꂡ hmo 3 98.36% 0.0127%

Walters and Schilken Nuosu Yi Syllable Frequency 20 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

555 ꂦ hmup 3 98.37% 0.0127% 590 ꊉ wop 3 98.81% 0.0127%

556 ꅰ hnat 3 98.38% 0.0127% 591 ꍘ zhup 3 98.83% 0.0127%

557 ꅺ hne 3 98.39% 0.0127% 592 ꍞ zhyp 3 98.84% 0.0127%

558 ꉍ hxiep 3 98.41% 0.0127% 593 ꋝ zzo 3 98.85% 0.0127%

559 ꉊ hxiet 3 98.42% 0.0127% 594 ꋤ zzup 3 98.87% 0.0127%

560 ꏥ jiex 3 98.43% 0.0127% 595 ꋥ zzurx 3 98.88% 0.0127%

561 ꐤ jjox 3 98.44% 0.0127% 596 ꀗ bix 2 98.89% 0.0085%

562 ꐡ jjuo 3 98.46% 0.0127% 597 ꀦ box 2 98.90% 0.0085%

563 ꈊ ket 3 98.47% 0.0127% 598 ꀴ by 2 98.90% 0.0085%

564 ꈋ kex 3 98.48% 0.0127% 599 ꀳ byx 2 98.91% 0.0085%

565 ꈉ kop 3 98.50% 0.0127% 600 ꋃ cep 2 98.92% 0.0085%

566 ꆻ liet 3 98.51% 0.0127% 601 ꍰ chep 2 98.93% 0.0085%

567 ꆼ liex 3 98.52% 0.0127% 602 ꍩ chot 2 98.94% 0.0085%

568 ꉀ mgut 3 98.53% 0.0127% 603 ꍹ chyp 2 98.95% 0.0085%

569 ꂳ mie 3 98.55% 0.0127% 604 ꊮ cit 2 98.95% 0.0085%

570 ꁿ nba 3 98.56% 0.0127% 605 ꋏ cyr 2 98.96% 0.0085%

571 ꂄ nbop 3 98.57% 0.0127% 606 ꄍ dot 2 98.97% 0.0085%

572 ꅗ ndat 3 98.59% 0.0127% 607 ꄌ duo 2 98.98% 0.0085%

573 ꅤ ndu 3 98.60% 0.0127% 608 ? et 2 98.99% 0.0085%

574 ꐴ njip 3 98.61% 0.0127% 609 ꃏ fi 2 99.00% 0.0085%

575 ꋴ nzat 3 98.62% 0.0127% 610 ꇥ gap 2 99.01% 0.0085%

576 ꁄ puo 3 98.64% 0.0127% 611 ꈣ ggox 2 99.01% 0.0085%

577 ꏓ rep 3 98.65% 0.0127% 612 ꇩ guop 2 99.02% 0.0085%

578 ꎑ rryp 3 98.66% 0.0127% 613 ꇲ gut 2 99.03% 0.0085%

579 ꎎ rryt 3 98.67% 0.0127% 614 ꉳ ha 2 99.04% 0.0085%

580 ꏋ ruo 3 98.69% 0.0127% 615 ꉽ hex 2 99.05% 0.0085%

581 ꏝ ryp 3 98.70% 0.0127% 616 ꆨ hlex 2 99.06% 0.0085%

582 ꎶ shet 3 98.71% 0.0127% 617 ꆘ hlix 2 99.07% 0.0085%

583 ꌙ so 3 98.73% 0.0127% 618 ꂢ hmop 2 99.07% 0.0085%

584 ꌭ ssip 3 98.74% 0.0127% 619 ꂣ hmut 2 99.08% 0.0085%

585 ꌟ sux 3 98.75% 0.0127% 620 ꂭ hmyr 2 99.09% 0.0085%

586 ꄣ ta 3 98.76% 0.0127% 621 ꅱ hnax 2 99.10% 0.0085%

587 ꀍ uo 3 98.78% 0.0127% 622 ꅩ hnix 2 99.11% 0.0085%

588 ꃫ vax 3 98.79% 0.0127% 623 ꉑ hxap 2 99.12% 0.0085%

589 ꃳ vep 3 98.80% 0.0127% 624 ꀅ iex 2 99.12% 0.0085%

Walters and Schilken Nuosu Yi Syllable Frequency 21 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

625 ꐜ jjiet 2 99.13% 0.0085% 660 ꃦ viet 2 99.43% 0.0085%

626 ꏻ jyr 2 99.14% 0.0085% 661 ꃣ vix 2 99.44% 0.0085%

627 ꏷ jyx 2 99.15% 0.0085% 662 ꃯ vox 2 99.45% 0.0085%

628 ꇺ ki 2 99.16% 0.0085% 663 ꑪ xop 2 99.46% 0.0085%

629 ꈑ kup 2 99.17% 0.0085% 664 ꑸ yiep 2 99.46% 0.0085%

630 ꇆ luop 2 99.18% 0.0085% 665 ꑼ yuop 2 99.47% 0.0085%

631 ꉁ mgux 2 99.18% 0.0085% 666 ꊔ zat 2 99.48% 0.0085%

632 ꂎ nbyp 2 99.19% 0.0085% 667 ꍓ zhe 2 99.49% 0.0085%

633 ꅘ ndax 2 99.20% 0.0085% 668 ꍐ zhop 2 99.50% 0.0085%

634 ꅓ ndi 2 99.21% 0.0085% 669 ꍕ zhut 2 99.51% 0.0085%

635 ꐻ njot 2 99.22% 0.0085% 670 ꋙ zzax 2 99.52% 0.0085%

636 ꎖ nra 2 99.23% 0.0085% 671 ꀡ bap 1 99.52% 0.0042%

637 ꆔ nup 2 99.24% 0.0085% 672 ꁜ bbie 1 99.52% 0.0042%

638 ꑑ nyiep 2 99.24% 0.0085% 673 ꀛ biex 1 99.53% 0.0042%

639 ꑖ nyox 2 99.25% 0.0085% 674 ꀤ buop 1 99.53% 0.0042%

640 ꑛ nyu 2 99.26% 0.0085% 675 ꍢ chax 1 99.54% 0.0042%

641 ꋶ nza 2 99.27% 0.0085% 676 ꍳ chup 1 99.54% 0.0042%

642 ꌀ nzup 2 99.28% 0.0085% 677 ꍵ chur 1 99.55% 0.0042%

643 ꌂ nzur 2 99.29% 0.0085% 678 ꍴ churx 1 99.55% 0.0042%

644 ꁋ pux 2 99.29% 0.0085% 679 ꊳ ciex 1 99.55% 0.0042%

645 ꏼ qit 2 99.30% 0.0085% 680 ꊽ cot 1 99.56% 0.0042%

646 ꏍ rot 2 99.31% 0.0085% 681 ꋆ cu 1 99.56% 0.0042%

647 ꎀ rrot 2 99.32% 0.0085% 682 ꋅ cux 1 99.57% 0.0042%

648 ꎏ rryx 2 99.33% 0.0085% 683 ꄇ dat 1 99.57% 0.0042%

649 ꏖ ru 2 99.34% 0.0085% 684 ꄈ dax 1 99.58% 0.0042%

650 ꏟ ryr 2 99.35% 0.0085% 685 ꄾ ddax 1 99.58% 0.0042%

651 ꏄ shyrx 2 99.35% 0.0085% 686 ꄺ ddiex 1 99.58% 0.0042%

652 ꏀ shyt 2 99.36% 0.0085% 687 ꅆ ddo 1 99.59% 0.0042%

653 ꌹ ssex 2 99.37% 0.0085% 688 ꅂ dduo 1 99.59% 0.0042%

654 ꌪ ssit 2 99.38% 0.0085% 689 ꅏ ddurx 1 99.60% 0.0042%

655 ꌢ surx 2 99.39% 0.0085% 690 ꄒ de 1 99.60% 0.0042%

656 ꄛ tix 2 99.40% 0.0085% 691 ꄑ dex 1 99.60% 0.0042%

657 ꄴ turx 2 99.41% 0.0085% 692 ꄕ dux 1 99.61% 0.0042%

658 ꀎ uop 2 99.41% 0.0085% 693 ꃓ fa 1 99.61% 0.0042%

659 ꃭ vap 2 99.42% 0.0085% 694 ꃛ fup 1 99.62% 0.0042%

Walters and Schilken Nuosu Yi Syllable Frequency 22 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

695 ꃙ fux 1 99.62% 0.0042% 730 ꆌ no 1 99.77% 0.0042%

696 ꈛ ggax 1 99.63% 0.0042% 731 ꎗ nrap 1 99.77% 0.0042%

697 ꇝ gip 1 99.63% 0.0042% 732 ꆇ nuox 1 99.78% 0.0042%

698 ꇶ gurx 1 99.63% 0.0042% 733 ꆖ nur 1 99.78% 0.0042%

699 ꉮ hit 1 99.64% 0.0042% 734 ꆑ nut 1 99.79% 0.0042%

700 ꆦ hlo 1 99.64% 0.0042% 735 ꑕ nyot 1 99.79% 0.0042%

701 ꂗ hmiep 1 99.65% 0.0042% 736 ꋰ nzip 1 99.80% 0.0042%

702 ꂔ hmip 1 99.65% 0.0042% 737 ꁂ pap 1 99.80% 0.0042%

703 ꂤ hmux 1 99.66% 0.0042% 738 ꀽ pie 1 99.80% 0.0042%

704 ꉺ hox 1 99.66% 0.0042% 739 ꀸ pit 1 99.81% 0.0042%

705 ꉶ huox 1 99.66% 0.0042% 740 ꁆ pot 1 99.81% 0.0042%

706 ꉓ hxuox 1 99.67% 0.0042% 741 ꁇ pox 1 99.82% 0.0042%

707 ꀇ iep 1 99.67% 0.0042% 742 ꁒ py 1 99.82% 0.0042%

708 ꏠ jit 1 99.68% 0.0042% 743 ꐇ quop 1 99.83% 0.0042%

709 ꐘ jjit 1 99.68% 0.0042% 744 ꐗ qyr 1 99.83% 0.0042%

710 ꐠ jjuox 1 99.69% 0.0042% 745 ꏆ rat 1 99.83% 0.0042%

711 ꏰ jut 1 99.69% 0.0042% 746 ꏒ re 1 99.84% 0.0042%

712 ꏱ jux 1 99.69% 0.0042% 747 ꎁ rrox 1 99.84% 0.0042%

713 ꇹ kix 1 99.70% 0.0042% 748 ꍿ rruo 1 99.85% 0.0042%

714 ꈆ kot 1 99.70% 0.0042% 749 ꎋ rrup 1 99.85% 0.0042%

715 ꈇ kox 1 99.71% 0.0042% 750 ꏙ rur 1 99.86% 0.0042%

716 ꈃ kuox 1 99.71% 0.0042% 751 ꏛ ryx 1 99.86% 0.0042%

717 ꆾ liep 1 99.72% 0.0042% 752 ꌓ sap 1 99.86% 0.0042%

718 ꈵ mgap 1 99.72% 0.0042% 753 ꌑ sax 1 99.87% 0.0042%

719 ꈽ mgex 1 99.72% 0.0042% 754 ꌜ se 1 99.87% 0.0042%

720 ꈷ mguo 1 99.73% 0.0042% 755 ꎲ shot 1 99.88% 0.0042%

721 ꉄ mgurx 1 99.73% 0.0042% 756 ꎯ shuox 1 99.88% 0.0042%

722 ꆆ nap 1 99.74% 0.0042% 757 ꎾ shurx 1 99.89% 0.0042%

723 ꂍ nby 1 99.74% 0.0042% 758 ꌏ siep 1 99.89% 0.0042%

724 ꉞ ngie 1 99.75% 0.0042% 759 ꌴ ssap 1 99.89% 0.0042%

725 ꉧ ngot 1 99.75% 0.0042% 760 ꌯ ssie 1 99.90% 0.0042%

726 ꉥ nguox 1 99.75% 0.0042% 761 ꌰ ssiep 1 99.90% 0.0042%

727 ꆃ niep 1 99.76% 0.0042% 762 ꌾ ssu 1 99.91% 0.0042%

728 ꐾ njop 1 99.76% 0.0042% 763 ꌖ suop 1 99.91% 0.0042%

729 ꑄ njyt 1 99.77% 0.0042% 764 ꌣ sur 1 99.92% 0.0042%

Walters and Schilken Nuosu Yi Syllable Frequency 23 of 24 ICYBLL 2012

Seq Syll Rom Tokens Cum % Proportion Seq Syll Rom Tokens Cum % Proportion

765 ꄢ tax 1 99.92% 0.0042% 776 ꍚ zhur 1 99.97% 0.0042%

766 ? tsyr 1 99.92% 0.0042% 777 ꊍ zit 1 99.97% 0.0042%

767 ꀌ uox 1 99.93% 0.0042% 778 ꊜ zox 1 99.97% 0.0042%

768 ꃲ vex 1 99.93% 0.0042% 779 ꊥ zup 1 99.98% 0.0042%

769 ꃸ vurx 1 99.94% 0.0042% 780 ꊧ zur 1 99.98% 0.0042%

770 ꃻ vyx 1 99.94% 0.0042% 781 ꊢ zut 1 99.99% 0.0042%

771 ꊀ wat 1 99.94% 0.0042% 782 ꋟ zzex 1 99.99% 0.0042%

772 ꊊ wex 1 99.95% 0.0042% 783 ꋞ zzop 1 100.00% 0.0042%

773 ꑡ xiet 1 99.95% 0.0042% 784 ꋣ zzu 1 100.00% 0.0042%

774 ꑩ xo 1 99.96% 0.0042% 23,536

775 ꍍ zhot 1 99.96% 0.0042%

Walters and Schilken Nuosu Yi Syllable Frequency 24 of 24