A Preliminary Study of Nuosu Yi Syllable Frequency in Text Revised 2015 February

Total Page:16

File Type:pdf, Size:1020Kb

A Preliminary Study of Nuosu Yi Syllable Frequency in Text Revised 2015 February Presented at the International Conference on Yi-Burmese Languages and Linguistics (ICYBLL) Chengdu, Sichuan, China 2012 November A Preliminary Study of Nuosu Yi Syllable Frequency in Text Revised 2015 February Dennis Walters, Doerthe Schilken, and Susan Walters SIL International, East Asia Group Abstract The standard written form of Nuosu Yi includes 1,165 symbols. The symbols correspond to phonemic syllables as spoken in the Shengzha variety of Northern Yi. Aspiring readers of Nuosu Yi text must memorize this sound-symbol correspondence. Studies across languages and writing systems have shown that early study of frequently used symbols can speed learning progress. (Dale and Chall 1948; Hu and Catts 1998; Johnson, Smith, and Jensen 1972) While software exists for doing corpus-based analysis of Yi language data (Chen 2010; 2011), until now the literature lacked an ordered list of frequently used Nuosu syllables. This study lists Nuosu Yi syllables in order of their frequency of occurrence in a body of text. The corpus includes nine texts, containing a total of 23,536 syllables. In the sample, 783 unique syllables occurred at least once. The cumulative usage data show that a person with reading knowledge of 402 symbols (of the available 1,165) could have read 95% of our text sample. The Nuosu Yi syllable frequency data are also compared with Mandarin Chinese syllable frequency (Sung 2005). ICYBLL 2012 Introduction Since the Yi language syllabary was approved by the China State Council in 1980, its use has been successfully popularized in the Liangshan region (Lewis, Simons, and Fennig 2014; Bradley 2009). Nuosu people take pride in seeing their written language on public signs, in their schools, on television, and in the Liangshan Daily newspaper (Yi language version). The writing system is used in traditional Nuosu culture, and there is a body of popular literature available in bookstores. Yi language material is increasingly available on the Internet as well, and it is clear that many people are studying written Yi language, both in school and informally. The task for aspiring readers of Nuosu Yi is to memorize the sound-symbol correspondence; most of them find it takes considerable effort and time to learn to read it. Hu and Catts (1998), showed that high frequency symbols are more readily learned than low frequency symbols in logographic orthographies as well as in alphabetic ones. This means that knowing the most frequently used characters can help a teacher teach effectively and a reader to learn more quickly. Until now, a quantitative study listing the most commonly used Nuosu Yi syllables has not been publicly available. This study lists the syllables, which occur in a body of Nuosu Yi text, in order of their frequency of occurrence (Appendix). It is intended to provide a preliminary set of data, to explore and refine the method, and to suggest directions for future research. Nuosu Yi syllables and symbols The Nuosu Yi syllabary is based on a traditional writing system used among the Yi people of southern Sichuan Province. (Huang 2001; Chen et al. 1985) Since its approval, it has been used, in addition to the national language, in education in the Liangshan region. The syllabary includes 819 basic symbols, plus a syllable iteration character and punctuation. Unlike an abugida (or alphasyllabary) system, the Nuosu Yi system pairs a single unitary symbol with each basic phonemic syllable. The symbols generally do not have systematic variations that could help a reader memorize the corresponding sounds, except that mid-high tone syllables are formed by adding an inverted breve mark above related basic symbols. There is also a syllable iteration character ꀕ , which stands in for the second occurrence of a reduplicated syllable: ꈀꎭꎭ → ꈀꎭꀕ. Including mid-high tone symbols and the iteration character, there are 1,165 symbols (Table 1). Because of the one-to-one correspondence between syllables and their symbols, we refer to Nuosu “symbols” and “syllables” interchangeably in this paper. Commonly recognized varieties of Northern Yi include Yinuo and Tianba in the north, Shengzha in the central and southwestern parts of Liangshan, and Suodi and Adur in the south (Bradley 2001; Chen et al. 1985). The standard syllabary is based on phonemic analysis of the Shengzha variety as spoken in the vicinity of Xide. Because it is a phonemic system, native speakers of most Northern Yi varieties find at least an approximate match between the syllables they speak and the symbols in the syllabary. Walters and Schilken Nuosu Yi Syllable Frequency 2 of 24 ICYBLL 2012 Table 1. Number of standard Yi symbols Nuosu Symbols Count Basic symbols 819 Mid high tone symbols 345 Iteration symbol “w” 1 Total 1,165 Traditionally, Nuosu writing was taught in the home by bimos, the keepers and agents of Nuosu traditional religion, equipping their sons, and sometimes their daughters, to use the writing in Nuosu folk culture. Teaching involved memorization of traditional poetry and other texts. A literacy campaign in the 1950s promoted a romanized writing system, not using the traditional Nuosu symbols. While the romanized system was easy to learn, Nuosu people preferred their traditional writing. Further study and development resulted in approval of the character-based Scheme for Standard Yi Writing (China State Council 1980; Chen, et al. 1985). After that time, public education in Liangshan began to include a special track using Nuosu Yi language as the medium of instruction for all subjects. Currently, home instruction and school instruction ensure that some Nuosu Yi people become confident readers, yet the proportion is small, and many others still desire to learn to read their own language. Text Corpus The text corpus under study was a collection of material readily available to the authors in electronic form. As shown in Table 2, about half the material is narrative, including some transcribed oral material. Another forty percent or so is behavioral, in the form of traditional proverbs and poetry. About five percent of the material is hortatory or expository in Longacre’s (1996) classification. The variety in genre as well as in written versus spoken text gives a measure of balance to the corpus. Still, the present sample has a greater proportion of poetry and proverbs than anything else. Because of this, we might expect a reduced frequency of some function words, and a greater proportion of content words—nouns, verbs, and descriptors—than we would see in a more balanced sample. Table 2. Text corpus by size and genre Text Description Genre Syllable Count Proportion of Total Witch Folk tale from trad Folk Narrative 2,029 8.6% Nuosu culture Day die Folk tale from trad Folk Narrative 861 3.7% Nuosu culture Firewood Young person describes a Personal 215 0.9% daily life task. Narrative Flood Mythical flood account. Poetic 8,525 36.2% Narrative Proverbs Poetic proverbs. Proverbs 10,242 43.5% Walters and Schilken Nuosu Yi Syllable Frequency 3 of 24 ICYBLL 2012 Magpie Old person recounts a Personal 328 1.4% childhood experience. Narrative No fight A teacher warns students Hortatory 187 0.8% not to fight. Welcome A teacher welcomes new Hortatory 180 0.8% students. Sewing needle Adult recounts an Personal 969 4.1% experience as a student. Narrative Total count 23,536 100% Data Processing Finding the relative frequency of language symbols is done by combing through volumes of text, listing units found there, counting their occurrences, and storing the results. Afterward, the data may be collated and presented in various ways. Automated techniques for storing and processing text have been available almost since the invention of electronic computers. For Yi language material, (Shama 2000) initially used a double-byte encoding, similar to what was done for Chinese characters before Unicode. This scheme allowed for input, storage, editing, and typesetting of Yi language material. Later, Yi characters were included in the GB18030 standard, and in Unicode since version 3.0 (Unicode Consortium 2000). These developments have greatly facilitated computer processing of Yi language data. Data shown in this study were extracted in the following steps: Install Primer (Weber 1999) software and set up a project for the language under study. Choose texts and prepare the electronic files. Create working copies of data files for analysis. Strip each text of metadata, leaving raw text only. For each text, use BabelPad (West 2004) or a similar utility to convert Yi symbols to romanized form with spaces between each syllable. Place each file in the directory where Primer will expect to find data. Use Primer to generate the frequency word list. Import the frequency word list to a spreadsheet program. Sort the data, record counts, generate histogram, etc. This work flow yielded the desired data, but with some drawbacks. For example, the iteration character ꀕ appears in our frequency list as number 42 although in the text corpus it actually stands for a number of different characters. Ideally, the software would automatically identify the reduplicated syllables and correct the counts. Also, Primer’s counting feature expected text data to be presented in romanized form with spaces between counted forms, so we Walters and Schilken Nuosu Yi Syllable Frequency 4 of 24 ICYBLL 2012 converted the character texts to romanized form. With newer tools, syllable counts and word counts may be done more simply. PrimerPro (Schroeder 2011), an updated version of Primer, is Unicode compliant and has a graphical user interface. UnicodeCCount (Warfel and White 2011) may produce the ordered frequency list without the need to convert syllabary symbols to romanized form. Alternatively, a skilled programmer could automate the entire process of harvesting electronic data and analyzing it for frequency, as described in Chen (2010, 2011). Results As shown in Table 3, the most frequently occurring character in our sample ꃅ/mu/ ‘do; ADVR’ occurred 707 times.
Recommended publications
  • LCSH Section Y
    Y-Bj dialects Yabakei (Japan) Yacatas Site (Mexico) USE Yugambeh-Bundjalung dialects BT Valleys—Japan BT Mexico—Antiquities Y-cars Yabakei (Japan) Yaccas USE General Motors Y-cars USE Yaba Valley (Japan) USE Xanthorrhoea Y chromosome Yabarana Indians (May Subd Geog) Yachats River (Or.) UF Chromosome Y UF Yaurana Indians BT Rivers—Oregon BT Sex chromosomes BT Indians of South America—Venezuela Yachats River Valley (Or.) — Abnormalities (May Subd Geog) Yabbie culture UF Yachats Valley (Or.) BT Sex chromosome abnormalities USE Yabby culture BT Valleys—Oregon Y Fenai (Wales) Yabbies (May Subd Geog) Yachats Valley (Or.) USE Menai Strait (Wales) [QL444.M33 (Zoology)] USE Yachats River Valley (Or.) Y-G personality test BT Cherax Yachikadai Iseki (Haga-machi, Tochigi-ken, Japan) USE Yatabe-Guilford personality test Yabby culture (May Subd Geog) USE Yachikadai Site (Haga-machi, Tochigi-ken, Y.M.C.A. libraries [SH380.94.Y32] Japan) USE Young Men's Christian Association libraries UF Yabbie culture Yachikadai Site (Haga-machi, Tochigi-ken, Japan) Y maze Yabby farming This heading is not valid for use as a geographic BT Maze tests BT Crayfish culture subdivision. Y Mountain (Utah) Yabby farming UF Yachikadai Iseki (Haga-machi, Tochigi-ken, BT Mountains—Utah USE Yabby culture Japan) Wasatch Range (Utah and Idaho) YABC (Behavioral assessment) BT Japan—Antiquities Y-particles USE Young Adult Behavior Checklist Yachinaka Tate Iseki (Hinai-machi, Japan) USE Hyperons Yabe family (Not Subd Geog) USE Yachinaka Tate Site (Hinai-machi, Japan) Y-platform cars Yabem (Papua New Guinean people) Yachinaka Tate Site (Hinai-machi, Japan) USE General Motors Y-cars USE Yabim (Papua New Guinean people) This heading is not valid for use as a geographic subdivision.
    [Show full text]
  • Shixing, a Sino-Tibetan Language of South-West China: a Grammatical Sketch with Two Appended Texts Ekaterina Chirkova
    Shixing, a Sino-Tibetan language of South-West China: A grammatical sketch with two appended texts Ekaterina Chirkova To cite this version: Ekaterina Chirkova. Shixing, a Sino-Tibetan language of South-West China: A grammatical sketch with two appended texts. Linguistics of the Tibeto-Burman Area, Dept. of Linguistics, University of California, 2009, 32 (1), pp.1-89. hal-00483979 HAL Id: hal-00483979 https://hal.archives-ouvertes.fr/hal-00483979 Submitted on 17 May 2010 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. SHǏXĪNG, A SINO-TIBETAN LANGUAGE OF SOUTH-WEST CHINA: A GRAMMATICAL SKETCH WITH TWO APPENDED TEXTS∗ Katia Chirkova Centre de Recherches Linguistiques sur l’Asie Orientale, CNRS This article is a brief grammatical sketch of Shǐxīng, accompanied by two analyzed and annotated texts. Shǐxīng is a little studied Sino-Tibetan language of South-West China, currently classified as belonging to the Qiangic subgroup of the Sino-Tibetan language family. Based on newly collected data, this grammatical sketch is deemed as an enlarged and elaborated version of Huáng & Rénzēng’s (1991) outline of Shǐxīng, with an aim to put forward a new description of Shǐxīng in a language that makes it accessible also to a non-Chinese speaking audience.
    [Show full text]
  • 2010 Center for East Asian Studies
    Center2010 for East Asian Studies University of Kansas | 2010 Annual Report Letter from the Director Connections between migration and their courses. religion are the subject of some of our 2010 In keeping with our efforts to has been other events. Jonathan Lipman of Mount expand our capacity in regions where another great Holyoke College gave a wonderful talk our least commonly taught languages year for on Islam in China that examined, among (Uyghur, Tibetan, and Mongolian) are CEAS. Most other things, historical connections spoken, CEAS has been working with importantly, between Chinese Muslims in Yunnan Environmental Studies, CGIS, and we competed Province and Central Asia, Egypt, faculty in some other units to get the successfully and other parts of the Middle East. In KU-Mongolia Initiative off the ground. in a very February, Xu Xin, author of The Jews of Not a Mongolian Studies program in a competitive Kaifeng, will speak on Judaism in China. conventional sense, this initiative aims pool for We anticipate that he will talk on both the to create new linkages with Mongolian another Title VI grant that will help history of Chinese Jews and Judaism in institutions, identify and work with faculty fund our activities through the 2013–14 contemporary China. We will continue to in a variety of science and social science academic year. Notice of the award came have some funding to bring in speakers disciplines who have an interest in doing a bit later this year, so we had to embark to talk about Islam in East Asia (not research in Mongolia, and help them to on planning for the grant even before we necessarily related to migration) for the develop a set of research projects that will knew we had the funding to carry out our remaining years of the grant, and I would enhance our knowledge and understanding projects.
    [Show full text]
  • P229A180008 University of Kansas
    U.S. Department of Education Washington, D.C. 20202-5335 APPLICATION FOR GRANTS UNDER THE Language Resource Centers CFDA # 84.229A PR/Award # P229A180008 Gramts.gov Tracking#: GRANT12659492 OMB No. , Expiration Date: Closing Date: Jun 25, 2018 PR/Award # P229A180008 **Table of Contents** Form Page 1. Application for Federal Assistance SF-424 e3 2. Standard Budget Sheet (ED 524) e6 3. Assurances Non-Construction Programs (SF 424B) e8 4. Disclosure Of Lobbying Activities (SF-LLL) e10 5. ED GEPA427 Form e11 Attachment - 1 (1234-062018 EGARC Title VI GEPA Signed- Final) e12 6. Grants.gov Lobbying Form e13 7. Dept of Education Supplemental Information for SF-424 e14 8. ED Abstract Narrative Form e15 Attachment - 1 (1238-062018 EGARC LRC Abstract- final) e16 9. Project Narrative Form e17 Attachment - 1 (1237-062218 EGARC Title VI Narrative and cover- Final) e18 10. Other Narrative Form e69 Attachment - 1 (1235-062118 EGARC Title VI Appendices- final) e70 11. Budget Narrative Form e159 Attachment - 1 (1236-062018_EGARC_Budget- Final) e160 This application was generated using the PDF functionality. The PDF functionality automatically numbers the pages in this application. Some pages/sections of this application may contain 2 sets of page numbers, one set created by the applicant and the other set created by e-Application's PDF functionality. Page numbers created by the e-Application PDF functionality will be preceded by the letter e (for example, e1, e2, e3, etc.). Page e2 OMB Number: 4040-0004 Expiration Date: 12/31/2019 Application for Federal Assistance SF-424 * 1. Type of Submission: * 2. Type of Application: * If Revision, select appropriate letter(s): Preapplication New Application Continuation * Other (Specify): Changed/Corrected Application Revision * 3.
    [Show full text]
  • Liangshan Yi Language Lessons
    Liangshan Yi Language Lessons Michael Mahai and Andrew Eatough 1 Explanatory Preface We worked on these lessons in 2000 and 2001, without any clear plan for who might use them or how they might be used. Thirteen years later, we are revisiting the question. These lessons were never polished up for publication, and the latter chapters in particular could use a fair amount of work. But given the scarcity of materials in English to help beginning students of the Nuosu language, it seemed worthwhile to tidy up the formatting a bit, to make sure all legacy encoding was converted to standard Unicode encoding, and then to make the lessons available in electronic form without any undue delay. It is our wish that these lessons will help to build bridges. 2 Chapter 1 1. Introduction to the Nuosu There are various names for the Nuosu people. Their name for themselves in their own language is usually romanized as either Nuosu or Nosu, and is pronounced [n!!su!]. In China their ethnicity is o"cially classi#ed as 彝族 Yizu , which is pronounced [i"#tsu"#] in Putonghua and [$i%&ts'u%&] or [$i%&t('o%&] in Sichuanese. This is the name they normally use for themselves if speaking Chinese. If it is necessary when speaking Chinese to distinguish the Nuosu from other groups of people that are also classi#ed as Yizu, it is common to say 凉山彝族 Liangshan Yizu or 四川彝族 Sichuan Yizu. Some people further distinguish one Nuosu subgroup from another Nuosu subgroup by using various geographical labels or dialect labels.
    [Show full text]
  • The Duoxu Language and the Ersu-Lizu-Duoxu Relationship Katia Chirkova
    The Duoxu Language and the Ersu-Lizu-Duoxu relationship Katia Chirkova To cite this version: Katia Chirkova. The Duoxu Language and the Ersu-Lizu-Duoxu relationship. Linguistics of the Tibeto-Burman Area, Dept. of Linguistics, University of California, 2014, 37 (1), pp.104-146. 10.1075/ltba.37.1.04chi. hal-01136724 HAL Id: hal-01136724 https://hal.archives-ouvertes.fr/hal-01136724 Submitted on 27 Mar 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THE DUOXU LANGUAGE AND THE ERSU-LIZU-DUOXU RELATIONSHIP* Katia Chirkova CNRS-CRLAO Duoxu is a terminally endangered and virtually undescribed Tibeto-Burman language, spoken in the historically multi-ethnic and multi-lingual Miǎnníng county in Sìchuān province in the People’s Republic of China. Until recently, Duoxu was known only through a 740-word vocabulary list in the Sino-Tibetan vocabularies Xīfān Yìyǔ [Tibetan-Chinese bilingual glossary], recorded in Chinese and Tibetan transcriptions in the 18th century, and a grammatical sketch (Huáng & Yǐn 2012). Researchers who have worked on the language (Nishida 1973, Sūn 1982, Huáng & Yǐn 2012) have expressed different views about the features and the genetic position of Duoxu, variously viewing it as (1) closely related to Lolo-Burmese languages (Nishida 1973), (2) closely related to Ersu and Lizu, two neighboring languages that are currently classified as members of the Qiangic subgroup of the Tibeto-Burman language family (Sūn 1982), or (3) distantly related to those two languages and to Qiangic languages at large (Huáng & Yǐn 2012).
    [Show full text]
  • Kansas Board of Regents Program Review
    2021 Kansas Board of Regents Program Review UNIVERSITY OF KANSAS University of Kansas Program Review Year 2021 Degree Program CIP Level Recommendation* Notes** African & African-American 05.0101 B, M Continue Studies American Studies 05.0102 B, M, D Continue Anthropology 45.0201 B, M, D Continue Contemporary East Asian 05.0104 M Continue Studies Creative Writing 23.1302 M Continue Dance 50.0301 B Continue East Asian Languages and 16.0300 B, M Continue Cultures Economics 45.0601 B Continue Economics 45.0603 M, D Continue English 23.0101 B, M, D Continue Film and Media Studies 50.0601 B, M, D Continue Global & International Studies 30.2001 B, M Continue (GIST) Humanities B Discontinue Discontinue major and department History 54.0101 B, M, D Continue History of Art 50.0703 B, M, D Continue Indigenous Studies 05.0202 M Continue Interdisciplinary Studies 30.9999 M, D Continue Latin American & Caribbean 05.0134 B, M B = Merge Merge bachelor’s with Studies M = Continue GIST as concentration Liberal Arts & Sciences 24.0101 B Continue Museum Studies 30.1401 M Continue Philosophy 38.0101 B, M, D Continue Religious Studies 38.0201 B, M Continue Russian, East European & 05.0110 B, M B = Merge Merge bachelor’s with Eurasian Studies M = Continue Slavic & Eurasian Lang. & Lit. as concentration Theatre 50.0501 B, M, D Continue Visual Art 50.0702 B, M Continue Visual Art Education 13.1302 B, M B = Discontinue Low enrollments; not M = Discontinue enough faculty to support quality program M= Masters; B=Bachelors; D= Doctorate *Recommendation options are: Continue, Additional Review, Enhance, Discontinue 2 University of Kansas Program Review Institutional Overview The University of Kansas is a major comprehensive research university that serves as a center for learning, scholarship, and creative endeavor.
    [Show full text]