A Comparative Study in Anêm and Lusi
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Arxiv:2011.02128V1 [Cs.CL] 4 Nov 2020
Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis Sashi Novitasari1, Andros Tjandra1, Sakriani Sakti1;2, Satoshi Nakamura1;2 1Nara Institute of Science and Technology, Japan 2RIKEN Center for Advanced Intelligence Project AIP, Japan fsashi.novitasari.si3, tjandra.ai6, ssakti,[email protected] Abstract Even though over seven hundred ethnic languages are spoken in Indonesia, the available technology remains limited that could support communication within indigenous communities as well as with people outside the villages. As a result, indigenous communities still face isolation due to cultural barriers; languages continue to disappear. To accelerate communication, speech-to-speech translation (S2ST) technology is one approach that can overcome language barriers. However, S2ST systems require machine translation (MT), speech recognition (ASR), and synthesis (TTS) that rely heavily on supervised training and a broad set of language resources that can be difficult to collect from ethnic communities. Recently, a machine speech chain mechanism was proposed to enable ASR and TTS to assist each other in semi-supervised learning. The framework was initially implemented only for monolingual languages. In this study, we focus on developing speech recognition and synthesis for these Indonesian ethnic languages: Javanese, Sundanese, Balinese, and Bataks. We first separately train ASR and TTS of standard Indonesian in supervised training. We then develop ASR and TTS of ethnic languages by utilizing Indonesian ASR and TTS in a cross-lingual machine speech chain framework with only text or only speech data removing the need for paired speech-text data of those ethnic languages. Keywords: Indonesian ethnic languages, cross-lingual approach, machine speech chain, speech recognition and synthesis. -
HOW UNIVERSAL IS AGENT-FIRST? EVIDENCE from SYMMETRICAL VOICE LANGUAGES Sonja Riesberg Kurt Malcher Nikolaus P
HOW UNIVERSAL IS AGENT-FIRST? EVIDENCE FROM SYMMETRICAL VOICE LANGUAGES Sonja Riesberg Kurt Malcher Nikolaus P. Himmelmann Universität zu Köln and Universität zu Köln Universität zu Köln The Australian National University Agents have been claimed to be universally more prominent than verbal arguments with other thematic roles. Perhaps the strongest claim in this regard is that agents have a privileged role in language processing, specifically that there is a universal bias for the first unmarked argument in an utterance to be interpreted as an agent. Symmetrical voice languages such as many western Austronesian languages challenge claims about agent prominence in various ways. Inter alia, most of these languages allow for both ‘agent-first’ and ‘undergoer-first’ orders in basic transitive con - structions. We argue, however, that they still provide evidence for a universal ‘agent-first’ princi - ple. Inasmuch as these languages allow for word-order variation beyond the basic set of default patterns, such variation will always result in an agent-first order. Variation options in which un - dergoers are in first position are not attested. The fact that not all transitive constructions are agent-first is due to the fact that there are competing ordering biases, such as the principles dictat - ing that word order follows constituency or the person hierarchy, as also illustrated with Aus - tronesian data.* Keywords : agent prominence, person prominence, word order, symmetrical voice, western Aus - tronesian 1. Introduction. Natural languages show a strong tendency to place agent argu - ments before other verbal arguments, as seen, inter alia, in the strong predominance of word-order types in which S is placed before O (SOV, SVO, and VSO account for more than 80% of the basic word orders attested in the languages of the world; see Dryer 2013). -
'Undergoer Voice in Borneo: Penan, Punan, Kenyah and Kayan
Undergoer Voice in Borneo Penan, Punan, Kenyah and Kayan languages Antonia SORIENTE University of Naples “L’Orientale” Max Planck Institute for Evolutionary Anthropology-Jakarta This paper describes the morphosyntactic characteristics of a few languages in Borneo, which belong to the North Borneo phylum. It is a typological sketch of how these languages express undergoer voice. It is based on data from Penan Benalui, Punan Tubu’, Punan Malinau in East Kalimantan Province, and from two Kenyah languages as well as secondary source data from Kayanic languages in East Kalimantan and in Sawarak (Malaysia). Another aim of this paper is to explore how the morphosyntactic features of North Borneo languages might shed light on the linguistic subgrouping of Borneo’s heterogeneous hunter-gatherer groups, broadly referred to as ‘Penan’ in Sarawak and ‘Punan’ in Kalimantan. 1. The North Borneo languages The island of Borneo is home to a great variety of languages and language groups. One of the main groups is the North Borneo phylum that is part of a still larger Greater North Borneo (GNB) subgroup (Blust 2010) that includes all languages of Borneo except the Barito languages of southeast Kalimantan (and Malagasy) (see Table 1). According to Blust (2010), this subgroup includes, in addition to Bornean languages, various languages outside Borneo, namely, Malayo-Chamic, Moken, Rejang, and Sundanese. The languages of this study belong to different subgroups within the North Borneo phylum. They include the North Sarawakan subgroup with (1) languages that are spoken by hunter-gatherers (Penan Benalui (a Western Penan dialect), Punan Tubu’, and Punan Malinau), and (2) languages that are spoken by agriculturalists, that is Òma Lóngh and Lebu’ Kulit Kenyah (belonging respectively to the Upper Pujungan and Wahau Kenyah subgroups in Ethnologue 2009) as well as the Kayan languages Uma’ Pu (Baram Kayan), Busang, Hwang Tring and Long Gleaat (Kayan Bahau). -
Languages of Malaysia (Sarawak)
Ethnologue report for Malaysia (Sarawak) Page 1 of 9 Languages of Malaysia (Sarawak) Malaysia (Sarawak). 1,294,000 (1979). Information mainly from A. A. Cense and E. M. Uhlenbeck 1958; R. Blust 1974; P. Sercombe 1997. The number of languages listed for Malaysia (Sarawak) is 47. Of those, 46 are living languages and 1 is extinct. Living languages Balau [blg] 5,000 (1981 Wurm and Hattori). Southwest Sarawak, southeast of Simunjan. Alternate names: Bala'u. Classification: Austronesian, Malayo- Polynesian, Malayic, Malayic-Dayak, Ibanic More information. Berawan [lod] 870 (1981 Wurm and Hattori). Tutoh and Baram rivers in the north. Dialects: Batu Bla (Batu Belah), Long Pata, Long Jegan, West Berawan, Long Terawan. It may be two languages: West Berawan and Long Terawan, versus East-Central Berawang: Batu Belah, Long Teru, and Long Jegan (Blust 1974). Classification: Austronesian, Malayo- Polynesian, Northwest, North Sarawakan, Berawan-Lower Baram, Berawan More information. Biatah [bth] 21,219 in Malaysia (2000 WCD). Population total all countries: 29,703. Sarawak, 1st Division, Kuching District, 10 villages. Also spoken in Indonesia (Kalimantan). Alternate names: Kuap, Quop, Bikuab, Sentah. Dialects: Siburan, Stang (Sitaang, Bisitaang), Tibia. Speakers cannot understand Bukar Sadong, Silakau, or Bidayuh from Indonesia. Lexical similarity 71% with Singgi. Classification: Austronesian, Malayo-Polynesian, Land Dayak More information. Bintulu [bny] 4,200 (1981 Wurm and Hattori). Northeast coast around Sibuti, west of Niah, around Bintulu, and two enclaves west. Dialects: Could also be classified as a Baram-Tinjar Subgroup or as an isolate within the Rejang-Baram Group. Blust classifies as an isolate with North Sarawakan. Not close to other languages. -
Languages of Southeast Asia
Jiarong Horpa Zhaba Amdo Tibetan Guiqiong Queyu Horpa Wu Chinese Central Tibetan Khams Tibetan Muya Huizhou Chinese Eastern Xiangxi Miao Yidu LuobaLanguages of Southeast Asia Northern Tujia Bogaer Luoba Ersu Yidu Luoba Tibetan Mandarin Chinese Digaro-Mishmi Northern Pumi Yidu LuobaDarang Deng Namuyi Bogaer Luoba Geman Deng Shixing Hmong Njua Eastern Xiangxi Miao Tibetan Idu-Mishmi Idu-Mishmi Nuosu Tibetan Tshangla Hmong Njua Miju-Mishmi Drung Tawan Monba Wunai Bunu Adi Khamti Southern Pumi Large Flowery Miao Dzongkha Kurtokha Dzalakha Phake Wunai Bunu Ta w an g M o np a Gelao Wunai Bunu Gan Chinese Bumthangkha Lama Nung Wusa Nasu Wunai Bunu Norra Wusa Nasu Xiang Chinese Chug Nung Wunai Bunu Chocangacakha Dakpakha Khamti Min Bei Chinese Nupbikha Lish Kachari Ta se N a ga Naxi Hmong Njua Brokpake Nisi Khamti Nung Large Flowery Miao Nyenkha Chalikha Sartang Lisu Nung Lisu Southern Pumi Kalaktang Monpa Apatani Khamti Ta se N a ga Wusa Nasu Adap Tshangla Nocte Naga Ayi Nung Khengkha Rawang Gongduk Tshangla Sherdukpen Nocte Naga Lisu Large Flowery Miao Northern Dong Khamti Lipo Wusa NasuWhite Miao Nepali Nepali Lhao Vo Deori Luopohe Miao Ge Southern Pumi White Miao Nepali Konyak Naga Nusu Gelao GelaoNorthern Guiyang MiaoLuopohe Miao Bodo Kachari White Miao Khamti Lipo Lipo Northern Qiandong Miao White Miao Gelao Hmong Njua Eastern Qiandong Miao Phom Naga Khamti Zauzou Lipo Large Flowery Miao Ge Northern Rengma Naga Chang Naga Wusa Nasu Wunai Bunu Assamese Southern Guiyang Miao Southern Rengma Naga Khamti Ta i N u a Wusa Nasu Northern Huishui -
ISO 639-3 New Code Request
ISO 639-3 Registration Authority Request for New Language Code Element in ISO 639-3 This form is to be used in conjunction with a “Request for Change to ISO 639-3 Language Code” form Date: 2012-6-27 Name of Primary Requester: Brian Paris E-mail address: lr-socioling at sil dot org dot pg Names, affiliations and email addresses of additional supporters of this request: John Brownie SIL-PNG Sociolingistic Consultant: lr-socioling at sil dot org dot pg Associated Change request number : 2012-141 (completed by Registration Authority) Tentative assignment of new identifier : hrc (completed by Registration Authority) PLEASE NOTE: This completed form will become part of the public record of this change request and the history of the ISO 639-3 code set. Use Shift-Enter to insert a new line in a form field (where allowed). 1. NAMES and IDENTIFICATION a) Preferred name of language for code element denotation: Niwer Mil b) Autonym (self-name) for this language: c) Common alternate names and spellings of language, and any established abbreviations: Tangga d) Reason for preferred name: This is the name the people use for themselves and their language. e) Name and approximate population of ethnic group or community who use this language (complete individual language currently in use): Niwer Mil 6300 (2000 National Census) f) Preferred three letter identifier, if available: hrc Your suggestion will be taken into account, but the Registration Authority will determine the identifier to be proposed. The identifiers is not intended to be an abbreviation for a name of the language, but to serve as a device to identify a given language uniquely. -
GOO-80-02119 392P
DOCUMENT RESUME ED 228 863 FL 013 634 AUTHOR Hatfield, Deborah H.; And Others TITLE A Survey of Materials for the Study of theUncommonly Taught Languages: Supplement, 1976-1981. INSTITUTION Center for Applied Linguistics, Washington, D.C. SPONS AGENCY Department of Education, Washington, D.C.Div. of International Education. PUB DATE Jul 82 CONTRACT GOO-79-03415; GOO-80-02119 NOTE 392p.; For related documents, see ED 130 537-538, ED 132 833-835, ED 132 860, and ED 166 949-950. PUB TYPE Reference Materials Bibliographies (131) EDRS PRICE MF01/PC16 Plus Postage. DESCRIPTORS Annotated Bibliographies; Dictionaries; *InStructional Materials; Postsecondary Edtmation; *Second Language Instruction; Textbooks; *Uncommonly Taught Languages ABSTRACT This annotated bibliography is a supplement tothe previous survey published in 1976. It coverslanguages and language groups in the following divisions:(1) Western Europe/Pidgins and Creoles (European-based); (2) Eastern Europeand the Soviet Union; (3) the Middle East and North Africa; (4) SouthAsia;(5) Eastern Asia; (6) Sub-Saharan Africa; (7) SoutheastAsia and the Pacific; and (8) North, Central, and South Anerica. The primaryemphasis of the bibliography is on materials for the use of theadult learner whose native language is English. Under each languageheading, the items are arranged as follows:teaching materials, readers, grammars, and dictionaries. The annotations are descriptive.Whenever possible, each entry contains standardbibliographical information, including notations about reprints and accompanyingtapes/records -
TEK Transnational Ethnic Connections
Ethnic Power Relations (EPR) Dataset Family EPR-TEK Transnational ethnic connections Transborder Ethnic Kin (TEK) Groups Atlas Version 2021 Seraina R¨uegger∗, Vanessa Kellerhals, Sarah D¨ascher and Lukas Dick Please cite as: R¨uegger,Seraina, Kellerhals, Vanessa, D¨ascher, Sarah and Lukas Dick. 2021. Transborder Ethnic Kin (TEK) Groups Atlas. Online: https: //icr.ethz.ch/data/epr/tek/. Accessed: [Date]. ∗Corresponding author. Email: [email protected]. Description The Transborder Ethnic Kin (TEK) groups Atlas provides a brief description of all ethnic kin groups that live spread across two or more states. Each group comment indicates the name of the group, lists the countries where the group is, or was, politically relevant at some point in time since 1946, and describes the group's common identifier. Transborder ethnic kin groups are ethnic groups that have transnational connections across at least two states, because their settlement area is split by an international border. The TEK dataset identifies trans-border ethnic groups based on a matching of all ethnic groups included in the EPR dataset (Cederman, Wimmer, and Min 2010; Vogt et al. 2015). The EPR-TEK Dataset constitutes a research-ready version of all TEK groups covering 1946 until 2021 in table format (Vogt et al. 2015). It can be downloaded at: https://icr.ethz.ch/data/epr/tek/. References Cederman, Lars-Erik, Andreas Wimmer, and Brian Min (2010). \Why Do Ethnic Groups Rebel? New Data and Analysis". In: World Politics 62.1, pp. 87{119. Vogt, Manuel et al. (2015). \Integrating Data on Ethnicity, Geography, and Conflict: The Ethnic Power Relations Dataset Family". -
Ancient DNA from Guam and the Peopling of the Pacific
Ancient DNA from Guam and the peopling of INAUGURAL ARTICLE the Pacific Irina Pugacha, Alexander Hübnera,b, Hsiao-chun Hungc, Matthias Meyera, Mike T. Carsond,1, and Mark Stonekinga,1 aDepartment of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D04103 Leipzig, Germany; bDepartment of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, D04103 Leipzig, Germany; cDepartment of Archaeology and Natural History, Australian National University, Canberra, ACT 2601, Australia; and dMicronesian Area Research Center, University of Guam, 96923 Mangilao, Guam This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected in 2020. Contributed by Mark Stoneking, November 12, 2020 (sent for review October 26, 2020; reviewed by Geoffrey K. Chambers, Elizabeth A. Matisoo-Smith, and Glenn R. Summerhayes) Humans reached the Mariana Islands in the western Pacific by by Polynesian ancestors until they ventured into eastern Polynesia ∼3,500 y ago, contemporaneous with or even earlier than the ini- within the past 1,000 y (2, 6). tial peopling of Polynesia. They crossed more than 2,000 km of Where these intrepid voyagers originated, and how they relate open ocean to get there, whereas voyages of similar length did to Polynesians, are open questions. Mariana Islanders are un- not occur anywhere else until more than 2,000 y later. Yet, the usual in many respects when compared with other Micronesians settlement of Polynesia has received far more attention than the and Polynesians. Chamorro, the indigenous language of Guam, settlement of the Marianas. There is uncertainty over both the or- is classified as a Western Malayo-Polynesian language within the igin of the first colonizers of the Marianas (with different lines of evidence suggesting variously the Philippines, Indonesia, New Austronesian language family, along with the languages of ’ Guinea, or the Bismarck Archipelago) as well as what, if any, rela- western Indonesia (the islands west of Wallace s Line) (Fig. -
Library of Congress Subject Headings for the Pacific Islands
Library of Congress Subject Headings for the Pacific Islands First compiled by Nancy Sack and Gwen Sinclair Updated by Nancy Sack Current to January 2020 Library of Congress Subject Headings for the Pacific Islands Background An inquiry from a librarian in Micronesia about how to identify subject headings for the Pacific islands highlighted the need for a list of authorized Library of Congress subject headings that are uniquely relevant to the Pacific islands or that are important to the social, economic, or cultural life of the islands. We reasoned that compiling all of the existing subject headings would reveal the extent to which additional subjects may need to be established or updated and we wish to encourage librarians in the Pacific area to contribute new and changed subject headings through the Hawai‘i/Pacific subject headings funnel, coordinated at the University of Hawai‘i at Mānoa.. We captured headings developed for the Pacific, including those for ethnic groups, World War II battles, languages, literatures, place names, traditional religions, etc. Headings for subjects important to the politics, economy, social life, and culture of the Pacific region, such as agricultural products and cultural sites, were also included. Scope Topics related to Australia, New Zealand, and Hawai‘i would predominate in our compilation had they been included. Accordingly, we focused on the Pacific islands in Melanesia, Micronesia, and Polynesia (excluding Hawai‘i and New Zealand). Island groups in other parts of the Pacific were also excluded. References to broader or related terms having no connection with the Pacific were not included. Overview This compilation is modeled on similar publications such as Music Subject Headings: Compiled from Library of Congress Subject Headings and Library of Congress Subject Headings in Jewish Studies. -
Nomenclature Abbreviations
Abbreviations * As a prefix, indicates a proto language word /?/ glottal stop 2′ compound for 3 = 2 + 1 or rarely 1 + 1 + 1 but numeral for 4 2″ distinct numeral for 3 but 4 is a compound, usually 2 + 2, rarely 5 - 1 or 2 + 1 + 1 AN Austronesian languages BC or BCE Before Christ, that is before the Current Era taken as before the period of Christ BP Before the present CE or AD In the current era, that is after the year of the Lord (Domino/Dominum) Christ CSQ, MQ Counting System Questionnaire; Measurement Questionnaire d. dialect IMP Indigenous Mathematics Project Manus type Lean used this to refer to counting systems that used subtraction from 10 such as 7=10-3, 8=10-2, 9=10-1, often with the meaning e.g. for 7 as 3 needed to com- plete the group MC Micronesian Motu type Lean used this to refer to counting systems that used pairs such as 6=2x3, 7=2x3+1, 8=2x4, 9=2x4+1 NAN Non-Austronesian (also called Papuan) languages NCQ, CQN Noun, classifier, quantifier; classifier, quantifier, noun NQC, QCN Noun, quantifier, classifier; quantifier, classifier, noun NTM New Tribes Mission, PNG PAN Proto Austronesian PN Polynesian PNG Papua New Guinea POC Proto Oceanic QC, CQ Order of quantifier-classifier; classifier-quantifier respectively SHWNG South Halmahera West New Guinea (AN Non-Oceanic language of the Central- Eastern Malayo-Polynesian, a subgroup of Proto-Malayo-Polynesian) after Tryon (2006) SIL Summer Institute of Linguistics SOV Order of words in a sentence: Subject Object Verb SVO Order of words in a sentence: Subject Verb Object TNG Trans New Guinea Phylum Nomenclature The Australian system of numbering is used. -
KUOT a Non-Austronesian Language of New Ireland, Papua New Guinea
Topics in the grammar of KUOT a non-Austronesian language of New Ireland, Papua New Guinea Eva Lindström PhD dissertation Department of Linguistics, Stockholm University, 106 91 Stockholm Doctoral dissertation 2002 Department of Linguistics Stockholms universitet 106 91 Stockholm Sweden © 2002 Eva Lindström ISBN 91-7265-459-7 Cover design: Eva Lindström & Anne Gailit Printed by Akademitryck AB, Edsbruk 2002 Abstract This thesis describes certain areas in the grammar of the little-known Kuot language, spoken by some 1,500 people in New Ireland Province in Papua New Guinea. Kuot is an isolate, and is the only non-Austronesian (Papuan) language of that province. The analyses presented here are based on original data from 18 months of linguistic fieldwork. The first chapter provides an overview of Kuot grammar, and gives details of earlier mentions of the language, and of data collection and the fieldwork situa- tion. The second chapter presents information about the prehistory and history of the area, the social system, kinship system and culture of Kuot speakers, as well as dialectal variation and prognosis of survival of the language. Chapter three treats Kuot phonology, with particular emphasis on the factors that govern allophonic variation, and on the expression of word stress and the functions of intonation. Word classes and the criteria used to define them are presented in Chapter four, which also contains a discussion of types of morphemes in Kuot. The last chapter describes in some detail the class of nouns in Kuot, their declensions, non-singular formation, and the properties of grammatical gender. Appendices give the full set of person-marking forms in Kuot, a transcription of a recorded text with interlinear glossing and translation, the Swadesh 100-word list for Kuot, and diagrams of kin relations and terminology.