Map by Steve Huffman; Data from World Language Mapping System 16 - N Ew Zealand English French Southern and Antarctic Lands

Total Page:16

File Type:pdf, Size:1020Kb

Map by Steve Huffman; Data from World Language Mapping System 16 - N Ew Zealand English French Southern and Antarctic Lands Greenland Svalbard Greenlandic Inuktitut Greenlandic Inuktitut Nganasan Nenets Dolgan Nenets Nganasan Yakut Nenets Nganasan Norwegian Nganasan Nganasan Jan Mayen Nenets Tundra Enets Dolgan North Saami North SaamiNorth Saami Nganasan Greenlandic Inuktitut North Saami Nenets Kven FinnishNorth Saami North Saami Nenets Forest Enets North Saami North Saami Skolt Saami North Saami North Saami Inari SaamiKven Finnish Forest Enets North Saami Norwegian North Saami Nenets Nenets Nenets Nenets Skolt Saami Chukchi North SaamiNorth Saami North SaamiNorth Saami Nenets Yakut North SaamiLule Saami Kildin Saami Nenets North Saami Nenets Nenets Pite Saami Tornedalen Finnish North Saami Te r Sa am i Nenets Pite SaamiPite Saami Lule Saami Nenets Nenets Tornedalen Finnish Even Lule Saami Ume Saami Pite Saami Karelian Selkup Selkup Even Icelandic Ume Saami Pite Saami Evenki Even Chukchi Iceland South Saami South SaamiUme Saami Finland Karelian Komi-Zyrian South Saami Russia Even Finnish Selkup South Saami Nenets Koryak Norway Swedish Even Sweden Swedish Even Faroese Koryak Chukchi Faroese South Saami Koryak South Saami Ludian Selkup Faroese Livvi-Karelian Veps Koryak Norwegian Evenki Alutor Koryak Koryak Veps Koryak Alutor Swedish Swedish Swedish Selkup Finnish Koryak Vod Ingrian Komi-Permyak Alutor Koryak Estonian Selkup Estonian Scottish Gaelic Estonia Evenki Scottish Gaelic Liv Itelmen Udmurt Khakas Scottish Gaelic Latvia Koryak Latvian Liv Russian Scots Denmark Meadow Mari Itelmen Scottish Gaelic Meadow Mari Danish Hill Mari Bashkort Lithuania Bashkort Scottish Gaelic Chuvash Meadow Mari Bashkort Danish Danish Lithuanian Ta tar Swedish Aleut Irish Gaelic Northern FrisianDanish Moksha English Karaim Khakas Northern Frisian Kashubian Moksha MokshaMoksha Erzya Shor Chulym Russia Buriat Irish Gaelic English Erzya Karagas Russia Buriat English Moksha Erzya Bashkort Shor Karagas Russia Buriat Irish Gaelic Belarus Belarusan English Welsh Western FrisianGronings Erzya Khakas Russia Buriat DrentsEastern Frisian Erzya Oroqen Negidal English Ireland Dutch Erzya Erzya Northern Altai Tuva Welsh United Kingdom Netherlands Sallands Russia Buriat Negidal English Irish Gaelic Germany Poland Polish Northern Altai Evenki Orok Irish Gaelic Zeeuws Twents Tuva English Irish Gaelic Dutch Upper Sorbian Erzya Russian Russia Buriat Ulch Orok Vlaams Tuva Darkhat Limburgish Upper Sorbian Tuva Nanai Vlaams DutchLimburgish Standard German Tuva Russia Buriat Oroqen Vlaams Northern Altai Mongolia Buriat Russia Buriat Evenki Nanai Picard French Standard German Tuva Tuva Standard German Tuva Halh Mongolian Manchu NanaiOroch Picard Walloon Czech Republic Southern Altai Mongolia Buriat China Buriat French UkrainianPolish Russian Manchu French Czech Polish Tuva Evenki Udihe BelgiumLuxembourgeois Polish Ukrainian Nanai French Slovakia Rusyn Mongolia Buriat Oroqen Oroch Orok Kalmyk-Oirat Mongolia Buriat Evenki Nanai Swiss German Czech HungarySlovak Ukrainian Ukraine Evenki Oroqen Slovak Rusyn Kazakh China Buriat Evenki Daur Breton Croatian Romanian Kazakh Austria Balkan Romani Carpathian Romani Moldova Kazakhstan Tuva Tuva Evenki Nanai Switzerland Croatian Ukrainian Tuva Mongolia Buriat Nanai Udihe Swiss German Standard German Standard German Romanian Tuva Walser Urum Tuva Manchu Franco-ProvencalSwiss German CroatianStandard German Gagauz Daur Nanai Standard German Russian Kalmyk-Oirat Kazakh Korean France French French Slovene Slovene Standard German Bulgarian Russian Kalmyk-Oirat Udihe ItalianRomansch Ladin Standard German Kalmyk-Oirat Friulian Slovene Standard German Gagauz Halh Mongolian Lombard Mongolia Franco-Provencal Venetian Croatian Standard German Romanian Franco-Provencal Lombard Italian Croatia Slovak Gagauz Manchu Slovenia Romania Romanian Vlax Romani Salar Kazakh Korean Occitan Emiliano-Romagnolo Bosnia and Herzegovina Ta tar Occitan Serbian Kalmyk-Oirat LigurianLigurian Bosnian Tuva Italian GagauzCrimean Tatar Northern Uzbek Ta tar Ta tar French Serbia Nogai Xibe Uyghur Manchu Korean Italy Kosovo Gagauz Balkar Kazakh Northern Uzbek Japanese Asturian Montenegro Karakalpak Uyghur Xibe Aranese Gascon Italian Serbian Bulgarian Bulgaria Bulgarian Uyghur Uyghur Peripheral Mongolian Manchu Galician Serbian Osetin Kumyk Kazakh AragoneseCatalan-Valencian-Balear Gheg AlbanianBulgarian Balkan Romani Balkan Romani Russian Kalmyk-Oirat Peripheral Mongolian Manchu Turkish Georgia Osetin Karakalpak Corsican Gheg Albanian Manchu Galician Catalan-Valencian-Balear Vatican City Croatian Macedonian Osetin Northern Uzbek Manchu Turkish TurkishBulgarian Kyrgyz PortugueseMirandese Corsican Turkish North Azerbaijani Judeo-Tat Uzbekistan Manchu Manchu Gallurese Sardinian Gheg AlbanianTurkish TurkishBalkan Gagauz Turkish Muslim Tat Northern Uzbek Kyrgyzstan Manchu Greek Pontic South AzerbaijaniNorth Azerbaijani Manchu Spain Tosk Albanian Slavic Kazakh KazakhNorthern Uzbek Manchu Manchu Logudorese Sardinian South AzerbaijaniNorth AzerbaijaniNorth Azerbaijani ManchuManchu North Korea Fala Spanish Crimean Tatar GreekAromanian South Azerbaijani Northern UzbekNorthern UzbekTajikiTajikistan Kyrgyz Korean North Azerbaijani Tajiki West Yugur Portugal Catalan-Valencian-Balear Campidanese Sardinian Albania Greek Armenian Tajiki Northern Uzbek Kirmanjki North Azerbaijani Northern Uzbek Kyrgyz Ainu Peripheral Mongolian Extremaduran Turkish Azerbaijan Turkmenistan Northern Uzbek West Yugur Manchu Arbereshe Albanian Greece Turkey Tajiki Tajiki YagnobiTajiki Ainu Dimli Northern KurdishNorthern KurdishArmenianTa ly sh Darwazi Tajiki East Yugur Tajiki Northern Uzbek Ainu Arvanitika Albanian Shughni Kyrgyz Barranquian Arbereshe AlbanianGreek Armenian Ta ly sh Northern Kurdish Sarikoli East Yugur Western Balochi Northern Uzbek Ta jik i Sicilian Arvanitika Albanian Turkmen Northern Kurdish Northern UzbekNorthern Uzbek Wakhi Northern Uzbek Kalmyk-Oirat South Azerbaijani Turkmen Western Balochi Kyrgyz Kalmyk-Oirat Arvanitika Albanian Gilaki Kazakh Turkmen Shughni Kyrgyz Wakhi Ainu Tu Northern Kurdish Turkmen WakhiWakhi South Korea Northern Kurdish Muslim Tat Wakhi Tu Northern Kurdish South Azerbaijani Turkmen Sanglechi-Ishkashimi Wakhi China Uyghur Korean Armenian Central Kurdish Mazanderani Southern Uzbek Khowar Japan English South Azerbaijani Eastern Farsi Khowar Tu Malta Northern Kurdish Central Kurdish Japanese South Azerbaijani Semnani Hazaragi Kati BonanSalarDongxiang Turkmen Kati To rw al i Shina Greek CyprusTurkish Armenian Indus Kohistani Syria VafsiAlviri-Vidari Northwest PashayiWaigali Greek South Azerbaijani Afghanistan Northern Hindko Turkic Khalaj Aimaq Parya Brokskat Mogholi Hazaragi Northern Pashto Gujari Tunisia Lebanon Southern KurdishLakiNorthern Luri Gazi Ormuri Kashmiri Southern Pashto Pahari-Potwari Japanese Southern Kurdish Natanzi Eastern Farsi Gujari Laki Southern Hindko Bhadrawahi Iraq Northern Luri Churahi Western Farsi Ormuri Dogri Japanese Morocco Palestine DezfuliBakhtiari ChambealiGaddi Central Pashto Domari Zoroastrian Dari Southern Pashto Kullu Pahari Palestine Iran Western PanjabiMajhi Mandeali Greek Jordan Pahlavani Israel Kashkay Eastern PanjabiMahasu Pahari Southern Luri Southern Pashto JaunsariGarhwali Zoroastrian Dari Western Balochi Khetrani HazaragiWaneci Seraiki Kumaoni Dehwari Bagri Kuwait South Azerbaijani Haryanvi Urdu Nepali Jumli Pakistan Eastern Balochi Buksa Western Balochi Seraiki Rana TharuKathoriya TharuNepal Western Balochi Marwari Northern Amami-Oshima Algeria Kashkay AwadhiDangaura Tharu Dehwari Od AwadhiBhojpuriDhanwar Nepali To ku -N o- Shi ma Awadhi Nepali Bhutan Kanauji BhojpuriMajhi Oki-No-Erabu Nepali Nepali Nepali Marwari Marwari Maithili Nepali Libya Egypt Southern Balochi Braj Bhasha Bengali Assamese Kunigami Kumzari Bhojpuri Maithili Bengali Bahrain Southern Balochi JadgaliSindhiMarwari Hindi Central Okinawan RangpuriAssamese Kachi KoliWadiyara Koli Kanjari Qatar Lasi Sanskrit Oraon SadriHajongBangladesh Urdu Sindhi Bhil Dhatki Hadothi Bundeli Magahi Hindi Sylheti Parkari Koli Rajput Garasia Bengali YaeyamaMiyako Mina Angika Bengali YonaguniYaeyama Parkari KoliAdiwasi Garasia Bishnupriya Western Sahara Saudi Arabia Mina Bagheli Mal Paharia United Arab Emirates Sadri Bihari Chakma Southern Balochi Sindhi Wagdi Malvi Taiwan Mirgan Panchpargania ChakmaChakma Kachchi India Bengali Noiri Bihari Chakma Gujarati BhiliBhilali Bundeli Panchpargania Chittagonian Hong Kong Nimadi Noiri Rohingya DubliVasaviKhandesi Gowlan Chhattisgarhi Rohingya DhodiaKhandesi Bhalay Bagheli Oriya Bhunjia Rohingya Myanmar Oman Dhanki Varhadi-Nagpuri Marathi Lambadi Halbi Mauritania Varli Lambadi Konkani Andh Laos KonkaniKonkani BhatriBodo Parja Wake Island Katkari Lambadi Deccan Lambadi Adivasi Oriya Adivasi Oriya Konkani Mali N iger UrduLambadi Konkani Goan Konkani Lambadi Lambadi Yemen Lambadi Paracel Islands Eritrea Cape Verde Chad Lambadi Thailand N orthern Mariana Islands Goan Konkani Lambadi Senegal Sudan The Gambia Guam Cambodia Burkina Faso Guinea-Bissau Djibouti Vietnam Guinea Spratly Islands Saurashtra Benin N igeria Somalia Togo Sierra Leone Ethiopia Maldivian Ghana Sinhala Philippines Cote D’Ivoire Sri LankaVeddah Palau Maldivian Liberia Veddah Federated States of Micronesia Marshall Islands Central African Republic Maldivian English Maldivian Cameroon Maldivian Maldivian Maldivian Maldivian Brunei Maldivian Maldivian Malaysia Maldivian Maldivian Maldivian Kiribati Equatorial Guinea Maldivian Uganda Singapore Maldivian Maldivian Sao Tome and Principe Kenya Maldivian Indonesia Gabon Maldivian Nauru Rwanda Republic of Congo Congo (Democratic Republic of the)
Recommended publications
  • The Indian Origin of Romani People As a Founding Myth in Eastern European Museums Douglas Neander Sambati1
    The Indian Origin of Romani people as a Founding Myth 39 Douglas Sambati https://doi.org/10.36572/csm.2019.vol.58.03 The Indian Origin of Romani people as a Founding Myth in Eastern European Museums Douglas Neander Sambati1 Introduction This article analyses three museums – the Muzeum Romské Kultury (MRK) in Brno/Czech Republic, the Muzej Romské Kulture (MRKu) in Belgrade/Serbia, and the Roma Ethnographic Museum (REM) in Tárnow/Poland – and part of its exhibitions which can be considered as elements of the Romani Nationalism. The main objective is to demonstrate how these museum institutions support a broad narrative about a common Indian origin of Gypsy/Romani populations. It will be discussed below how the aforementioned museums – by means of their exhibitions, websites, events or any other kind of official production – support sets of representations which allow a formation of an umbrella rhetoric about the group known, taken and self-ascribed as Gypsies and/or Roma. This discourse, then, is able to shelter all different groups within this population in a holistic manner, based on a narrative formed by essentializations, exoticizations and generalizations. Therefore, the argument will be developed from now onwards by establishing a dialogue of elements which essentialize – in the sense that they imply natural and intrinsic characteristics – some aspects of Roma history and 1 Grupo de Pesquisa Estudos Interdisciplinares de Patrimônio Cultural da UNIVILLE. [email protected] 40 Cadernos de Sociomuseologia, 2019.vol.58.nº14 culture. In plain words, it is possible to find within these museums some specific deliberation which sustains none (or little) doubt that some characteristics are part of a claimed Romani way of life.
    [Show full text]
  • (And Potential) Language and Linguistic Resources on South Asian Languages
    CoRSAL Symposium, University of North Texas, November 17, 2017 Existing (and Potential) Language and Linguistic Resources on South Asian Languages Elena Bashir, The University of Chicago Resources or published lists outside of South Asia Digital Dictionaries of South Asia in Digital South Asia Library (dsal), at the University of Chicago. http://dsal.uchicago.edu/dictionaries/ . Some, mostly older, not under copyright dictionaries. No corpora. Digital Media Archive at University of Chicago https://dma.uchicago.edu/about/about-digital-media-archive Hock & Bashir (eds.) 2016 appendix. Lists 9 electronic corpora, 6 of which are on Sanskrit. The 3 non-Sanskrit entries are: (1) the EMILLE corpus, (2) the Nepali national corpus, and (3) the LDC-IL — Linguistic Data Consortium for Indian Languages Focus on Pakistan Urdu Most work has been done on Urdu, prioritized at government institutions like the Center for Language Engineering at the University of Engineering and Technology in Lahore (CLE). Text corpora: http://cle.org.pk/clestore/index.htm (largest is a 1 million word Urdu corpus from the Urdu Digest. Work on Essential Urdu Linguistic Resources: http://www.cle.org.pk/eulr/ Tagset for Urdu corpus: http://cle.org.pk/Publication/papers/2014/The%20CLE%20Urdu%20POS%20Tagset.pdf Urdu OCR: http://cle.org.pk/clestore/urduocr.htm Sindhi Sindhi is the medium of education in some schools in Sindh Has more institutional backing and consequent research than other languages, especially Panjabi. Sindhi-English dictionary developed jointly by Jennifer Cole at the University of Illinois Urbana- Champaign and Sarmad Hussain at CLE (http://182.180.102.251:8081/sed1/homepage.aspx).
    [Show full text]
  • SOAS Sylheti
    Sylheti Language Lessons 2014 - 2015 SOAS Sylheti Project After an invitation from the director of the Surma Centre, Camden, during Endangered Languages Week presentations at SOAS in 2012, the SOAS Sylheti Project was created. For the past three years, SOAS MA linguistics students have participated in this extracurricular project to document Sylheti spoken by users of the Surma Centre. The SOAS Fieldmethods course has also worked with Sylheti speakers to document Sylheti grammar. Besides other sub-projects, the SOAS Sylheti Project is compiling a dictionary for the Sylheti language and launching a dictionary app. Contact the SOAS Sylheti Project via email: [email protected] Facebook group: https://www.facebook.com/groups/soassylhetiproject/ YouTube channel: https://www.youtube.com/user/soassylhetiproject SOAS Sylheti Language Society The SOAS Sylheti Language Society organizes weekly Sylheti language lessons and meets to discuss the grammar of the Sylheti language. Sylheti- speaking members practice teaching and develop lessons. This booklet represents our first efforts to create teaching materials for the Sylheti language. We thank our teachers Kushie, Nadia and Monsur! Facebook page: www.facebook.com/soassylhetilanguagesociety Information on the SOAS Students’ Union webpage: http://soasunion.org/activities/society/7438/ June 2015, London Pictures in this booklet are taken from pixabay.com unless otherwise specified. 2 Contents Lesson 00: Spelling and transcription conventions .............................................
    [Show full text]
  • Language Documentation and Description
    Language Documentation and Description ISSN 1740-6234 ___________________________________________ This article appears in: Language Documentation and Description, vol 18. Editors: Candide Simard, Sarah M. Dopierala & E. Marie Thaut Irrealis? Issues concerning the inflected t-form in Sylheti JONAS LAU Cite this article: Lau, Jonas. 2020. Irrealis? Issues concerning the inflected t-form in Sylheti. In Candide Simard, Sarah M. Dopierala & E. Marie Thaut (eds.) Language Documentation and Description 18, 56- 68. London: EL Publishing. Link to this article: http://www.elpublishing.org/PID/199 This electronic version first published: August 2020 __________________________________________________ This article is published under a Creative Commons License CC-BY-NC (Attribution-NonCommercial). The licence permits users to use, reproduce, disseminate or display the article provided that the author is attributed as the original creator and that the reuse is restricted to non-commercial purposes i.e. research or educational use. See http://creativecommons.org/licenses/by-nc/4.0/ ______________________________________________________ EL Publishing For more EL Publishing articles and services: Website: http://www.elpublishing.org Submissions: http://www.elpublishing.org/submissions Irrealis? Issues concerning the inflected t-form in Sylheti Jonas Lau SOAS, University of London Abstract Among the discussions about cross-linguistic comparability of grammatical categories within the field of linguistic typology (cf. Cristofaro 2009; Haspelmath 2007), one in particular seems to be especially controversial: is there really such a category as irrealis? This term has been used extensively in descriptive works and grammars to name all kinds of grammatical morphemes occurring in various modal and non-modal contexts. However, cross-linguistic evidence for a unitary category that shares invariant semantic features has not been attested (Bybee 1998:266).
    [Show full text]
  • Some Principles of the Use of Macro-Areas Language Dynamics &A
    Online Appendix for Harald Hammarstr¨om& Mark Donohue (2014) Some Principles of the Use of Macro-Areas Language Dynamics & Change Harald Hammarstr¨om& Mark Donohue The following document lists the languages of the world and their as- signment to the macro-areas described in the main body of the paper as well as the WALS macro-area for languages featured in the WALS 2005 edi- tion. 7160 languages are included, which represent all languages for which we had coordinates available1. Every language is given with its ISO-639-3 code (if it has one) for proper identification. The mapping between WALS languages and ISO-codes was done by using the mapping downloadable from the 2011 online WALS edition2 (because a number of errors in the mapping were corrected for the 2011 edition). 38 WALS languages are not given an ISO-code in the 2011 mapping, 36 of these have been assigned their appropri- ate iso-code based on the sources the WALS lists for the respective language. This was not possible for Tasmanian (WALS-code: tsm) because the WALS mixes data from very different Tasmanian languages and for Kualan (WALS- code: kua) because no source is given. 17 WALS-languages were assigned ISO-codes which have subsequently been retired { these have been assigned their appropriate updated ISO-code. In many cases, a WALS-language is mapped to several ISO-codes. As this has no bearing for the assignment to macro-areas, multiple mappings have been retained. 1There are another couple of hundred languages which are attested but for which our database currently lacks coordinates.
    [Show full text]
  • Pashto, Waneci, Ormuri. Sociolinguistic Survey of Northern
    SOCIOLINGUISTIC SURVEY OF NORTHERN PAKISTAN VOLUME 4 PASHTO, WANECI, ORMURI Sociolinguistic Survey of Northern Pakistan Volume 1 Languages of Kohistan Volume 2 Languages of Northern Areas Volume 3 Hindko and Gujari Volume 4 Pashto, Waneci, Ormuri Volume 5 Languages of Chitral Series Editor Clare F. O’Leary, Ph.D. Sociolinguistic Survey of Northern Pakistan Volume 4 Pashto Waneci Ormuri Daniel G. Hallberg National Institute of Summer Institute Pakistani Studies of Quaid-i-Azam University Linguistics Copyright © 1992 NIPS and SIL Published by National Institute of Pakistan Studies, Quaid-i-Azam University, Islamabad, Pakistan and Summer Institute of Linguistics, West Eurasia Office Horsleys Green, High Wycombe, BUCKS HP14 3XL United Kingdom First published 1992 Reprinted 2004 ISBN 969-8023-14-3 Price, this volume: Rs.300/- Price, 5-volume set: Rs.1500/- To obtain copies of these volumes within Pakistan, contact: National Institute of Pakistan Studies Quaid-i-Azam University, Islamabad, Pakistan Phone: 92-51-2230791 Fax: 92-51-2230960 To obtain copies of these volumes outside of Pakistan, contact: International Academic Bookstore 7500 West Camp Wisdom Road Dallas, TX 75236, USA Phone: 1-972-708-7404 Fax: 1-972-708-7433 Internet: http://www.sil.org Email: [email protected] REFORMATTING FOR REPRINT BY R. CANDLIN. CONTENTS Preface.............................................................................................................vii Maps................................................................................................................
    [Show full text]
  • Operation China
    Yugur, Saragh December 24 Location: Approximately west of their present good hosts unless their MONGOLIA 10,000 Saragh (Western) location. Today, their guests get drunk. Yugur live in the western descendants are no longer INNER MONGOLIA XINJIANG •Dunhuang part of the Sunan Yugur called Yugur and probably Religion: The Saragh Yugur Zhangye• Autonomous County, in the have become part of the adhere to a mixture of GANSU QINGHAI narrow northern corridor of Uygurs in Xinjiang, who also Tibetan Buddhism and •Lanzhou Scale Gansu Province. The nearest speak a Turkic language. A shamanism. Each family 0 KM 400 town to the Saragh Yugur is small number of people clan has a shaman who Population in China: Zhangye. Other Saragh migrated back inside the consults the spirit world for 8,197 (1990) Yugur communities are wall to avoid the conflict them. 9,870 (2000) located in the Dahe and between the Turfan and 11,880 (2010) Location: Gansu Minghua districts, and in the Hami rulers. They are Christianity: This group had Religion: Tibetan Buddhism Huangnibao area near believed to be the Yugur’s no knowledge whatsoever of Christians: 50 Jiuquan City in western ancestors. Christianity until 1997, Gansu. when about 15 Saragh Overview of the Customs: Most Saragh Yugur believed in Christ Saragh Yugur Identity: The Saragh Yugur, Yugur live in yak-hair yurts. A after watching the Jesus film also known as Yaofuer, are visitor who comes by in Mandarin.5 This number Countries: China the Turkic half of the official horseback should leave his grew to around 50 believers Pronunciation: “Sah-rahg-Yoo-gur” Yugur nationality.
    [Show full text]
  • New Language Resources for the Pashto Language
    New language resources for the Pashto language Djamel Mostefa 1 , Khalid Choukri 1 , Sylvie Brunessaux 2 , Karim Boudahmane 3 1 Evaluation and Language resources Distribution Agency, France 2 CASSIDIAN, France 3 Direction Générale de l'Armement, France E-mail: [email protected], [email protected], [email protected], [email protected] Abstract This paper reports on the development of new language resources for the Pashto language, a very low-resource language spoken in Afghanistan and Pakistan. In the scope of a multilingual data collection project, three large corpora are collected for Pashto. Firstly a monolingual text corpus of 100 million words is produced. Secondly a 100 hours speech database is recorded and manually transcribed. Finally a bilingual Pashto-French parallel corpus of around 2 million is produced by translating Pashto texts into French. These resources will be used to develop Human Language Technology systems for Pashto with a special focus on Machine Translation. Keywords: Pashto, low-resource language, speech corpus, monolingual and multilingual text corpora, web crawling. other one being Dari) and one regional language in 1. Introduction Pakistan. There are very few corpora and Human Language The code assigned to the language by the ISO 639-3 Technology (HLT) services available for Pashto. No standard is [pus]. language resources for Pashto can be found in the According to the Ethnologue.com website, it is spoken by catalogues of LDC1 and ELRA2. around 20 million people and three main dialects are to be Pashto is a very low-resource language. Google doesn't considered: support Pashto in its search engine or translation services.
    [Show full text]
  • 547 References
    Mongolic phonology and the Qinghai-Gansu languages Nugteren, H. Citation Nugteren, H. (2011, December 7). Mongolic phonology and the Qinghai-Gansu languages. LOT dissertation series. Utrecht : LOT, Netherlands Graduate School of Linguistics. Retrieved from https://hdl.handle.net/1887/18188 Version: Not Applicable (or Unknown) Licence agreement concerning inclusion of doctoral thesis in the License: Institutional Repository of the University of Leiden Downloaded from: https://hdl.handle.net/1887/18188 Note: To cite this publication please use the final published version (if applicable). REFERENCES Apatóczky, Ákos Bertalan. 2009. Dialectal traces in Beilu Yiyu. V. Rybatzki & A. Pozzi & P. W, Geier & J. R. Krueger (eds.). The Early Mongols: Language, Culture and History. Tümen tümen nasulatuɣai. Studies in Honor of Igor de Rachewiltz on the Occasion of His 80th Birthday. 9-20. Bloomington. Binnick, Robert I. 1987. On the classification of the Mongolian languages. CAJ 31. 178-195. Bökh, & Chén Năixióng. 1981. Tóngrén Băo‟ānhuà gàiyào [Outline of the vernacular of Tongren Bao‟an]. Mínzú Yŭwén 1981:2. 61-75. Peking. Bökh & Čoyiǰungǰab. 1985 [1986]. Düngsiyang kele ba Mongɣol kele / Dōngxiāngyŭ hé Mĕnggŭyŭ [Dongxiang and Mongolian]. Hohhot. Bökh & Liú Zhàoxióng. 1982. Băo’ānyŭ jiănzhì [Concise grammar of Bao‟an]. Peking. Bökh, et al. 1983. Düngsiyang kelen-ü üges / Dōngxiāngyŭ cíhuì [Vocabulary of Dongxiang]. Hohhot. Bolčuluu & Jalsan. 1988. Jegün Yuɣur kelen-ü kelelge-yin matèriyal / Dōngbù Yùgùyŭ huàyŭ cáiliào [Materials of Eastern Yugur spoken language]. Hohhot. Bolčuluu, et al. 1984 [1985]. Jegün Yuɣur kelen-ü üges / Dōngbù Yùgùyŭ cíhuì [Vocabulary of Eastern Yugur]. Hohhot. Bolčuluu & Jalsan. 1990 [1991]. Jegün Yuɣur kele ba Mongɣol kele / Dōngbù Yùgùyŭ hé Mĕnggŭyŭ [Eastern Yugur and Mongolian].
    [Show full text]
  • Languages of Kohistan. Sociolinguistic Survey of Northern
    SOCIOLINGUISTIC SURVEY OF NORTHERN PAKISTAN VOLUME 1 LANGUAGES OF KOHISTAN Sociolinguistic Survey of Northern Pakistan Volume 1 Languages of Kohistan Volume 2 Languages of Northern Areas Volume 3 Hindko and Gujari Volume 4 Pashto, Waneci, Ormuri Volume 5 Languages of Chitral Series Editor Clare F. O’Leary, Ph.D. Sociolinguistic Survey of Northern Pakistan Volume 1 Languages of Kohistan Calvin R. Rensch Sandra J. Decker Daniel G. Hallberg National Institute of Summer Institute Pakistani Studies of Quaid-i-Azam University Linguistics Copyright © 1992 NIPS and SIL Published by National Institute of Pakistan Studies, Quaid-i-Azam University, Islamabad, Pakistan and Summer Institute of Linguistics, West Eurasia Office Horsleys Green, High Wycombe, BUCKS HP14 3XL United Kingdom First published 1992 Reprinted 2002 ISBN 969-8023-11-9 Price, this volume: Rs.300/- Price, 5-volume set: Rs.1500/- To obtain copies of these volumes within Pakistan, contact: National Institute of Pakistan Studies Quaid-i-Azam University, Islamabad, Pakistan Phone: 92-51-2230791 Fax: 92-51-2230960 To obtain copies of these volumes outside of Pakistan, contact: International Academic Bookstore 7500 West Camp Wisdom Road Dallas, TX 75236, USA Phone: 1-972-708-7404 Fax: 1-972-708-7433 Internet: http://www.sil.org Email: [email protected] REFORMATTING FOR REPRINT BY R. CANDLIN. CONTENTS Preface............................................................................................................viii Maps.................................................................................................................
    [Show full text]
  • Language Documentation and Description
    Language Documentation and Description ISSN 1740-6234 ___________________________________________ This article appears in: Language Documentation and Description, vol 17. Editor: Peter K. Austin Countering the challenges of globalization faced by endangered languages of North Pakistan ZUBAIR TORWALI Cite this article: Torwali, Zubair. 2020. Countering the challenges of globalization faced by endangered languages of North Pakistan. In Peter K. Austin (ed.) Language Documentation and Description 17, 44- 65. London: EL Publishing. Link to this article: http://www.elpublishing.org/PID/181 This electronic version first published: July 2020 __________________________________________________ This article is published under a Creative Commons License CC-BY-NC (Attribution-NonCommercial). The licence permits users to use, reproduce, disseminate or display the article provided that the author is attributed as the original creator and that the reuse is restricted to non-commercial purposes i.e. research or educational use. See http://creativecommons.org/licenses/by-nc/4.0/ ______________________________________________________ EL Publishing For more EL Publishing articles and services: Website: http://www.elpublishing.org Submissions: http://www.elpublishing.org/submissions Countering the challenges of globalization faced by endangered languages of North Pakistan Zubair Torwali Independent Researcher Summary Indigenous communities living in the mountainous terrain and valleys of the region of Gilgit-Baltistan and upper Khyber Pakhtunkhwa, northern
    [Show full text]
  • Analogy in Lovari Morphology
    Analogy in Lovari Morphology Márton András Baló Ph.D. dissertation Supervisor: László Kálmán C.Sc. Doctoral School of Linguistics Gábor Tolcsvai Nagy MHAS Theoretical Linguistics Doctoral Programme Zoltán Bánréti C.Sc. Department of Theoretical Linguistics Eötvös Loránd University, Budapest Budapest, 2016 Contents 1. General introduction 4 1.1. The aim of the study of language . 4 2. Analogy in grammar 4 2.1. Patterns and exemplars versus rules and categories . 4 2.2. Analogy and similarity . 6 2.3. Neither synchronic, nor diachronic . 9 2.4. Variation and frequency . 10 2.5. Rich memory and exemplars . 12 2.6. Paradigms . 14 2.7. Patterns, prototypes and modelling . 15 3. Introduction to the Romani language 18 3.1. Discovery, early history and research . 18 3.2. Later history . 21 3.3. Para-Romani . 22 3.4. Recent research . 23 3.5. Dialects . 23 3.6. The Romani people in Hungary . 28 3.7. Dialects in Hungary . 29 3.8. Dialect diversity and dialectal pluralism . 31 3.9. Current research activities . 33 3.10. Research of Romani in Hungary . 34 3.11. The current research . 35 4. The Lovari sound system 37 4.1. Consonants . 37 4.2. Vowels . 37 4.3. Stress . 38 5. A critical description of Lovari morphology 38 5.1. Nominal inflection . 38 5.1.1. Gender . 39 5.1.2. Animacy . 40 5.1.3. Case . 42 5.1.4. Additional features. 47 5.2. Verbal inflection . 50 5.2.1. The present tense . 50 5.2.2. Verb derivation. 54 5.2.2.1. Transitive derivational markers .
    [Show full text]