On the Place of Parachi and Ormuri Among the Iranian Languages According to the Data of Annotated Swadesh Lists1
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
(And Potential) Language and Linguistic Resources on South Asian Languages
CoRSAL Symposium, University of North Texas, November 17, 2017 Existing (and Potential) Language and Linguistic Resources on South Asian Languages Elena Bashir, The University of Chicago Resources or published lists outside of South Asia Digital Dictionaries of South Asia in Digital South Asia Library (dsal), at the University of Chicago. http://dsal.uchicago.edu/dictionaries/ . Some, mostly older, not under copyright dictionaries. No corpora. Digital Media Archive at University of Chicago https://dma.uchicago.edu/about/about-digital-media-archive Hock & Bashir (eds.) 2016 appendix. Lists 9 electronic corpora, 6 of which are on Sanskrit. The 3 non-Sanskrit entries are: (1) the EMILLE corpus, (2) the Nepali national corpus, and (3) the LDC-IL — Linguistic Data Consortium for Indian Languages Focus on Pakistan Urdu Most work has been done on Urdu, prioritized at government institutions like the Center for Language Engineering at the University of Engineering and Technology in Lahore (CLE). Text corpora: http://cle.org.pk/clestore/index.htm (largest is a 1 million word Urdu corpus from the Urdu Digest. Work on Essential Urdu Linguistic Resources: http://www.cle.org.pk/eulr/ Tagset for Urdu corpus: http://cle.org.pk/Publication/papers/2014/The%20CLE%20Urdu%20POS%20Tagset.pdf Urdu OCR: http://cle.org.pk/clestore/urduocr.htm Sindhi Sindhi is the medium of education in some schools in Sindh Has more institutional backing and consequent research than other languages, especially Panjabi. Sindhi-English dictionary developed jointly by Jennifer Cole at the University of Illinois Urbana- Champaign and Sarmad Hussain at CLE (http://182.180.102.251:8081/sed1/homepage.aspx). -
Tajiki Some Useful Phrases in Tajiki Five Reasons Why You Should Ассалому Алейкум
TAJIKI SOME USEFUL PHRASES IN TAJIKI FIVE REASONS WHY YOU SHOULD ассалому алейкум. LEARN MORE ABOUT TAJIKIS AND [ˌasːaˈlɔmu aˈlɛɪkum] /asah-lomu ah-lay-koom./ THEIR LANGUAGE Hello! 1. Tajiki is spoken as a first or second language by over 8 million people worldwide, but the Hоми шумо? highest population of speakers is located in [ˈnɔmi ʃuˈmɔ] Tajikistan, with significant populations in other /No-mee shoo-moh?/ Central Eurasian countries such as Afghanistan, What is your name? Uzbekistan, and Russia. Номи ман… 2. Tajiki is a member of the Western Iranian branch [ˈnɔmi man …] of the Indo-Iranian languages, and shares many structural similarities to other Persian languages /No-mee man.../ such as Dari and Farsi. My name is… 3. Few people in America can speak or use the Tajiki Шумо чи xeл? Нағз, рахмат. version of Persian. Given the different script and [ʃuˈmɔ ʧi χɛl naʁz ɾaχˈmat] dialectal differences, simply knowing Farsi is not /shoo-moh-chee-khel? Naghz, rah-mat./ enough to fully understand Tajiki. Those who How are you? I’m fine, thank you. study Tajiki can find careers in a variety of fields including translation and interpreting, consulting, Aз вохуриамон шод ҳастам. and foreign service and intelligence. NGOs [az vɔχuˈɾiamɔn ʃɔd χaˈstam] and other enterprises that deal with Tajikistan /Az vo-khu-ri-amon shod has-tam./ desperately need specialists who speak Tajiki. Nice to meet you. 4. The Pamir Mountains which have an elevation Лутфан. / Рахмат. of 23,000 feet are known locally as the “Roof of [lutˈfan] / [ɾaχˈmat] the World”. Mountains make up more than 90 /Loot-fan./ /Rah-mat./ percent of Tajikistan’s territory. -
Linguistics Development Team
Development Team Principal Investigator: Prof. Pramod Pandey Centre for Linguistics / SLL&CS Jawaharlal Nehru University, New Delhi Email: [email protected] Paper Coordinator: Prof. K. S. Nagaraja Department of Linguistics, Deccan College Post-Graduate Research Institute, Pune- 411006, [email protected] Content Writer: Prof. K. S. Nagaraja Prof H. S. Ananthanarayana Content Reviewer: Retd Prof, Department of Linguistics Osmania University, Hyderabad 500007 Paper : Historical and Comparative Linguistics Linguistics Module : Indo-Aryan Language Family Description of Module Subject Name Linguistics Paper Name Historical and Comparative Linguistics Module Title Indo-Aryan Language Family Module ID Lings_P7_M1 Quadrant 1 E-Text Paper : Historical and Comparative Linguistics Linguistics Module : Indo-Aryan Language Family INDO-ARYAN LANGUAGE FAMILY The Indo-Aryan migration theory proposes that the Indo-Aryans migrated from the Central Asian steppes into South Asia during the early part of the 2nd millennium BCE, bringing with them the Indo-Aryan languages. Migration by an Indo-European people was first hypothesized in the late 18th century, following the discovery of the Indo-European language family, when similarities between Western and Indian languages had been noted. Given these similarities, a single source or origin was proposed, which was diffused by migrations from some original homeland. This linguistic argument is supported by archaeological and anthropological research. Genetic research reveals that those migrations form part of a complex genetical puzzle on the origin and spread of the various components of the Indian population. Literary research reveals similarities between various, geographically distinct, Indo-Aryan historical cultures. The Indo-Aryan migrations started in approximately 1800 BCE, after the invention of the war chariot, and also brought Indo-Aryan languages into the Levant and possibly Inner Asia. -
Pashto, Waneci, Ormuri. Sociolinguistic Survey of Northern
SOCIOLINGUISTIC SURVEY OF NORTHERN PAKISTAN VOLUME 4 PASHTO, WANECI, ORMURI Sociolinguistic Survey of Northern Pakistan Volume 1 Languages of Kohistan Volume 2 Languages of Northern Areas Volume 3 Hindko and Gujari Volume 4 Pashto, Waneci, Ormuri Volume 5 Languages of Chitral Series Editor Clare F. O’Leary, Ph.D. Sociolinguistic Survey of Northern Pakistan Volume 4 Pashto Waneci Ormuri Daniel G. Hallberg National Institute of Summer Institute Pakistani Studies of Quaid-i-Azam University Linguistics Copyright © 1992 NIPS and SIL Published by National Institute of Pakistan Studies, Quaid-i-Azam University, Islamabad, Pakistan and Summer Institute of Linguistics, West Eurasia Office Horsleys Green, High Wycombe, BUCKS HP14 3XL United Kingdom First published 1992 Reprinted 2004 ISBN 969-8023-14-3 Price, this volume: Rs.300/- Price, 5-volume set: Rs.1500/- To obtain copies of these volumes within Pakistan, contact: National Institute of Pakistan Studies Quaid-i-Azam University, Islamabad, Pakistan Phone: 92-51-2230791 Fax: 92-51-2230960 To obtain copies of these volumes outside of Pakistan, contact: International Academic Bookstore 7500 West Camp Wisdom Road Dallas, TX 75236, USA Phone: 1-972-708-7404 Fax: 1-972-708-7433 Internet: http://www.sil.org Email: [email protected] REFORMATTING FOR REPRINT BY R. CANDLIN. CONTENTS Preface.............................................................................................................vii Maps................................................................................................................ -
ARTICLE Development of a Gold-Standard Pashto Dataset and a Segmentation App Yan Han and Marek Rychlik
ARTICLE Development of a Gold-standard Pashto Dataset and a Segmentation App Yan Han and Marek Rychlik ABSTRACT The article aims to introduce a gold-standard Pashto dataset and a segmentation app. The Pashto dataset consists of 300 line images and corresponding Pashto text from three selected books. A line image is simply an image consisting of one text line from a scanned page. To our knowledge, this is one of the first open access datasets which directly maps line images to their corresponding text in the Pashto language. We also introduce the development of a segmentation app using textbox expanding algorithms, a different approach to OCR segmentation. The authors discuss the steps to build a Pashto dataset and develop our unique approach to segmentation. The article starts with the nature of the Pashto alphabet and its unique diacritics which require special considerations for segmentation. Needs for datasets and a few available Pashto datasets are reviewed. Criteria of selection of data sources are discussed and three books were selected by our language specialist from the Afghan Digital Repository. The authors review previous segmentation methods and introduce a new approach to segmentation for Pashto content. The segmentation app and results are discussed to show readers how to adjust variables for different books. Our unique segmentation approach uses an expanding textbox method which performs very well given the nature of the Pashto scripts. The app can also be used for Persian and other languages using the Arabic writing system. The dataset can be used for OCR training, OCR testing, and machine learning applications related to content in Pashto. -
Judeo-Persian Literature Chapter Author(S): Vera Basch Moreen Book
Princeton University Press Chapter Title: Judeo-Persian Literature Chapter Author(s): Vera Basch Moreen Book Title: A History of Jewish-Muslim Relations Book Subtitle: From the Origins to the Present Day Book Editor(s): Abdelwahab Meddeb, Benjamin Stora Published by: Princeton University Press. (2013) Stable URL: https://www.jstor.org/stable/j.ctt3fgz64.80 JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at https://about.jstor.org/terms Princeton University Press is collaborating with JSTOR to digitize, preserve and extend access to A History of Jewish-Muslim Relations This content downloaded from 89.176.194.108 on Sun, 12 Apr 2020 13:51:06 UTC All use subject to https://about.jstor.org/terms Judeo- Persian Literature Vera Basch Moreen Jews have lived in Iran for almost three millennia and became profoundly acculturated to many aspects of Iranian life. This phenomenon is particu- larly manifest in the literary sphere, defi ned here broadly to include belles lettres, as well as nonbelletristic (i.e., historical, philosophical, and polemi- cal) writings. Although Iranian Jews spoke Vera Basch Moreen many local dialects and some peculiar Jewish dialects, such as the hybrid lo- Vera Basch Moreen is an independent Torah[i] (Heb. + Pers. -
The Iranian Reflexes of Proto-Iranian *Ns
The Iranian Reflexes of Proto-Iranian *ns Martin Joachim Kümmel, Friedrich-Schiller-Universität Jena [email protected] Abstract1 The obvious cognates of Avestan tąθra- ‘darkness’ in the other Iranian languages generally show no trace of the consonant θ; they all look like reflexes of *tār°. Instead of assuming a different word formation for the non-Avestan words, I propose a solution uniting the obviously corresponding words under a common preform, starting from Proto-Iranian *taNsra-: Before a sonorant *ns was preserved as ns in Avestan (feed- ing the change of tautosyllabic *sr > *θr) but changed to *nh elsewhere, followed by *anhr > *ã(h)r. A parallel case of apparent variation can be explained similarly, namely Avestan pąsnu- ‘ashes’ and its cog- nates. Finally, the general development of Proto-Indo-Iranian *ns in Iranian and its relative chronology is discussed, including word-final *ns, where it is argued that the Avestan accusative plural of a-stems can be derived from *-āns. Keywords: Proto-Iranian, nasals, sibilant, sound change, variation, chronology 1. Introduction The aim of this paper is to discuss some details of the development of the Proto- Iranian (PIr) cluster *ns in the Iranian languages. Before we proceed to do so, it will be useful to recall the most important facts concerning the history of dental-alveolar sibilants in Iranian. 1) PIr had inherited a sibilant *s identical to Old Indo-Aryan (Sanskrit/Vedic) s from Proto-Indo-Iranian (PIIr) *s. This sibilant changed to Common Iranian (CIr) h in most environments, while its voiced allophone z remained stable all the time. -
Language Documentation and Description
Language Documentation and Description ISSN 1740-6234 ___________________________________________ This article appears in: Language Documentation and Description, vol 17. Editor: Peter K. Austin Countering the challenges of globalization faced by endangered languages of North Pakistan ZUBAIR TORWALI Cite this article: Torwali, Zubair. 2020. Countering the challenges of globalization faced by endangered languages of North Pakistan. In Peter K. Austin (ed.) Language Documentation and Description 17, 44- 65. London: EL Publishing. Link to this article: http://www.elpublishing.org/PID/181 This electronic version first published: July 2020 __________________________________________________ This article is published under a Creative Commons License CC-BY-NC (Attribution-NonCommercial). The licence permits users to use, reproduce, disseminate or display the article provided that the author is attributed as the original creator and that the reuse is restricted to non-commercial purposes i.e. research or educational use. See http://creativecommons.org/licenses/by-nc/4.0/ ______________________________________________________ EL Publishing For more EL Publishing articles and services: Website: http://www.elpublishing.org Submissions: http://www.elpublishing.org/submissions Countering the challenges of globalization faced by endangered languages of North Pakistan Zubair Torwali Independent Researcher Summary Indigenous communities living in the mountainous terrain and valleys of the region of Gilgit-Baltistan and upper Khyber Pakhtunkhwa, northern -
The Socio Linguistic Situation and Language Policy of the Autonomous Region of Mountainous Badakhshan: the Case of the Tajik Language*
THE SOCIO LINGUISTIC SITUATION AND LANGUAGE POLICY OF THE AUTONOMOUS REGION OF MOUNTAINOUS BADAKHSHAN: THE CASE OF THE TAJIK LANGUAGE* Leila Dodykhudoeva Institute Of Linguistics, Russian Academy Of Sciences The paper deals with the problem of closely related languages of the Eastern and Western Iranian origin that coexist in a close neighbourhood in a rather compact area of one region of Republic Tajikistan. These are a group of "minor" Pamir languages and state language of Tajikistan - Tajik. The population of the Autonomous Region of Mountainous Badakhshan speaks different Pamir languages. They are: Shughni, Rushani, Khufi, Bartangi, Roshorvi, Sariqoli; Yazghulami; Wakhi; Ishkashimi. These languages have no script and written tradition and are used only as spoken languages in the region. The status of these languages and many other local linguemes is still discussed in Iranology. Nearly all Pamir languages to a certain extent can be called "endangered". Some of these languages, like Yazghulami, Roshorvi, Ishkashimi are included into "The Red Book " (UNESCO 1995) as "endangered". Some of them are extinct. Information on other idioms up to now is not available. These languages live in close cooperation and interaction with the state language of Tajikistan - Tajik. Almost all population of Badakhshan is multilingual or bilingual. The second language is official language of the state - Tajik. This language is used in Badakhshan as the language of education, press, media, and culture. This is the reason why this paper is focused on the status of Tajik language in Republic Tajikistan and particularly in Badakhshan. The Tajik literary language (its oral and written forms) has a long history and rich written traditions. -
M. Witzel (2003) Sintashta, BMAC and the Indo-Iranians. a Query. [Excerpt
M. Witzel (2003) Sintashta, BMAC and the Indo-Iranians. A query. [excerpt from: Linguistic Evidence for Cultural Exchange in Prehistoric Western Central Asia] (to appear in : Sino-Platonic Papers 129) Transhumance, Trickling in, Immigration of Steppe Peoples There is no need to underline that the establishment of a BMAC substrate belt has grave implications for the theory of the immigration of speakers of Indo-Iranian languages into Greater Iran and then into the Panjab. By and large, the body of words taken over into the Indo-Iranian languages in the BMAC area, necessarily by bilingualism, closes the linguistic gap between the Urals and the languages of Greater Iran and India. Uralic and Yeneseian were situated, as many IIr. loan words indicate, to the north of the steppe/taiga boundary of the (Proto-)IIr. speaking territories (§2.1.1). The individual IIr. languages are firmly attested in Greater Iran (Avestan, O.Persian, Median) as well as in the northwestern Indian subcontinent (Rgvedic, Middle Vedic). These materials, mentioned above (§2.1.) and some more materials relating to religion (Witzel forthc. b) indicate an early habitat of Proto- IIr. in the steppes south of the Russian/Siberian taiga belt. The most obvious linguistic proofs of this location are the FU words corresponding to IIr. Arya "self-designation of the IIr. tribes": Pre-Saami *orja > oarji "southwest" (Koivulehto 2001: 248), ārjel "Southerner", and Finnish orja, Votyak var, Syry. ver "slave" (Rédei 1986: 54). In other words, the IIr. speaking area may have included the S. Ural "country of towns" (Petrovka, Sintashta, Arkhaim) dated at c. -
A Grammatical Description of Dameli
Emil Perder A Grammatical Description of Dameli Emil Perder A Grammatical Description of Dameli A Grammatical ISBN 978-91-7447-770-2 Department of Linguistics Doctoral Thesis in Linguistics at Stockholm University, Sweden 2013 A Grammatical Description of Dameli Emil Perder A Grammatical Description of Dameli Emil Perder ©Emil Perder, Stockholm 2013 ISBN 978-91-7447-770-2 Printed in Sweden by Universitetsservice AB, Stockholm 2013 Distributor: Department of Linguistics, Stockholm University ماں َتارف تہ ِاک توحفہ، دامیاں A gift from me, Dameli people Abstract This dissertation aims to provide a grammatical description of Dameli (ISO-639-3: dml), an Indo-Aryan language spoken by approximately 5 000 people in the Domel Valley in Chitral in the Khyber Pakhtunkhwa Province in the North-West of Pakistan. Dameli is a left-branching SOV language with considerable morphological complexity, particularly in the verb, and a complicated system of argument marking. The phonology is relatively rich, with 31 consonant and 16 vowel phonemes. This is the first extensive study of this language. The analysis presented here is based on original data collected primarily between 2003-2008 in cooperation with speakers of the language in Peshawar and Chitral, including the Domel Valley. The core of the data consists of recorded texts and word lists, but questionnaires and paradigms of word forms have also been used. The main emphasis is on describing the features of the language as they appear in texts and other material, rather than on conforming them to any theory, but the analysis is informed by functional analysis and linguistic typology, hypotheses on diachronical developments and comparisons with neighbouring and related languages. -
Internal Classification of Indo-European Languages: Survey
Václav Blažek (Masaryk University of Brno, Czech Republic) On the internal classification of Indo-European languages: Survey The purpose of the present study is to confront most representative models of the internal classification of Indo-European languages and their daughter branches. 0. Indo-European 0.1. In the 19th century the tree-diagram of A. Schleicher (1860) was very popular: Germanic Lithuanian Slavo-Lithuaian Slavic Celtic Indo-European Italo-Celtic Italic Graeco-Italo- -Celtic Albanian Aryo-Graeco- Greek Italo-Celtic Iranian Aryan Indo-Aryan After the discovery of the Indo-European affiliation of the Tocharian A & B languages and the languages of ancient Asia Minor, it is necessary to take them in account. The models of the recent time accept the Anatolian vs. non-Anatolian (‘Indo-European’ in the narrower sense) dichotomy, which was first formulated by E. Sturtevant (1942). Naturally, it is difficult to include the relic languages into the model of any classification, if they are known only from several inscriptions, glosses or even only from proper names. That is why there are so big differences in classification between these scantily recorded languages. For this reason some scholars omit them at all. 0.2. Gamkrelidze & Ivanov (1984, 415) developed the traditional ideas: Greek Armenian Indo- Iranian Balto- -Slavic Germanic Italic Celtic Tocharian Anatolian 0.3. Vladimir Georgiev (1981, 363) included in his Indo-European classification some of the relic languages, plus the languages with a doubtful IE affiliation at all: Tocharian Northern Balto-Slavic Germanic Celtic Ligurian Italic & Venetic Western Illyrian Messapic Siculian Greek & Macedonian Indo-European Central Phrygian Armenian Daco-Mysian & Albanian Eastern Indo-Iranian Thracian Southern = Aegean Pelasgian Palaic Southeast = Hittite; Lydian; Etruscan-Rhaetic; Elymian = Anatolian Luwian; Lycian; Carian; Eteocretan 0.4.