From Oral to Written: a Text-Linguistic Study of Wakhi Narratives

Total Page:16

File Type:pdf, Size:1020Kb

From Oral to Written: a Text-Linguistic Study of Wakhi Narratives ACTA UNIVERSITATIS UPSALIENSIS Studia Iranica Upsaliensia 35 From Oral to Written A Text-linguistic Study of Wakhi Narratives Jaroslava Obrtelová A PDF of this book as well as a PDF of Appendix E is available online at: http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-381858 Dissertation presented at Uppsala University to be publicly examined in Ihresalen (21-0011), Engelska Parken, Thunbergsvägen 3, Uppsala, Friday, 14 June 2019 at 10:15 for the degree of Doctor of Philosophy. The examination will be conducted in English. Faculty examiner: Docent Henrik Liljegren (Stockholm University). Abstract Obrtelová, J. 2019. From Oral to Written. A Text-linguistic Study of Wakhi Narratives. Studia Iranica Upsaliensia 35. 333 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-513-0664-3. Wakhi is one of the endangered “Pamir” languages belonging to the East Iranian group of Indo-European. A total of around 72,000 Wakhi speakers live in the border areas of four countries: Tajikistan, Afghanistan, Pakistan and China. This study focuses on the Wakhi spoken in Tajikistan, which for a long time was unwritten. Recently, however, native speakers have made significant efforts to preserve and develop their mother tongue in written form. This study examines textual differences between oral and written narratives in Wakhi from a discourse-pragmatic perspective. It addresses issues relating to the transition from an unwritten to an early-stage written language. Rather than finding a clear boundary between the two, it proposes a continuum with spontaneous narratives at one end and written narratives at the other, and identifies both syntactic and textual differences between them. Oral narratives prepared in advance differ from the spontaneously told ones and share some characteristics with written narratives. The first and most extensive part of the study compares interclausal coordinate and subordinate relations in the oral and written narratives. Coordination in written narratives is usually achieved using unmarked forms and with fewer types of coordinating devices than in oral narratives. Unmarked patterns of complementation are also more frequent in written than in oral narratives. However, temporal subordinate clauses in written narratives favour the post-nuclear form that is described as marked, whereas oral narratives prefer the unmarked pre-nuclear order. The second part analyses significant differences between the two forms of expression from the perspective of their function in the overall structure of narratives. One significant difference concerns story development techniques. Development in oral narratives is marked primarily through a “then”-type conjunction, but in written narratives is signalled through certain forms of references to agents. An annotated corpus of 29 written and 13 oral narratives accompanies the publication (available in electronic format). Keywords: Wakhi, Iranian languages, text-linguistics, oral narratives, written narratives, discourse analysis Jaroslava Obrtelová, Department of Linguistics and Philology, Box 635, Uppsala University, SE-75126 Uppsala, Sweden. © Jaroslava Obrtelová 2019 ISSN 1100-326X ISBN 978-91-513-0664-3 urn:nbn:se:uu:diva-381858 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-381858) To my Jan &RQWHQWV /LVW RI 7DEOHV /LVW RI )LJXUHV $EEUHYLDWLRQV $FNQRZOHGJHPHQWV ,QWURGXFWLRQ 5HVHDUFK WRSLF :DNKL LQ FRQWH[W $Q RYHUYLHZ RI WKH KLVWRU\ RI WKH 3DPLU DUHD &ODVVLILFDWLRQ DQG JHQHWLF UHODWLRQV 3DPLU ODQJXDJHV LQ 7DMLNLVWDQ *HRJUDSKLF DQG VRFLDO VHWWLQJ 6RFLROLQJXLVWLF VLWXDWLRQ RI WKH 3DPLU ODQJXDJHV +LVWRU\ DQG WKH FXUUHQW VWDWH RI ODQJXDJH GH YHORSPHQW :DNKL LQ 7DMLNLVWDQ 3UHYLRXV VWXGLHV RQ :DNKL :DNKL ODQJXDJH GHVFULSWLRQ ± VHOHFWHG WRSLFV 9HUEV 1RQWHQVH 3DVW WHQVH 3HUIHFW 'HULYHG ILQLWH IRUPV 1RQILQLWH YHUE IRUPV 6HQWHQFH FODXVH DQG FODXVH FRPELQLQJ LQ :DNKL ,QGHSHQGHQW FODXVHV LQ :DNKL /LJKW YHUE FRQVWUXFWLRQV 6HULDO YHUE FRQVWUXFWLRQV 0RGDO FRQVWUXFWLRQV 0XOWLSOH UHSHWLWLRQ RI D YHUE 6XERUGLQDWH FRQVWUXFWLRQV &RPSOHPHQW FODXVHV $GYHUELDO FODXVHV 5HODWLYH FODXVHV 6XERUGLQDWH FODXVHV ZLWK DPELJXRXV LQWHUSUH WDWLRQV &RRUGLQDWH FRQVWUXFWLRQV &RUSXV RI :DNKL QDUUDWLYHV :DNKL QDUUDWLYHV )LHOGZRUN DQG GDWD JDWKHULQJ 7UDQVFULSWLRQ WUDQVODWLRQ DQG JORVVLQJ 6HJPHQWDWLRQ RI WKH WH[W $QDO\VHG :DNKL QDUUDWLYHV 7KHRUHWLFDO EDFNJURXQG DQG PHWKRGRORJ\ 2UDO YHUVXV ZULWWHQ ODQJXDJH ± RYHUYLHZ RI WKH OLWHUDWXUH $SSURDFKHV WR WH[WOLQJXLVWLF GLVFRXUVH DQDO\VLV 0HWKRGRORJLFDO DSSURDFK ,QWHUFODXVDO ORJLFDO UHODWLRQV 0HWKRG $QDO\VLV GLVFXVVLRQ 6\QWDFWLF VWUXFWXUH &RRUGLQDWLRQ &RQMRLQLQJ 7KH GHYHORSPHQW PDUNHU 7DLOKHDG OLQNDJH $GGLWLYHV $GYHUVDWLYH FRRUGLQDWLRQ 6XERUGLQDWLRQ &RPSOHPHQWDWLRQ DQG UHSRUWHG VSHHFK 5HODWLYH FODXVHV $GYHUELDO FODXVHV 6XPPDU\ 1DUUDWLYH VWUXFWXUH )RUHJURXQG DQG EDFNJURXQG 'HLFWLF VKLIW 'HLFWLF VKLIW ± ศ༬໦ཟ༝ 'HLFWLF VKLIW ± ศ༬໦ཟ༝ 'HLFWLF VKLIW ± ศ༬໦ཟ༝ 'HLFWLF VKLIW ± VXPPDU\ 6WRU\ GHYHORSPHQW WHFKQLTXHV 6XPPDU\ &RQFOXVLRQ DQG ILQDO UHPDUNV $SSHQGL[ $ (*,'6 $SSHQGL[ % :DNKL &\ULOOLF DQG SKRQHPLF DOSKDEHW $SSHQGL[ & /LVW RI :DNKL QDUUDWLYHV $SSHQGL[ ' ([DPSOHV RI DQDO\VHG WH[WV ' $QDO\VHG RUDO WH[W 2B= )URJ ' $QDO\VHG ZULWWHQ WH[W :B= )URJ 5HIHUHQFHV $SSHQGL[ ( :DNKL7H[W&RUSXV RQOLQH http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-381858 /LVW RI 7DEOHV 7DEOH 3HUVRQDO SURQRXQV DQG FRUUHVSRQGLQJ VXEMHFWPDUNLQJ VXIIL[HV 7DEOH 3HUVRQDO SURQRXQV DQG FRUUHVSRQGLQJ VXEMHFWPDUNLQJ HQFOLWLFV 7DEOH /LVW RI DQDO\VHG :DNKL ZULWWHQ QDUUDWLYHV 7DEOH /LVW RI DQDO\VHG :DNKL RUDO QDUUDWLYHV 7DEOH 'LVWULEXWLRQ RI QDUUDWLYHV SHU JHQUH FDWHJRULHV 7DEOH 'LVWULEXWLRQ RI QDUUDWLYHV SHU IRUP 7DEOH 6DPSOH RI DQ DQDO\VHG WH[W 7DEOH *URXS 6SRQWDQHRXV RUDO VWRULHV DQG WKHLU HGLWHG ZULWWHQ YHUVLRQV 7DEOH *URXS 3UHSDUHG RUDO VWRULHV DQG WKHLU HGLWHG ZULWWHQ YHUVLRQV 7DEOH *URXS 6SRQWDQHRXV RUDO VWRULHV DQG LQGHSHQGHQW ZULWWHQ VWRULHV 7DEOH *URXS $YHUDJH QXPEHU RI FODXVHV SHU VHQWHQFH 7DEOH *URXS $YHUDJH QXPEHU RI FODXVHV SHU VHQWHQFH 7DEOH *URXS $YHUDJH QXPEHU RI FODXVHV SHU VHQWHQFH 7DEOH *URXS &RQMRLQLQJ VWUDWHJLHV LQ UHODWLRQ WR WRWDO QXPEHU RI FRQMRLQHG SDLUV RI LQGHSHQGHQW FODXVHV LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS &RQMRLQLQJ VWUDWHJLHV LQ UHODWLRQ WR WRWDO QXPEHU RI FRQMRLQHG SDLUV RI LQGHSHQGHQW FODXVHV LQWUDVHQWHQWLDOO\ 7DEOH *URXS &RQMRLQLQJ VWUDWHJLHV LQ GLUHFW VSHHFK LQ UHODWLRQ WR WRWDO QXPEHU RI LQGHSHQGHQW FODXVHV LQWUDVHQWHQWLDOO\ 7DEOH *URXS &RQMRLQLQJ SDWWHUQV LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS &RQMRLQLQJ VWUDWHJLHV LQ UHODWLRQ WR WRWDO QXPEHU RI FRQMRLQHG SDLUV RI LQGHSHQGHQW FODXVHV LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS &RQMRLQLQJ VWUDWHJLHV LQ UHODWLRQ WR WRWDO QXPEHU RI FRQMRLQHG SDLUV RI LQGHSHQGHQW FODXVHV LQWUDVHQWHQWLDOO\ 7DEOH *URXS &RQMRLQLQJ SDWWHUQV LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS &RQMRLQLQJ VWUDWHJLHV LQ UHODWLRQ WR WRWDO QXPEHU RI FRQMRLQHG SDLUV RI LQGHSHQGHQW FODXVHV LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS &RQMRLQLQJ VWUDWHJLHV LQ UHODWLRQ WR WRWDO QXPEHU RI FRQMRLQHG SDLUV RI LQGHSHQGHQW FODXVHV LQWUDVHQWHQWLDOO\ 7DEOH *URXS &RQMRLQLQJ VWUDWHJLHV LQ GLUHFW VSHHFK LQ UHODWLRQ WR WRWDO QXPEHU RI FRQMRLQHG LQGHSHQGHQW FODXVHV LQWUDVHQWHQWLDOO\ 7DEOH *URXS &RQMRLQLQJ SDWWHUQV LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXSV ± &RQMRLQLQJ E\ MX[WDSRVLWLRQ 7DEOH *URXSV ± ,QWUDVHQWHQWLDO SUHIHUHQFHV 7DEOH *URXS 7KH '0 \DQ LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS 7KH '0 \DQ LQ WKH IRUHJURXQG DQG EDFNJURXQG RI WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS 3DWWHUQV LQ ZKLFK WKH '0 \DQ LV IRXQG 7DEOH *URXS 7KH '0 \DQ LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS 7KH '0 \DQ LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS 3DWWHUQV LQ ZKLFK WKH '0 \DQ LV IRXQG 7DEOH *URXSV ± 2FFXUUHQFHV RI WKH '0 \DQ 7DEOH *URXS 7DLOKHDG OLQNDJH LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS 7DLOKHDG OLQNDJH LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXSV ± 2FFXUUHQFHV RI WDLOKHDG OLQNDJH 7DEOH *URXS 7RWDO RFFXUUHQFHV RI DGGLWLYH SDUWLFOHV 7DEOH *URXS 2FFXUUHQFHV DQG SHUFHQWDJHV RI WKH SUDJPDWLF IXQFWLRQV RI WKH DGGLWLYH SDUWLFOH EϷ LQ UHODWLRQ WR LWV WRWDO RFFXUUHQFHV 7DEOH *URXS 7RWDO RFFXUUHQFHV RI DGGLWLYH SDUWLFOHV 7DEOH *URXS 7RWDO RFFXUUHQFHV RI DGGLWLYH SDUWLFOHV 7DEOH *URXS 2FFXUUHQFHV DQG SHUFHQWDJHV RI WKH SUDJPDWLF IXQFWLRQV RI WKH DGGLWLYH SDUWLFOH EϷ LQ UHODWLRQ WR LWV WRWDO RFFXUUHQFHV 7DEOH *URXSV ± 2FFXUUHQFHV RI DGGLWLYH SDUWLFOHV 7DEOH *URXSV ± 2FFXUUHQFHV DQG SHUFHQWDJHV RI WKH SUDJPDWLF IXQFWLRQV RI WKH DGGLWLYH SDUWLFOH EϷ LQ UHODWLRQ WR LWV WRWDO RFFXUUHQFHV 7DEOH *URXS $GYHUVDWLYH FRQQHFWLYHV LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS $GYHUVDWLYH FRQQHFWLYHV LQ WKH RUDO DQG ZULWWHQ QDUUDWLYHV 7DEOH *URXS $GYHUVDWLYH
Recommended publications
  • Thank You Mr. Moderator. I Am Nuriddin Rizoyi, From
    Thank you Mr. Moderator. I am Nuriddin Rizoyi, From Tajikistan, head of foreign affairs of Political movement “Group 24”. According to today’s topic I would like to focus on the “Rights of persons belonging to national minorities” - Although oppressions and injustices are happening over all societies in Tajikistan, but the national minority in the region of Badakhshan is in a very bad situation for a long time and still it’s going on. The “Kūhistoni Badakhshon” autonomous region located in the east of Tajikistan in the Pamir Mountains, it makes up 45% of the land area of the country. The population is almost 218,000. The main ethnic group are Pamiris. The largest city is Khorugh. They have their own language, it called Pamiris. The majority religion is Ismaili Shi'ite and adherence to the Aga Khan is widespread. Although Badakhshan is Semi-independent, but unfortunately its people have no any contribution in their destiny. For example: 1. Head of province elected by the President, not by the people. 2. Despite the existence of a local parliament, but without any authority. 3. It is one of the richest areas by its natural resources in the world, but unfortunately the people of this region are deprived of all those resources. 4. The majority of the new generation of Badakhshan is among the rest of the Tajik youth in Russia working as a labor. 5. The central government does not give serious attention and not care toward this region, the evidence of the mentioned is lack of factories and industries. 6.
    [Show full text]
  • Grammatical Gender in Hindukush Languages
    Grammatical gender in Hindukush languages An areal-typological study Julia Lautin Department of Linguistics Independent Project for the Degree of Bachelor 15 HEC General linguistics Bachelor's programme in Linguistics Spring term 2016 Supervisor: Henrik Liljegren Examinator: Bernhard Wälchli Expert reviewer: Emil Perder Project affiliation: “Language contact and relatedness in the Hindukush Region,” a research project supported by the Swedish Research Council (421-2014-631) Grammatical gender in Hindukush languages An areal-typological study Julia Lautin Abstract In the mountainous area of the Greater Hindukush in northern Pakistan, north-western Afghanistan and Kashmir, some fifty languages from six different genera are spoken. The languages are at the same time innovative and archaic, and are of great interest for areal-typological research. This study investigates grammatical gender in a 12-language sample in the area from an areal-typological perspective. The results show some intriguing features, including unexpected loss of gender, languages that have developed a gender system based on the semantic category of animacy, and languages where this animacy distinction is present parallel to the inherited gender system based on a masculine/feminine distinction found in many Indo-Aryan languages. Keywords Grammatical gender, areal-typology, Hindukush, animacy, nominal categories Grammatiskt genus i Hindukush-språk En areal-typologisk studie Julia Lautin Sammanfattning I den här studien undersöks grammatiskt genus i ett antal språk som talas i ett bergsområde beläget i norra Pakistan, nordvästra Afghanistan och Kashmir. I området, här kallat Greater Hindukush, talas omkring 50 olika språk från sex olika språkfamiljer. Det stora antalet språk tillsammans med den otillgängliga terrängen har gjort att språken är arkaiska i vissa hänseenden och innovativa i andra, vilket gör det till ett intressant område för arealtypologisk forskning.
    [Show full text]
  • (And Potential) Language and Linguistic Resources on South Asian Languages
    CoRSAL Symposium, University of North Texas, November 17, 2017 Existing (and Potential) Language and Linguistic Resources on South Asian Languages Elena Bashir, The University of Chicago Resources or published lists outside of South Asia Digital Dictionaries of South Asia in Digital South Asia Library (dsal), at the University of Chicago. http://dsal.uchicago.edu/dictionaries/ . Some, mostly older, not under copyright dictionaries. No corpora. Digital Media Archive at University of Chicago https://dma.uchicago.edu/about/about-digital-media-archive Hock & Bashir (eds.) 2016 appendix. Lists 9 electronic corpora, 6 of which are on Sanskrit. The 3 non-Sanskrit entries are: (1) the EMILLE corpus, (2) the Nepali national corpus, and (3) the LDC-IL — Linguistic Data Consortium for Indian Languages Focus on Pakistan Urdu Most work has been done on Urdu, prioritized at government institutions like the Center for Language Engineering at the University of Engineering and Technology in Lahore (CLE). Text corpora: http://cle.org.pk/clestore/index.htm (largest is a 1 million word Urdu corpus from the Urdu Digest. Work on Essential Urdu Linguistic Resources: http://www.cle.org.pk/eulr/ Tagset for Urdu corpus: http://cle.org.pk/Publication/papers/2014/The%20CLE%20Urdu%20POS%20Tagset.pdf Urdu OCR: http://cle.org.pk/clestore/urduocr.htm Sindhi Sindhi is the medium of education in some schools in Sindh Has more institutional backing and consequent research than other languages, especially Panjabi. Sindhi-English dictionary developed jointly by Jennifer Cole at the University of Illinois Urbana- Champaign and Sarmad Hussain at CLE (http://182.180.102.251:8081/sed1/homepage.aspx).
    [Show full text]
  • Tajiki Some Useful Phrases in Tajiki Five Reasons Why You Should Ассалому Алейкум
    TAJIKI SOME USEFUL PHRASES IN TAJIKI FIVE REASONS WHY YOU SHOULD ассалому алейкум. LEARN MORE ABOUT TAJIKIS AND [ˌasːaˈlɔmu aˈlɛɪkum] /asah-lomu ah-lay-koom./ THEIR LANGUAGE Hello! 1. Tajiki is spoken as a first or second language by over 8 million people worldwide, but the Hоми шумо? highest population of speakers is located in [ˈnɔmi ʃuˈmɔ] Tajikistan, with significant populations in other /No-mee shoo-moh?/ Central Eurasian countries such as Afghanistan, What is your name? Uzbekistan, and Russia. Номи ман… 2. Tajiki is a member of the Western Iranian branch [ˈnɔmi man …] of the Indo-Iranian languages, and shares many structural similarities to other Persian languages /No-mee man.../ such as Dari and Farsi. My name is… 3. Few people in America can speak or use the Tajiki Шумо чи xeл? Нағз, рахмат. version of Persian. Given the different script and [ʃuˈmɔ ʧi χɛl naʁz ɾaχˈmat] dialectal differences, simply knowing Farsi is not /shoo-moh-chee-khel? Naghz, rah-mat./ enough to fully understand Tajiki. Those who How are you? I’m fine, thank you. study Tajiki can find careers in a variety of fields including translation and interpreting, consulting, Aз вохуриамон шод ҳастам. and foreign service and intelligence. NGOs [az vɔχuˈɾiamɔn ʃɔd χaˈstam] and other enterprises that deal with Tajikistan /Az vo-khu-ri-amon shod has-tam./ desperately need specialists who speak Tajiki. Nice to meet you. 4. The Pamir Mountains which have an elevation Лутфан. / Рахмат. of 23,000 feet are known locally as the “Roof of [lutˈfan] / [ɾaχˈmat] the World”. Mountains make up more than 90 /Loot-fan./ /Rah-mat./ percent of Tajikistan’s territory.
    [Show full text]
  • A Journey to the End of Indo-Persian
    Chapter 8 The Antipodes of “Progress”: A Journey to the End of Indo-Persian Nile Green Siyahat ki gun hain na mard-e safar hain (We do not seize the advantages of travel, nor are we intrepid voyagers.) Hali, Mosaddas (1879) … Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt. (The limits of my language mean the limits of my world.) Ludwig Wittgenstein, Tractatus (1922) ⸪ In the last decades of the nineteenth century, Indians effectively stopped producing Persian prose after over eight hundred years of using the language for literature, statecraft, and science.1 At the public level, the obvious turning point was Persian’s administrative replacement by the East India Company with English and the vernaculars between 1832 and 1837.2 As Tariq Rahman This essay is dedicated to the memory of Omar Khalidi with whom I had hoped to write a short book about travelers from his beloved Hyderabad. For archival and other assistance, I am grateful to Teresa Jones (Worcestershire History Centre), Alf Russell (Wolverhampton City Archives) and the staff of the Library of Birmingham and the Birmingham Pen Museum. I am also thankful to the custodians of the Salar Jung Library (particularly direc- tor A. Negender Reddy) and the Salar Jung Museum for their assistance during my previous research visits to Hyderabad. 1 For historical overviews of Indo-Persian, see T.N. Devare, A Short History of Persian Literature at the Bahmani, Adil Shahi and Qutb Shahi Courts (Poona: T.N. Devare, 1961); and Muhammad Abdul Ghani, History of Persian Language and Literature at the Mughal Court, 3 vols.
    [Show full text]
  • Pashto, Waneci, Ormuri. Sociolinguistic Survey of Northern
    SOCIOLINGUISTIC SURVEY OF NORTHERN PAKISTAN VOLUME 4 PASHTO, WANECI, ORMURI Sociolinguistic Survey of Northern Pakistan Volume 1 Languages of Kohistan Volume 2 Languages of Northern Areas Volume 3 Hindko and Gujari Volume 4 Pashto, Waneci, Ormuri Volume 5 Languages of Chitral Series Editor Clare F. O’Leary, Ph.D. Sociolinguistic Survey of Northern Pakistan Volume 4 Pashto Waneci Ormuri Daniel G. Hallberg National Institute of Summer Institute Pakistani Studies of Quaid-i-Azam University Linguistics Copyright © 1992 NIPS and SIL Published by National Institute of Pakistan Studies, Quaid-i-Azam University, Islamabad, Pakistan and Summer Institute of Linguistics, West Eurasia Office Horsleys Green, High Wycombe, BUCKS HP14 3XL United Kingdom First published 1992 Reprinted 2004 ISBN 969-8023-14-3 Price, this volume: Rs.300/- Price, 5-volume set: Rs.1500/- To obtain copies of these volumes within Pakistan, contact: National Institute of Pakistan Studies Quaid-i-Azam University, Islamabad, Pakistan Phone: 92-51-2230791 Fax: 92-51-2230960 To obtain copies of these volumes outside of Pakistan, contact: International Academic Bookstore 7500 West Camp Wisdom Road Dallas, TX 75236, USA Phone: 1-972-708-7404 Fax: 1-972-708-7433 Internet: http://www.sil.org Email: [email protected] REFORMATTING FOR REPRINT BY R. CANDLIN. CONTENTS Preface.............................................................................................................vii Maps................................................................................................................
    [Show full text]
  • ARTICLE Development of a Gold-Standard Pashto Dataset and a Segmentation App Yan Han and Marek Rychlik
    ARTICLE Development of a Gold-standard Pashto Dataset and a Segmentation App Yan Han and Marek Rychlik ABSTRACT The article aims to introduce a gold-standard Pashto dataset and a segmentation app. The Pashto dataset consists of 300 line images and corresponding Pashto text from three selected books. A line image is simply an image consisting of one text line from a scanned page. To our knowledge, this is one of the first open access datasets which directly maps line images to their corresponding text in the Pashto language. We also introduce the development of a segmentation app using textbox expanding algorithms, a different approach to OCR segmentation. The authors discuss the steps to build a Pashto dataset and develop our unique approach to segmentation. The article starts with the nature of the Pashto alphabet and its unique diacritics which require special considerations for segmentation. Needs for datasets and a few available Pashto datasets are reviewed. Criteria of selection of data sources are discussed and three books were selected by our language specialist from the Afghan Digital Repository. The authors review previous segmentation methods and introduce a new approach to segmentation for Pashto content. The segmentation app and results are discussed to show readers how to adjust variables for different books. Our unique segmentation approach uses an expanding textbox method which performs very well given the nature of the Pashto scripts. The app can also be used for Persian and other languages using the Arabic writing system. The dataset can be used for OCR training, OCR testing, and machine learning applications related to content in Pashto.
    [Show full text]
  • Current Issues in Kurdish Linguistics Current Issues in Kurdish Linguistics 1 Bamberg Studies in Kurdish Linguistics Bamberg Studies in Kurdish Linguistics
    Bamberg Studies in Kurdish Linguistics 1 Songül Gündoğdu, Ergin Öpengin, Geofrey Haig, Erik Anonby (eds.) Current issues in Kurdish linguistics Current issues in Kurdish linguistics 1 Bamberg Studies in Kurdish Linguistics Bamberg Studies in Kurdish Linguistics Series Editor: Geofrey Haig Editorial board: Erik Anonby, Ergin Öpengin, Ludwig Paul Volume 1 2019 Current issues in Kurdish linguistics Songül Gündoğdu, Ergin Öpengin, Geofrey Haig, Erik Anonby (eds.) 2019 Bibliographische Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deut schen Nationalbibliographie; detaillierte bibliographische Informationen sind im Internet über http://dnb.d-nb.de/ abrufbar. Diese Veröff entlichung wurde im Rahmen des Elite-Maststudiengangs „Kul- turwissenschaften des Vorderen Orients“ durch das Elitenetzwerk Bayern ge- fördert, einer Initiative des Bayerischen Staatsministeriums für Wissenschaft und Kunst. Die Verantwortung für den Inhalt dieser Veröff entlichung liegt bei den Auto- rinnen und Autoren. Dieses Werk ist als freie Onlineversion über das Forschungsinformations- system (FIS; https://fi s.uni-bamberg.de) der Universität Bamberg erreichbar. Das Werk – ausgenommen Cover, Zitate und Abbildungen – steht unter der CC-Lizenz CC-BY. Lizenzvertrag: Creative Commons Namensnennung 4.0 http://creativecommons.org/licenses/by/4.0. Herstellung und Druck: Digital Print Group, Nürnberg Umschlaggestaltung: University of Bamberg Press © University of Bamberg Press, Bamberg 2019 http://www.uni-bamberg.de/ubp/ ISSN: 2698-6612 ISBN: 978-3-86309-686-1 (Druckausgabe) eISBN: 978-3-86309-687-8 (Online-Ausgabe) URN: urn:nbn:de:bvb:473-opus4-558751 DOI: http://dx.doi.org/10.20378/irbo-55875 Acknowledgements This volume contains a selection of contributions originally presented at the Third International Conference on Kurdish Linguistics (ICKL3), University of Ams- terdam, in August 2016.
    [Show full text]
  • Pdf 373.11 K
    Journal of Language and Translation Volume 11, Number 4, 2021 (pp. 1-18) Adposition and Its Correlation with Verb/Object Order in Taleshi, Gilaki, and Tati Based on Dryer’s Typological Approach Farinaz Nasiri Ziba1, Neda Hedayat2*, Nassim Golaghaei3, Andisheh Saniei4 ¹ PhD Candidate of Linguistics, Roudehen Branch, Islamic Azad University, Roudehen, Iran ² Assistant Professor of Linguistics, Varamin-Pishva Branch, Islamic Azad University, Varamin, Iran ³ Assistant Professor of Applied Linguistics, Roudehen Branch, Islamic Azad University, Roudehen, Iran ⁴ Assistant Professor of Applied Linguistics, Roudehen Branch, Islamic Azad University, Roudehen, Iran Received: January 6, 2021 Accepted: May 9, 2021 Abstract This paper is a descriptive-analytic study on the adpositional system in a number of northwestern Iranian languages, namely Taleshi, Gilaki, and Tati, based on Dryer’s typological approach. To this end, the correlation of verb/object order was examined with the adpositional phrase and the results were compared based on the aforesaid approach. The research question investigated the correlation between adposition and verb/object order in each of these three varieties. First, the data collection was carried out through a semi-structured interview that was devised based on a questionnaire including a compilation of 66 Persian sentences that were translated into Taleshi, Gilaki, and Tati during interviews with 10 elderly illiterate and semi-literate speakers, respectively, from Hashtpar, Bandar Anzali, and Rostamabad of the Province of Gilan for each variety. Then, the transcriptions were examined in terms of diversity in adpositions, including two categories of preposition and postposition. The findings of the study indicated a strong correlation between the order of verbs and objects with postpositions.
    [Show full text]
  • Counterfactual-Hando
    Third International Conference on Iranian Linguistics 11th-13th September 2009, Paris, Sorbonne Nouvelle Arseniy Vydrin Institute of Linguistic Studies St.Petersburg, Russia [email protected] Counterfactual mood in Iron Ossetic Ossetic1 (Northeastern Iranian): Iron, Digor dialects. Spoken mostly in The Republic of North Ossetia-Alania, about 500000 native speakers. 1. Counterfactual meaning Counterfactual meaning can be defined as the meaning which is contrary to the actual state of affairs. Conditional constructions with irreal condition are the easiest way to express the counterfactual meaning. For example, Persian: (1) Agar tabar-rā az dast-aš na-geferte1 bud2-and if axe-OBL PREP hand-ENCL.3SG NEG-take.PLUPERF1,2-3PL hame-ye mā-rā tekke pāre karde1 bud2-and all-EZF we-OBL piece piece do.PLUPERF1,2-3PL ‘If they hadn’t taken the axe from him we would have been hacked to pieces’ (S. Hedāyat. Katja). Couterfactual is considered to be the core meaning of the semantic domain of irrealis [Plungian 2005]. However, as shown in [Lazard 1998; Van Linden and Verstraete 2008], very few languages have a narrow dedicated marker for expressing only counterfactuality. In most languages, counterfactual meaning is a part of the semantic repertoire of some other “broad” markers, primarily associated with the domain of possibility / probability or past (including, according to Lazard, such values as prospective, desiderative, debitive, inceptive, evidentiality, habitual, subjunctive and optative). Most of the Iranian languages: past habitual, imperfect or pluperfect markers. Among languages which possess a dedicated counterfactual marker Lazard cites Turkana (Nilotic), Ewondo (Bantu), Yoruba and classic Nahuatl. Van Linden and Verstraete add Chukchi (Chukotko-Kamchatkan), Hua (Trans–New Guinea), Ika (Chibchan-Paezan), Kolyma Yukaghir, Martuthunira (Pama-Nyungan) and Somali (Cushitic).
    [Show full text]
  • Language Documentation and Description
    Language Documentation and Description ISSN 1740-6234 ___________________________________________ This article appears in: Language Documentation and Description, vol 17. Editor: Peter K. Austin Countering the challenges of globalization faced by endangered languages of North Pakistan ZUBAIR TORWALI Cite this article: Torwali, Zubair. 2020. Countering the challenges of globalization faced by endangered languages of North Pakistan. In Peter K. Austin (ed.) Language Documentation and Description 17, 44- 65. London: EL Publishing. Link to this article: http://www.elpublishing.org/PID/181 This electronic version first published: July 2020 __________________________________________________ This article is published under a Creative Commons License CC-BY-NC (Attribution-NonCommercial). The licence permits users to use, reproduce, disseminate or display the article provided that the author is attributed as the original creator and that the reuse is restricted to non-commercial purposes i.e. research or educational use. See http://creativecommons.org/licenses/by-nc/4.0/ ______________________________________________________ EL Publishing For more EL Publishing articles and services: Website: http://www.elpublishing.org Submissions: http://www.elpublishing.org/submissions Countering the challenges of globalization faced by endangered languages of North Pakistan Zubair Torwali Independent Researcher Summary Indigenous communities living in the mountainous terrain and valleys of the region of Gilgit-Baltistan and upper Khyber Pakhtunkhwa, northern
    [Show full text]
  • CAPSTONE 20-1 SWA Field Study Trip Book Part II
    CAPSTONE 20-1 SWA Field Study Trip Book Part II Subject Page Afghanistan ................................................................ CIA Summary ......................................................... 2 CIA World Fact Book .............................................. 3 BBC Country Profile ............................................... 24 Culture Gram .......................................................... 30 Kazakhstan ................................................................ CIA Summary ......................................................... 39 CIA World Fact Book .............................................. 40 BBC Country Profile ............................................... 58 Culture Gram .......................................................... 62 Uzbekistan ................................................................. CIA Summary ......................................................... 67 CIA World Fact Book .............................................. 68 BBC Country Profile ............................................... 86 Culture Gram .......................................................... 89 Tajikistan .................................................................... CIA World Fact Book .............................................. 99 BBC Country Profile ............................................... 117 Culture Gram .......................................................... 121 AFGHANISTAN GOVERNMENT ECONOMY Chief of State Economic Overview President of the Islamic Republic of recovering
    [Show full text]