New Language Resources for the Pashto Language
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Grammatical Gender in Hindukush Languages
Grammatical gender in Hindukush languages An areal-typological study Julia Lautin Department of Linguistics Independent Project for the Degree of Bachelor 15 HEC General linguistics Bachelor's programme in Linguistics Spring term 2016 Supervisor: Henrik Liljegren Examinator: Bernhard Wälchli Expert reviewer: Emil Perder Project affiliation: “Language contact and relatedness in the Hindukush Region,” a research project supported by the Swedish Research Council (421-2014-631) Grammatical gender in Hindukush languages An areal-typological study Julia Lautin Abstract In the mountainous area of the Greater Hindukush in northern Pakistan, north-western Afghanistan and Kashmir, some fifty languages from six different genera are spoken. The languages are at the same time innovative and archaic, and are of great interest for areal-typological research. This study investigates grammatical gender in a 12-language sample in the area from an areal-typological perspective. The results show some intriguing features, including unexpected loss of gender, languages that have developed a gender system based on the semantic category of animacy, and languages where this animacy distinction is present parallel to the inherited gender system based on a masculine/feminine distinction found in many Indo-Aryan languages. Keywords Grammatical gender, areal-typology, Hindukush, animacy, nominal categories Grammatiskt genus i Hindukush-språk En areal-typologisk studie Julia Lautin Sammanfattning I den här studien undersöks grammatiskt genus i ett antal språk som talas i ett bergsområde beläget i norra Pakistan, nordvästra Afghanistan och Kashmir. I området, här kallat Greater Hindukush, talas omkring 50 olika språk från sex olika språkfamiljer. Det stora antalet språk tillsammans med den otillgängliga terrängen har gjort att språken är arkaiska i vissa hänseenden och innovativa i andra, vilket gör det till ett intressant område för arealtypologisk forskning. -
(And Potential) Language and Linguistic Resources on South Asian Languages
CoRSAL Symposium, University of North Texas, November 17, 2017 Existing (and Potential) Language and Linguistic Resources on South Asian Languages Elena Bashir, The University of Chicago Resources or published lists outside of South Asia Digital Dictionaries of South Asia in Digital South Asia Library (dsal), at the University of Chicago. http://dsal.uchicago.edu/dictionaries/ . Some, mostly older, not under copyright dictionaries. No corpora. Digital Media Archive at University of Chicago https://dma.uchicago.edu/about/about-digital-media-archive Hock & Bashir (eds.) 2016 appendix. Lists 9 electronic corpora, 6 of which are on Sanskrit. The 3 non-Sanskrit entries are: (1) the EMILLE corpus, (2) the Nepali national corpus, and (3) the LDC-IL — Linguistic Data Consortium for Indian Languages Focus on Pakistan Urdu Most work has been done on Urdu, prioritized at government institutions like the Center for Language Engineering at the University of Engineering and Technology in Lahore (CLE). Text corpora: http://cle.org.pk/clestore/index.htm (largest is a 1 million word Urdu corpus from the Urdu Digest. Work on Essential Urdu Linguistic Resources: http://www.cle.org.pk/eulr/ Tagset for Urdu corpus: http://cle.org.pk/Publication/papers/2014/The%20CLE%20Urdu%20POS%20Tagset.pdf Urdu OCR: http://cle.org.pk/clestore/urduocr.htm Sindhi Sindhi is the medium of education in some schools in Sindh Has more institutional backing and consequent research than other languages, especially Panjabi. Sindhi-English dictionary developed jointly by Jennifer Cole at the University of Illinois Urbana- Champaign and Sarmad Hussain at CLE (http://182.180.102.251:8081/sed1/homepage.aspx). -
Globalization and Language Policies of Multilingual Societies
Globalization and Language Policies of Multilingual Societies: Some Case Studies of South East Asia Globalização e políticas linguísticas em sociedades multilíngues: Estudos de caso do sudeste da Asia Navin Kumar Singh* Northern Arizona University Flagstaff / USA Shaoan Zhang** University of Nevada Las Vegas / USA Parwez Besmel*** Northern Arizona University Flagstaff / USA ABSTRACT: Over the past few decades, significant economic and political changes have taken place around the world. These changes also have put a significant mark on language teaching and learning practices across the globe. There is a clear movement towards multilingual practices in the world, which is also evident in the title of UNESCO 2003 education position paper, “Education in a multilingual world.” Given the long-standing history of multilingual contexts of the Himalayan region and the emergence of the two major global economic power centers of 21st century, China and India, language policies and practices of the region have become a great matter of interests for linguists and policy makers around the world. This paper uses case studies to investigate how globalization influences language education policies and practices in multilingual countries. The case studies that we have drawn from the four nations of South East Asia – Afghanistan, China, India, and Nepal offer insights for other multilingual nations of the world, as they portray the influences of globalization on language policies and practices of multilingual countries. This paper suggests more research on comparative studies of multilingual education across multilingual nations in the world. KEYWORDS: Language maintenance, globalization, multilingualism, Asian societies. * [email protected] ** [email protected] *** [email protected] RBLA, Belo Horizonte, v. -
Some Principles of the Use of Macro-Areas Language Dynamics &A
Online Appendix for Harald Hammarstr¨om& Mark Donohue (2014) Some Principles of the Use of Macro-Areas Language Dynamics & Change Harald Hammarstr¨om& Mark Donohue The following document lists the languages of the world and their as- signment to the macro-areas described in the main body of the paper as well as the WALS macro-area for languages featured in the WALS 2005 edi- tion. 7160 languages are included, which represent all languages for which we had coordinates available1. Every language is given with its ISO-639-3 code (if it has one) for proper identification. The mapping between WALS languages and ISO-codes was done by using the mapping downloadable from the 2011 online WALS edition2 (because a number of errors in the mapping were corrected for the 2011 edition). 38 WALS languages are not given an ISO-code in the 2011 mapping, 36 of these have been assigned their appropri- ate iso-code based on the sources the WALS lists for the respective language. This was not possible for Tasmanian (WALS-code: tsm) because the WALS mixes data from very different Tasmanian languages and for Kualan (WALS- code: kua) because no source is given. 17 WALS-languages were assigned ISO-codes which have subsequently been retired { these have been assigned their appropriate updated ISO-code. In many cases, a WALS-language is mapped to several ISO-codes. As this has no bearing for the assignment to macro-areas, multiple mappings have been retained. 1There are another couple of hundred languages which are attested but for which our database currently lacks coordinates. -
Pashto, Waneci, Ormuri. Sociolinguistic Survey of Northern
SOCIOLINGUISTIC SURVEY OF NORTHERN PAKISTAN VOLUME 4 PASHTO, WANECI, ORMURI Sociolinguistic Survey of Northern Pakistan Volume 1 Languages of Kohistan Volume 2 Languages of Northern Areas Volume 3 Hindko and Gujari Volume 4 Pashto, Waneci, Ormuri Volume 5 Languages of Chitral Series Editor Clare F. O’Leary, Ph.D. Sociolinguistic Survey of Northern Pakistan Volume 4 Pashto Waneci Ormuri Daniel G. Hallberg National Institute of Summer Institute Pakistani Studies of Quaid-i-Azam University Linguistics Copyright © 1992 NIPS and SIL Published by National Institute of Pakistan Studies, Quaid-i-Azam University, Islamabad, Pakistan and Summer Institute of Linguistics, West Eurasia Office Horsleys Green, High Wycombe, BUCKS HP14 3XL United Kingdom First published 1992 Reprinted 2004 ISBN 969-8023-14-3 Price, this volume: Rs.300/- Price, 5-volume set: Rs.1500/- To obtain copies of these volumes within Pakistan, contact: National Institute of Pakistan Studies Quaid-i-Azam University, Islamabad, Pakistan Phone: 92-51-2230791 Fax: 92-51-2230960 To obtain copies of these volumes outside of Pakistan, contact: International Academic Bookstore 7500 West Camp Wisdom Road Dallas, TX 75236, USA Phone: 1-972-708-7404 Fax: 1-972-708-7433 Internet: http://www.sil.org Email: [email protected] REFORMATTING FOR REPRINT BY R. CANDLIN. CONTENTS Preface.............................................................................................................vii Maps................................................................................................................ -
ARTICLE Development of a Gold-Standard Pashto Dataset and a Segmentation App Yan Han and Marek Rychlik
ARTICLE Development of a Gold-standard Pashto Dataset and a Segmentation App Yan Han and Marek Rychlik ABSTRACT The article aims to introduce a gold-standard Pashto dataset and a segmentation app. The Pashto dataset consists of 300 line images and corresponding Pashto text from three selected books. A line image is simply an image consisting of one text line from a scanned page. To our knowledge, this is one of the first open access datasets which directly maps line images to their corresponding text in the Pashto language. We also introduce the development of a segmentation app using textbox expanding algorithms, a different approach to OCR segmentation. The authors discuss the steps to build a Pashto dataset and develop our unique approach to segmentation. The article starts with the nature of the Pashto alphabet and its unique diacritics which require special considerations for segmentation. Needs for datasets and a few available Pashto datasets are reviewed. Criteria of selection of data sources are discussed and three books were selected by our language specialist from the Afghan Digital Repository. The authors review previous segmentation methods and introduce a new approach to segmentation for Pashto content. The segmentation app and results are discussed to show readers how to adjust variables for different books. Our unique segmentation approach uses an expanding textbox method which performs very well given the nature of the Pashto scripts. The app can also be used for Persian and other languages using the Arabic writing system. The dataset can be used for OCR training, OCR testing, and machine learning applications related to content in Pashto. -
Breaking Bread Dinner Series: COVID-19 Special Event Kneading Community at Home Virtual Watch Party: the Breadwinner June 5
Breaking Bread Dinner Series: COVID-19 Special Event Kneading Community at Home Virtual Watch Party: The Breadwinner June 5, 2020 Educational Guide Page Interactive Guide: Before 2 What to know before watching The Breadwinner Viewing Guide: The Breadwinner 5 Themes to consider, discuss, and explore Interactive Guide: After 7 Activities to do after watching The Breadwinner Flavors of Home 10 Recipes from cultures of ReEstablish Richmond clients Recommended for Adults 13 Resources that explore refugee and immigrant experiences Recommended for Children and Youth 15 Resources that explore refugee and immigrant experiences Timeline 16 The history of refugee resettlement in Richmond Interactive Guide: Before What to know before watching The Breadwinner Watch the movie trailer: http://thebreadwinner.com/ The setting of the story The story takes place in the city of Kabul in 2001, at the end of the Taliban regime. Kabul is the capital of Afghanistan, located between Uzbekistan, Tajikistan, and Pakistan. To learn more about Kabul: https://www.britannica.com/place/Kabul To learn more about specific, important places shown in the film: https://thebreadwinner2017.weebly.com/setting.html Languages spoken by the characters in the film Dari and Pashto are the official languages of Afghanistan, but there are many regional and minor languages as well. In The Breadwinner, you may hear some words that are unfamiliar to you. • Jan – means “dear” and is used after a person’s name as a way to be polite and friendly • Parvana – means “butterfly” and is a common female name in Afghanistan • Salaam – means “peace” and is the customary way of greeting someone https://www.worldatlas.com/articles/what-languages-are-spoken-in-afghanistan.html Food sold in the markets and served in the home People from Afghanistan are known for their hospitality and beautiful food. -
Feasibility Study of Introducing Pashto Language As a Medium of Instruction in the Government Primary Schools of Khyber
FEASIBILITY STUDY OF INTRODUCING PASHTO LANGUAGE AS A MEDIUM OF INSTRUCTION IN GOVERNMENT PRIMARY SCHOOLS OF KHYBER PAKHTUNKHWA BY ABDUL BASIT SIDDIQUI Registration No. 091- NUN - 0056 Submitted in partial fulfillment of the requirements of the degree of Doctor of Philosophy In Education DEPARTMENT OF EDUCATION FACULTY OF ARTS AND SOCIAL SCIENCES NORTHERN UNIVERSITY, NOWSHERA (PAKISTAN) 2014 i ii DEDICATION To my dear parents, whose continuous support, encouragement and persistent prayers have been the real source of my all achievements. iii TABLE OF CONTENTS ACKNOWLEDGEMENT xv ABSTRACT xvii Chapter 1: INTRODUCTION 1 1.1 STATEMENT OF THE PROBLEM 2 1.2 OBJECTIVES OF THE STUDY 3 1.3 HYPOTHESIS OF THE STUDY 3 1.4 SIGNIFICANCE OF THE STUDY 3 1.5 DELIMITATION OF THE STUDY 4 1.6 METHOD AND PROCEDURE 4 1.6.1 Population 4 1.6.2 Sample 4 1.6.3 Research Instruments 5 1.6.4 Data Collection 5 1.6.5 Analysis of Data 5 Chapter 2: REVIEW OF RELATED LITERATURE 6 2.1 ALL CREATURES OF THE UNIVERSE COMMUNICATE 7 2.2 LANGUAGE ESTABLISHES THE SUPERIORITY OF HUMAN BEINGS OVER OTHER SPECIES OF THE WORLD 8 2.3 DEFINITIONS: 9 2.3.1 Mother Tongue / First Language 9 2.3.2 Second Language (L2) 9 2.3.3 Foreign Language 10 2.3.4 Medium of Instruction 10 iv 2.3.5 Mother Tongue as a Medium of Instruction 10 2.4 HOW CHILDREN LEARN THEIR MOTHER TONGUE 10 2.5 IMPORTANT CHARACTERISTICS FOR A LANGUAGE ADOPTED AS MEDIUM OF INSTRUCTION 11 2.6 CONDITIONS FOR THE SELECTION OF DESIRABLE TEXT FOR LANGUAGE 11 2.7 THEORIES ABOUT LEARNING (MOTHER) LANGUAGE 12 2.8 ORIGIN OF PAKHTUN -
Pashto Language & Identity Formation in Pakistan
Pashto Language & Identity Formation in Pakistan∗ Tariq Rahmany Contents 1 Linguistic and Ethnic Situation 2 1.1 In Afghanistan . 2 1.2 In Pakistan . 3 2 Pashto and Pakhtun identity 4 2.1 Imperialist mistrust of Pashto . 6 2.2 Pre-partition efforts to promote Pashto . 7 2.3 Journalistic and literary activities in Pashto . 8 2.4 Pashto and politics in pre-partition NWFP. 8 2.5 Pashto in Swat . 10 3 Pashto in Pakistan 11 3.1 The political background . 11 3.2 The status of Pashto . 13 3.3 The politics of Pashto . 15 4 Conclusion 17 References 18 Abstract Traces out the history of the movement to increase the use of the Pashto language in the domains of power in Pakistan. Relationship of the movement with ethnic politics; Linguistic and ethnic ∗Contemporary South Asia, July 1995, Vol 4, Issue 2, p151-20 yTariq Rahman is Associate Professor of Linguistics, National Institute of Pakistan Studies, Quaid-i-Azam University, Islamabad, Pakistan. 1 Khyber.ORG Q.J.k situation in Afghanistan; Pashto and Pakhtun identity; Attitude of the Pakistani ruling elite towards Pashto. Pashto, a language belonging to the Iranian branch of the Indo-European language family, has more than 25 million native speakers. Of these, 16 to 17 million live in Pakistan and 8 to 9 million in Afghanistan.1 Pashto is the official language in Afghanistan, along with Dari (Afghan Persian), but in Pakistan it is not used in the domains of power–administration, military, judiciary, commerce, education and research–in any significant way. The activists of the Pashto language movement of Pakistan have been striving to increase the use of the language in these domains–i.e. -
Language Documentation and Description
Language Documentation and Description ISSN 1740-6234 ___________________________________________ This article appears in: Language Documentation and Description, vol 17. Editor: Peter K. Austin Countering the challenges of globalization faced by endangered languages of North Pakistan ZUBAIR TORWALI Cite this article: Torwali, Zubair. 2020. Countering the challenges of globalization faced by endangered languages of North Pakistan. In Peter K. Austin (ed.) Language Documentation and Description 17, 44- 65. London: EL Publishing. Link to this article: http://www.elpublishing.org/PID/181 This electronic version first published: July 2020 __________________________________________________ This article is published under a Creative Commons License CC-BY-NC (Attribution-NonCommercial). The licence permits users to use, reproduce, disseminate or display the article provided that the author is attributed as the original creator and that the reuse is restricted to non-commercial purposes i.e. research or educational use. See http://creativecommons.org/licenses/by-nc/4.0/ ______________________________________________________ EL Publishing For more EL Publishing articles and services: Website: http://www.elpublishing.org Submissions: http://www.elpublishing.org/submissions Countering the challenges of globalization faced by endangered languages of North Pakistan Zubair Torwali Independent Researcher Summary Indigenous communities living in the mountainous terrain and valleys of the region of Gilgit-Baltistan and upper Khyber Pakhtunkhwa, northern -
A Micro-Typological Study of Pashai Varieties in Afghanistan
A micro-typological study of Pashai varieties in Afghanistan Vanessa Quasnik Department of Linguistics Independent Project for a Master’s degree 30hp Master’s program (120hp) Spring term 2019 Supervisor: Henrik Liljegren En mikrotypologisk studie av Pashai varieteter i Afghanistan Sammanfattning Hindukushregionen sträcker sig från Afghanistan över Pakistan till norra Indien och hyser de van- ligtvis så kallade dardiska språken. De dardiska språken tillhör de indo-ariska språken vilka i isolation och genom kontakt utvecklade eller bevarade drag som inte längre finns i indo-ariska språk utanför regionen. I det pågående projekt “Språkkontakt och språksläktskap i Hindukushregionen” samlades data från mer än 50 språk inklusive nio varietéer av det nordvästra indo-ariska språket Pashai som talas i västra Afghanistan. En kognatanalys och en analys av fonologiska, morfologiska, syntaktiska och lexikala drag genomfördes. Kognatanalysen visar att Pashai varieteterna formar kluster, en västra grupp av de tre västra varieteterna och en östra grupp av de sex östra varieteterna. Struktruanalysen visar en mer skiftande bild av tre potentiella kluster, en grupp av de två mest västra varieteterna, en nordöstra grupp och en centergrupp bestående av en västra varietet och två sydöstra varieteter. Några drag som anses vara delad av språken i regionen kan också konstateras i alla Pashaivarieteter som en subjekt-objekt-verb följd och postpositioner. Nyckelord Hindu Kush, Indoariska språk, microtypology, Pashai A micro-typological study of Pashai varieties in Afghanistan Abstract The Hindu Kush region stretches from Afghanistan over Pakistan to North India and is hometowhatis commonly known as the Dardic languages. The Dardic langagues are a group of Indo-Aryan languages that have in isolation and under contact developed or retained features that can not be found in Indo- Aryan languages outside the region. -
Reimagine a F G H a N I S T a N
REIMAGINE A F G H A N I S T A N A N I N I T I A T I V E B Y R A I S I N A H O U S E REIMAGINE A F G H A N I S T A N INTRODUCTION . Afghanistan equals Culture, heritage, music, poet, spirituality, food & so much more. The country had witnessed continuous violence for more than 4 ................................................... decades & this has in turn overshadowed the rich cultural heritage possessed by the country, which has evolved through mellinnias of Cultural interaction & evolution. Reimagine Afghanistan as a digital magazine is an attempt by Raisina House to explore & portray that hidden side of Afghanistan, one that is almost always overlooked by the mainstream media, the side that is Humane. Afghanistan is rich in Cultural Heritage that has seen mellinnias of construction & destruction but has managed to evolve to the better through the ages. Issued as part of our vision project "Rejuvenate Afghanistan", the magazine is an attempt to change the existing perception of Afghanistan as a Country & a society bringing forward that there is more to the Country than meets the eye. So do join us in this journey to explore the People, lifestyle, Art, Food, Music of this Adventure called Afghanistan. C O N T E N T S P A G E 1 AFGHANISTAN COUNTRY PROFILE P A G E 2 - 4 PEOPLE ETHNICITY & LANGUAGE OF AFGHANISTAN P A G E 5 - 7 ART OF AFGHANISTAN P A G E 8 ARTISTS OF AFGHANISTAN P A G E 9 WOOD CARVING IN AFGHANISTAN P A G E 1 0 GLASS BLOWING IN AFGHANISTAN P A G E 1 1 CARPETS OF AFGHANISTAN P A G E 1 2 CERAMIC WARE OF AFGHANISTAN P A G E 1 3 - 1 4 FAMOUS RECIPES OF AFGHANISTAN P A G E 1 5 AFGHANI POETRY P A G E 1 6 ARCHITECTURE OF AFGHANISTAN P A G E 1 7 REIMAGINING AFGHANISTAN THROUGH CINEMA P A G E 1 8 AFGHANI MOVIE RECOMMENDATION A B O U T A F G H A N I S T A N Afghanistan Country Profile: The Islamic Republic of Afghanistan is a landlocked country situated between the crossroads of Western, Central, and Southern Asia and is at the heart of the continent.