Building a 50M Corpus of Tajik Language
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Tajiki Some Useful Phrases in Tajiki Five Reasons Why You Should Ассалому Алейкум
TAJIKI SOME USEFUL PHRASES IN TAJIKI FIVE REASONS WHY YOU SHOULD ассалому алейкум. LEARN MORE ABOUT TAJIKIS AND [ˌasːaˈlɔmu aˈlɛɪkum] /asah-lomu ah-lay-koom./ THEIR LANGUAGE Hello! 1. Tajiki is spoken as a first or second language by over 8 million people worldwide, but the Hоми шумо? highest population of speakers is located in [ˈnɔmi ʃuˈmɔ] Tajikistan, with significant populations in other /No-mee shoo-moh?/ Central Eurasian countries such as Afghanistan, What is your name? Uzbekistan, and Russia. Номи ман… 2. Tajiki is a member of the Western Iranian branch [ˈnɔmi man …] of the Indo-Iranian languages, and shares many structural similarities to other Persian languages /No-mee man.../ such as Dari and Farsi. My name is… 3. Few people in America can speak or use the Tajiki Шумо чи xeл? Нағз, рахмат. version of Persian. Given the different script and [ʃuˈmɔ ʧi χɛl naʁz ɾaχˈmat] dialectal differences, simply knowing Farsi is not /shoo-moh-chee-khel? Naghz, rah-mat./ enough to fully understand Tajiki. Those who How are you? I’m fine, thank you. study Tajiki can find careers in a variety of fields including translation and interpreting, consulting, Aз вохуриамон шод ҳастам. and foreign service and intelligence. NGOs [az vɔχuˈɾiamɔn ʃɔd χaˈstam] and other enterprises that deal with Tajikistan /Az vo-khu-ri-amon shod has-tam./ desperately need specialists who speak Tajiki. Nice to meet you. 4. The Pamir Mountains which have an elevation Лутфан. / Рахмат. of 23,000 feet are known locally as the “Roof of [lutˈfan] / [ɾaχˈmat] the World”. Mountains make up more than 90 /Loot-fan./ /Rah-mat./ percent of Tajikistan’s territory. -
View / Download 7.3 Mb
Between Shanghai and Mecca: Diaspora and Diplomacy of Chinese Muslims in the Twentieth Century by Janice Hyeju Jeong Department of History Duke University Date:_______________________ Approved: ___________________________ Engseng Ho, Advisor ___________________________ Prasenjit Duara, Advisor ___________________________ Nicole Barnes ___________________________ Adam Mestyan ___________________________ Cemil Aydin Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of History in the Graduate School of Duke University 2019 ABSTRACT Between Shanghai and Mecca: Diaspora and Diplomacy of Chinese Muslims in the Twentieth Century by Janice Hyeju Jeong Department of History Duke University Date:_______________________ Approved: ___________________________ Engseng Ho, Advisor ___________________________ Prasenjit Duara, Advisor ___________________________ Nicole Barnes ___________________________ Adam Mestyan ___________________________ Cemil Aydin An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy, in the Department of History in the Graduate School of Duke University 2019 Copyright by Janice Hyeju Jeong 2019 Abstract While China’s recent Belt and the Road Initiative and its expansion across Eurasia is garnering public and scholarly attention, this dissertation recasts the space of Eurasia as one connected through historic Islamic networks between Mecca and China. Specifically, I show that eruptions of -
Technical Reference Manual for the Standardization of Geographical Names United Nations Group of Experts on Geographical Names
ST/ESA/STAT/SER.M/87 Department of Economic and Social Affairs Statistics Division Technical reference manual for the standardization of geographical names United Nations Group of Experts on Geographical Names United Nations New York, 2007 The Department of Economic and Social Affairs of the United Nations Secretariat is a vital interface between global policies in the economic, social and environmental spheres and national action. The Department works in three main interlinked areas: (i) it compiles, generates and analyses a wide range of economic, social and environmental data and information on which Member States of the United Nations draw to review common problems and to take stock of policy options; (ii) it facilitates the negotiations of Member States in many intergovernmental bodies on joint courses of action to address ongoing or emerging global challenges; and (iii) it advises interested Governments on the ways and means of translating policy frameworks developed in United Nations conferences and summits into programmes at the country level and, through technical assistance, helps build national capacities. NOTE The designations employed and the presentation of material in the present publication do not imply the expression of any opinion whatsoever on the part of the Secretariat of the United Nations concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. The term “country” as used in the text of this publication also refers, as appropriate, to territories or areas. Symbols of United Nations documents are composed of capital letters combined with figures. ST/ESA/STAT/SER.M/87 UNITED NATIONS PUBLICATION Sales No. -
Shughni (Edelman & Dodykhudoeva).Pdf
DEMO : Purchase from www.A-PDF.com to remove the watermark CHAPTER FOURTEEN B SHUGHNI D. (Joy) I. Edelman and Leila R. Dodykhlldoel'a 1 INTRODUCTION 1.1 Overview The Shughni, or Shughnani, ethnic group, ethnonym xuynl, xuynl'tnl, populates the mountain valleys of the We st Pamir. Administratively, the Shughni-speaking area is part of the Mountainous Badakhshan Autonomous Region (Taj ik Viluyali /viukltlori Klihistoni Badakhsholl) of the Republic of Tajikistan, with its major center of Khorog, Taj . Khorugh (370 30/N, 710 3l/E), and of the adjacent Badakhshan Province of Afghanistan. In Tajikistan, the Shughn(an)i live along the right bank of the longitudinal stretch of the river Panj from (Zewar) Dasht in the North to Darmorakht in the south, as well as along the valleys of its eastern tributaries, the Ghund (rlll1d, Taj ik ru nt) and the Shahdara (Xa"tdarii), which meet at Khorog. They also constitute the major population group in the high mountain valley of Baju(w)dara (Bafli (l I>}darii) to the north of Khorog. Small, compact groups are also fo und in central Tajikistan, including Khatlon, Romit, Kofarnikhon, and other regions. In Afghanistan, the Shughn(an)i have also compact settlements, mainly on the left bank of the river Panj in Badakhshan Province. A sizeable Shughn(an)i-speaking com munity is also fo und in Kabul (cf. Nawata 1979) and in Faizabad, the capital of Afghan Badakhshan. Linguistically, the Shughn(an)i language, endonym (xlIyn (I.ln)l, Xll)i/1(lln)J zil'. XUYIUII1 zil'), belongs to the Shughn(an)i-Rushani sub-group of the North Pamir languages. -
Birth of Tajikistan : National Identity and the Origins of the Republic
THE BIRTH OF TAJIKISTAN i THE BIRTH OF TAJIKISTAN ii THE BIRTH OF TAJIKISTAN For Suzanne Published in 2007 by I.B.Tauris & Co Ltd 6 Salem Road, London W2 4BU 175 Fifth Avenue, New York NY 10010 www.ibtauris.com In the United States of America and Canada distributed by Palgrave Macmillan a division of St. Martin's Press, 175 Fifth Avenue, New York NY 10010 Copyright © Paul Bergne The right of Paul Bergne to be identified as the author of this work has been asserted by the author in accordance with the Copyright, Designs and Patent Act 1988. All rights reserved. Except for brief quotations in a review, this book, or any part thereof, may not be reproduced, stored in or introduced into a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher. International Library of Central Asian Studies 1 ISBN: 978 1 84511 283 7 A full CIP record for this book is available from the British Library A full CIP record is available from the Library of Congress Library of Congress Catalog Card Number: available Printed and bound in India by Replika Press Pvt. Ltd From camera-ready copy edited and supplied by the author THE BIRTH OF TAJIKISTAN v CONTENTS Abbreviations vii Transliteration ix Acknowledgements xi Maps. Central Asia c 1929 xii Central Asia c 1919 xiv Introduction 1 1. Central Asian Identities before 1917 3 2. The Turkic Ascendancy 15 3. The Revolution and After 20 4. The Road to Soviet Power 28 5. -
The Socio Linguistic Situation and Language Policy of the Autonomous Region of Mountainous Badakhshan: the Case of the Tajik Language*
THE SOCIO LINGUISTIC SITUATION AND LANGUAGE POLICY OF THE AUTONOMOUS REGION OF MOUNTAINOUS BADAKHSHAN: THE CASE OF THE TAJIK LANGUAGE* Leila Dodykhudoeva Institute Of Linguistics, Russian Academy Of Sciences The paper deals with the problem of closely related languages of the Eastern and Western Iranian origin that coexist in a close neighbourhood in a rather compact area of one region of Republic Tajikistan. These are a group of "minor" Pamir languages and state language of Tajikistan - Tajik. The population of the Autonomous Region of Mountainous Badakhshan speaks different Pamir languages. They are: Shughni, Rushani, Khufi, Bartangi, Roshorvi, Sariqoli; Yazghulami; Wakhi; Ishkashimi. These languages have no script and written tradition and are used only as spoken languages in the region. The status of these languages and many other local linguemes is still discussed in Iranology. Nearly all Pamir languages to a certain extent can be called "endangered". Some of these languages, like Yazghulami, Roshorvi, Ishkashimi are included into "The Red Book " (UNESCO 1995) as "endangered". Some of them are extinct. Information on other idioms up to now is not available. These languages live in close cooperation and interaction with the state language of Tajikistan - Tajik. Almost all population of Badakhshan is multilingual or bilingual. The second language is official language of the state - Tajik. This language is used in Badakhshan as the language of education, press, media, and culture. This is the reason why this paper is focused on the status of Tajik language in Republic Tajikistan and particularly in Badakhshan. The Tajik literary language (its oral and written forms) has a long history and rich written traditions. -
5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721
Internet Engineering Task Force (IETF) P. Faltstrom, Ed. Request for Comments: 5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721 The Unicode Code Points and Internationalized Domain Names for Applications (IDNA) Abstract This document specifies rules for deciding whether a code point, considered in isolation or in context, is a candidate for inclusion in an Internationalized Domain Name (IDN). It is part of the specification of Internationalizing Domain Names in Applications 2008 (IDNA2008). Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5892. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. -
AIX Globalization
AIX Version 7.1 AIX globalization IBM Note Before using this information and the product it supports, read the information in “Notices” on page 233 . This edition applies to AIX Version 7.1 and to all subsequent releases and modifications until otherwise indicated in new editions. © Copyright International Business Machines Corporation 2010, 2018. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents About this document............................................................................................vii Highlighting.................................................................................................................................................vii Case-sensitivity in AIX................................................................................................................................vii ISO 9000.....................................................................................................................................................vii AIX globalization...................................................................................................1 What's new...................................................................................................................................................1 Separation of messages from programs..................................................................................................... 1 Conversion between code sets............................................................................................................. -
Tehran, Vienna Keen on Exchanging National Archive Documents
Art & Culture November 23, 2019 3 This Day in History (November 23) Tehran, Vienna Keen on Exchanging Today is Saturday; 2nd of the Iranian month of Azar 1398 solar hijri; corresponding to 25th of the Islamic month of Rabi al-Awwal 1441 lunar hijri; and November 23, 2019, of the Christian Gregorian Calendar. National Archive Documents 1918 solar years ago, on this day in the year 101 AD, present day Romania was occupied by the Roman Empire. This land was ruled by the Romans until its conquest for publishing a 162-year-old Elsewhere in his remarks, by the Ottoman Turks in 1453. Meanwhile, parts of Romania were also occupied by cooperation document between Zarafshan pointed to the documents Austria for a while till the year 1877, in which Romania emerged as independent. Iran and Austria. between Iran and Austria in the Romania covers an area of 237500 sq km. Its capital is Bucharest. 1005 lunar years ago, on this day in 436 AH, the great scholar and jurisprudent, Zarafshan made the remarks at field of cooperation of the two Seyyed Ali Ibn Hussain, popularly known as Sharif Murtaza, passed away at the the venue of National Archive countries in different economic age of 81 in his hometown Baghdad. He was born in a family descended on both of Iran on Wednesday in the sectors, technology transfer, in sides from Prophet Mohammad (blessings of God upon him and his progeny). His inaugural ceremony of Iranian aviation and transport industries, th father Hussain was 5 in line of descent from Imam Musa al-Kazem (AS), while and Austrian documents in the import and export activities and his mother, Fatema – a scion of the family that had carved out an independent state in Tabaristan on the Caspian Sea coast of Iran – was a descendant of Imam presence of Austrian ambassador added, “the old documents of the Zain al-Abedin (AS). -
An Acoustic, Historical, and Developmental Analysis Of
AN ACOUSTIC, HISTORICAL, AND DEVELOPMENTAL ANALYSIS OF SARIKOL TAJIK DIPHTHONGS by PAMELA S. ARLUND Presented to the Faculty of the Graduate School of The University of Texas at Arlington in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY THE UNIVERSITY OF TEXAS AT ARLINGTON December 2006 Copyright © by Pamela S. Arlund 2006 All Rights Reserved ACKNOWLEDGEMENTS I am thankful to all those people who have encouraged me along the way to make the time for this and to not give up. I would like to thank my Mom and Dad, Tom and Judy Arlund, first and foremost. They weren’t always sure what to do with their daughter who loved books more than playing, but they always encouraged my desire for knowledge. They may not know any Tajiks, but the Tajiks know Mom and Dad through me. Thank you to all the people in China. The students, faculty and foreign affairs staff at Xinjiang University have always been so gracious and affirming. The Tajik people themselves are so willing and eager to have their language studied. I hope this work lives up to their hopes and expectations. Thank you to Dr. Jerold Edmondson whose assistance helped me to take a whole new approach to the problem of diphthongs. There are so many people in so many far-flung places who have encouraged me. Thanks to all those at Metro in Kansas City, at Hillside in Perth, at New City in New Zealand and at Colleyville in Texas. Your love, encouragement, and support have kept me going on this project. -
Alphabets, Letters and Diacritics in European Languages (As They Appear in Geography)
1 Vigleik Leira (Norway): [email protected] Alphabets, Letters and Diacritics in European Languages (as they appear in Geography) To the best of my knowledge English seems to be the only language which makes use of a "clean" Latin alphabet, i.d. there is no use of diacritics or special letters of any kind. All the other languages based on Latin letters employ, to a larger or lesser degree, some diacritics and/or some special letters. The survey below is purely literal. It has nothing to say on the pronunciation of the different letters. Information on the phonetic/phonemic values of the graphic entities must be sought elsewhere, in language specific descriptions. The 26 letters a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z may be considered the standard European alphabet. In this article the word diacritic is used with this meaning: any sign placed above, through or below a standard letter (among the 26 given above); disregarding the cases where the resulting letter (e.g. å in Norwegian) is considered an ordinary letter in the alphabet of the language where it is used. Albanian The alphabet (36 letters): a, b, c, ç, d, dh, e, ë, f, g, gj, h, i, j, k, l, ll, m, n, nj, o, p, q, r, rr, s, sh, t, th, u, v, x, xh, y, z, zh. Missing standard letter: w. Letters with diacritics: ç, ë. Sequences treated as one letter: dh, gj, ll, rr, sh, th, xh, zh. -
Kampf Um Wort Und Schrift
Vandenhoeck & Ruprecht Kampf um Wort Veröffentlichungen des Instituts für Europäische Geschichte Mainz Beiheft 90 und Schrift Russifizierung in Osteuropa Nach den Teilungen Polens und der Eroberung des Kaukasus im 19.–20. Jahrhundert und Zentralasiens im 18./19. Jahrhundert erhielt das Zarenreich Kontrolle über alte Kulturräume, die es im Zuge der Koloniali- sierung zu assimilieren versuchte. Diese Versuche erfolgten nicht Herausgegeben von Zaur Gasimov zuletzt mittels der Sprachpolitik. Russisch sollte im Bildungs- und Behördenwesen im gesamten Imperium Verbreitung finden, andere Sprachen sollten verdrängt werden. Diese Russifizierung lässt sich Schrift und Wort um Kampf von einer kurzen Phase der »Verwurzelung« unter Lenin bis weit ins 20. Jahrhundert nachverfolgen. Erst im Zuge der Perestrojka wurde die sowjetische Sprachpolitik öffentlich kritisiert: Die einzelnen Republiken konnten durch neue Sprachgesetze ein Aussterben der lokalen Sprachen verhindern. Der Herausgeber Dr. Zaur Gasimov ist Wissenschaftlicher Mitarbeiter am Leibniz-Institut für Europäische Geschichte in Mainz. Zaur Gasimov (Hg.) (Hg.) Gasimov Zaur Vandenhoeck & Ruprecht www.v-r.de 9 783525 101223 V UUMS_Gasimov_VIEG_v2MS_Gasimov_VIEG_v2 1 005.03.125.03.12 115:095:09 Veröffentlichungen des Instituts für Europäische Geschichte Mainz Abteilung für Universalgeschichte Herausgegeben von Johannes Paulmann Beiheft 90 Vandenhoeck & Ruprecht Kampf um Wort und Schrift Russifizierung in Osteuropa im 19.–20. Jahrhundert Herausgegeben von Zaur Gasimov Vandenhoeck & Ruprecht Bibliografische Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.d-nb.de abrufbar. ISBN (Print) 978-3-525-10122-3 ISBN (OA) 978-3-666-10122-9 https://doi.org/10.13109/9783666101229 © 2012, Vandenhoeck & Ruprecht GmbH & Co.