Bridging the Script and Lexical Barrier Between Hindi and Urdu
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Technical Reference Manual for the Standardization of Geographical Names United Nations Group of Experts on Geographical Names
ST/ESA/STAT/SER.M/87 Department of Economic and Social Affairs Statistics Division Technical reference manual for the standardization of geographical names United Nations Group of Experts on Geographical Names United Nations New York, 2007 The Department of Economic and Social Affairs of the United Nations Secretariat is a vital interface between global policies in the economic, social and environmental spheres and national action. The Department works in three main interlinked areas: (i) it compiles, generates and analyses a wide range of economic, social and environmental data and information on which Member States of the United Nations draw to review common problems and to take stock of policy options; (ii) it facilitates the negotiations of Member States in many intergovernmental bodies on joint courses of action to address ongoing or emerging global challenges; and (iii) it advises interested Governments on the ways and means of translating policy frameworks developed in United Nations conferences and summits into programmes at the country level and, through technical assistance, helps build national capacities. NOTE The designations employed and the presentation of material in the present publication do not imply the expression of any opinion whatsoever on the part of the Secretariat of the United Nations concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. The term “country” as used in the text of this publication also refers, as appropriate, to territories or areas. Symbols of United Nations documents are composed of capital letters combined with figures. ST/ESA/STAT/SER.M/87 UNITED NATIONS PUBLICATION Sales No. -
Arabic Alphabet - Wikipedia, the Free Encyclopedia Arabic Alphabet from Wikipedia, the Free Encyclopedia
2/14/13 Arabic alphabet - Wikipedia, the free encyclopedia Arabic alphabet From Wikipedia, the free encyclopedia َأﺑْ َﺠ ِﺪﯾﱠﺔ َﻋ َﺮﺑِﯿﱠﺔ :The Arabic alphabet (Arabic ’abjadiyyah ‘arabiyyah) or Arabic abjad is Arabic abjad the Arabic script as it is codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually[1] stand for consonants, it is classified as an abjad. Type Abjad Languages Arabic Time 400 to the present period Parent Proto-Sinaitic systems Phoenician Aramaic Syriac Nabataean Arabic abjad Child N'Ko alphabet systems ISO 15924 Arab, 160 Direction Right-to-left Unicode Arabic alias Unicode U+0600 to U+06FF range (http://www.unicode.org/charts/PDF/U0600.pdf) U+0750 to U+077F (http://www.unicode.org/charts/PDF/U0750.pdf) U+08A0 to U+08FF (http://www.unicode.org/charts/PDF/U08A0.pdf) U+FB50 to U+FDFF (http://www.unicode.org/charts/PDF/UFB50.pdf) U+FE70 to U+FEFF (http://www.unicode.org/charts/PDF/UFE70.pdf) U+1EE00 to U+1EEFF (http://www.unicode.org/charts/PDF/U1EE00.pdf) Note: This page may contain IPA phonetic symbols. Arabic alphabet ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع en.wikipedia.org/wiki/Arabic_alphabet 1/20 2/14/13 Arabic alphabet - Wikipedia, the free encyclopedia غ ف ق ك ل م ن ه و ي History · Transliteration ء Diacritics · Hamza Numerals · Numeration V · T · E (//en.wikipedia.org/w/index.php?title=Template:Arabic_alphabet&action=edit) Contents 1 Consonants 1.1 Alphabetical order 1.2 Letter forms 1.2.1 Table of basic letters 1.2.2 Further notes -
Shahmukhi to Gurmukhi Transliteration System: a Corpus Based Approach
Shahmukhi to Gurmukhi Transliteration System: A Corpus based Approach Tejinder Singh Saini1 and Gurpreet Singh Lehal2 1 Advanced Centre for Technical Development of Punjabi Language, Literature & Culture, Punjabi University, Patiala 147 002, Punjab, India [email protected] http://www.advancedcentrepunjabi.org 2 Department of Computer Science, Punjabi University, Patiala 147 002, Punjab, India [email protected] Abstract. This research paper describes a corpus based transliteration system for Punjabi language. The existence of two scripts for Punjabi language has created a script barrier between the Punjabi literature written in India and in Pakistan. This research project has developed a new system for the first time of its kind for Shahmukhi script of Punjabi language. The proposed system for Shahmukhi to Gurmukhi transliteration has been implemented with various research techniques based on language corpus. The corpus analysis program has been run on both Shahmukhi and Gurmukhi corpora for generating statistical data for different types like character, word and n-gram frequencies. This statistical analysis is used in different phases of transliteration. Potentially, all members of the substantial Punjabi community will benefit vastly from this transliteration system. 1 Introduction One of the great challenges before Information Technology is to overcome language barriers dividing the mankind so that everyone can communicate with everyone else on the planet in real time. South Asia is one of those unique parts of the world where a single language is written in different scripts. This is the case, for example, with Punjabi language spoken by tens of millions of people but written in Indian East Punjab (20 million) in Gurmukhi script (a left to right script based on Devanagari) and in Pakistani West Punjab (80 million), written in Shahmukhi script (a right to left script based on Arabic), and by a growing number of Punjabis (2 million) in the EU and the US in the Roman script. -
Orthographic Transparency and the Ottoman Abjad Maithili Jais
Orthographic Transparency and the Ottoman Abjad Maithili Jais University of Florida Spring 2018 I. Introduction In 2014, the debate over whether Ottoman Turkish was to be taught in schools or not was once again brought to the forefront of Turkish society and the Turkish conscience, as Erdogan began to push for Ottoman Turkish to be taught in all high schools across the country (Yeginsu, 2014). This became an obsession of a news topic for media in the West as well as in Turkey. Turkey’s tumultuous history with politics inevitably led this proposal of teaching Ottoman Turkish in all high schools to become a hotbed of controversy and debate. For all those who are perfectly contented to let bygones be bygones, there are many who assert that the Ottoman Turkish alphabet is still relevant and important. In fact, though this may be a personal anecdote, there are still certainly people who believe that the Ottoman script is, or was, superior to the Latin alphabet with which modern Turkish is written. This thesis does not aim to undertake a task so grand as sussing out which of the two was more appropriate for Turkish. No, such a task would be a behemoth for this paper. Instead, it aims to answer the question, “How?” Rather, “How was the Arabic script moulded to fit Turkish and to what consequence?” Often the claim that one script it superior to another suggests inherent judgement of value, but of the few claims seen circulating Facebook on the efficacy of the Ottoman script, it seems some believe that it represented Turkish more accurately and efficiently. -
The Secret of Letters: Chronograms in Urdu Literary Culture1
Edebiyˆat, 2003, Vol. 13, No. 2, pp. 147–158 The Secret of Letters: Chronograms in Urdu Literary Culture1 Mehr Afshan Farooqi University of Virginia Letters of the alphabet are more than symbols on a page. They provide an opening into new creative possibilities, new levels of understanding, and new worlds of experience. In mature literary traditions, the “literal meaning” of literal meaning can encompass a variety of arcane uses of letters, both in their mode as a graphemic entity and as a phonemic activity. Letters carry hidden meanings in literary languages at once assigned and intrinsic: the numeric and prophetic, the cryptic and esoteric, and the historic and commemoratory. In most literary traditions there appears to be at least a threefold value system assigned to letters: letters can be seen as phonetic signs, they have a semantic value, and they also have a numerical value. Each of the 28 letters of the Arabic alphabet can be used as a numeral. When used numerically, the letters of the alphabet have a special order, which is called the abjad or abujad. Abjad is an acronym referring to alif, be, j¯ım, d¯al, the first four letters in the numerical order which, in the system most widely used, runs from alif to ghain. The abjad order organizes the 28 characters of the Arabic alphabet into eight groups in a linear series: abjad, havvaz, hutt¯ı, kalaman, sa`fas, qarashat, sakhkha˙˙ z, zazzagh.2 In nearly every area where˙ ¨the¨ Arabic script ˙ was adopted, the abjad¨ ˙ ˙system gained popularity. Within the vast area in which the Arabic script was used, two abjad systems developed. -
Middle East-I 9 Modern and Liturgical Scripts
The Unicode® Standard Version 13.0 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2020 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 13.0. Includes index. ISBN 978-1-936213-26-9 (http://www.unicode.org/versions/Unicode13.0.0/) 1. -
Blin Orthography: a History and an Assessment
Blin Orthography: A History and an Assessment Paul D. Fallon University of Mary Washington 1.Introduction1 Blin2 is a Central Cushitic language spoken by an estimated 90,000 in Eritrea, concentrated in the ‘Anseba region around Keren. Blin speakers comprise roughly 2% of the Eritrean population, and Blin is one of nine national languages in Eritrea. Most speakers are bilingual in the Semitic languages Tigrinya or Tigre, and many know Amharic, Arabic, and/or English as well. Abbebe (2001) is an excellent assessment of Blin language vitality and sociolinguistics. The language policy of the Eritrean government is to encourage mother-tongue education for native speakers of each of its nine ethnolinguistic groups through primary school (Chefena, Kroon and Walters 1999). When this was implemented for the Blin in 1997, the government provided Blin curricular materials in the language using a new Roman-based alphabet. This overturned a 110-year tradition of writing Blin in Ethiopic script. This paper will focus on the history of writing in Blin, and examine the linguistic and sociolinguistic factors of each writing system. For more on language planning in Blin in general, see Fallon (2006). Before proceeding, I will adopt the following definitions from Daniels (2001). An alphabet is a writing system “in which each character stands for a consonant or a vowel” (44). A syllabary is a system “in which each character stands for a syllable” (43), in contrast to the system used in Ethiopian Semitic languages, an abugida, “in which each character stands for a consonant accompanied by a particular vowel, usually /a/, and other vowels (or no vowel) are indicated by consistent additions to the consonant symbols” (44). -
The Arabic Alphabet Free
FREE THE ARABIC ALPHABET PDF Nicholas Awde,Putros Samano | 95 pages | 30 Apr 2007 | SAQI BOOKS | 9780863569548 | English | London, United Kingdom Your Complete Guide to the Arabic Alphabet | OptiLingo The Arabic script evolved from the Nabataean Aramaic script. The Aramaic language has fewer consonants than Arabic, so during the 7th century new Arabic letters were created by adding dots to existing letters in order to avoid ambiguities. Further diacritics indicating short vowels were introduced, but are only generally used to ensure the Qur'an was read aloud without mistakes. Each Arabic speaking country or region also has its own The Arabic Alphabet of colloquial spoken Arabic. These colloquial varieties of Arabic appear in written form in some The Arabic Alphabet, cartoons and comics, plays and personal letters. There are also translations of the bible into most varieties of The Arabic Alphabet Arabic. Arabic has also been written with the HebrewSyriac and Latin scripts. The The Arabic Alphabet of consonants used above is the ISO version of There are various other ways of transliterating Arabic. When chatting online some Arabic speakers write in the Latin alphabet use the following letters:. These numerals are those used when writing Arabic and are written from left to right. The term 'Arabic numerals' is also used to refer to 1, 2, 3, etc. For a full list of all varieties of colloquial Arabic click here format: Excel, 20K. All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act The Arabic Alphabet one another in a spirit of brotherhood. -
Proposal for Arabic Script Root Zone LGR
Proposal for Arabic Script Root Zone LGR LGR Version: 1 Authors: Task Force on Arabic Script IDN (TF-AIDN) https://community.icann.org/display/MES/TF-AIDN+Work+Space Date: 18 November 2015 Document Version: 3.4 Contents General Information ..................................................................................................................................... 2 1 Script and Languages Covered .............................................................................................................. 2 2 Process Undertaken for Developing the Proposal ................................................................................ 4 2.1 Team diversity and process .......................................................................................................... 4 2.2 Analysis of code point repertoire.................................................................................................. 6 2.3 Analysis of code point variants ..................................................................................................... 8 3 Code Point Repertoire......................................................................................................................... 10 3.1 Summary of code point repertoire included and excluded ........................................................ 10 3.2 Code point repertoire included .................................................................................................. 12 4 Final Recommendation of Variants for Top Level Domains (TLDs) .................................................... -
Arabic Romanization Table
Arabic Letters of the Alphabet Initial Medial Final Alone Romanization (omit (see Note 1 ا ﺎ ﺎ ا b ﺏ ﺐ ﺒ ﺑ t ﺕ ﺖ ﺘ ﺗ th ﺙ ﺚ ﺜ ﺛ j ﺝ ﺞ ﺠ ﺟ ḥ ﺡ ﺢ ﺤ ﺣ kh ﺥ ﺦ ﺨ ﺧ d ﺩ ﺪ ﺪ ﺩ dh ﺫ ﺬ ﺬ ﺫ r ﺭ ﺮ ﺮ ﺭ z ﺯ ﺰ ﺰ ﺯ s ﺱ ﺲ ﺴ ﺳ sh ﺵ ﺶ ﺸ ﺷ ṣ ﺹ ﺺ ﺼ ﺻ ḍ ﺽ ﺾ ﻀ ﺿ ṭ ﻁ ﻂ ﻄ ﻃ ẓ ﻅ ﻆ ﻈ ﻇ (ayn) ‘ ﻉ ﻊ ﻌ ﻋ gh ﻍ ﻎ ﻐ ﻏ (f (see Note 2 ﻑ ﻒ ﻔ ﻓ (q (see Note 2 ﻕ ﻖ ﻘ ﻗ k ﻙ ﻚ ﻜ ﻛ l ﻝ ﻞ ﻠ ﻟ m ﻡ ﻢ ﻤ ﻣ n ﻥ ﻦ ﻨ ﻧ (h (see Note 3 ة ، ه ـﺔ ، ﻪ ﻬ ﻫ w ﻭ ﻮ ﻮ ﻭ y ي ي ﻴ ﻳ Vowels and Diphthongs ī ◌ِ ﻯ (ā (see Rule 5 ◌َا a ◌َ aw ْ◌َ ﻭ ((á (see Rule 6(a ◌َ ﻯ u ◌ُ ay ◌َ ﻯْ ū ◌ُ ﻭ i ◌ِ Letters Representing Non-Arabic Consonants This list is not exhaustive. It should be noted that a letter in this group may have more than one phonetic value, depending on the country or area where it is used, and that the romanization will vary accordingly. v ڤ ch چ g گ v ۋ zh چ ñ ڴ v ڥ zh ژ p پ Notes 1. -
Urdu Alphabet
Urdu alphabet The Urdu alphabet is the right-to-left alphabet used for the Urdu language. It is Urdu alphabet a modification of the Persian alphabet known as Perso-Arabic, which is itself a اردو ﺗ ﺠﯽ derivative of the Arabic alphabet. The Urdu alphabet has up to 58 letters.[1] With 39 basic letters and no distinct letter cases, the Urdu alphabet is typically written in the calligraphic Nastaʿlīq script, whereas Arabic is more commonly in the Naskh style. Usually, bare transliterations of Urdu into Roman letters (called Roman Urdu) omit many phonemic elements that have no equivalent in English or other languages commonly written in the Latin script. The National Language Authority of Pakistan has developed a number of systems with specific notations Example of writing in the Urdu to signify non-English sounds, but these can only be properly read by someone alphabet: Urdu already familiar with the loan letters. Type Abjad Languages Urdu, Balti, Burushaski, others Contents Parent Proto-Sinaitic systems History Phoenician Nastaʿlīq Alphabet Aramaic Differences from Persian alphabet Retroflex letters Nabataean Vowels Arabic Vowel chart Alif Perso-Arabic Wāʾo Ye Urdu The 2 he's alphabet Ayn Nun Ghunnah Unicode U+0600 to U+06FF range Hamza U+0750 to U+077F Diacritics U+FB50 to U+FDFF Iẓāfat U+FE70 to U+FEFF Computers and the Urdu alphabet Encoding Urdu in Unicode Software Romanization standards and systems See also References Sources External links History The Urdu language emerged as a distinct register of Hindustani well before the Partition of India. It is distinguished most by its extensive Persian influences (Persian having been the official language of the Mughal government and the most prominent lingua franca of the Indian subcontinent for several centuries before the solidification of British colonial rule during the 19th century). -
[PDF] Beginners Guide to Arabic
An Arabic learning PDF with a crash course on the Arabic alphabet and a 33-page special addendum covering secrets of the Arabic language embedded within its grammar, vocabulary, etymology and everywhere else The Beginner's Guide To Arabic by Mohtanick Jamil ALLOW US TO HELP YOU OUT OF THE BEGINNER PHASE AND LEARN SOME REAL ARABIC. www.LearnArabicOnline.com The Beginner’s Guide to Arabic By Mohtanick Jamil GUIDE TO STUDYING ARABIC 3 WHY STUDY ARABIC 3 HOW TO STUDY ARABIC 6 WHERE TO STUDY ARABIC 9 WHAT YOU NEED BEFORE YOU START 10 THE ARABIC ALPHABET 11 INTRODUCTION TO THE ALPHABET 11 THE LETTERS 14 THE VOWELS 23 SOME BASIC VOCABULARY 26 RESOURCES FOR LEARNING ARABIC 33 ONLINE 33 RECOMMENDED BOOKS 35 JOIN OUR 3-PART MINI-CLASS 37 WHY IS ARABIC HARD? 37 HERE’S WHAT YOU’LL LEARN 38 OUR NEWSLETTERS 41 © Shariah Program Inc. FREE VIDEO SERIES Reveals How to Read Arabic With Understanding in 21 Days! Guide to Studying Arabic Why Study Arabic Arabic is spoken as a mother tongue by between 250 and 400 million people across 25 countries. Over a billion people can read the script even if they can’t understand the language. And Arabic happens to be one of the official languages of the United Nations. Therefore, many people learn the language for formal reasons. At about 1,500 years old, Arabic also happens to be a very old language. It was the language of scholarship throughout the rule of the Islamic empires – a period of well over 1,000 years from the 7th century right down to the 19th and even 20th.