Breathy Nasals and /Nh/ Clusters in Bengali, Hindi, and Marathi Christina M
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
A Practical Sanskrit Introductory
A Practical Sanskrit Intro ductory This print le is available from ftpftpnacaczawiknersktintropsjan Preface This course of fteen lessons is intended to lift the Englishsp eaking studentwho knows nothing of Sanskrit to the level where he can intelligently apply Monier DhatuPat ha Williams dictionary and the to the study of the scriptures The rst ve lessons cover the pronunciation of the basic Sanskrit alphab et Devanagar together with its written form in b oth and transliterated Roman ash cards are included as an aid The notes on pronunciation are largely descriptive based on mouth p osition and eort with similar English Received Pronunciation sounds oered where p ossible The next four lessons describ e vowel emb ellishments to the consonants the principles of conjunct consonants Devanagar and additions to and variations in the alphab et Lessons ten and sandhi eleven present in grid form and explain their principles in sound The next three lessons p enetrate MonierWilliams dictionary through its four levels of alphab etical order and suggest strategies for nding dicult words The artha DhatuPat ha last lesson shows the extraction of the from the and the application of this and the dictionary to the study of the scriptures In addition to the primary course the rst eleven lessons include a B section whichintro duces the student to the principles of sentence structure in this fully inected language Six declension paradigms and class conjugation in the present tense are used with a minimal vo cabulary of nineteen words In the B part of -
Proposal for a Gurmukhi Script Root Zone Label Generation Ruleset (LGR)
Proposal for a Gurmukhi Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2019-04-22 Document version: 2.7 Authors: Neo-Brahmi Generation Panel [NBGP] 1. General Information/ Overview/ Abstract This document lays down the Label Generation Ruleset for Gurmukhi script. Three main components of the Gurmukhi Script LGR i.e. Code point repertoire, Variants and Whole Label Evaluation Rules have been described in detail here. All these components have been incorporated in a machine-readable format in the accompanying XML file named "proposal-gurmukhi-lgr-22apr19-en.xml". In addition, a document named “gurmukhi-test-labels-22apr19-en.txt” has been provided. It provides a list of labels which can produce variants as laid down in Section 6 of this document and it also provides valid and invalid labels as per the Whole Label Evaluation laid down in Section 7. 2. Script for which the LGR is proposed ISO 15924 Code: Guru ISO 15924 Key N°: 310 ISO 15924 English Name: Gurmukhi Latin transliteration of native script name: gurmukhī Native name of the script: ਗੁਰਮੁਖੀ Maximal Starting Repertoire [MSR] version: 4 1 3. Background on Script and Principal Languages Using It 3.1. The Evolution of the Script Like most of the North Indian writing systems, the Gurmukhi script is a descendant of the Brahmi script. The Proto-Gurmukhi letters evolved through the Gupta script from 4th to 8th century, followed by the Sharda script from 8th century onwards and finally adapted their archaic form in the Devasesha stage of the later Sharda script, dated between the 10th and 14th centuries. -
Recognition of Online Handwritten Gurmukhi Strokes Using Support Vector Machine a Thesis
Recognition of Online Handwritten Gurmukhi Strokes using Support Vector Machine A Thesis Submitted in partial fulfillment of the requirements for the award of the degree of Master of Technology Submitted by Rahul Agrawal (Roll No. 601003022) Under the supervision of Dr. R. K. Sharma Professor School of Mathematics and Computer Applications Thapar University Patiala School of Mathematics and Computer Applications Thapar University Patiala – 147004 (Punjab), INDIA June 2012 (i) ABSTRACT Pen-based interfaces are becoming more and more popular and play an important role in human-computer interaction. This popularity of such interfaces has created interest of lot of researchers in online handwriting recognition. Online handwriting recognition contains both temporal stroke information and spatial shape information. Online handwriting recognition systems are expected to exhibit better performance than offline handwriting recognition systems. Our research work presented in this thesis is to recognize strokes written in Gurmukhi script using Support Vector Machine (SVM). The system developed here is a writer independent system. First chapter of this thesis report consist of a brief introduction to handwriting recognition system and some basic differences between offline and online handwriting systems. It also includes various issues that one can face during development during online handwriting recognition systems. A brief introduction about Gurmukhi script has also been given in this chapter In the last section detailed literature survey starting from the 1979 has also been given. Second chapter gives detailed information about stroke capturing, preprocessing of stroke and feature extraction. These phases are considered to be backbone of any online handwriting recognition system. Recognition techniques that have been used in this study are discussed in chapter three. -
Tai Lü / ᦺᦑᦟᦹᧉ Tai Lùe Romanization: KNAB 2012
Institute of the Estonian Language KNAB: Place Names Database 2012-10-11 Tai Lü / ᦺᦑᦟᦹᧉ Tai Lùe romanization: KNAB 2012 I. Consonant characters 1 ᦀ ’a 13 ᦌ sa 25 ᦘ pha 37 ᦤ da A 2 ᦁ a 14 ᦍ ya 26 ᦙ ma 38 ᦥ ba A 3 ᦂ k’a 15 ᦎ t’a 27 ᦚ f’a 39 ᦦ kw’a 4 ᦃ kh’a 16 ᦏ th’a 28 ᦛ v’a 40 ᦧ khw’a 5 ᦄ ng’a 17 ᦐ n’a 29 ᦜ l’a 41 ᦨ kwa 6 ᦅ ka 18 ᦑ ta 30 ᦝ fa 42 ᦩ khwa A 7 ᦆ kha 19 ᦒ tha 31 ᦞ va 43 ᦪ sw’a A A 8 ᦇ nga 20 ᦓ na 32 ᦟ la 44 ᦫ swa 9 ᦈ ts’a 21 ᦔ p’a 33 ᦠ h’a 45 ᧞ lae A 10 ᦉ s’a 22 ᦕ ph’a 34 ᦡ d’a 46 ᧟ laew A 11 ᦊ y’a 23 ᦖ m’a 35 ᦢ b’a 12 ᦋ tsa 24 ᦗ pa 36 ᦣ ha A Syllable-final forms of these characters: ᧅ -k, ᧂ -ng, ᧃ -n, ᧄ -m, ᧁ -u, ᧆ -d, ᧇ -b. See also Note D to Table II. II. Vowel characters (ᦀ stands for any consonant character) C 1 ᦀ a 6 ᦀᦴ u 11 ᦀᦹ ue 16 ᦀᦽ oi A 2 ᦰ ( ) 7 ᦵᦀ e 12 ᦵᦀᦲ oe 17 ᦀᦾ awy 3 ᦀᦱ aa 8 ᦶᦀ ae 13 ᦺᦀ ai 18 ᦀᦿ uei 4 ᦀᦲ i 9 ᦷᦀ o 14 ᦀᦻ aai 19 ᦀᧀ oei B D 5 ᦀᦳ ŭ,u 10 ᦀᦸ aw 15 ᦀᦼ ui A Indicates vowel shortness in the following cases: ᦀᦲᦰ ĭ [i], ᦵᦀᦰ ĕ [e], ᦶᦀᦰ ăe [ ∎ ], ᦷᦀᦰ ŏ [o], ᦀᦸᦰ ăw [ ], ᦀᦹᦰ ŭe [ ɯ ], ᦵᦀᦲᦰ ŏe [ ]. -
Aspirated and Unaspirated Voiceless Consonants in Old Tibetan*
LANGUAGE AND LINGUISTICS 8.2:471-493, 2007 2007-0-008-002-000248-1 Aspirated and Unaspirated Voiceless Consonants * in Old Tibetan Nathan W. Hill Harvard University Although Tibetan orthography distinguishes aspirated and unaspirated voiceless consonants, various authors have viewed this distinction as not phonemic. An examination of the unaspirated voiceless initials in the Old Tibetan Inscriptions, together with unaspirated voiceless consonants in several Tibetan dialects confirms that aspiration was either not phonemic in Old Tibetan, or only just emerging as a distinction due to loan words. The data examined also affords evidence for the nature of the phonetic word in Old Tibetan. Key words: Tibetan orthography, aspiration, Old Tibetan 1. Introduction The distribution between voiceless aspirated and voiceless unaspirated stops in Written Tibetan is nearly complementary. This fact has been marshalled in support of the reconstruction of only two stop series (voiced and voiceless) in Proto-Tibeto- Burman. However, in order for Tibetan data to support a two-way manner distinction in Proto-Tibeto-Burman it is necessary to demonstrate that the three-way distinction of voiced, voiceless aspirated, and voiceless unaspirated in Written Tibetan is derivable from an earlier two-way voicing distinction. Those environments in which voiceless aspirated and unaspirated stops are not complimentary must be thoroughly explained. The Tibetan script distinguishes the unaspirated consonant series k, c, t, p, ts from the aspirated consonant series kh, ch, th, ph, tsh. The combination of letters hr and lh are not to be regarded as representing aspiration but rather the voiceless counterparts to r and l respectively (Hahn 1973:434). -
Preaspiration in Modern and Old Mongolian
Jan-Olof Svantesson & Anastasia Karlsson Lund University Preaspiration in Modern and Old Mongolian Background The problem of how to analyse the two contrasting manners of articulation for Mongolian stops and affricates has been debated since Mongolian phonetics first came to be studied. To date, there remains no consensus about the phonetic correlates of the two manners of articulation found in the literature. In phone- mic transcriptions, the two series have often been rendered with symbols for voiceless and voiced consonants. The Cyrillic Mongolian script also treats them in this way. Most Mongolists have used terms such as fortis ~ lenis, strong ~ weak, or tense ~ lax; see the review in our book, The Phonology of Mongolian (Svantesson et al. 2005: 220–221). In that book, we presented data showing that the difference is one of the presence vs. the absence of aspiration. Aspirated consonants are preaspirated in all positions except utterance-initially, while they are postaspirated word-in- itially. Here we will present more data to support this analysis. (An analysis of this kind was already proposed by the Finland-Swedish scholar John Ramstedt in his pioneering 1902 description of Halh Mongolian phonetics.) The stop and affricate phonemes are shown in (1). Minimal pairs show- ing that they contrast are given in Svantesson et al. (2005: 26–27). In phonemic transcription, we write aspirated stops and affricates as /pʰ/, /tʰ/, etc.; although the aspiration sign is written after the consonant, it is intended as a symbol for (pre- or post-) aspiration in general. In close phonetic transcription, we differen- tiate between pre- and postaspiration [ʰt, tʰ]. -
Devan¯Agar¯I for TEX Version 2.17.1
Devanagar¯ ¯ı for TEX Version 2.17.1 Anshuman Pandey 6 March 2019 Contents 1 Introduction 2 2 Project Information 3 3 Producing Devan¯agar¯ıText with TEX 3 3.1 Macros and Font Definition Files . 3 3.2 Text Delimiters . 4 3.3 Example Input Files . 4 4 Input Encoding 4 4.1 Supplemental Notes . 4 5 The Preprocessor 5 5.1 Preprocessor Directives . 7 5.2 Protecting Text from Conversion . 9 5.3 Embedding Roman Text within Devan¯agar¯ıText . 9 5.4 Breaking Pre-Defined Conjuncts . 9 5.5 Supported LATEX Commands . 9 5.6 Using Custom LATEX Commands . 10 6 Devan¯agar¯ıFonts 10 6.1 Bombay-Style Fonts . 11 6.2 Calcutta-Style Fonts . 11 6.3 Nepali-Style Fonts . 11 6.4 Devan¯agar¯ıPen Fonts . 11 6.5 Default Devan¯agar¯ıFont (LATEX Only) . 12 6.6 PostScript Type 1 . 12 7 Special Topics 12 7.1 Delimiter Scope . 12 7.2 Line Spacing . 13 7.3 Hyphenation . 13 7.4 Captions and Date Formats (LATEX only) . 13 7.5 Customizing the date and captions (LATEX only) . 14 7.6 Using dvnAgrF in Sections and References (LATEX only) . 15 7.7 Devan¯agar¯ıand Arabic Numerals . 15 7.8 Devan¯agar¯ıPage Numbers and Other Counters (LATEX only) . 15 1 7.9 Category Codes . 16 8 Using Devan¯agar¯ıin X E LATEXand luaLATEX 16 8.1 Using Hindi with Polyglossia . 17 9 Using Hindi with babel 18 9.1 Installation . 18 9.2 Usage . 18 9.3 Language attributes . 19 9.3.1 Attribute modernhindi . -
An Introduction to Indic Scripts
An Introduction to Indic Scripts Richard Ishida W3C [email protected] HTML version: http://www.w3.org/2002/Talks/09-ri-indic/indic-paper.html PDF version: http://www.w3.org/2002/Talks/09-ri-indic/indic-paper.pdf Introduction This paper provides an introduction to the major Indic scripts used on the Indian mainland. Those addressed in this paper include specifically Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. I have used XHTML encoded in UTF-8 for the base version of this paper. Most of the XHTML file can be viewed if you are running Windows XP with all associated Indic font and rendering support, and the Arial Unicode MS font. For examples that require complex rendering in scripts not yet supported by this configuration, such as Bengali, Oriya, and Malayalam, I have used non- Unicode fonts supplied with Gamma's Unitype. To view all fonts as intended without the above you can view the PDF file whose URL is given above. Although the Indic scripts are often described as similar, there is a large amount of variation at the detailed implementation level. To provide a detailed account of how each Indic script implements particular features on a letter by letter basis would require too much time and space for the task at hand. Nevertheless, despite the detail variations, the basic mechanisms are to a large extent the same, and at the general level there is a great deal of similarity between these scripts. It is certainly possible to structure a discussion of the relevant features along the same lines for each of the scripts in the set. -
Middle Indo-Aryan “Aspirate” Clusters Revisited1
MIDDLE INDO-ARYAN “ASPIRATE” CLUSTERS REVISITED1 Hans Henrich Hock The issue of the fate of Sanskrit clusters with sibilant + stop (whether oral or nasal) and with [h] + sonorant has been revived through a paper by Palaschke & Dressler (1999). Focusing on the developments of Sanskrit sibilant + stop clusters (see e.g. (1)), they propose a two-step process through which these become Middle Indo-Aryan geminated stops with postaspiration. (1) Skt. asti > MIAr. atthi ‘is’ The first step in the development (see (2) below), which they consider a “natural process”, namely a lenition, backgrounding, or weakening, could have resulted in preaspirated stops. However, preaspiration is “prevented […] in terms of system adequacy” by a “paradigmatic prelexical process, within the framework of Natural Phonology [… which] fits the universal scale of naturalness presented inter alia in Hurch (1988: 61–63)”. See example (3). (2) s ≤ h (3) Ch > hC ‘Postaspirated consonants are more likely to be expected than preaspirated ones.’ In support of their account they refer to Hurch (1986: 62) who cites examples of a supposedly similar process in the Spanish of Seville, where coda s becomes h; see (4). According to Palaschke & Dressler (62–63), Hurch assumes that a process of metathesis converts preaspiration to postaspiration. In this context, it is important to emphasize that this process is 1 *Revised version of a paper read at the 2003 South Asian Languages Analysis (SALA) Roundtable at Austin, TX. I am grateful for comments by participants in the Roundtable, especially Elena Bashir and Robert King. Part of the paper was also presented, in revised form, at the 2004 Annual Meeting of the American Oriental Society. -
Tibetan Romanization Table
Tibetan Comment [LRH1]: Transliteration revisions are highlighted below in light-gray or otherwise noted in a comment. ALA-LC romanization of Tibetan letters follows the principles of the Wylie transliteration system, as described by Turrell Wylie (1959). Diacritical marks are used for those letters representing Indic or other non-Tibetan languages, and parallel the use of these marks when transcribing their counterpart letters in Sanskrit. These are applied for the sake of consistency, and to reflect international publishing standards. Accordingly, romanize words of non-Tibetan origin systematically (following this table) in all cases, even though the word may derive from Sanskrit or another language. When Tibetan is written in another script (e.g., ʼPhags-pa) the corresponding letters in that script are also romanized according to this table. Consonants (see Notes 1-3) Vernacular Romanization Vernacular Romanization Vernacular Romanization ka da zha ཀ་ ད་ ཞ་ kha na za ཁ་ ན་ ཟ་ Comment [LH2]: While the current ALA-LC ga pa ’a table stipulates that an apostrophe should be used, ག་ པ་ འ་ this revision proposal recommends that the long- nga pha ya standing defacto LC practice of using an alif (U+02BC) be continued and explicitly stipulated in ང་ ཕ་ ཡ་ the Table. See accompanying Narrative for details. ca ba ra ཅ་ བ་ ར་ cha ma la ཆ་ མ་ ལ་ ja tsa sha ཇ་ ཙ་ ཤ་ nya tsha sa ཉ་ ཚ་ ས་ ta dza ha ཏ་ ཛ་ ཧ་ tha wa a ཐ་ ཝ་ ཨ་ Vowels and Diphthongs (see Notes 4 and 5) ཨི་ i ཨཱི་ ī རྀ་ r̥ ཨུ་ u ཨཱུ་ ū རཱྀ་ r̥̄ ཨེ་ e ཨཻ་ ai ལྀ་ ḷ ཨོ་ o ཨཽ་ au ལཱྀ ḹ ā ཨཱ་ Other Letters or Diacritical Marks Used in Words of Non-Tibetan Origin (see Notes 6 and 7) ṭa gha ḍha ཊ་ གྷ་ ཌྷ་ ṭha jha anusvāra ṃ Comment [LH3]: This letter combination does ཋ་ ཇྷ་ ◌ ཾ not occur in Tibetan texts, and has been deprecated from the Unicode Standard. -
The Gurmukhi Astrolabe of the Maharaja of Patiala
Indian Journal of History of Science, 47.1 (2012) 63-92 THE GURMUKHI ASTROLABE OF THE MAHARAJA OF PATIALA SREERAMULA RAJESWARA SARMA* (Received 14 August 2011; revised 23 January 2012) Even though the telescope was widely in use in India, the production of naked eye observational instruments continued throughout the nineteenth century. In Punjab, in particular, several instrument makers produced astrolabes, celestial globes and other instruments of observation, some inscribed in Sanskrit, some in Persian and yet some others in English. A new feature was the production of instruments with legends in Punjabi language and Gurmukhi script. Three such instruments are known so far. One of these is a splendid astrolabe produced for the Maharaja of Patiala in 1850. This paper offers a full technical description of this Gurmukhi astrolabe and discusses the geographical and astrological data engraved on it. Attention is drawn to the similarities and differences between this astrolabe and the other astrolabes produced in India. Key words: Astrolabe, Astrological tables, Bha– lu– mal, – Geographical gazetteer, Gurmukhi script, Kursi , Maharaja of Patiala, – – Malayendu Su– ri, Rahim Bakhsh, Rete, Rishikes, Shadow squares, Star pointers, Zodiac signs INTRODUCTION For the historian of science and technology in India, the nineteenth century is an interesting period when traditional sciences and technologies continued to exist side by side with modern western science and technology before ultimately giving way to the latter. In the field of astronomical -
Internationalized Domain Names-Punjabi
Draft Policy Document for INTERNATIONALIZED DOMAIN NAMES Language: PUNJABI (PANJABI) 1 RECORD OF CHANGES *A - ADDED M - MODIFIED D - DELETED PAGES A* COMPLIANCE VERSION DATE AFFECTED M TITLE OR BRIEF VERSION OF NUMBER D DESCRIPTION MAIN POLICY DOCUMENT 1.0 20/11/09 Whole M Language Specific 1.5 Document Policy Document for PANJABI 1.1 22/12/2010 Page No. 5, 6, M Remove Explicit 1.6 7, 20 Halant from the consonant syllable, ccTLD added 1.2 05/08/2013 Whole A,M Modified as per Document IDNA 2008, Restriction rules added and modified. 1.3 21/11/2014 Page No 9 A Modified rule related to Addak 2 Table of Contents 1. AUGMENTED BACKUS-NAUR FORMALISM (ABNF) ..................................... 4 1.1 Declaration of variables ............................................................................................ 4 1.2 ABNF Operators ....................................................................................................... 4 1.3 The Vowel Sequence ................................................................................................ 5 1.4 The Consonant Sequence .......................................................................................... 5 1.5 Sequence ................................................................................................................... 7 1.6 ABNF Applied to the Panjabi IDN ........................................................................... 7 2. RESTRICTION RULES ......................................................................................... 10 3. EXAMPLES