Burmese Language
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Festvox Indic Frontend for Grapheme-To-Phoneme Conversion
The Festvox Indic Frontend for Grapheme-to-Phoneme Conversion Alok Parlikar, Sunayana Sitaram, Andrew Wilkinson and Alan W Black Carnegie Mellon University Pittsburgh, USA aup, ssitaram, aewilkin, [email protected] Abstract Text-to-Speech (TTS) systems convert text into phonetic pronunciations which are then processed by Acoustic Models. TTS frontends typically include text processing, lexical lookup and Grapheme-to-Phoneme (g2p) conversion stages. This paper describes the design and implementation of the Indic frontend, which provides explicit support for many major Indian languages, along with a unified framework with easy extensibility for other Indian languages. The Indic frontend handles many phenomena common to Indian languages such as schwa deletion, contextual nasalization, and voicing. It also handles multi-script synthesis between various Indian-language scripts and English. We describe experiments comparing the quality of TTS systems built using the Indic frontend to grapheme-based systems. While this frontend was designed keeping TTS in mind, it can also be used as a general g2p system for Automatic Speech Recognition. Keywords: speech synthesis, Indian language resources, pronunciation 1. Introduction in models of the spectrum and the prosody. Another prob- lem with this approach is that since each grapheme maps Intelligible and natural-sounding Text-to-Speech to a single “phoneme” in all contexts, this technique does (TTS) systems exist for a number of languages of the world not work well in the case of languages that have pronun- today. However, for low-resource, high-population lan- ciation ambiguities. We refer to this technique as “Raw guages, such as languages of the Indian subcontinent, there Graphemes.” are very few high-quality TTS systems available. -
Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only. -
Title <Article>Phonology of Burmese Loanwords in Jinghpaw Author(S)
Title <Article>Phonology of Burmese loanwords in Jinghpaw Author(s) KURABE, Keita Citation 京都大学言語学研究 (2016), 35: 91-128 Issue Date 2016-12-31 URL https://doi.org/10.14989/219015 Right © 京都大学言語学研究室 2016 Type Departmental Bulletin Paper Textversion publisher Kyoto University 京都大学言語学研究 (Kyoto University Linguistic Research) 35 (2016), 91 –128 Phonology of Burmese loanwords in Jinghpaw Keita KURABE Abstract: The aim of this paper is to provide a preliminary descriptive account of the phonological properties of Burmese loans in Jinghpaw especially focusing on their segmental phonology. Burmese loan phonology in Jinghpaw is significant in two respects. First, a large portion of Burmese loans, despite the fact that the contact relationship between Burmese and Jinghpaw appears to be of relatively recent ori- gin, retains several phonological properties of Written Burmese that have been lost in the modern language. This fact can be explained in terms of borrowing chains, i.e. Burmese Shan Jinghpaw, where Shan, which has had intensive contact → → with both Burmese and Jinghpaw from the early stages, transferred lexical items of Burmese origin into Jinghpaw. Second, the Jinghpaw lexicon also contains some Burmese loans reflecting the phonology of Modern Burmese. These facts highlight the multistratal nature of Burmese loans in Jinghpaw. A large portion of this paper is devoted to building a lexicon of Burmese loans in Jinghpaw together with loans from other relevant languages whose lexical items entered Jinghpaw through the medium of Burmese.∗ Key words: Burmese, Jinghpaw, Shan, loanwords, contact linguistics 1 Introduction Jinghpaw is a Tibeto-Burman (TB) language spoken primarily in northern Burma (Myan- mar) where, as with other regions of Southeast Asia, intensive contact among speakers ∗ I would like to thank Professor Hideo Sawada and Professor Keisuke Huziwara for their careful reading and helpful suggestions on an earlier draft of this paper. -
Two Letter English Abbreviations
Two Letter English Abbreviations Zealous Andrzej palpate, his shandy forefeeling enrobe wittingly. Lathy Pail sometimes rebuffs any figureheadsmangonel regulated soothsays negatively. unsparingly, Tawdriest he validate Ole dumfounds so counter. lawlessly while Xever always obnubilate his Clearly identify mistakes to be regarded as well as part of style is where as ngo and english letter Past simple or more, germans found what causes confusion with a validation rule that. Another principle is to minimize confusion with other words or abbreviations. This got it obligatory to clearly, two letter word becomes shorter terms may apply to rsvp you add two strokes coming from the second letter. Thank you want to the end with other journals require periods in english equivalent which are two letter english abbreviations: when only one long, embedding an affinity for. Definition of english, just one name, just for market activity from opposite directions in contrast, two letter english abbreviations are. Speaking letters or for bilingual use this list of the. Neither one word has worked with. It is not forget to win me the cow does anyone have you to use the like defusing a currency in. Learning the nuances of English makes it a difficult language But specify that's keep true has many languages There is also two-letter transfer in. When people in some packaging since he pursued with friends in a ad free! If they are so you use details from the client or abbreviation of english letter abbreviations! Abbreviations in the world war ii to language as an acronym abbreviated cu, such as a question, monovalent and scientific research institute of. -
Myanmar Script Learning Guide Burmese 1,2,3 Tone System Is Developed by Naing Tin-Nyunt-Pu Copyright © 2013-2017, Naing Tinnyuntpu & Asia Pearl Travels
Myanmar Script Learning guide (Revision F) by Naing Tinnyuntpu WITH FREE ONLINE AUDIO SUPPORT https://www.asiapearltravels.com/language/myanmar-script-learning-guide.php 1 of 104 Menu ≡ Introduction ........................................................................................4 Vowel: A .............................................................................................6 Vowel: E or I ....................................................................................13 Vowel: U ..........................................................................................20 Vowel: O ..........................................................................................24 Vowel: Ay .........................................................................................28 Vowel: Au ........................................................................................33 Vowel: Un or An ..............................................................................37 Vowel: In ..........................................................................................42 Vowel: Eare ......................................................................................46 Vowel: Ain .......................................................................................51 Vowel: Ome .....................................................................................53 Vowel: Ine ........................................................................................56 Vowel: Oon ......................................................................................59 -
An Introduction to Old Persian Prods Oktor Skjærvø
An Introduction to Old Persian Prods Oktor Skjærvø Copyright © 2016 by Prods Oktor Skjærvø Please do not cite in print without the author’s permission. This Introduction may be distributed freely as a service to teachers and students of Old Iranian. In my experience, it can be taught as a one-term full course at 4 hrs/w. My thanks to all of my students and colleagues, who have actively noted typos, inconsistencies of presentation, etc. TABLE OF CONTENTS Select bibliography ................................................................................................................................... 9 Sigla and Abbreviations ........................................................................................................................... 12 Lesson 1 ..................................................................................................................................................... 13 Old Persian and old Iranian. .................................................................................................................... 13 Script. Origin. .......................................................................................................................................... 14 Script. Writing system. ........................................................................................................................... 14 The syllabary. .......................................................................................................................................... 15 Logograms. ............................................................................................................................................ -
Finite-State Script Normalization and Processing Utilities: the Nisaba Brahmic Library
Finite-state script normalization and processing utilities: The Nisaba Brahmic library Cibu Johny† Lawrence Wolf-Sonkin‡ Alexander Gutkin† Brian Roark‡ Google Research †United Kingdom and ‡United States {cibu,wolfsonkin,agutkin,roark}@google.com Abstract In addition to such normalization issues, some scripts also have well-formedness constraints, i.e., This paper presents an open-source library for efficient low-level processing of ten ma- not all strings of Unicode characters from a single jor South Asian Brahmic scripts. The library script correspond to a valid (i.e., legible) grapheme provides a flexible and extensible framework sequence in the script. Such constraints do not ap- for supporting crucial operations on Brahmic ply in the basic Latin alphabet, where any permuta- scripts, such as NFC, visual normalization, tion of letters can be rendered as a valid string (e.g., reversible transliteration, and validity checks, for use as an acronym). The Brahmic family of implemented in Python within a finite-state scripts, however, including the Devanagari script transducer formalism. We survey some com- mon Brahmic script issues that may adversely used to write Hindi, Marathi and many other South affect the performance of downstream NLP Asian languages, do have such constraints. These tasks, and provide the rationale for finite-state scripts are alphasyllabaries, meaning that they are design and system implementation details. structured around orthographic syllables (aksara)̣ as the basic unit.1 One or more Unicode characters 1 Introduction combine when rendering one of thousands of leg- The Unicode Standard separates the representation ible aksara,̣ but many combinations do not corre- of text from its specific graphical rendering: text spond to any aksara.̣ Given a token in these scripts, is encoded as a sequence of characters, which, at one may want to (a) normalize it to a canonical presentation time are then collectively rendered form; and (b) check whether it is a well-formed into the appropriate sequence of glyphs for display. -
Development of Local Theology of the Chin (Zomi) of the Assemblies of God (Ag) in Myanmar: a Case Study in Contextualization
DEVELOPMENT OF LOCAL THEOLOGY OF THE CHIN (ZOMI) OF THE ASSEMBLIES OF GOD (AG) IN MYANMAR: A CASE STUDY IN CONTEXTUALIZATION BY DENISE ROSS A thesis submitted to The University of Birmingham For the degree of DOCTOR OF PHILOSOPHY School of Philosophy, Theology and Religion THE UNIVERSITY OF BIRMINGHAM January 2019 This thesis is dedicated firstly to my loving parents Albert and Hilda Ross from whom I got the work ethic required to complete this research. Secondly, I dedicate it to the Chin people who were generous in telling me their stories, so I offer this completed research as a reflection for even greater understanding and growth. Acknowledgements This thesis took many years to produce, and I would like to acknowledge and thank everyone who encouraged and supported me throughout the often painful process. I would like to thank Edmond Tang for his tremendous supervision for several years. He challenged me, above all else, to think. I can never acknowledge or thank him sufficiently for the time and sacrifice he has invested. I would like to thank my supervisor Allan Anderson, who has been so patient and supportive throughout the whole process. I would like to acknowledge him as a pillar of Pentecostal research within the University of Birmingham, UK which has made it an international centre of excellence. It was his own research on contextual theology, especially in mission contexts, which inspired this research. I acknowledge the Chin interviewees and former classmates who willingly shared their time and expertise and their spiritual lives with me. They were so grateful that I chose their people group, so I offer this research back to them, in gratitude. -
Copyright and Use of This Thesis This Thesis Must Be Used in Accordance with the Provisions of the Copyright Act 1968
View metadata, citation and similar papers at core.ac.ukbrought to you by CORE provided by Sydney eScholarship COPYRIGHT AND USE OF THIS THESIS This thesis must be used in accordance with the provisions of the Copyright Act 1968. Reproduction of material protected by copyright may be an infringement of copyright and copyright owners may be entitled to take legal action against persons who infringe their copyright. Section 51 (2) of the Copyright Act permits an authorized officer of a university library or archives to provide a copy (by communication or otherwise) of an unpublished thesis kept in the library or archives, to a person who satisfies the authorized officer that he or she requires the reproduction for the purposes of research or study. The Copyright Act grants the creator of a work a number of moral rights, specifically the right of attribution, the right against false attribution and the right of integrity. You may infringe the author’s moral rights if you: - fail to acknowledge the author of this thesis if you quote sections from the work - attribute this thesis to another author - subject this thesis to derogatory treatment which may prejudice the author’s reputation For further information contact the University’s Director of Copyright Services sydney.edu.au/copyright 1 A STUDY OF THE APADĀNA, INCLUDING AN EDITION AND ANNOTATED TRANSLATION OF THE SECOND, THIRD AND FOURTH CHAPTERS CHRIS CLARK A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy Faculty of Arts and Social Sciences University of Sydney May 2015 ii CONTENTS Abstract ................................................................................................................. -
Buddhism and Written Law: Dhammasattha Manuscripts and Texts in Premodern Burma
BUDDHISM AND WRITTEN LAW: DHAMMASATTHA MANUSCRIPTS AND TEXTS IN PREMODERN BURMA A Dissertation Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Dietrich Christian Lammerts May 2010 2010 Dietrich Christian Lammerts BUDDHISM AND WRITTEN LAW: DHAMMASATTHA MANUSCRIPTS AND TEXTS IN PREMODERN BURMA Dietrich Christian Lammerts, Ph.D. Cornell University 2010 This dissertation examines the regional and local histories of dhammasattha, the preeminent Pali, bilingual, and vernacular genre of Buddhist legal literature transmitted in premodern Burma and Southeast Asia. It provides the first critical analysis of the dating, content, form, and function of surviving dhammasattha texts based on a careful study of hitherto unexamined Burmese and Pali manuscripts. It underscores the importance for Buddhist and Southeast Asian Studies of paying careful attention to complex manuscript traditions, multilingual post- and para- canonical literatures, commentarial strategies, and the regional South-Southeast Asian literary, historical, and religious context of the development of local legal and textual practices. Part One traces the genesis of dhammasattha during the first and early second millennia C.E. through inscriptions and literary texts from India, Cambodia, Campå, Java, Lakå, and Burma and investigates its historical and legal-theoretical relationships with the Sanskrit Bråhmaˆical dharmaßåstra tradition and Pali Buddhist literature. It argues that during this period aspects of this genre of written law, akin to other disciplines such as alchemy or medicine, functioned in both Buddhist and Bråhmaˆical contexts, and that this ecumenical legal culture persisted in certain areas such as Burma and Java well into the early modern period. -
An Introduction to Indic Scripts
An Introduction to Indic Scripts Richard Ishida W3C [email protected] HTML version: http://www.w3.org/2002/Talks/09-ri-indic/indic-paper.html PDF version: http://www.w3.org/2002/Talks/09-ri-indic/indic-paper.pdf Introduction This paper provides an introduction to the major Indic scripts used on the Indian mainland. Those addressed in this paper include specifically Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. I have used XHTML encoded in UTF-8 for the base version of this paper. Most of the XHTML file can be viewed if you are running Windows XP with all associated Indic font and rendering support, and the Arial Unicode MS font. For examples that require complex rendering in scripts not yet supported by this configuration, such as Bengali, Oriya, and Malayalam, I have used non- Unicode fonts supplied with Gamma's Unitype. To view all fonts as intended without the above you can view the PDF file whose URL is given above. Although the Indic scripts are often described as similar, there is a large amount of variation at the detailed implementation level. To provide a detailed account of how each Indic script implements particular features on a letter by letter basis would require too much time and space for the task at hand. Nevertheless, despite the detail variations, the basic mechanisms are to a large extent the same, and at the general level there is a great deal of similarity between these scripts. It is certainly possible to structure a discussion of the relevant features along the same lines for each of the scripts in the set. -
Proto Northern Burmic Is Reconstructed on the Basis of Data from Achang, Bela, Lashi
1 CHAPTER 1 INTRODUCTION 1.0 Introduction The basic purpose of this thesis is to provide a reconstruction of Proto Northern Burmic and to describe the phonological relationships between the Northern Burmic languages. This work is relevant as a thorough analysis of this language family has not yet been conducted. This analysis relies primarily on data from six Northern Burmic languages, with the additional resource of Written Burmese. In chapter 1, the background information on the languages under study and the basic approach of the thesis are described, chapter 2 gives a description of the lang-uages. The reconstruction is provided in chapter 3. A description of Proto Northern Burmic and a discussion of the phonological relationships between the Northern Burmic languages is given in chapter 4. Reconstructed vocabulary is entailed in chapter 5. This chapter provides background information for this thesis such as a description of the Northern Burmic peoples, linguistic classification, literature review, historical reconstruction methodology, a brief statement of purpose, as well as sources for the linguistic data. 1.1 Northern Burmic Historical, Cultural and Geographic Background The term Northern Burmic (Shafer 1966) is used in this thesis to refer to the grouping of Achang, Bela, Lashi, Maru, Phon, and Zaiwa. The primary data for this thesis is from speakers in the Kachin State in NE Myanmar (Burma) near the border of Yunnan, China (see section 1.7 for discussion of data). It must be stressed that this is by no means the only area in which these 2 languages are spoken as these language groups straddle the rugged mountain peaks between China and Myanmar.