Brahmi-Net: a Transliteration and Script Conversion System for Languages of the Indian Subcontinent

Total Page:16

File Type:pdf, Size:1020Kb

Brahmi-Net: a Transliteration and Script Conversion System for Languages of the Indian Subcontinent Brahmi-Net: A transliteration and script conversion system for languages of the Indian subcontinent Anoop Kunchukuttan ∗ Ratish Puduppully ∗y Pushpak Bhattacharyya IIT Bombay IIIT Hyderabad IIT Bombay [email protected] ratish.surendran [email protected] @research.iiit.ac.in Abstract one of the oldest writing systems of the Indian sub- continent which can be dated to at least the 3rd cen- We present Brahmi-Net - an online system for tury B.C.E. In addition, Arabic-derived and Roman transliteration and script conversion for all ma- scripts are also used for some languages. Given the jor Indian language pairs (306 pairs). The sys- diversity of languages and scripts, transliteration and tem covers 13 Indo-Aryan languages, 4 Dra- script conversion are extremely important to enable vidian languages and English. For training effective communication. the transliteration systems, we mined paral- The goal of script conversion is to represent the lel transliteration corpora from parallel trans- lation corpora using an unsupervised method source script accurately in the target script, without and trained statistical transliteration systems loss of phonetic information. It is useful for exactly using the mined corpora. Languages which reading manuscripts, signboards, etc. It can serve do not have parallel corpora are supported as a useful tool for linguists, NLP researchers, etc. by transliteration through a bridge language. whose research is multilingual in nature. Script con- Our script conversion system supports con- version enables reading text written in foreign scripts version between all Brahmi-derived scripts as accurately in a user's native script. On the other well as ITRANS romanization scheme. For this, we leverage co-ordinated Unicode ranges hand, transliteration aims to conform to the phonol- between Indic scripts and use an extended ogy of the target language, while being close to the ITRANS encoding for transliterating between source language phonetics. Transliteration is needed English and Indic scripts. The system also pro- for phonetic input systems, cross-lingual informa- vides top-k transliterations and simultaneous tion retrieval, question-answering, machine transla- transliteration into multiple output languages. tion and other cross-lingual applications. We provide a Python as well as REST API to Brahmi-Net is a general purpose transliteration access these services. The API and the mined and script conversion system that aims to provide so- transliteration corpus are made available for research use under an open source license. lutions for South Asian scripts and languages. While transliteration and script conversion are challenging given the scale and diversity, we leverage the com- 1 Introduction monality in the phonetics and the scriptural systems of these languages. The major features of Brahmi- The Indian subcontinent is home to some of the most Net are: widely spoken languages of the world. It is unique 1. It supports 18 languages and 306 language in terms of the large number of scripts used for writ- pairs for statistical transliteration. The sup- ing these languages. Most of the these are abugida ported languages cover 13 Indo-Aryan lan- scripts derived from the Brahmi script. Brahmi is guage (Assamese, Bengali, Gujarati, Hindi, ∗These authors contributed equally to this project Konkani, Marathi, Nepali, Odia, Punjabi, San- yWork done while the author was at IIT Bombay skrit, Sindhi, Sinhala, Urdu) , 4 Dravidian lan- guages (Kannada, Malayalam, Tamil, Telugu) 2.1 Among scripts of the Brahmi family and English. To the best of our knowledge, Each Brahmi-derived Indian language script has no other system covers as many languages and been allocated a distinct codepoint range in the Uni- scripts. code standard. These scripts have a similar char- acter inventory, but different glyphs. Hence, the 2. It supports script conversion among the fol- first 85 characters in each Unicode block are in the lowing 10 scripts used by major Indo-Aryan same order and position, on a script by script basis. and Dravidian languages: Bengali, Gujarati, Our script conversion method simply maps the code- Kannada, Malayalam, Odia, Punjabi, Devana- points between the two scripts. gari, Sinhala, Tamil and Telugu. Some of these The Tamil script is different from other scripts scripts are used for writing multiple languages. since it uses the characters for unvoiced, unaspi- Devanagari is used for writing Hindi, Sanskrit, rated plosives for representing voiced and/or aspi- Marathi, Nepali, Konkani and Sindhi. The Ben- rated plosives. When converting into the Tamil gali script is also used for writing Assamese. script, we substitute all voiced and/or aspirated plo- Also, Sanskrit has historically been written in sives by the corresponding unvoiced, unaspirated many of the above mentioned scripts. plosive in the Tamil script. For Sinhala, we do an ex- 3. The system also supports an extended ITRANS plicit mapping between the characters since the Uni- transliteration scheme for romanization of the code ranges are not coordinated. Indic scripts. This simple script conversion scheme accounts for a vast majority of the characters. However, there 4. The transliteration and script conversion sys- are some characters which do not have equivalents tems are accessible via an online portal. Some in other scripts, an issue we have not addressed so additional features include the ability to simul- far. For instance, the Dravidian scripts do not have taneously view transliterations to all available the nukta character. languages and the top-k best transliterations. 2.2 Between a Roman transliteration scheme 5. An Application Programming Interface (API) is and scripts from the Brahmi family available as a Python package and a REST in- We chose ITRANS1 as our transliteration scheme terface for easy integration of the transliteration since: (i) it can be entered using Roman keyboard and script conversion systems into other appli- characters, (ii) the Roman character mappings map cations requiring transliteration services. to Indic script characters in a phonetically intuitive fashion. The official ITRANS specification is lim- 6. As part of the project, parallel transliteration ited to the Devanagari script. We have added a few corpora has been mined for transliteration be- extensions to account for some characters not found tween 110 languages pairs for the following 11 in non-Devanagari scripts. Our extended encoding languages: Bengali, Gujarati, Hindi, Konkani, is backward compatible with ITRANS. We convert Marathi, Punjabi, Urdu, Malayalam, Tamil, Devanagari to ITRANS using Alan Little's python Telugu and English. The parallel translitera- module2. For romanization of other scripts, we use tion corpora is comprised of 1,694,576 word Devanagari as a pivot script and use the inter-Brahmi pairs across all language pairs, which is roughly script converter mentioned in Section 2.1. 15,000 mined pairs per language pair. 3 Transliteration 2 Script Conversion Though Indian language scripts are phonetic and Our script conversion engine contains two rule- largely unambiguous, script conversion is not a sub- based systems: one for script conversion amongst 1http://www.aczoom.com/itrans/ scripts of the Brahmi family, and the other for ro- 2http://www.alanlittle.org/projects/ manization of Brahmi scripts. transliterator/transliterator.html stitute for transliteration which needs to account for system is trained on them. the target language phonology and orthographic con- We used a hybrid approach for transliteration in- ventions. The main challenges that machine translit- volving languages for which we could not mine eration systems encounter are: script specifications, a parallel transliteration corpus. Source languages missing sounds, transliteration variants, language of which cannot be statistically transliterated are first origin, etc. (Karimi et al., 2011). A summary of the transliterated into a phonetically close language challenges specific to Indian languages is described (bridge language) using the above-mentioned rule- by Antony, P. J. and Soman, K.P. (2011). based system. The bridge language is then transliter- ated into the target language using statistical translit- 3.1 Transliteration Mining eration. Similarly, for target languages which cannot Statistical transliteration can address these chal- be statistically transliterated, the source is first sta- lenges by learning transliteration divergences from a tistically transliterated into a phonetically close lan- parallel transliteration corpus. For most Indian lan- guage, followed by rule-based transliteration into the guage pairs, parallel transliteration corpora are not target language. publicly available. Hence, we mine transliteration corpora for 110 language pairs from the ILCI corpus, 4 Brahmi-Net Interface a parallel translation corpora of 11 Indian languages (Jha, 2012). Transliteration pairs are mined using Brahmi-Net is accessible via a web interface as well the unsupervised approach proposed by Sajjad et al. an API. We describe these interfaces in this section. (2012) and implemented in the Moses SMT system (Durrani et al., 2014). Their approach models paral- 4.1 Web Interface lel translation corpus generation as a generative pro- The purpose of the Web interface is to allow users cess comprising an interpolation of a transliteration quick access to transliteration and script conversion and a non-transliteration process. The parameters of services. They can also choose to see the translitera-
Recommended publications
  • The Original Pronunciation of Sanskrit Devanàgará – Transliteration – IPA-Symbols
    The Original Pronunciation of Sanskrit DevanÀgarÁ – Transliteration – IPA-Symbols $ a n$D À DØ i L Á LØ u X A Â XØ ? Ã `C C Å `CØ CØ Æ O C # e HØ #H ai DeØ, $DH o RØ $D( au DØe8 N{ k N R kh N+ J g J gh J+ Ç 1 F c WeÝ ' ch WeÝ+ M j GeÛ + jh GeÛ+ _ È × T Ê Þ 4 Êh Þ+ I Ë É ) Ëh É+ > É Ö W t W Z th W+ G{ d G [ dh G+ Q n Q S p S { ph S+ E b E bh E+ P m P \ y M U{ r `O l O Y v Y9 ] Ì Ý Í V s V K h Ù Drafted by Maciej Zieba and Ulrich Stiehl under the auspices of Manfred Mayrhofer. ! Ï [ Improvements by Jost Gippert, Madhav Deshpande, Sunder Hattangadi, John Smith and others. This chart is a compromise, since the original pronunciation of Sanskrit Î YDULRXV is not exactly known in every detail. – 06/09/2002/us. Notes: 1. $ seems to have been pronounced originally as [n] or as [], possibly never as [D]. Not even the pronunciation of this most often used letter $ is exactly known! 2. The original pronunciation of is not known. It occurs only in the verb dS(kÆp). 3. U{seems to have been pronounced as [`] or as [], and likewise the liquid seems to have been pronounced as syllabic [`C] (notation: [`]+ [ C]) or as syllabic [C]. 4. The pronunciation of the semi-vowel Y seems to have been either [Y] or [9].
    [Show full text]
  • Sanskrit Alphabet
    Sounds Sanskrit Alphabet with sounds with other letters: eg's: Vowels: a* aa kaa short and long ◌ к I ii ◌ ◌ к kii u uu ◌ ◌ к kuu r also shows as a small backwards hook ri* rri* on top when it preceeds a letter (rpa) and a ◌ ◌ down/left bar when comes after (kra) lri lree ◌ ◌ к klri e ai ◌ ◌ к ke o au* ◌ ◌ к kau am: ah ◌ं ◌ः कः kah Consonants: к ka х kha ga gha na Ê ca cha ja jha* na ta tha Ú da dha na* ta tha Ú da dha na pa pha º ba bha ma Semivowels: ya ra la* va Sibilants: sa ш sa sa ha ksa** (**Compound Consonant. See next page) *Modern/ Hindi Versions a Other ऋ r ॠ rr La, Laa (retro) औ au aum (stylized) ◌ silences the vowel, eg: к kam झ jha Numero: ण na (retro) १ ५ ॰ la 1 2 3 4 5 6 7 8 9 0 @ Davidya.ca Page 1 Sounds Numero: 0 1 2 3 4 5 6 7 8 910 १॰ ॰ १ २ ३ ४ ६ ७ varient: ५ ८ (shoonya eka- dva- tri- catúr- pancha- sás- saptán- astá- návan- dásan- = empty) works like our Arabic numbers @ Davidya.ca Compound Consanants: When 2 or more consonants are together, they blend into a compound letter. The 12 most common: jna/ tra ttagya dya ddhya ksa kta kra hma hna hva examples: for a whole chart, see: http://www.omniglot.com/writing/devanagari_conjuncts.php that page includes a download link but note the site uses the modern form Page 2 Alphabet Devanagari Alphabet : к х Ê Ú Ú º ш @ Davidya.ca Page 3 Pronounce Vowels T pronounce Consonants pronounce Semivowels pronounce 1 a g Another 17 к ka v Kit 42 ya p Yoga 2 aa g fAther 18 х kha v blocKHead
    [Show full text]
  • Inventory of Romanization Tools
    Inventory of Romanization Tools Standards Intellectual Management Office Library and Archives Canad Ottawa 2006 Inventory of Romanization Tools page 1 Language Script Romanization system for an English Romanization system for a French Alternate Romanization system catalogue catalogue Amharic Ethiopic ALA-LC 1997 BGN/PCGN 1967 UNGEGN 1967 (I/17). http://www.eki.ee/wgrs/rom1_am.pdf Arabic Arabic ALA-LC 1997 ISO 233:1984.Transliteration of Arabic BGN/PCGN 1956 characters into Latin characters NLC COPIES: BS 4280:1968. Transliteration of Arabic characters NL Stacks - TA368 I58 fol. no. 00233 1984 E DMG 1936 NL Stacks - TA368 I58 fol. no. DIN-31635, 1982 00233 1984 E - Copy 2 I.G.N. System 1973 (also called Variant B of the Amended Beirut System) ISO 233-2:1993. Transliteration of Arabic characters into Latin characters -- Part 2: Lebanon national system 1963 Arabic language -- Simplified transliteration Morocco national system 1932 Royal Jordanian Geographic Centre (RJGC) System Survey of Egypt System (SES) UNGEGN 1972 (II/8). http://www.eki.ee/wgrs/rom1_ar.pdf Update, April 2004: http://www.eki.ee/wgrs/ung22str.pdf Armenian Armenian ALA-LC 1997 ISO 9985:1996. Transliteration of BGN/PCGN 1981 Armenian characters into Latin characters Hübschmann-Meillet. Assamese Bengali ALA-LC 1997 ISO 15919:2001. Transliteration of Hunterian System Devanagari and related Indic scripts into Latin characters UNGEGN 1977 (III/12). http://www.eki.ee/wgrs/rom1_as.pdf 14/08/2006 Inventory of Romanization Tools page 2 Language Script Romanization system for an English Romanization system for a French Alternate Romanization system catalogue catalogue Azerbaijani Arabic, Cyrillic ALA-LC 1997 ISO 233:1984.Transliteration of Arabic characters into Latin characters.
    [Show full text]
  • Pali (In Various Scripts) Romanization Table
    Pali (in various scripts) Notes 1. Only the vowel forms that appear at the beginning of a syllable are listed; the forms used for vowels following a consonant can be found in grammars; no distinction between the two is made in transliteration. 2. The vowel a is implicit after all consonants and consonant clusters and is supplied in romanization, except when another vowel is indicated by its appropriate sign. 3. Exception: Niggahīta and saññaka combinations representing nasals are romanized by ṅ before gutturals, ñ before palatals, ṇ before cerebrals, n before dentals, and m before labials. 4. In Bengali script, ba and va are not differentiated. The romanization should follow the value of the consonant in the particular passage, ascertainable by checking the same passage as printed in other scripts. Romanization Bengali Burmese Devanagari Sinhalese Thai Vowels (see Note 1) a অ အ अ අ อ, อ ั ā আ အာ आ ආ อา i ই ဣ इ ඉ อ ิ ī ঈ ဤ ई ඊ อ ี u উ ဥ उ උ อ ุ ū ঊ ဦ ऊ ඌ อ ู e এ ဧ ए ඒ เอ o ও ဪ ओ ඔ โอ Consonants (see Note 2) Gutturals ka ক က क ක ก kha খ ခ ख ඛ ข ga গ ဂ ग ග ค gha ঘ ဃ घ ඝ ฆ ṅa ঙ င ङ ඞ ง Palatals ca চ စ च ච จ Romanization Bengali Burmese Devanagari Sinhalese Thai cha ছ ဆ छ ඡ ฉ ja জ ဇ ज ජ ช jha ঝ ဈ झ ඣ ฌ ña ঞ ည ञ ඤ , ญ Cerebrals ṭa ট ဋ ट ට ฏ ṭha ঠ ဌ ठ ඨ ฐ, ḍa ড ဍ ड ඩ ฑ ḍha ঢ ဎ ढ ඪ ฒ ṇa ণ ဏ ण ණ ณ Dentals ta ত တ त ත ต tha থ ထ थ ථ ถ da দ ဒ द ද ท dha ধ ဓ ध ධ ธ na ন န न න น Labials (see Note 4) pa প ပ प ප ป pha ফ ဖ फ ඵ ผ ba ব ဗ ब බ พ bha ভ ဘ भ භ ภ ma ম မ म ම ม Semivowels (see Note 4) ya য ယ य ය ย ra র ရ र ර ร la ল လ ल ල ล ḷa ဠ ळ ළ ฬ va ব ဝ व ව ว Sibilant sa স သ स ස ส Aspirate ha হ ဟ ह හ ห Romanization Bengali Burmese Devanagari Sinhalese Thai Niggahīta (see Note 3) Visagga ṃ ◌ः ḥ Romanization Khmer Lao Tua Tham/A Tua Tham/B Northern Thai Vowels (Independent) (see Note 1) a អ ອ - ā ◌ា ອາ - i ឥ ອ ິ ᩍ ī ឦ ອ ີ ᩎ u ឧ ອຸ ᩏ ū ឪ, ឩ ອູ ᩐ e ឯ ເອ ᩑ o ឲ, ឱ ໂອ - Vowels (Dependent) (see Note 1) a ◌ ◌ ◌ᩢ ā ◌ា ◌າ ◌ᩣ i ◌ិ ◌ິ ◌.
    [Show full text]
  • Proposal to Encode the Kaithi Script in ISO/IEC 10646
    Proposal to Encode the Kaithi Script in ISO/IEC 10646 Anshuman Pandey University of Michigan Ann Arbor, Michigan, U.S.A. [email protected] December 13, 2007 Contents Proposal Summary Form i 1 Introduction 1 1.1 Description ..................................... ...... 1 1.2 Justification for Encoding . ........... 1 1.3 Acknowledgments ................................. ...... 2 1.4 ProposalHistory ................................. ....... 2 2 Characters Proposed 3 2.1 CharactersNotProposed . ......... 4 2.2 Basis for Character Shapes . .......... 4 3 Technical Features 8 3.1 Name ............................................ 8 3.2 Classification ................................... ....... 8 3.3 Allocation...................................... ...... 8 3.4 EncodingModel................................... ...... 8 3.5 CharacterProperties. .......... 8 3.6 Collation ....................................... ..... 11 3.7 Typology of Characters . ......... 11 4 Background 13 4.1 Origins ......................................... 13 4.2 Name ............................................ 13 4.3 Definitions...................................... ...... 13 4.4 Languages Written in the Script . ........... 14 4.5 Standardization and Growth . .......... 16 4.6 Decline ......................................... 16 4.7 Usage ........................................... 17 5 Orthography 23 5.1 Distinguishing Features . ........... 23 5.2 Vowels.......................................... 23 5.3 Consonants ...................................... ..... 23 5.4 Nasalization...................................
    [Show full text]
  • Initial Literacy in Devanagari: What Matters to Learners
    Initial literacy in Devanagari: What Matters to Learners Renu Gupta University of Aizu 1. Introduction Heritage language instruction is on the rise at universities in the US, presenting a new set of challenges for language pedagogy. Since heritage language learners have some experience of the target language from their home environment, language instruction cannot be modeled on foreign language instruction (Kondo-Brown, 2003). At the same time, there may be significant gaps in the learner’s competence; in describing heritage learners of South Asian languages, Moag (1996) notes that their acquisition of the native language has often atrophied at an early stage. For learners of South Asian languages, the writing system presents an additional hurdle. Moag (1996) states that heritage learners have more problems than their American counterparts in learning the script; the heritage learner “typically takes much more time to master the script, and persists in having problems with both reading and writing far longer than his or her American counterpart” (page 170). Moag attributes this to the different purposes for which learners study the language, arguing that non-native speakers rapidly learn the script since they study the language for professional purposes, whereas heritage learners study it for personal reasons. Although many heritage languages, such as Japanese, Mandarin, and the South Asian languages, use writing systems that differ from English, the difficulties of learning a second writing system have received little attention. At the elementary school level, Perez (2004) has summarized differences in writing systems and rhetorical structures, while Sassoon (1995) has documented the problems of school children learning English as a second script; in addition, Cook and Bassetti (2005) offer research studies on the acquisition of some writing systems.
    [Show full text]
  • Magahi and Magadh: Language and People
    G.J.I.S.S.,Vol.3(2):52-59 (March-April, 2014) ISSN: 2319-8834 MAGAHI AND MAGADH: LANGUAGE AND PEOPLE Lata Atreya1 , Smriti Singh2, & Rajesh Kumar3 1Department of Humanities and Social Sciences, Indian Institute of Technology Patna, Bihar, India 2Department of Humanities and Social Sciences, Indian Institute of Technology Patna, Bihar, India 3Department of Humanities and Social Sciences, Indian Institute of Technology Madras, Tamil Nadu, India Abstract Magahi is an Indo-Aryan language, spoken in Eastern part of India. It is genealogically related to Magadhi Apbhransha, once having the status of rajbhasha, during the reign of Emperor Ashoka. The paper outlines the Magahi language in historical context along with its present status. The paper is also a small endeavor to capture the history of Magadh. The paper discusses that once a history of Magadh constituted the history of India. The paper also attempts to discuss the people and culture of present Magadh. Keywords: Magahi, Magadhi Apbhransha, Emperor Ashoka. 1 Introduction The history of ancient India is predominated by the history of Magadh. Magadh was once an empire which expanded almost till present day Indian peninsula excluding Southern India. Presently the name ‘Magadh’ is confined to Magadh pramandal of Bihar state of India. The prominent language spoken in Magadh pramandal and its neighboring areas is Magahi. This paper talks about Magahi as a language, its history, geography, script and its classification. The paper is also a small endeavor towards the study of the history of ancient Magadh. The association of history of Magadh with the history and culture of ancient India is outlined.
    [Show full text]
  • The Brahmi Family of Scripts and Hangul: Alphabets Or Syllabaries? PUB DATE 92 NOTE 25P.; In: Linn, Mary Sarah, Ed
    DOCUMENT RESUME ED 351 852 FL 020 606 AUTHOR Wilhelm, Christopher TITLE The Brahmi Family of Scripts and Hangul: Alphabets or Syllabaries? PUB DATE 92 NOTE 25p.; In: Linn, Mary Sarah, Ed. and Oliverio, Giulia, R. M., Ed. Kansas Working Papers in Linguistics, Volume 17, Numbers 1 and 2; see FL 020 603. PUB TYPE Reports Research/Technical (143) EDRS PRICE MFO1 /PCO1 Plus Postage. DESCRIPTORS *Alphabets; Foreign Countries; Language Classification; *Written Language IDENTIFIERS Asia (Southeast); *Brahmi Script; *Hangul Script; Korea; Syllabaries ABSTRACT A great deal of disagreement exists as to whether the writing systems of the Brahmi family ol7 scripts from Southeast Asia and the Indian subcontinent and the Hangul script,..+f Korea should be classified as alphabets or syllabaries. Both have in common a mixture of syllabic and alphabetic characteristics that has spawned vigorous disagreement among scholars as to their classification. An analysis of the alphabetic and syllabic aspects of the two writing systems is presented and it is demonstrated that each exhibits significant characteristics of both types of writing systems. It is concluded that neither label entirely does either writing system justice. (JL) *********************************************************************** * Reproductions supplied by EDRS are the best that can be made from the original document. *********************************************************************** U S. DEPARTMENT OF EDUCATION tyke dl EduCat,Onal Research andImprovement EDUCATIONAL RESOURCESINFORMATION
    [Show full text]
  • Changing the Northern Khmer Orthography
    Changing the Northern Khmer orthography by Dorothy M. Thomas © 2018 SIL International® All rights reserved. Fair-Use Policy: Publications in SIL's Notes on Literacy series are intended for personal and educational use. You may make copies of these publications for instructional purposes free of charge (within fair-use guidelines) and without further permission. Republication or commercial use of SIL's Notes on Literacy is expressly prohibited without the written consent of the copyright holder. Originally published as: Thomas, Dorothy M. 1989. "Changing the Northern Khmer orthography." Notes on Literacy 57:47–59. See also: Thomas, D. 1989a (in Bibliography (Literacy)) [Topics: Asia: Thailand: Khmer, Northern, orthography, scripts: non-Roman] 1. Introduction Unless you have tangled with the Devanagari scripts of South and Southeast Asia, it is hard to imagine how complicated it is to write still another language based on one of the daughter scripts, such as Cambodian and Thai. But a reasonable orthography using Thai letters had been worked out for the Northern Khmer (N.K.) language by William Smalley and John Ellison and it had been used, with some adaptations, for about 15 years. However, there are so many problems and varieties of possible solutions, plus interference from the Cambodian script for Standard Khmer, that some objections had been raised recently by influential people. The hope was to get a consensus from the people themselves and to stir up interest in writing this dialect of Khmer, with a minimum of conflict with the Thai writing system, yet efficient for N.K. On April 23–24, 1987, the Institute of Language and Culture for Rural Development at Mahidol University, Bangkok, on the invitation of David and Dorothy Thomas, SIL members and instructors at the Institute, sponsored a conference in Surin Province, the heart of the N.K.
    [Show full text]
  • Itf Devanagari Font Free Download Itf Devanagari Font Free Download
    itf devanagari font free download Itf devanagari font free download. Welcome to DevanagariFonts, The largest and unique site that is completely dedicated to providing you easy and free download of Devanagari fonts. With more than 283 free Devanagari (Hindi, Nepali, Marathi, Sanskrit. ) fonts, you can download almost any Devanagari font you want. You are in the right place for Hindi fonts free download, and download of any other Devanagari fonts. If you are looking for a Devanagari Font to use in your website, programs, etc we are confident that you can get exactly what you are looking for right here. We are continuously increasing their number by searching for new fonts and providing them for free download. Introducing Fontshare, a free fonts service making high-quality fonts accessible to all. The Indian Type Foundry (ITF) is delighted to announce the launch of Fontshare—a brand new free fonts service—making high-quality fonts accessible to all. We are on a mission to ensure everyone has the opportunity to use beautiful fonts in their designs for free. Fonts for free for good. For a number of years, ITF has been offering students, design institutions and other not-for-profit organisations all of its commercial fonts for free. Between 2014-2016, it released several open-source fonts, including the incredibly popular Poppins (which is used on millions of websites across the globe), Rajdhani, Khand, Hind and Teko, to name just a few. It has also invested a large proportion of its resources in developing fonts for languages that are rare or on the brink of extinction, even though this is not commercially profitable.
    [Show full text]
  • F) Lontara (1/2
    (F) Lontara (1/2) Hints Hint 1: The phrase mattampa puang lé ri batara is repeated twice in the scrambled Latin-alphabet Buginese text. Can you find a series of characters that is repeated twice in the Lontara-script Buginese text? Use this as a foothold for discovering what some symbols mean, and learning more symbols. Hint 2: Try thinking of the symbols as syllables instead of letters. The Lontara script Each symbol is a single syllable. Lontara is an abugida: the larger, core part of each symbol represents the ini- tial consonant or consonants, while any smaller, accentuating mark represents the vowel. When there is no vowel mark, the syllable defaults to the vowel "A". When a vowel mark is present, the "A" changes to the corresponding vowel. This is similar to Devanagari, another abugida. Here are more consonants of the Lontara script, in the order they appear in the Buginese text. For fun, here are some consonants which were not used in the problem. Although some of the Buginese syllables have final consonants, Lontara does not have a way to write down the final consonants. So the -ng and -q at the end of many of the Buginese words have no representation in the Lontara script: they are ignored. Doubled consonants, as in mattampa, are also ignored and just treated as single consonants (matampa). Lastly, spaces are ignored. A single punctuation mark (at right) is used to represent both the comma and the period. Reference for the Lontaran script: http://en.wikipedia.org/wiki/Lontara_alphabet (F) Lontara (2/2) The unscrambled poem Lé namasuaq mua na sia There is no one mattampa Puang lé ri batara, to call the gods Lord, mappaleq wali ri Pérétiwi.
    [Show full text]
  • Pronouncing the Devanagari Alphabet
    Pronouncing the Devanagari Alphabet The Five Mouth Positions 1. Guttural 2. Palatal 3. Cerebral 4. Dental 5. Labial Simple Vowels ~ positions 1, 2, 5 Position 1 Position 2 short long short long [one beat] [two beats] cola father kinelite Position 5 short long puttruth Diphthongs ~ vowel sounds produced by gliding from one position to another Position 1 to 2 Position 1 to 5 state write so cow Simple Vowels ~ positions 3, 4 Position 3 Position 4 short long short long [rare usage] [rare usage] [very rare usage] similar to t ri ck similar to t ree similar to s li ps similar to leet Modifiers ~ nasalizes the vowel The is a dot above a Devanagari letter that represents the “closing” or nasalization of the vowel sound. When found at the end of a word, this results in a final “m” sound. When found within a word, the sound used is that of the nasal consonant at the position of the letter that follows the mark. The can be transliterated as or as the appropriate nasal consonant: depending upon the mouth position of the following consonant. Examples: or or ~ adds a breath after the vowel Adds an extra breath that emphasizes the preceding vowel, pronounce as “huh” or “hih” Examples: Consonants To experience the difference between the aspirated and unaspirated consonants: position a hand about two or three inches from the face with the palm facing the mouth ~ the breath should be felt on the palm when an aspirated consonant is spoken and not felt when an unaspirated consonant is spoken.
    [Show full text]