17.1 Philippine Scripts
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Cross-Language Framework for Word Recognition and Spotting of Indic Scripts
Cross-language Framework for Word Recognition and Spotting of Indic Scripts aAyan Kumar Bhunia, bPartha Pratim Roy*, aAkash Mohta, cUmapada Pal aDept. of ECE, Institute of Engineering & Management, Kolkata, India bDept. of CSE, Indian Institute of Technology Roorkee India cCVPR Unit, Indian Statistical Institute, Kolkata, India bemail: [email protected], TEL: +91-1332-284816 Abstract Handwritten word recognition and spotting of low-resource scripts are difficult as sufficient training data is not available and it is often expensive for collecting data of such scripts. This paper presents a novel cross language platform for handwritten word recognition and spotting for such low-resource scripts where training is performed with a sufficiently large dataset of an available script (considered as source script) and testing is done on other scripts (considered as target script). Training with one source script and testing with another script to have a reasonable result is not easy in handwriting domain due to the complex nature of handwriting variability among scripts. Also it is difficult in mapping between source and target characters when they appear in cursive word images. The proposed Indic cross language framework exploits a large resource of dataset for training and uses it for recognizing and spotting text of other target scripts where sufficient amount of training data is not available. Since, Indic scripts are mostly written in 3 zones, namely, upper, middle and lower, we employ zone-wise character (or component) mapping for efficient learning purpose. The performance of our cross- language framework depends on the extent of similarity between the source and target scripts. -
Transforming Sundanese Script: from Palm Leaf to Digital Typography
Transforming Sundanese Script: From Palm Leaf to Digital Typography PRESENTER’S IDENTITY HERE Name: Agung Zainal M. Raden, Rustopo, Timbul Haryono Affiliation: Program Doktor ISI Surakarta Abstract Code: ABS-ICOLLITE-20026 Transforming Sundanese Script: From Palm Leaf to Digital Typography AGUNG ZAINAL MUTTAKIN RADEN, RUSTOPO, TIMBUL HARYONO ABSTRACT The impact of globalisation is the lost of local culture, transformation is an attempt to offset the global culture. This article will discuss the transformation process from the Sundanese script contained in palm leaf media to the modern Sundanese script in the form of digital typography. The method used is transformation, which can be applied to rediscover the ancient Sundanese script within the new form known as the modern Sundanese script that it is relevant to modern society. Transformation aims to maintain local culture from global cultural domination. This article discovers the way Sundanese people reinvent their identity through the transformation from ancient Sundanese script to modern Sundanese script by designing a new form of script in order to follow the global technological developments. Keywords: Sundanese script, digital typography, transformation, reinventing, globalisation INTRODUCTION Sundanese ancient handwriting in palm leaf manuscripts is one of the cultural heritage that provides rich knowledge about past, recent, and the future of the Sundanese. The Sundanese script is a manifestation of Sundanese artefacts that contain many symbols and values Digitisation is the process of transforming analogue material into binary electronic (digital) form, especially for storage and use in a computer The dataset consists of three type of data: annotation at word level, annotation at character level, and binarised images Unicode is a universal character encoding standard used for representation of text for computer processing METHODS Transforming: Aims to reinvent an old form of tradition so that it fits into and suits contemporary lifestyles. -
Introduction
INTRODUCTION ’ P”(Bha.t.tikāvya) is one of the boldest “B experiments in classical literature. In the formal genre of “great poem” (mahākāvya) it incorprates two of the most powerful Sanskrit traditions, the “Ramáyana” and Pánini’s grammar, and several other minor themes. In this one rich mix of science and art, Bhatti created both a po- etic retelling of the adventures of Rama and a compendium of examples of grammar, metrics, the Prakrit language and rhetoric. As literature, his composition stands comparison with the best of Sanskrit poetry, in particular cantos , and . “Bhatti’s Poem” provides a comprehensive exem- plification of Sanskrit grammar in use and a good introduc- tion to the science (śāstra) of poetics or rhetoric (alamk. āra, lit. ornament). It also gives a taster of the Prakrit language (a major component in every Sanskrit drama) in an easily ac- cessible form. Finally it tells the compelling story of Prince Rama in simple elegant Sanskrit: this is the “Ramáyana” faithfully retold. e learned Indian curriculum in late classical times had at its heart a system of grammatical study and linguistic analysis. e core text for this study was the notoriously difficult “Eight Books” (A.s.tādhyāyī) of Pánini, the sine qua non of learning composed in the fourth century , and arguably the most remarkable and indeed foundational text in the history of linguistics. Not only is the “Eight Books” a description of a language unmatched in totality for any language until the nineteenth century, but it is also pre- sented in the most compact form possible through the use xix of an elaborate and sophisticated metalanguage, again un- known anywhere else in linguistics before modern times. -
Ka И @И Ka M Л @Л Ga Н @Н Ga M М @М Nga О @О Ca П
ISO/IEC JTC1/SC2/WG2 N3319R L2/07-295R 2007-09-11 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Proposal for encoding the Javanese script in the UCS Source: Michael Everson, SEI (Universal Scripts Project) Status: Individual Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Replaces: N3292 Date: 2007-09-11 1. Introduction. The Javanese script, or aksara Jawa, is used for writing the Javanese language, the native language of one of the peoples of Java, known locally as basa Jawa. It is a descendent of the ancient Brahmi script of India, and so has many similarities with modern scripts of South Asia and Southeast Asia which are also members of that family. The Javanese script is also used for writing Sanskrit, Jawa Kuna (a kind of Sanskritized Javanese), and Kawi, as well as the Sundanese language, also spoken on the island of Java, and the Sasak language, spoken on the island of Lombok. Javanese script was in current use in Java until about 1945; in 1928 Bahasa Indonesia was made the national language of Indonesia and its influence eclipsed that of other languages and their scripts. Traditional Javanese texts are written on palm leaves; books of these bound together are called lontar, a word which derives from ron ‘leaf’ and tal ‘palm’. 2.1. Consonant letters. Consonants have an inherent -a vowel sound. Consonants combine with following consonants in the usual Brahmic fashion: the inherent vowel is “killed” by the PANGKON, and the follow- ing consonant is subjoined or postfixed, often with a change in shape: §£ ndha = § NA + @¿ PANGKON + £ DA-MAHAPRANA; üù n. -
IDENTIMRS Hawaii; *Ilocano
DOCUMENT MUNE ED 206 189 FL 012 493 AUTHOR Opinaldo, Eulanda; And Others TITLE Ilokano Language Program Guide. TNSTITOTION Hawaii State Dept. of Education, Honolulu.Office of Instructional Services. PUB DATE Jun 81 NOTE 163p. LANGUAGE English: Ilocano !DRS PRICE MR01/PC07 Plus Postage. DESCRIPTORS Cultural Education: Educational Objectives; Elementary Secondary Education: Hawaiians; Lesson Plans: Malayo Polynesian Languages: *Second Language Instruction: State Curriculum Guides; *Teaohing Guides: Teaching Methods IDENTIMRS Hawaii; *Ilocano ABSTRACT This guide expresses the philosophy, goals,and objectives, and outlines the scope and sequence ofIlocano language instruction at various levels for the public schoolsof Hawaii. It serves as a resource document for theteacher of Ilocano in that it identifies essential skills, suggests areas ofemphasis, points out possible problem areas, and proposes solutions tothose problems. It also presents a short history of the Ilocanolangua de, its phonology and grammar, selected language teaching andevaluation strategies, and an outline of the curriculum for LevelsI and II. he four appendices include the grammar of Ilocano, asample lesson, useful classroom expressions, and references. (Author/AMH) ***************************,******************************************* * * Reproductions supplied by EDRS are the best that canbe made * * from the original document. *********************************************************************** \*. SPANISH Cn rn "PERMISSION TO REPRODUCE THIS MATERIAL HAS BEEN GRANTED -
SC22/WG20 N896 L2/01-476 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale De Normalisation
SC22/WG20 N896 L2/01-476 Universal multiple-octet coded character set International organization for standardization Organisation internationale de normalisation Title: Ordering rules for Khmer Source: Kent Karlsson Date: 2001-12-19 Status: Expert contribution Document type: Working group document Action: For consideration by the UTC and JTC 1/SC 22/WG 20 1 Introduction The Khmer script in Unicode/10646 uses conjoining characters, just like the Indic scripts and Hangul. An alternative that is suitable (and much more elegant) for Khmer, but not for Indic scripts, would have been to have combining- below (and sometimes a bit to the side) consonants and combining-below independent vowels, similar to the combining-above Latin letters recently encoded, as well as how Tibetan is handled. However that is not the chosen solution, which instead uses a combining character (COENG) that makes characters conjoin (glue) like it is done for Indic (Brahmic) scripts and like Hangul Jamo, though the latter does not have a separate gluer character. In the Khmer script, using the COENG based approach, the words are formed from orthographic syllables, where an orthographic syllable has the following structure [add ligation control?]: Khmer-syllable ::= (K H)* K M* where K is a Khmer consonant (most with an inherent vowel that is pronounced only if there is no consonant, independent vowel, or dependent vowel following it in the orthographic syllable) or a Khmer independent vowel, H is the invisible Khmer conjoint former COENG, M is a combining character (including COENG, though that would be a misspelling), in particular a combining Khmer vowel (noted A below) or modifier sign. -
A Practical Sanskrit Introductory
A Practical Sanskrit Intro ductory This print le is available from ftpftpnacaczawiknersktintropsjan Preface This course of fteen lessons is intended to lift the Englishsp eaking studentwho knows nothing of Sanskrit to the level where he can intelligently apply Monier DhatuPat ha Williams dictionary and the to the study of the scriptures The rst ve lessons cover the pronunciation of the basic Sanskrit alphab et Devanagar together with its written form in b oth and transliterated Roman ash cards are included as an aid The notes on pronunciation are largely descriptive based on mouth p osition and eort with similar English Received Pronunciation sounds oered where p ossible The next four lessons describ e vowel emb ellishments to the consonants the principles of conjunct consonants Devanagar and additions to and variations in the alphab et Lessons ten and sandhi eleven present in grid form and explain their principles in sound The next three lessons p enetrate MonierWilliams dictionary through its four levels of alphab etical order and suggest strategies for nding dicult words The artha DhatuPat ha last lesson shows the extraction of the from the and the application of this and the dictionary to the study of the scriptures In addition to the primary course the rst eleven lessons include a B section whichintro duces the student to the principles of sentence structure in this fully inected language Six declension paradigms and class conjugation in the present tense are used with a minimal vo cabulary of nineteen words In the B part of -
Proposal for a Gurmukhi Script Root Zone Label Generation Ruleset (LGR)
Proposal for a Gurmukhi Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2019-04-22 Document version: 2.7 Authors: Neo-Brahmi Generation Panel [NBGP] 1. General Information/ Overview/ Abstract This document lays down the Label Generation Ruleset for Gurmukhi script. Three main components of the Gurmukhi Script LGR i.e. Code point repertoire, Variants and Whole Label Evaluation Rules have been described in detail here. All these components have been incorporated in a machine-readable format in the accompanying XML file named "proposal-gurmukhi-lgr-22apr19-en.xml". In addition, a document named “gurmukhi-test-labels-22apr19-en.txt” has been provided. It provides a list of labels which can produce variants as laid down in Section 6 of this document and it also provides valid and invalid labels as per the Whole Label Evaluation laid down in Section 7. 2. Script for which the LGR is proposed ISO 15924 Code: Guru ISO 15924 Key N°: 310 ISO 15924 English Name: Gurmukhi Latin transliteration of native script name: gurmukhī Native name of the script: ਗੁਰਮੁਖੀ Maximal Starting Repertoire [MSR] version: 4 1 3. Background on Script and Principal Languages Using It 3.1. The Evolution of the Script Like most of the North Indian writing systems, the Gurmukhi script is a descendant of the Brahmi script. The Proto-Gurmukhi letters evolved through the Gupta script from 4th to 8th century, followed by the Sharda script from 8th century onwards and finally adapted their archaic form in the Devasesha stage of the later Sharda script, dated between the 10th and 14th centuries. -
The Festvox Indic Frontend for Grapheme-To-Phoneme Conversion
The Festvox Indic Frontend for Grapheme-to-Phoneme Conversion Alok Parlikar, Sunayana Sitaram, Andrew Wilkinson and Alan W Black Carnegie Mellon University Pittsburgh, USA aup, ssitaram, aewilkin, [email protected] Abstract Text-to-Speech (TTS) systems convert text into phonetic pronunciations which are then processed by Acoustic Models. TTS frontends typically include text processing, lexical lookup and Grapheme-to-Phoneme (g2p) conversion stages. This paper describes the design and implementation of the Indic frontend, which provides explicit support for many major Indian languages, along with a unified framework with easy extensibility for other Indian languages. The Indic frontend handles many phenomena common to Indian languages such as schwa deletion, contextual nasalization, and voicing. It also handles multi-script synthesis between various Indian-language scripts and English. We describe experiments comparing the quality of TTS systems built using the Indic frontend to grapheme-based systems. While this frontend was designed keeping TTS in mind, it can also be used as a general g2p system for Automatic Speech Recognition. Keywords: speech synthesis, Indian language resources, pronunciation 1. Introduction in models of the spectrum and the prosody. Another prob- lem with this approach is that since each grapheme maps Intelligible and natural-sounding Text-to-Speech to a single “phoneme” in all contexts, this technique does (TTS) systems exist for a number of languages of the world not work well in the case of languages that have pronun- today. However, for low-resource, high-population lan- ciation ambiguities. We refer to this technique as “Raw guages, such as languages of the Indian subcontinent, there Graphemes.” are very few high-quality TTS systems available. -
Devanagari in Luatex
Devanagari in LuaTeX Using OpenType features Ivo Geradts Kai Eigner 1. About us 2. Foreign scripts in TeX, past and present 3. OpenType features 4. OpenType rendering engines 5. ConTeXt engine 6. Example 1: Latin with mark and mkmk 7. Example 2: Arabic 8. Example 3: Devanagari 1. About us TAT Zetwerk offers academic typesetting services Specialties: - critical editions - foreign scripts PlainTeX, for two reasons: - our exotic needs - well documented 2. Foreign scripts in TeX, past and present Past: 8-bit PostScript fonts; e.g. approach for classical greek: - encoding: α = a, β = b, γ = g - ligature table: σ = s, ς = /s, ἄ = a)' - multitude of font-related files: tfm, vf, pfb, enc Present: new TeX-engines XeTeX and LuaTeX can handle OpenType fonts containing Unicode glyph table: - ἄ is unicode position 1F04 3. OpenType features OpenType fonts contain more information than traditional fonts - tfm: bounding boxes, kerning, ligaturen, etc. - OpenType additions: GPOS, GSUB, marks Introduction of XeTeX and LuaTeX removes old limitations: - vocalized Hebrew, Arabic - CJK - Devanagari 4. OpenType rendering engines OpenType fonts require rendering engine, such as: - Uniscribe (Windows) - AAT (OS X); XeTeX on OS X - Graphite (Windows and Linux); XeTeX on all platforms - ConTeXt engine: collection of Lua-scripts attached to various callbacks related to font calls and processing of node list 5. ConTeXt engine - independent of operating system - readable and modifiable - work in progress 6. Example 1: Latin script with mark and mkmk 7. Example 2: Arabic 8. Example 3: Devanagari - Introduction: what is Devanagari? - Devanagari and OpenType - Examples Devanagari Script - Sanskrit (old Indic) - Hindi - Nepali Abugida (consonant–vowel sequences are written as a unit) क = “ka” ◌ क् = “k” क + ् (halant) ◌ क = “ku” क + ु (matra) Mahabharata Syllable structure - Consonants: e.g. -
Kharosthi Manuscripts: a Window on Gandharan Buddhism*
KHAROSTHI MANUSCRIPTS: A WINDOW ON GANDHARAN BUDDHISM* Andrew GLASS INTRODUCTION In the present article I offer a sketch of Gandharan Buddhism in the centuries around the turn of the common era by looking at various kinds of evidence which speak to us across the centuries. In doing so I hope to shed a little light on an important stage in the transmission of Buddhism as it spread from India, through Gandhara and Central Asia to China, Korea, and ultimately Japan. In particular, I will focus on the several collections of Kharo~thi manuscripts most of which are quite new to scholarship, the vast majority of these having been discovered only in the past ten years. I will also take a detailed look at the contents of one of these manuscripts in order to illustrate connections with other text collections in Pali and Chinese. Gandharan Buddhism is itself a large topic, which cannot be adequately described within the scope of the present article. I will therefore confine my observations to the period in which the Kharo~thi script was used as a literary medium, that is, from the time of Asoka in the middle of the third century B.C. until about the third century A.D., which I refer to as the Kharo~thi Period. In addition to looking at the new manuscript materials, other forms of evidence such as inscriptions, art and architecture will be touched upon, as they provide many complementary insights into the Buddhist culture of Gandhara. The travel accounts of the Chinese pilgrims * This article is based on a paper presented at Nagoya University on April 22nd 2004. -
Inclusion and Cultural Preservation for the Ifugao People
421 Journal of Southeast Asian Human Rights, Vol.2 No. 2 December 2018. pp. 421-447 doi: 10.19184/jseahr.v2i2.8232 © University of Jember & Indonesian Consortium for Human Rights Lecturers Inclusion and Cultural Preservation for the Ifugao People Ellisiah U. Jocson Managing Director, OneLife Foundation Inc. (OLFI), M.A.Ed Candidate, University of the Philippines, Diliman Abstract This study seeks to offer insight into the paradox between two ideologies that are currently being promoted in Philippine society and identify the relationship of both towards the indigenous community of the Ifugao in the country. Inclusion is a growing trend in many areas, such as education, business, and development. However, there is ambiguity in terms of educating and promoting inclusion for indigenous groups, particularly in the Philippines. Mandates to promote cultural preservation also present limits to the ability of indigenous people to partake in the cultures of mainstream society. The Ifugao, together with other indigenous tribes in the Philippines, are at a state of disadvantage due to the discrepancies between the rights that they receive relative to the more urbanized areas of the country. The desire to preserve the Ifugao culture and to become inclusive in delivering equal rights and services create divided vantages that seem to present a rift and dilemma deciding which ideology to promulgate. Apart from these imbalances, the stance of the Ifugao regarding this matter is unclear, particularly if they observe and follow a central principle. Given that the notion of inclusion is to accommodate everyone regardless of “race, gender, disability, ethnicity, social class, and religion,” it is highly imperative to provide clarity to this issue and identify what actions to take.