Document Image Analysis and Text Recognition on Khmer Historical Manuscripts

Total Page:16

File Type:pdf, Size:1020Kb

Document Image Analysis and Text Recognition on Khmer Historical Manuscripts Document Image Analysis and Text Recognition on Khmer Historical Manuscripts A thesis submitted in partial fulfillment of the requirements for the degree of Docteur en sciences de l’ingénieur et technologie, Université catholique de Louvain January 2020 DONA VALY Thesis committee: Prof. Michel Verleysen (UCLouvain), supervisor Prof. Jean-Pierre Raskin (UCLouvain), chairperson Prof. John Lee (UCLouvain), secretary Prof. Bernard Gosselin (UMons, Belgium) Dr. Sophea Chhun (ITC, Cambodia) List of Figures Figure 2.1: A collection of Khmer palm leaf manuscripts ................... 8 Figure 2.2: Palm leaves are being placed to dry ................................... 9 Figure 2.3: A black mixture is used to make the engraved text easy to read ..................................................................................................... 10 Figure 2.4: Several types of deformations and defects found in palm leaf manuscripts .................................................................................. 13 Figure 2.5: Example of similarity between certain Khmer characters 14 Figure 2.6: Examples of double-decker and triple-decker clusters of Khmer consonants .............................................................................. 15 Figure 2.7: Some examples of ligatures between certain letters which form new shapes ................................................................................. 16 Figure 2.8: Irregular sequential ordering of symbols in a word. The different order of the position of the characters in the text image (top) and the Unicode sequence of the text (bottom) .................................. 17 Figure 2.9: Simulation of how the word “TABLE” could have been written in Khmer style (vowel ‘A’ composed of two parts which are placed on top and on the left side of ‘T’, consonant ‘L’ written as a sub-from under ‘B’, and vowel ‘E’ placed under ‘B’ and the sub-f . 17 Figure 3.1: (a) Sigmoid, (b) Tanh, (c) ReLU, (d) Leaky ReLU ......... 28 Figure 3.2: Architecture of a simple CNN ......................................... 31 Figure 3.3: Architecture of a simple RNN (a) rolled, (b) unrolled .... 33 Figure 3.4: Architecture of an LSTM cell .......................................... 35 Figure 3.5: Architecture of a simple Bidirectional-RNN ................... 37 Figure 3.6: Axes representing the 4 possible directions in 2D-RNN . 39 Figure 4.1: Samples of digitized images from top to bottom: EFEO, BI, and NL .......................................................................................... 45 Figure 4.2: A set up for our digitization campaign ............................ 46 i Figure 4.3: Sample images of our digitization campaign (from top to bottom: Phnom Penh, Kandal, and Siem Reap) ................................. 46 Figure 4.4: Annotation of individual character dataset ...................... 48 Figure 4.5: Examples of (a) characters containing multiple parts and (b) merged shapes ............................................................................... 49 Figure 4.6: Application of inpainting technique on a character image patch (a) input image, (b) inpainting mask using polygon boundary, (c) result .............................................................................................. 49 Figure 4.7: Order sequence of characters in a word ........................... 50 Figure 4.8: Annotation of word dataset .............................................. 50 Figure 4.9: (a) Sample Khmer text, (b) Word separation, (c) Sub- syllable separation .............................................................................. 52 Figure 4.10: Construction of line segmentation ground truth ............ 53 Figure 4.11: Sample of an xml file storing annotation information of a manuscript page .................................................................................. 54 Figure 4.12: Samples of annotated character patch images ............... 55 Figure 4.13: Samples of annotated word patch images ...................... 55 Figure 4.14: Balinese palm leaf manuscripts ..................................... 57 Figure 4.15: Sundanese palm leaf manuscript .................................... 57 Figure 5.3: Khmer manuscript with binarized ground truth image .... 61 Figure 5.3: Balinese manuscript with binarized ground truth image . 61 Figure 5.3: Sundanese manuscript with binarized ground truth image ............................................................................................................ 61 Figure 5.4: Binarization of Khmer manuscript with ICFHR G1 method ............................................................................................................ 65 Figure 5.5: Binarization of Balinese manuscript with ICFHR G2 method ................................................................................................ 65 Figure 5.6: Binarization of Sundanese manuscript with Niblack’s method ................................................................................................ 65 Figure 5.7: Overview of the proposed line segmentation pipeline ..... 66 ii Figure 5.8: (a) Original Image, (b) Edge map using Canny edge detection, (c) Stroke map .................................................................... 68 Figure 5.9: (a) Smooth projection profiles and the estimated medial points (red dots), (b) Adaption of the medial points to surrounding connected components (black dots) ................................................... 71 Figure 5.10: An example of an optimal path going from the start state 1 to the goal state ........................................................................ 74 Figure 5.11: Segmentation results of the proposed approach (pairs of whole segmented manuscript page and zoomed out area with medial seams marked in red) .......................................................................... 76 Figure 6.1: Overview of the workflow of the text recognition module ............................................................................................................ 79 Figure 6.2: CNN based network ......................................................... 81 Figure 6.3: Column wise LSTM network .......................................... 81 Figure 6.4: Column wise and row wise LSTM .................................. 81 Figure 6.5: A combination of convolutional and recurrent neural network ............................................................................................... 82 Figure 6.6: (a). Original image, (b). Gray scaled and resized, (c). Normalized ......................................................................................... 83 Figure 6.7: Sample images of Balinese characters ............................. 84 Figure 6.8: Sample images of Sundanese characters ......................... 84 Figure 6.9: (a). Original word image patch, (b). Annotated character information in the word: polygon boundaries of all characters, (c). Glyph-class map ................................................................................. 87 Figure 6.10: General architecture of the networks in the first trial .... 88 Figure 6.11: Architecture of the 1D-LSTM layer (Trial-Net1) .......... 89 Figure 6.12: Architecture of the 2D-LSTM layer (Trial-Net2) .......... 89 Figure 6.13: (a). Initial sample order, (b). Sort by the width of each sample, (c). Pad each sample to the maximum width in the batch, (d). Shuffle batch order ............................................................................. 90 iii Figure 7.6: (a). Original word image, (b). Ground truth GCM, (c). Result predicted by the Trial-Net1, (d). Result predicted by the Trial- Net2 .................................................................................................... 91 Figure 7.1: Overview of the architecture of proposed word recognition model .................................................................................................. 94 Figure 7.2: Detailed architecture of the GCM decoder ...................... 98 Figure 7.3: Overview of the architecture of the SubSyl-Net ............ 102 Figure 7.4: Modified version of GCM, (a) Raw image patch I, (b) Map I' containing polygon boundaries of all glyphs, (c) Downsampling I' by applying nearest neighbor interpolation to obtain the new GCM 103 Figure 7.5: Detailed architecture of the inception layer. Values in the parenthesis are the numbers of filters (corresponding respectively to the first and the second inception blocks in the SubSyl-Net ............ 104 Figure 7.7: Sample results from the SubSyl-Net showing the predicted GCM and the attention map at each time step (the region highlighted in red is where the decoder attends to) ............................................. 117 iv List of Tables Table 4.1: Number of digital Image collections available in various establishments .................................................................................... 44 Table 4.2: Collection of digitized palm leaf manuscripts from our digitization campaign ......................................................................... 45 Table 4.3: Collection of palm leaf manuscript images from different sources composing SleukRith Set ...................................................... 47 Table 4.4 Summary of the statistics of the SleukRith Set .................. 54 Table 5.1: Palm leaf manuscript datasets for binarization task .......... 60 Table 5.2: Experimental results for the binarization task ................... 62 Table 5.3 Result of the performance evaluation of line segmentation methods
Recommended publications
  • Transforming Sundanese Script: from Palm Leaf to Digital Typography
    Transforming Sundanese Script: From Palm Leaf to Digital Typography PRESENTER’S IDENTITY HERE Name: Agung Zainal M. Raden, Rustopo, Timbul Haryono Affiliation: Program Doktor ISI Surakarta Abstract Code: ABS-ICOLLITE-20026 Transforming Sundanese Script: From Palm Leaf to Digital Typography AGUNG ZAINAL MUTTAKIN RADEN, RUSTOPO, TIMBUL HARYONO ABSTRACT The impact of globalisation is the lost of local culture, transformation is an attempt to offset the global culture. This article will discuss the transformation process from the Sundanese script contained in palm leaf media to the modern Sundanese script in the form of digital typography. The method used is transformation, which can be applied to rediscover the ancient Sundanese script within the new form known as the modern Sundanese script that it is relevant to modern society. Transformation aims to maintain local culture from global cultural domination. This article discovers the way Sundanese people reinvent their identity through the transformation from ancient Sundanese script to modern Sundanese script by designing a new form of script in order to follow the global technological developments. Keywords: Sundanese script, digital typography, transformation, reinventing, globalisation INTRODUCTION Sundanese ancient handwriting in palm leaf manuscripts is one of the cultural heritage that provides rich knowledge about past, recent, and the future of the Sundanese. The Sundanese script is a manifestation of Sundanese artefacts that contain many symbols and values Digitisation is the process of transforming analogue material into binary electronic (digital) form, especially for storage and use in a computer The dataset consists of three type of data: annotation at word level, annotation at character level, and binarised images Unicode is a universal character encoding standard used for representation of text for computer processing METHODS Transforming: Aims to reinvent an old form of tradition so that it fits into and suits contemporary lifestyles.
    [Show full text]
  • SC22/WG20 N896 L2/01-476 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale De Normalisation
    SC22/WG20 N896 L2/01-476 Universal multiple-octet coded character set International organization for standardization Organisation internationale de normalisation Title: Ordering rules for Khmer Source: Kent Karlsson Date: 2001-12-19 Status: Expert contribution Document type: Working group document Action: For consideration by the UTC and JTC 1/SC 22/WG 20 1 Introduction The Khmer script in Unicode/10646 uses conjoining characters, just like the Indic scripts and Hangul. An alternative that is suitable (and much more elegant) for Khmer, but not for Indic scripts, would have been to have combining- below (and sometimes a bit to the side) consonants and combining-below independent vowels, similar to the combining-above Latin letters recently encoded, as well as how Tibetan is handled. However that is not the chosen solution, which instead uses a combining character (COENG) that makes characters conjoin (glue) like it is done for Indic (Brahmic) scripts and like Hangul Jamo, though the latter does not have a separate gluer character. In the Khmer script, using the COENG based approach, the words are formed from orthographic syllables, where an orthographic syllable has the following structure [add ligation control?]: Khmer-syllable ::= (K H)* K M* where K is a Khmer consonant (most with an inherent vowel that is pronounced only if there is no consonant, independent vowel, or dependent vowel following it in the orthographic syllable) or a Khmer independent vowel, H is the invisible Khmer conjoint former COENG, M is a combining character (including COENG, though that would be a misspelling), in particular a combining Khmer vowel (noted A below) or modifier sign.
    [Show full text]
  • The Development of Iqra' Lampung Script Teaching Materials for Primary School Levels in Bandar Lampung City
    INTERNATIONAL JOURNAL OF EDUCATIONAL STUDIES IN SOCIAL SCIENCES 2021, VOL. 1, NO. 1, 38 – 43 OPEN ACCESS The development of Iqra' Lampung script teaching materials for primary school levels in Bandar Lampung city Ali Imron1 1Department of History Education, University of Lampung, Bandar Lampung, Indonesia ABSTRACT KEYWORDS The Lampung Provincial Government requires all primary and secondary schools to Language; Lampung script: provide Lampung language lessons and introduction to Lampung script. Some learning; teaching material; students only recognize the Lampung Khaganga script, which amounts to 20 primary school characters, while the combination of the main character and 12 characters that produces 560 syllables of the Lampung script is not yet fully understood. In fact, knowledge of Lampung script is still lacking, even some students do not understand Lampung script, it can be said that learning Lampung script at elementary school level has not been successful. This could be due to a lack of modules (teaching materials) that support learning. The research aims to develop Lampung language teaching materials in Lampung script material in the form of Iqra' Lampung script as teaching material for elementary school level. This study adopted a research and development (R&D) method proposed by Gall, Borg & Gall (2007). The work step procedure includes: 1) preliminary study, 2) product development, 3) product validation test and revision. Based on the research that has been done, it can be concluded that the teaching material (module) is feasible because the average value is 4.588 from the teacher validator and 4.446 from the lecturer validator. If it is based on a scale of 5 it can be concluded that the Iqra' Lampung script teaching material has a very good value of teachers and good grades from expert lecturers in terms of material aspects, language, presentation, the effect of teaching materials on learning strategies, and overall appearance feasibility.
    [Show full text]
  • Development Production Line the Short Story
    Development Production Line The Short Story Jene Jasper Copyright © 2007-2018 freedumbytes.dev.net (Free Dumb Bytes) Published 3 July 2018 4.0-beta Edition While every precaution has been taken in the preparation of this installation manual, the publisher and author assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. To get an idea of the Development Production Line take a look at the following Application Integration overview and Maven vs SonarQube Quality Assurance reports comparison. 1. Operating System ......................................................................................................... 1 1.1. Windows ........................................................................................................... 1 1.1.1. Resources ................................................................................................ 1 1.1.2. Desktop .................................................................................................. 1 1.1.3. Explorer .................................................................................................. 1 1.1.4. Windows 7 Start Menu ................................................................................ 2 1.1.5. Task Manager replacement ........................................................................... 3 1.1.6. Resource Monitor .....................................................................................
    [Show full text]
  • Khmer Phonetics & Phonology: Theoretical Implications for ESL Instruction
    Running Head: KHMER PHONETICS AND PHONOLOGY 1 Khmer Phonetics & Phonology: Theoretical Implications for ESL Instruction Alex Donley A Senior Thesis submitted in partial fulfillment of the requirements for graduation in the Honors Program Liberty University Spring 2020 KHMER PHONETICS AND PHONOLOGY 2 Acceptance of Senior Honors Thesis This Senior Honors Thesis is accepted in partial fulfillment of the requirements for graduation from the Honors Program of Liberty University. ______________________________ Jaeshil Kim, Ph.D. Thesis Chair ______________________________ Stephanie Blankenship, Ed.D. Committee Member ______________________________ David Schweitzer, Ph.D. Assistant Honors Director ______________________________ Date KHMER PHONETICS AND PHONOLOGY 3 Abstract This thesis develops an approach to English teaching for Khmer-speaking students that centers on Khmer phonetics and phonology. Cambodia has a strong demand for English instruction, but consistently underperforms next to other nations in terms of proficiency. A significant reason for Cambodia’s skill gap is the lack of research into linguistic hurdles Khmer speakers face when learning English. This paper aims to bridge Khmer and English with an understanding of the speech systems that both languages use before turning to the unique challenges Khmer speakers must overcome based on the tenets of L1 Transfer Theory. It closes by outlining strategies for English teachers to build the comprehensibility and confidence of their Khmer-speaking students. Keywords: Khmer, English, phonetics, phonology, transfer, ESL KHMER PHONETICS AND PHONOLOGY 4 Khmer Phonetics and Phonology: Theoretical Implications for ESL Instruction Introduction This thesis develops an approach to English teaching for Khmer-speaking students that is grounded in a thorough understanding of Khmer phonetics and phonology.
    [Show full text]
  • Khmer Romanization Table
    KHMER Consonants ʻAksar Mul script ʻAksar Mul script Full Full Full Full Form Subscript form Subscript Romanization Form Subscript form Subscript Romanization ក ◌ក ក ◌្ក k ទ ◌ទ ទ ◌្ទ d ខ ◌ខ ខ ◌្ខ kh ធ ◌ធ ធ ◌្ធ dh គ ◌គ គ ◌្គ g ន ◌ន ន ◌្ន n ឃ ◌ឃ ឃ ◌ឃ gh ប ◌ប ប ◌ប p ង ◌ង ង ◌្ង ng ផ ◌ផ ផ ◌្ផ ph ច ◌ច ច ◌្ច c ព ◌ព ព ◌្ព b ឆ ◌ឆ ឆ ◌្ឆ ch ភ ◌ភ ភ ◌្ភ bh ជ ◌ជ ជ ◌្ជ j ម ◌ម ម ◌្ម m ឈ ◌ឈ ឈ ◌ឈ jh យ ◌យ យ ◌យ y ញ ◌ញ or ញ ញ ◌្ញ or ញ ñ រ ្រ◌ រ ្រ◌ r ដ ◌្ត ដ ◌្ត ṭ ល ◌្ល ល ◌្ល l ឋ ◌្ឋ ឋ ◌្ឋ ṭh វ ◌្វ វ ◌្វ v ឌ ◌្ឌ ឌ ◌្ឌ ḍ ឝ ◌គ ឝ ◌្គ ś * ឍ ◌ ឍ ◌ឍ ḍh ឞ ◌ ឞ ◌្ឞ s*̣ ណ ◌្ណ ណ ◌្ណ ṇ ស ◌ ស ◌ស s ត ◌្ត ត ◌្ត t ហ ◌្ហ ហ ◌្ហ h ḷ (l with ថ ◌ថ ថ ◌្ថ th ឡ - ឡ - dot below) ‛ ʹ (ayn + អ ◌្អ អ ◌្អ soft sign) * Not used since the mid-17th century and is mainly used for Pali and Sanskrit transliteration. Vowels Independent Romanization Independent Romanization ឥ i ឭ ḷ ឦ ī ឮ ḹ ឧ u ឯ ae ឩ ū ឰ ai ឪ ýu ឱ o ឫ ṛ ឳ au ឬ ṝ Dependent Romanization Dependent Romanization ◌◌ ʹaʹ ែ◌ ʹae ◌ា ʹā ៃ◌ ʹai ◌ិ ʹi េ◌ ʹo ◌ី ʹī េ◌ ʹau ◌ឹ ʹẏ ◌ុ◌ំ ʹuṃ ȳ ṃ ◌ឺ ʹ ◌ំ ʹa ◌ុ ʹu ◌ា◌ំ ʹāṃ ◌ូ ʹū ◌ះ ʹaḥ ◌ួ ʹua ◌ិ◌ះ ʹiḥ ʹẏ េ◌ើ ʹoe ◌ឹ◌ះ ḥ ẏ ʹu េ◌ឿ ʹ a ◌ុ◌ះ ḥ េ◌ៀ ʹia េ◌◌ះ ʹeḥ េ◌ ʹe េ◌◌ះ ʹoaḥ Diacritical marks Vernacular Alternative Romanization ◌៉ ◌ុ ″ (hard sign) ◌៊ ◌ុ ′ (soft sign (prime)) ◌៌ r ◌៍ ̊ (circle above) (see Note 5) ◌៎ ’ (alif) ◌៏ ʻ (ayn) ◌៱ ˙ (dot above) (see Note 5) ◌័◌ ă (breve) ◌ៈ à (combining grave accent) ◌◌់ á (combining acute accent) ◌ា◌់ â (modified letter circumflex) Notes 1.
    [Show full text]
  • Iso/Iec Jtc1/Sc2/Wg2 N3022 L2/06-002
    ISO/IEC JTC1/SC2/WG2 N3022 L2/06-002 2006-01-09 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Proposal for encoding the Sundanese script in the BMP of the UCS Source: Michael Everson Status: Individual Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Date: 2006-01-09 The Sundanese script, or aksara Sunda, is used for writing the Sundanese language, one of the languages of the island of Java in Indonesia. It is a descendent of the ancient Brahmi script of India, and so has many similarities with modern scripts of South Asia and Southeast Asia which are also members of that family. The script has official support and is taught in schools and is used on road signs. The SIL Ethnologue indicates that there are 27,000,000 Sundanese. Sundanese has been written in a number of scripts. Pallawa or Pra-Nagari was first used in West Java to write Sanskrit from the fifth to eighth centuries CE, and from Pallawa was derived Sunda Kuna or Old Sundanese which was used in the Sunda Kingdom from the 14th to 18th centuries. Both Javanese and Arabic script were used from the 17th to 19th centuries and the 17th to the mid-20th centuries respectively. Latin script has had currency since the 20th century. The modern Sundanese script, called Sunda Baku or Official Sundanese, was made official in 1996. The modern script itself was derived from Old Sundanese, the earliest example of which is the Prasasti Kawali stone (see Figure 1).
    [Show full text]
  • Incipient Tonogenesis in Phnom Penh Khmer: Acoustic and Perceptual Studies
    Incipient tonogenesis in Phnom Penh Khmer: Acoustic and perceptual studies James P. Kirby School of Philosophy, Psychology and Language Sciences, The University of Edinburgh, Dugald Stewart Building, 3 Charles Street, Edinburgh, EH8 9AD, Scotland (U.K.) Abstract Unlike many languages of Southeast Asia, Khmer (Cambodian) is not a tone lan- guage. However, in the colloquial speech of the capital Phnom Penh, /r/ is lost in onsets, reportedly supplanted by a range of other acoustic cues such as aspira- tion, a falling- or low-rising f0 contour, breathy voice quality, and in some cases diphthongization, e.g. /krA:/ ‘poor’ > [k`OA], [khˇOA],[kOA], /kru:/ ‘teacher’ > [k`u:], ¨¨ [khˇu:], [ku:]. This paper presents the results of production and perception studies ¨ designed to shed light on this unusual sound change. Acoustic evidence shows that colloquial /CrV/ forms differ from reading pronunciation forms in terms of VOT, f0, and spectral balance measures, while a pair of perceptual studies demon- strate that f0 is a sufficient cue for listeners to distinguish underlying /CrV/-initial from /CV/-initial forms, but that F1 is not. I suggest that this sound change may have arisen via the perceptual reanalysis of changes in spectral balance, coupled with the coarticulatory influence of the dorsal gesture for /r/. Key words: Khmer; tonogenesis; voice quality; sound change; phonologization ∗Tel: +44 (0)131 650 3952; fax: +44 (0)131 651 3190. Email address: [email protected] (James P. Kirby) Preprint submitted to Journal of Phonetics January 8, 2014 1. Introduction Khmer, the national language of Cambodia, is notable for being a non-tonal language in what may be the most ‘tone-prone’ area of the world (Matisoff, 1973).
    [Show full text]
  • Brāhmaṇa As an Honorific in 'Indianized' Southeast Asia. a Linguistic Approach
    Brāhmaṇa as an honorific in ’Indianized’ Southeast Asia. A Linguistic Approach Frederic Pain To cite this version: Frederic Pain. Brāhmaṇa as an honorific in ’Indianized’ Southeast Asia. A Linguistic Approach. 2014. hal-01009585 HAL Id: hal-01009585 https://hal.archives-ouvertes.fr/hal-01009585 Preprint submitted on 18 Jun 2014 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. PRE-PRINT VERSION | 1 """B"BBBRĀHMARĀHMARĀHMAṆṆṆṆAAAA"""" ASASAS AN HONORIFIC INININ ‘I‘I‘INDIANIZED‘INDIANIZEDNDIANIZED’’ MAINLAND SSSOUTHEASTSOUTHEAST ASIA A LINGUISTIC APPROACH111 Frédéric Pain Academia Sinica (Institute of Linguistics, Taipei) KULeuven (Belgium) – LACITO (France) With the editorial assistance of Jonathan Evans, Academia Sinica Abstract. This article demonstrates that the Old Khmer b/vraḥ originates from a syllabic reduction of Sanskrit brāhmaṇa via monosyllabization, a widespread diachronic phenomenon among the Mon-Khmer languages of Mainland Southeast Asia and will show that this term must have been originally used as an honorific for deities and, consequently for royalty. We then respectfully disagree with two other current hypotheses explaining the etymology of this word, that is b/vraḥ is an autochthonous Mon-Khmer word or, according to the second hypothesis, that b/vraḥ originates in the Sanskrit/ Pāli word vara- "excellent, splendid, noble".
    [Show full text]
  • On the History of Thai Scripts Hans Penth
    Dr. Hans Penth giving his address to the audience or the Oriental Dr. Navavan Bandhumedha talking about the different Tai Hotel on 10 July 1986. Languages at the Oriental Hotel on 10 July 1986. On the History of Thai scripts Hans Penth (From lecture delivered during a meeting of groups were living scattered over a vast area, that the Siam Society, held at the Oriental Hotel, on borrowing of local Man letters by local Thai groups Thursday, 10 July 1986) had the effect to produce, right from the outset, a number of slightly different local proto-Thai scripts Dr. Pe nth pointed out in his introductory which were spread out over a large geographical remarks that the history of the alphabets used by area. the Thais throughout Southeast Asia was still insufficiently known but there were many theories. (3) In the decades around 1300 A.D., efforts He hoped he would be excused if he too come up to reform the old proto-Thai script were made in with certain ideas concerning the origin and both Sukhothai and Chiang Mai. These efforts mark development of the scripts used by tlie Thais, from the turning point from proto-Thai script to modern Assam in the west to Tongking in the east, and from Thai script. Yunnan in the north to Thailand in the south, and (4) Also during the decades before or around referred listeners for further information to his article 1300 A.D., the Thais of Sukhothai and the Thais included in H.R.H. Galyani Vadhana's book on of Chiang Mai did something else that was identical: Yunnan.
    [Show full text]
  • A STUDY of the NAMES of MONUMENTS in ANGKOR (Cambodia)
    A STUDY OF THE NAMES OF MONUMENTS IN ANGKOR (Cambodia) NHIM Sotheavin Sophia Asia Center for Research and Human Development, Sophia University Introduction This article aims at clarifying the concept of Khmer culture by specifically explaining the meanings of the names of the monuments in Angkor, names that have existed within the Khmer cultural community.1 Many works on Angkor history have been researched in different fields, such as the evolution of arts and architecture, through a systematic analysis of monuments and archaeological excavation analysis, and the most crucial are based on Cambodian epigraphy. My work however is meant to shed light on Angkor cultural history by studying the names of the monuments, and I intend to do so by searching for the original names that are found in ancient and middle period inscriptions, as well as those appearing in the oral tradition. This study also seeks to undertake a thorough verification of the condition and shape of the monuments, as well as the mode of affixation of names for them by the local inhabitants. I also wish to focus on certain crucial errors, as well as the insufficiency of earlier studies on the subject. To begin with, the books written in foreign languages often have mistakes in the vocabulary involved in the etymology of Khmer temples. Some researchers are not very familiar with the Khmer language, and besides, they might not have visited the site very often, or possibly also they did not pay too much attention to the oral tradition related to these ruins, a tradition that might be known to the village elders.
    [Show full text]
  • (RSEP) Request October 16, 2017 Registry Operator INFIBEAM INCORPORATION LIMITED 9Th Floor
    Registry Services Evaluation Policy (RSEP) Request October 16, 2017 Registry Operator INFIBEAM INCORPORATION LIMITED 9th Floor, A-Wing Gopal Palace, NehruNagar Ahmedabad, Gujarat 380015 Request Details Case Number: 00874461 This service request should be used to submit a Registry Services Evaluation Policy (RSEP) request. An RSEP is required to add, modify or remove Registry Services for a TLD. More information about the process is available at https://www.icann.org/resources/pages/rsep-2014- 02-19-en Complete the information requested below. All answers marked with a red asterisk are required. Click the Save button to save your work and click the Submit button to submit to ICANN. PROPOSED SERVICE 1. Name of Proposed Service Removal of IDN Languages for .OOO 2. Technical description of Proposed Service. If additional information needs to be considered, attach one PDF file Infibeam Incorporation Limited (“infibeam”) the Registry Operator for the .OOO TLD, intends to change its Registry Service Provider for the .OOO TLD to CentralNic Limited. Accordingly, Infibeam seeks to remove the following IDN languages from Exhibit A of the .OOO New gTLD Registry Agreement: - Armenian script - Avestan script - Azerbaijani language - Balinese script - Bamum script - Batak script - Belarusian language - Bengali script - Bopomofo script - Brahmi script - Buginese script - Buhid script - Bulgarian language - Canadian Aboriginal script - Carian script - Cham script - Cherokee script - Coptic script - Croatian language - Cuneiform script - Devanagari script
    [Show full text]