Development of a Punjabi to English Transliteration System

Total Page:16

File Type:pdf, Size:1020Kb

Development of a Punjabi to English Transliteration System International Journal of Computer Science and Communication Vol. 2, No. 2, July-December 2011, pp. 521-526 DEVELOPMENT OF A PUNJABI TO ENGLISH TRANSLITERATION SYSTEM Kamal Deep1 and Vishal Goyal2 1Department of Computer Science, Punjabi University, Patiala, India E-mail: [email protected] 2Assistant Professor, Department of Computer Science, Punjabi University, Patiala, India E-mail: [email protected] ABSTRACT Machine transliteration has gained prime importance as a supporting tool for Machine translation and cross language information retrieval especially when proper names and technical terms are involved. The performance of machine translation and cross-language information retrieval depends extremely on accurate transliteration of named entities. Hence, the transliteration model must aim to preserve the phonetic structure of words as closely as possible. This paper addresses the problem of transliterating Punjabi to English language using a rule based approach .The proposed transliteration scheme uses grapheme based method to model the transliteration problem. This technique has demonstrated transliteration from Punjabi to English for conman names and achieved accuracy of 93.22%. Y 1. INTRODUCTION grapheme transformation. In hybrid approaches ( H), it Transliteration is the process of replacing words in simply combines the grapheme-based transliteration Y source language with their approximate phonetic or probability (Pr ( G)) and the phoneme-based Y spelling equivalents in target language. Commonly, transliteration probability (Pr ( P)) using linear transliteration is used to translate named entities across interpolation. languages. Automatic transliteration is helpful for many Vijaya, VP, Shivapratap and KP CEN [1] has applications, such as Machine Translation (MT), Cross developed English to Tamil Transliteration system and Language Information Retrieval (CLIR) and Information named it WEKA. It is a Rule based system and is used Extraction (IE), etc. Transliterating a word from the the j48 decision tree classifier of WEKA for classification language of its origin to a foreign language is called purposes. The transliteration process consisted of four Forward Transliteration, while transliterating a loan phases: Preprocessing phase, feature extraction, training word written in a foreign language back to the language and transliteration phase .The accuracy of this system of its origin is called Backward Transliteration. This has been tested with 1000 English names that were out paper addresses the problem of forward transliterating of corpus. The transliteration model produced an exact of person names from Punjabi to English. transliteration in Tamil from English words with an accuracy of 84.82%.Chinnakotla, Damani, Satoskar[2] has The remainder of this paper is organized as follows. developed Transliteration systems for Resource Scarce In section 2, we have described the related work. Section Languages. They have developed rule based systems for 3 introduces about English and Punjabi Language. We Hindi to English, English to Hindi, and Persian to English describe Transliteration System Architecture in sections transliteration tasks. They used CSM (Character 4. Experimental Results and Error Analysis are discussed Sequence Modeling) on the source side for word origin in section 5. Finally, we have concluded it in section 6. identiûcation,a manually generated non-probabilistic character mapping rule base for generating 2. RELATED WORK transliteration candidates, and then again used the CSM Several approaches have been proposed for name on the target side for ranking the generated candidates. Y transliteration. In Grapheme based approaches ( G), The overall efficiency by using CRF (Conditional Random transliteration is viewed as a process of mapping a Field) approach of English to Hindi is 67.0%, Hindi to grapheme sequence from a source language to a target English is 70.7% and Persian to English is 48.0%. Lehal language ignoring the phoneme-level processes. In and Singh [3] have developed Shahmukhi to Gurmukhi Y contrast, in phoneme-based approaches ( P), the Transliteration System based on Corpus approach. In this transliteration key is pronunciation or the source system, first of all script mappings has been done in phoneme rather than Spelling or the source grapheme. which mapping of Simple Consonants, Aspirated This approach is basically source grapheme-to-source Consonants (AC), Vowels, other Diacritical Marks or phoneme transformation and source phoneme-to-target Symbols are done. This system has been virtually divided 522 International Journal of Computer Science and Communication (IJCSC) into two phases. The first phase performs pre-processing and its corresponding phoneme have been aligned and rule-based transliteration tasks and the second phase phonetically. Second, English words have been performs the task of post-processing. The overall transliterated into Korean words through several steps. accuracy of system has been reported to be 91.37%. Malik Using an English pronunciation dictionary (P-DIC), [4] has developed Punjabi Machine Transliteration assigned pronunciation to a given English word. If it has (PMT) system which is rule-based. PMT has been used been not found in P-DIC, system investigates that it has for the Shahmukhi to Gurmukhi Transliteration System. a complex word form. For detecting a complex word PMT has preserved the phonetics of transliterated word form, they have divided a given English word into two and the meaning of transliterated word. The primary words (word+word) using entries of P-DIC. If both of limitation of this system is that this system works only them are in P-DIC, system can assign pronunciation to on input data which has been manually edited for the given word otherwise system should estimate missing vowels or diacritical marks (the basic ambiguity pronunciation. Then, system checks whether the English of written Arabic script) which practically has limited word is from Greek origin or not. Because a way of E-K use. The accuracy of system has been reported to 98.95%. transliteration for the English words of Greek origin is Verma[5] has developed Gurmukhi to Roman different from that for pure English words, it is important Transliteration System and named it GTrans. He has to detect that. Pronunciation for English words, which surveyed existing Roman-Indic script transliteration were not registered in a P-DIC, has been estimated in techniques and finally a transliteration scheme based on the next step. Finally, Korean transliterated words has ISO: 15919 transliteration and ALA-LC has been been generated using conversion rules. Evaluation has developed. It is a rule based system. He has also done been performed through Word Accuracy (WA) and reverse transliteration from Gurumukhi to Roman. The Character Accuracy (CA). This system has reported overall accuracy of system has been reported to be accuracy of 90.82% for WA and 56% for CA. Yaser, 98.43%. Hong, Kim, Lee and Chang [6] have developed Knight [9] has developed Arabic To English English-Korean Name Transliteration system, using the Transliteration system based on the sound & spelling Hybrid Approach. In the transliteration process, first, a mapping using finite state machine. They have combined phrase-base SMT model with some factored translation the phonetic based model & spelling based model into features has been used. Second, they have expanded the the single transliteration model. For testing they have base system by applying web-based n-best re-ranking used the development data set & blind data set. The of the results. Third, they have applied a pronouncing overall accuracy with development data set has been dictionary-based method to the base system which reported to be 53.66% & with blind data set it showed utilizes the pronunciation symbols which is motivated 61% accuracy. The reason of high accuracy with blind by linguistic knowledge. Finally, phonics based method data set was that blind set is mostly of highly frequent, is applied which has been originally designed for prominent politicians where as development set also teaching speakers of English to read and write that contain names of writers and less common political language. The experimental results of using three n-best figure. re-ranking techniques have showed that the web-based re-ranking is proved to be a useful method .Their 3. PUNJABI & ENGLISH LANGUAGE standard run and best standard run has accuracy of In this section we will discuss about Punjabi and English 45.1% & 78.5%. Ali and Ijaz [7] have developed English Language. to Urdu Transliteration System based on the mapping rules. The whole process has three steps. In the first step, 3.1 Punjabi Language the mapping rules that have been used to generate Urdu Punjabi Language is written in Gurmukhi Script. The text from English transcription. English text is converted Gurmukhi script was derived from the Sharada script to Urdu using both English pronunciation and mapping and standardized by Guru Angad Dev in the 16th rules. In Second step, Urdu syllabification has been century. It was designed to write the Punjabi language. applied on English transcription. Consonant and Vowels The meaning of Gurmukhi is “from the mouth of the have been combined to make syllable and breaking up a Guru”. The Gurmukhi (or Punjabi) alphabet contains word into syllables is known as syllabification. To thirty-five distinct letters. These are: improve system’s accuracy, they have applied the Urduization Rules
Recommended publications
  • The Origins, Evolution and Decline of the Khojki Script
    The origins, evolution and decline of the Khojki script Juan Bruce The origins, evolution and decline of the Khojki script Juan Bruce Dissertation submitted in partial fulfilment of the requirements for the Master of Arts in Typeface Design, University of Reading, 2015. 5 Abstract The Khojki script is an Indian script whose origins are in Sindh (now southern Pakistan), a region that has witnessed the conflict between Islam and Hinduism for more than 1,200 years. After the gradual occupation of the region by Muslims from the 8th century onwards, the region underwent significant cultural changes. This dissertation reviews the history of the script and the different uses that it took on among the Khoja people since Muslim missionaries began their activities in Sindh communities in the 14th century. It questions the origins of the Khojas and exposes the impact that their transition from a Hindu merchant caste to a broader Muslim community had on the development of the script. During this process of transformation, a rich and complex creed, known as Satpanth, resulted from the blend of these cultures. The study also considers the roots of the Khojki writing system, especially the modernization that the script went through in order to suit more sophisticated means of expression. As a result, through recording the religious Satpanth literature, Khojki evolved and left behind its mercantile features, insufficient for this purpose. Through comparative analysis of printed Khojki texts, this dissertation examines the use of the script in Bombay at the beginning of the 20th century in the shape of Khoja Ismaili literature.
    [Show full text]
  • Shahmukhi to Gurmukhi Transliteration System: a Corpus Based Approach
    Shahmukhi to Gurmukhi Transliteration System: A Corpus based Approach Tejinder Singh Saini1 and Gurpreet Singh Lehal2 1 Advanced Centre for Technical Development of Punjabi Language, Literature & Culture, Punjabi University, Patiala 147 002, Punjab, India [email protected] http://www.advancedcentrepunjabi.org 2 Department of Computer Science, Punjabi University, Patiala 147 002, Punjab, India [email protected] Abstract. This research paper describes a corpus based transliteration system for Punjabi language. The existence of two scripts for Punjabi language has created a script barrier between the Punjabi literature written in India and in Pakistan. This research project has developed a new system for the first time of its kind for Shahmukhi script of Punjabi language. The proposed system for Shahmukhi to Gurmukhi transliteration has been implemented with various research techniques based on language corpus. The corpus analysis program has been run on both Shahmukhi and Gurmukhi corpora for generating statistical data for different types like character, word and n-gram frequencies. This statistical analysis is used in different phases of transliteration. Potentially, all members of the substantial Punjabi community will benefit vastly from this transliteration system. 1 Introduction One of the great challenges before Information Technology is to overcome language barriers dividing the mankind so that everyone can communicate with everyone else on the planet in real time. South Asia is one of those unique parts of the world where a single language is written in different scripts. This is the case, for example, with Punjabi language spoken by tens of millions of people but written in Indian East Punjab (20 million) in Gurmukhi script (a left to right script based on Devanagari) and in Pakistani West Punjab (80 million), written in Shahmukhi script (a right to left script based on Arabic), and by a growing number of Punjabis (2 million) in the EU and the US in the Roman script.
    [Show full text]
  • Ist National Digital Workshop on Sharada Script Learning
    Ist National Digital Workshop On Sharada Script Learning (Level 1 course) (Under the aegis of MOU between MIEF & KSU) 13th -20th July, 2020,4-6pm On Zoom App Platform Registration: https://bit.ly/3gNS3Kh Ancient Kashmir Map Sharada Temple Ruins Pakistan Occupied Sharada Script Inscription Kashmir Sharada Script Learning Workshop Sharada Script trained scholars Sharada Script Learning workshop MOU Siging with KSU in 2018 Banner 1 Background Medium of Training: About Sharada Script English/Hindi/Sanskrit/Kannad Among the Western Himalayan scripts, the Sharada alphabet has a Who Can take Part place of pride. Evolved from north western Brahmi a millennium Any interested person who may or may not be knowing Sanskrit ago in the 9th century A.D. It remained in popular use for several language/Devnagari script. centuries in an extensive area of Western Himalayas including North Western Frontier Province, Dardistan, Kashmir, Jammu, Sanskrit knowing scholars/Academicians/Researchers/Philosphers Ladakh and Himachal Pradesh. Later it got restricted to Kashmir etc. only, where it was theprincipal means of writing until the 20th E-Certificate of participation will be given to all participants century. This form was widely used for writing Vedic texts (speakers / experts /panelists/ delegates). The epigraphic and literary records written in this script, that have Program Schedule been found in these regions, have thrown light on many facets of the history and culture of the areas of their provenance The Gurmukhi script was developed from Sarada. 13 July 2020 Day 1 Like the Brahmi and the Kharoshti in the ancient period, the 4.00-04.30 pm Inaugural session Sharada script in the early medieval period formed a vital link 4.30-5.00 pm History about Spkr Prof Sushma in the chain of communication of ideas, knowledge, and culture development of Sharada Devi Gupta Sanskrit among the states comprised in the Western Himalayan region.
    [Show full text]
  • Praagaash February 2019.Cdr
    For Private Circulation Only ß vçcçççÆcç lJççb Mççjoç oíJççR, cçnçYççiççR YçiçJçlççR kçÀçMcççÇj HçájJçççÆmçvççR, çÆJçÐçç oççƳçvççR j#ç cççcç j#ç cççcçd~ vçcçççÆcç lJççcçd~ Praagaash Net-journal of 'Zaan’ ÒççiççMç `]pççvç' kçÀçÇ vçíì-HççÆ$çkçÀç Jç<ç& 4 : DçbkçÀ 2 ~ Vol 4 : No. 2 HçÀjJçjçÇ 2019 ~ February 2019 Srinagar City River Painting by Kapil Kaul 01 Praagaash ÒççiççMç `]pççvç' kçÀçÇ vçíì-HççÆ$çkçÀç Jç<ç& 4 : DçbkçÀ 2~ HçÀjJçjçÇ 2019 In this issue Editorial - T.N.Dhar ‘Kundan’ l Editorial : T.N.Dhar ‘Kundan’ 01 l Literature & Litterateurs : Editor Praagaash e are grateful to our readers for - Samad Mir - The Sufi Poet of Kashmir 02 W l History : P.L.Ganju their continued support and for - History of Two Ancient Capitals of Kashmir 05 appreciating the standard l kçÀçJ³ç : uççuçe uç#çcçCç : cçncçÓo içç@cççÇ 09 maintained in this e-journal. As l Peculiar Kashmiri Words You May Not Know 09 already stated, our endeavour has l My Medical Journey - Dr. K.L.Chowdhury been and will be to highlight the - A Doctor’s Deportment 10 l Language : Prof R.N.Bhat richness of our culture and the - Writing Systems in India 13 beauty of our tradition. As is well l Adventures : Ajay Dhar known our language is one of the - My Polar Adventure - 5 17 prominent factors of our culture. It is our duty, l Our Mothertongue : M.K.Raina - On Kashmiri Language 21 therefore, to propagate it so that our younger l nbmçvçç cçvçç nÌ 24 generation does not get disconnected from our roots. l Kashmir Shaivism : T.N.Dhar Kundan Appreciating the practicality of this task, we have - Abhinavgupta - The Pride of Kashmir 25 been advocating use of Devanagari script, in addition l kçÀçJ³ç : yçMççÇj Dçlnj to the Nastaliq recognised by the state.
    [Show full text]
  • Some Interesting Facts, Myths and History of Mathematics
    International Journal of Mathematics and Statistics Invention (IJMSI) E-ISSN: 2321 – 4767 P-ISSN: 2321 - 4759 www.ijmsi.org Volume 4 Issue 6 || August. 2016 || PP-54-68 Some Interesting Facts, Myths and History of Mathematics Singh Prashant1 1(Department of Computer Science, Institute of Science, Banaras Hindu University) ABSTRACT : This paper deals with primary concepts and fallacies of mathematics which many a times students and even teachers ignore. Also this paper comprises of history of mathematical symbols, notations and methods of calculating time. I have also included some ancient techniques of solving mathematical real time problems. This paper is a confluence of various traditional mathematical techniques and their implementation in modern mathematics. I. INTRODUCTION I have heard my father saying that ―Mathematics is the only genuine subject as it does not change with boundary of countries‖. It is lucrative just because of its simplicity. Galileo once said, ―Mathematics is the language with which God wrote the Universe.‖ He was precise in calling mathematics a language, because like any dialect, mathematics has its own rubrics, formulas, and nuances. In precise, the symbols used in mathematics are quite unique to its field and are profoundly engrained in history. The following will give an ephemeral history of some of the greatest well-known symbols employed by mathematics. Categorized by discipline within the subject, each section has its own interesting subculture surrounding it. Arithmetic is the most rudimentary part of mathematics and covers addition, subtraction, multiplication, and the division of numbers. One category of numbers are the integers, -n,…-3,-2,-1,0,1,2,3,…n , where we say that n is in .The capital letter Z is written to represent integers and comes from the German word, Zahlen, meaning numbers.
    [Show full text]
  • Evolution of Script in India
    Evolution of script in India December 4, 2018 Manifest pedagogy UPSC in recent times has been asking tangential questions surrounding a personality. This is being done by linking dimensions in the syllabus with the personality. Iravatham Mahadevan which was in news last week. His contributions to scripts particularly Harappan Script and Brahmi script was immense. So the issue of growth of language and script become a relevant topic. In news Death of Iravatham Mahadevan an Indian epigraphist with expertise in Tamil-brahmi and Indus Valley script. Placing it in syllabus Indian culture will cover the salient aspects of Art Forms, Literature and Architecture from ancient to modern times. Dimensions 1. Difference between language and script 2. Indus valley script and the unending debate on its decipherment 3. The prominence of Brahmi script 4. The evolution of various scripts of India from Brahmi. 5. Modern Indian scripts. Content A language usually refers to the spoken language, a method of communication. A script refers to a collection of characters used to write one or more languages. A language is a method of communication. Scripts are writing systems that allow the transcription of a language, via alphabet sets. Indus script After the pictographic and petroglyph representations of early man the first evidence of a writing system can be seen in the Indus valley civilization. The earliest evidence of which is found on the pottery and pot shreds of Rahman Dheri and these potter’s marks, engraved or painted, are strikingly similar to those appearing in the Mature Indus symbol system. Later the writing system can be seen on the seals and sealings of Harappan period.
    [Show full text]
  • Know-Kashmir-2.Pdf
    Know Kashmir Author: Deepali Patwadkar © 2019 Deepali Patwadkar Published – Bharatiya Saur Jyeshtha 1941 | June 2019 Coverpage – Way to Amarnath Caves Note – All images are from the net. E-book from – Kalaa-pushpa www.facebook.com/kalaapushpa The views and opinions expressed are those of the author(s) and do not necessarily reflect the official policy or position of Kalaa-pushpa. Any content provided by author is their opinion, and are not intended to malign any religion, ethic group, club, organization, company, individual or anyone or anything. Sharada Desh Sharada Peeth, the ancient centre of learning. Kashmir was known as Sharada Desh. Sharada Devi नमस्ते शारदे देवी काश्मीरपुरवासिसन त्वामहं प्रार्थये सनत्यं सवद्यादानं च देसह मे ॥ I bow to you, O Goddess Sharada, the goddess of knowledge, who dwells in Kashmir! I pray to you, bestow upon me, the gift of knowledge! Devi Sharada in Kashmiri peherav Sharada Devotees Kashmir produced some of the best pieces of literature in India. When Al-Beruni came to India with Muhammad Ghazni, he noted with astonishment – India has thousands of books written in Sharada Script. The literature of Kashmir includes - Ashwaghosha’s Buddha Charita of 2nd century is a Sanskrit Mahakavya. Adi Shankara wrote his famous Soundarya Lahari when he visited Kashmir in 8th century. Kalhana’s Rajatarangini of 12th Century is the history of Kashmir. And, Somadeva’s KathaSaritSagar, 11th Century. The stories of Vikram & Vetal, Sinhasan Battishi and the stories of Panchatantra are some of the stories from KathaSaritSagar. Sharada Script The Bakshali Manuscript written in Sharada script.
    [Show full text]
  • Galaxy: International Multidisciplinary Research Journal the Criterion: an International Journal in English Vol
    About Us: http://www.the-criterion.com/about/ Archive: http://www.the-criterion.com/archive/ Contact Us: http://www.the-criterion.com/contact/ Editorial Board: http://www.the-criterion.com/editorial-board/ Submission: http://www.the-criterion.com/submission/ FAQ: http://www.the-criterion.com/fa/ ISSN 2278-9529 Galaxy: International Multidisciplinary Research Journal www.galaxyimrj.com The Criterion: An International Journal in English Vol. 8, Issue-VI, December 2017 ISSN: 0976-8165 Generating Dogri Morphological Analyzer Using Apertium Tool: An Overview Sunil Kumar Senior Resource Person (Academics) National Translation Mission Central Institute of Indian Languages, Mysore Article History: Submitted-06/12/2017, Revised-13/12/2017, Accepted-15/12/2017, Published-31/12/2017. Abstract: Computational morphology is a subfield of computational linguistics (also called “natural language processing” or language engineering). Computational morphology concerns itself with computer applications that analyze words in a given text, such as determining whether a given word is verb or a noun. Almost all practical applications that deal with natural language must have a morphological component. After all, an application must first recognize the word in question before analyzing it syntactically, semantically, or whatever the case may be. The term morphology is generally attributed to the Johann Wolfgang von Goethe (1749–1832) a German poet, playwright, novelist, and philosopher who coined it early in the nineteenth century in a biological context. Its etymology is Greek: morph- means ‘shape, form’, and morphology is the study of form or forms. In linguistics morphology is thestudy of the smallest grammatical units of language, and of their formation into words, including composition, derivation and inflection.
    [Show full text]
  • Encoding of Vedic Characters Used in Non-Devanagari Scripts
    Encoding of Vedic characters used in non-Devanagari scripts Srinidhi, Tumakuru, Karnataka, India [email protected] Date: 27 March 2015 The Vedic Unicode proposals had dealt with the Vedic characters used in Devanagari script only .The Grantha proposal had proposed the encoding of Samavedic characters and Vedic Anusvaras and other characters and Tirhuta had proposed the encoding of a Vedic Anusvara. Apart from this there were no efforts of encoding Vedic characters used in non-Devanagari scripts. There is a need of encoding the characters which are seen in both manuscripts and prints. An encoding for these Vedic characters in the UCS will certainly be of promote the usage among native users, scholars and manuscriptologists. This is a preliminary document which gives brief description of some signs used in non-Devanagari scripts. It also seeks feedback from scholars, native users and experts on encoding these signs. The number of characters used is much more since only very few manuscripts and books are available online. It is to be noted not all scripts which are used to write Sanskrit are used to write Vedas. In general the scripts are used for Buddhist Sanskrit religious texts such as Tibetan, Siddham and Thai etc. are not used to write Vedas. The following scripts are used to write Vedas 1. Bengali/Assamese 2. Devanagari 3. Grantha 4. Gujarati 5. Kannada 6. Malayalam 7. Nandinagari 8. Newar 9. Odia 10. Sharada 11. Telugu 12. Tigalari 13. Tirhuta Many of the existing characters are used most commonly in other scripts, Devanagari sign Udatta (mainly for Svarita), Devanagari sign Anudatta and Vedic tone double Svarita.
    [Show full text]
  • 4403 2014-01-28
    ISO/IEC JTC 1/SC 2 N____ ISO/IEC JTC 1/SC 2/WG 2 N4403 2014-01-28 ISO/IEC JTC 1/SC 2/WG 2 Universal Coded Character Set (UCS) - ISO/IEC 10646 Secretariat: ANSI DOC TYPE: Meeting minutes TITLE: Unconfirmed minutes of WG 2 meeting 61 Holiday Inn, Vilnius, Lithuania; 2013-06-10/14 SOURCE: V.S. Umamaheswaran, Recording Secretary, Michel Suignard, Acting Meeting Convener and Mike Ksar, Convener PROJECT: JTC 1.02.18 – ISO/IEC 10646 STATUS: SC 2/WG 2 participants are requested to review the attached unconfirmed minutes, act on appropriate noted action items, and to send any comments or corrections to the convener as soon as possible but no later than the due date below. ACTION ID: ACT DUE DATE: 2014-02-17 DISTRIBUTION: SC 2/WG 2 members and Liaison organizations MEDIUM: Acrobat PDF file NO. OF PAGES: 64 (including cover sheet) 2014-01-28 Holiday Inn, Vilnius, Lithuania; 2013-06-10/14 Page 1 of 64 JTC 1/SC 2/WG 2/N4403 Unconfirmed minutes of meeting 61 ISO International Organization for Standardization Organisation Internationale de Normalisation ISO/IEC JTC 1/SC 2/WG 2 Universal Coded Character Set (UCS) ISO/IEC JTC 1/SC 2 N____ ISO/IEC JTC 1/SC 2/WG 2 N4403 2014-01-28 Title: Unconfirmed minutes of WG 2 meeting 61 Holiday Inn, Vilnius, Lithuania; 2013-06-10/14 Source: V.S. Umamaheswaran ([email protected]), Recording Secretary Michel Suignard ([email protected]), Acting Meeting Convener Mike Ksar ([email protected]), Convener Action: WG 2 members and Liaison organizations Distribution: ISO/IEC JTC 1/SC 2/WG 2 members and liaison organizations 1 Opening and roll call Input document: 4405 2nd Call for meeting 61 in Vilnius, Lithuania; Mike Ksar; 2013-04-23 The meeting was opened at 10:05h.
    [Show full text]
  • Some Linguistic Features of the Old Kashmiri Language of the Bāṇāsurakathā
    Acta Orientalia Academiae Scientiarum Hung. Volume 71 (3), 351 – 367 (2018) DOI: 10.1556/062.2018.71.3.7 SOME LINGUISTIC FEATURES OF THE OLD KASHMIRI LANGUAGE OF THE BĀṆĀSURAKATHĀ SAARTJE VERBEKE Ghent University/Research Foundation Flanders (FWO) Blandijnberg 2, B-9000 Gent, Belgium e-mail: [email protected] The Bāṇāsurakathā is a sharada manuscript in Old Kashmiri composed by Avtar Bhatt, dated be- tween the 14th and 16th centuries. It retells the love story of the demon Bāṇa’s daughter Uṣā with Krishna’s grandson Aniruddha, and the ensuing fight between Bāṇa and Krishna, as it is found in the Harivaṃśapurāṇa. This paper focuses on the linguistic features of the Old Kashmiri language in which this manuscript is composed. Old Kashmiri belongs to the Early New Indo-Aryan language stage, a stage crucial for a number of syntactic developments which determined the Indo-Aryan lan- guages of today. First, the language found in the Bāṇāsurakathā is situated among the attestations of Old Kashmiri found in other manuscripts. The language is younger than that of the Mahānaya- Prakāśa, but older than the language used in the Lallā-Vākyāni. Second, a number of linguistic fea- tures of Old Kashmiri are presented, such as the case marking and the verb agreement. Third, the paper focuses on the phenomenon of pronominal suffixation, well known in Modern Kashmiri, but not present in Apabhraṃśa. It is shown that the first traces of pronominal suffixation already existed in the Bāṇāsurakathā, but their use was not yet grammatically fixed. Key words: Old Kashmiri, Bāṇāsurakathā, case marking, agreement, linguistics, literature.
    [Show full text]
  • Proposal to Encode the Sharada Script in ISO/IEC 10646
    ISO/IEC JTC1/SC2/WG2 N3595 L2/09-074R 2009-03-25 Proposal to Encode the Sharada Script in ISO/IEC 10646 Anshuman Pandey University of Michigan Ann Arbor, Michigan, U.S.A. [email protected] March 25, 2009 Contents Proposal Summary Form i 1 Introduction 1 2 Background 1 3 Characters Proposed 6 3.1 CharacterInventory ..... ...... ..... ...... ...... .. ......... 6 3.2 CharactersNotProposed . ......... 7 3.3 Basis for Character Shapes . .......... 10 4 The Writing System 17 4.1 GeneralFeatures ................................. ....... 17 4.2 Distinguishing Features . ........... 17 4.3 Consonant-Vowel Ligatures . ........... 19 4.4 ConsonantConjuncts .... ...... ..... ...... ...... ... ........ 19 4.5 Nasalization.................................... ....... 19 4.6 SpecialCharacters............................... ......... 20 4.7 Punctuation ..................................... ...... 20 4.8 Digits .......................................... 21 4.9 Variant Forms of Characters . .......... 21 4.10 Homoglyphic Characters . .......... 22 5 Implementation 23 5.1 EncodingModel................................... ...... 23 5.2 Collation ....................................... ..... 23 5.3 CharacterProperties. .......... 23 6 References 29 List of Tables 1 GlyphchartforSharada .............................. ......... 4 2 NameslistforSharada............................... ......... 5 3 Comparison of hand-written Sharada consonants with digitizedforms.... ..... ...... 11 4 Comparison of hand-written Sharada vowels with digitized forms................
    [Show full text]