Automatic Phonetic Transcription of Tamil in Roman Script
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Intelligence System for Tamil Vattezhuttuoptical
Mr R.Vinoth et al. / International Journal of Computer Science & Engineering Technology (IJCSET) INTELLIGENCE SYSTEM FOR TAMIL VATTEZHUTTUOPTICAL CHARACTER RCOGNITION Mr R.Vinoth Assistant Professor, Department of Information Technology Agni college of Technology, Chennai, India [email protected] Rajesh R. UG Student, Department of Information Technology Agni college of Technology, Chennai, India [email protected] Yoganandhan P. UG Student, Department of Information Technology Agni college of Technology, Chennai, India [email protected] Abstract--A system that involves character recognition and information retrieval of Palm Leaf Manuscript. The conversion of ancient Tamil to the present Tamil digital text format. Various algorithms were used to find the OCR for different languages, Ancient letter conversion still possess a big challenge. Because Image recognition technology has reached near-perfection when it comes to scanning Tamil text. The proposed system overcomes such a situation by converting all the palm manuscripts into Tamil digital text format. Though the Tamil scripts are difficult to understand. We are using this approach to solve the existing problems and convert it to Tamil digital text. Keyword - Vatteluttu Tamil (VT); Data set; Character recognition; Neural Network. I. INTRODUCTION Tamil language is one of the longest surviving classical languages in the world. Tamilnadu is a place, where the Palm Leaf Manuscript has been preserved. There are some difficulties to preserve the Palm Leaf Manuscript. So, we need to preserve the Palm Leaf Manuscript by converting to the form of digital text format. Computers and Smart devices are used by mostof them now a day. So, this system helps to convert and preserve in a fine manner. -
Comment on L2/13-065: Grantha Characters from Other Indic Blocks Such As Devanagari, Tamil, Bengali Are NOT the Solution
Comment on L2/13-065: Grantha characters from other Indic blocks such as Devanagari, Tamil, Bengali are NOT the solution Dr. Naga Ganesan ([email protected]) Here is a brief comment on L2/13-065 which suggests Devanagari block for Grantha characters to write both Aryan and Dravidian languages. From L2/13-065, which suggests Devanagari block for writing Sanskrit and Dravidian languages, we read: “Similar dedicated characters were provided to cater different fonts as URUDU MARATHI SINDHI MARWARI KASHMERE etc for special conditions and needs within the shared DEVANAGARI range. There are provisions made to suit Dravidian specials also as short E short O LLLA RRA NNNA with new created glyptic forms in rhythm with DEVANAGARI glyphs (with suitable combining ortho-graphic features inside DEVANAGARI) for same special functions. These will help some aspirant proposers as a fall out who want to write every Indian language contents in Grantham when it is unified with DEVANAGARI because those features were already taken care of by the presence in that DEVANAGARI. Even few objections seen in some docs claim them as Tamil characters to induce a bug inside Grantham like letters from GoTN and others become tangential because it is going to use the existing Sanskrit properties inside shared DEVANAGARI and also with entirely new different glyph forms nothing connected with Tamil. In my doc I have shown rhythmic non-confusable glyphs for Grantham LLLA RRA NNNA for which I believe there cannot be any varied opinions even by GoTN etc which is very much as Grantham and not like Tamil form at all to confuse (as provided in proposals).” Grantha in Devanagari block to write Sanskrit and Dravidian languages will not work. -
The Journal of the Music Academy Madras Devoted to the Advancement of the Science and Art of Music
The Journal of Music Academy Madras ISSN. 0970-3101 Publication by THE MUSIC ACADEMY MADRAS Sangita Sampradaya Pradarsini of Subbarama Dikshitar (Tamil) Part I, II & III each 150.00 Part – IV 50.00 Part – V 180.00 The Journal Sangita Sampradaya Pradarsini of Subbarama Dikshitar of (English) Volume – I 750.00 Volume – II 900.00 The Music Academy Madras Volume – III 900.00 Devoted to the Advancement of the Science and Art of Music Volume – IV 650.00 Volume – V 750.00 Vol. 89 2018 Appendix (A & B) Veena Seshannavin Uruppadigal (in Tamil) 250.00 ŸÊ„¢U fl‚ÊÁ◊ flÒ∑ȧá∆U Ÿ ÿÊÁªNÔUŒÿ ⁄UflÊÒ– Ragas of Sangita Saramrta – T.V. Subba Rao & ◊jQÊ— ÿòÊ ªÊÿÁãà ÃòÊ ÁÃDÊÁ◊ ŸÊ⁄UŒH Dr. S.R. Janakiraman (in English) 50.00 “I dwell not in Vaikunta, nor in the hearts of Yogins, not in the Sun; Lakshana Gitas – Dr. S.R. Janakiraman 50.00 (but) where my Bhaktas sing, there be I, Narada !” Narada Bhakti Sutra The Chaturdandi Prakasika of Venkatamakhin 50.00 (Sanskrit Text with supplement) E Krishna Iyer Centenary Issue 25.00 Professor Sambamoorthy, the Visionary Musicologist 150.00 By Brahma EDITOR Sriram V. Raga Lakshanangal – Dr. S.R. Janakiraman (in Tamil) Volume – I, II & III each 150.00 VOL. 89 – 2018 VOL. COMPUPRINT • 2811 6768 Published by N. Murali on behalf The Music Academy Madras at New No. 168, TTK Road, Royapettah, Chennai 600 014 and Printed by N. Subramanian at Sudarsan Graphics Offset Press, 14, Neelakanta Metha Street, T. Nagar, Chennai 600 014. Editor : V. Sriram. THE MUSIC ACADEMY MADRAS ISSN. -
Library Stock.Pdf
Acc.No. Title Author Publication Samuhyashasthra Vidyarthimithram, 1 Vidhyabyasathinte saithadhika Rajeshwary V.P Kottayam Adisthanangal-1 Samuhyashasthra Vidyarthimithram, 2 Vidhyabyasathinte saithadhika Rajeshwary V.P Kottayam Adisthanangal-1 Samaniashasthra Vidhyabyasathinte Vidyarthimithram, 3 saithadhika Rajeshwary V.P Kottayam Adisthanangal-1 Samaniashasthra Vidhyabyasathinte Vidyarthimithram, 4 saithadhika Rajeshwary V.P Kottayam Adisthanangal-1 Adhunika Vidhyabyasathinte Vidyarthimithram, 5 Rajeshwary V.P Saithathikadisthanangal-2 Kottayam Adhunika Vidhyabyasathinte Vidyarthimithram, 6 Rajeshwary V.P Saithathikadisthanangal-2 Kottayam MalayalaBasha Bodhanathinte Vidyarthimithram, 7 Saithathikadisthanam (Pradhesika Rajapan Nair P.K Kottayam Bashabothanam-1) MalayalaBasha Bodhanathinte Vidyarthimithram, 8 Saithathikadisthanam (Pradhesika Rajapan Nair P.K Kottayam Bashabothanam-1) Samuhyashasthra Vidyarthimithram, 9 Vidhyabyasathinte saithadhika Thomas R.S Kottayam Adisthanangal-2 Samuhyashasthra Vidyarthimithram, 10 Vidhyabyasathinte saithadhika Thomas R.S Kottayam Adisthanangal-2 Samaniashasthra Vidhyabyasathinte Vidyarthimithram, 11 saithadhika Rajeshwary V.P Kottayam Adisthanangal-2 Samaniashasthra Vidhyabyasathinte Vidyarthimithram, 12 saithadhika Rajeshwary V.P Kottayam Adisthanangal-2 Samaniashasthra Vidhyabyasathinte Vidyarthimithram, 13 saithadhika Rajeshwary V.P Kottayam Adisthanangal-2 MalayalaBasha Bodhanathinte Vidyarthimithram, 14 Saithathikadisthanam (Pradhesika Rajapan Nair P.K Kottayam Bashabothanam-2) MalayalaBasha -
Language Transliteration in Indian Languages – a Lexicon Parsing Approach
LANGUAGE TRANSLITERATION IN INDIAN LANGUAGES – A LEXICON PARSING APPROACH SUBMITTED BY JISHA T.E. Assistant Professor, Department of Computer Science, Mary Matha Arts And Science College, Vemom P O, Mananthavady A Minor Research Project Report Submitted to University Grants Commission SWRO, Bangalore 1 ABSTRACT Language, ability to speak, write and communicate is one of the most fundamental aspects of human behaviour. As the study of human-languages developed the concept of communicating with non-human devices was investigated. This is the origin of natural language processing (NLP). Natural language processing (NLP) is a subfield of Artificial Intelligence and Computational Linguistics. It studies the problems of automated generation and understanding of natural human languages. A 'Natural Language' (NL) is any of the languages naturally used by humans. It is not an artificial or man- made language such as a programming language. 'Natural language processing' (NLP) is a convenient description for all attempts to use computers to process natural language. The goal of the Natural Language Processing (NLP) group is to design and build software that will analyze, understand, and generate languages that humans use naturally, so that eventually you will be able to address your computer as though you were addressing another person. The last 50 years of research in the field of Natural Language Processing is that, various kinds of knowledge about the language can be extracted through the help of constructing the formal models or theories. The tools of work in NLP are grammar formalisms, algorithms and data structures, formalism for representing world knowledge, reasoning mechanisms. Many of these have been taken from and inherit results from Computer Science, Artificial Intelligence, Linguistics, Logic, and Philosophy. -
Dravidian Letters in Tamil Grantha Script Some Notes in Their History of Use
Dravidian Letters in Tamil Grantha Script Some Notes in Their History of Use Dr. Naga Ganesan ([email protected]) 1.0 Introduction Earlier, in L2/11-011 with the title, Diacritic Marks for Short e & o Vowels (Dravidian and Vedic) in Devanagari (North India) and Grantha (South India), Vedic vowels, short e & o in Devanagari are recommended to be transcribed into Grantha in Unicode using the corresponding letters as in Govt. of India (GoI) proposal. The archaic Dravidian short /e/ & /o/ typically use a “dot” (puLLi) diacritic on long /e/ & /o/ vowels in south India, and this was originally proposed in the Grantha proposal after consultations with epigraphists and Tamil Grantha scholars. Reference: the Table of Dravidian letters in Grantha proposal by Dr. Swaran Lata, Govt. of India letter, pg. 8, L2/11-011 is given: This document is to add some further inputs about the history of Tamil written in Grantha script. Grantha Tamil or Tamil Grantha, by definition, can handle writing both Indo- Aryan (Sanskrit and its derived languages) and Dravidian such as Tamil language. Both Tamil script with explicit virama, and Tamil Grantha script with implicit virama grew up together in their history, that is the reason about 30 letters are very similar between the two scripts. For example, page 37, A. C. Burnell, Elements of South Indian Paleography, 1874. “The origin of this Tamil alphabet is apparent at first sight; it is a brahmanical adaptation of the Grantha letters corresponding to the old VaTTezuttu, from which, however, the last four signs (LLL, LL, RR and NNN) have been retained.” For the co-development of Tamil and Grantha scripts exchanging letter forms is dealt in B. -
Stotras in Sanskrit, Tamil, Malayalam and Krithis Addressed To
Stotras in Sanskrit, Tamil, Malayalam and Krithis addressed to Guruvayurappan This document has 10 sanskrit stotras, 5 Tamil prayers ,56 Malayalam prayers and 8 Krithis Table of Contents 1. Stotras in Sanskrit, Tamil, Malayalam and Krithis addressed to Guruvayurappan .......... 1 2. Guruvayu puresa Pancha Rathnam(sanskrit)........................................................................... 5 3. Sri Guruvatha pureesa stotram .................................................................................................. 7 4. Sampoorna roga nivarana stotra of Guryvayurappan ......................................................... 9 5. Solve all your problems by worshiping Lord Guruvayurappan ........................................ 13 6. Guruvayu puresa Bhujanga stotra ............................................................................................ 17 7. Guruvayupuresa Suprabatham ................................................................................................. 25 8. Gurvayurappan Suprabatham .................................................................................................. 31 9. Guruvayurappan Pratha Smaranam ...................................................................................... 37 10. Sri Guruvatha puresa Ashtotharam ........................................................................................ 39 11. Sampoorna roga nivarana stotra of Guryvayurappan ....................................................... 45 12. Srimad Narayaneeyamrutham(Tamil summary of Narayaneeyam) -
(RSEP) Request October 16, 2017 Registry Operator INFIBEAM INCORPORATION LIMITED 9Th Floor
Registry Services Evaluation Policy (RSEP) Request October 16, 2017 Registry Operator INFIBEAM INCORPORATION LIMITED 9th Floor, A-Wing Gopal Palace, NehruNagar Ahmedabad, Gujarat 380015 Request Details Case Number: 00874461 This service request should be used to submit a Registry Services Evaluation Policy (RSEP) request. An RSEP is required to add, modify or remove Registry Services for a TLD. More information about the process is available at https://www.icann.org/resources/pages/rsep-2014- 02-19-en Complete the information requested below. All answers marked with a red asterisk are required. Click the Save button to save your work and click the Submit button to submit to ICANN. PROPOSED SERVICE 1. Name of Proposed Service Removal of IDN Languages for .OOO 2. Technical description of Proposed Service. If additional information needs to be considered, attach one PDF file Infibeam Incorporation Limited (“infibeam”) the Registry Operator for the .OOO TLD, intends to change its Registry Service Provider for the .OOO TLD to CentralNic Limited. Accordingly, Infibeam seeks to remove the following IDN languages from Exhibit A of the .OOO New gTLD Registry Agreement: - Armenian script - Avestan script - Azerbaijani language - Balinese script - Bamum script - Batak script - Belarusian language - Bengali script - Bopomofo script - Brahmi script - Buginese script - Buhid script - Bulgarian language - Canadian Aboriginal script - Carian script - Cham script - Cherokee script - Coptic script - Croatian language - Cuneiform script - Devanagari script -
Proposal to Encode Characters for Extended Tamil §1. Introduction
Proposal to encode characters for Extended Tamil Shriramana Sharma, jamadagni-at-gmail-dot-com, India 2010-Jul-10 This document requests the encoding of characters which are required to enable the Tamil script to support the writing of Sanskrit. It repeats some relevant sections from my Grantha proposal L2/09-372 and my feedback document L2/10-085. It also contains some more information and citations and is a formal proposal for the encoding of those characters. §1. Introduction It is well known that the Tamil script has an insufficient character repertoire to represent the Sanskrit language. Sanskrit can be and is written and printed quite naturally in most other (major and some minor) Indian scripts, which have the required number of characters. However, it cannot be written in plain Tamil script without some contrived extensions acting in the capacity of diacritical marks, just as it cannot be written in the Latin script without the usage of diacritical marks (as prescribed by ISO 15919 or IAST). Another option is to import written forms from another script (to be specific, Grantha, Tamil’s closest Sanskrit-capable relative) to cover the remaining unsupported sounds. We call this version of Tamil that has been extended to support Sanskrit as Extended Tamil. It should be noted that this is neither a distinct script from Tamil nor can it be considered a mixture of two scripts (Tamil and Grantha), because the ‘grammar’ of the script (i.e. the orthographic rules, such as those of forming consonant clusters) is still that of Tamil. Thus Extended Tamil may be likened to the IPA extensions to the Latin script to denote sounds that the Latin script does not natively denote or differentiate. -
Krishnadeva, a Fresh Transliteration Scheme
KRISHNADHEVA SCRIPT – Transliteration Scheme for Indian Languages to English N. Krishnamurthy 1. INTRODUCTION For the thousands of years that Devanagari has existed, millions have used it, mainly to chant Sanskrit (or other Devanagari-script based) prayers and scriptures. Devanagari is the script, and it is used for many languages such as Sanskrit, Hindi, and Marathi. Many do not know the Devanagari script, but have access to its transliterations into a language they know. Of these, a number of South Indian languages, such as Kannada, Malayalam and Telugu, based on Devanagari, have very similar alphabet sets with almost identical sounds. To them the transliterated versions would have been faithful reproductions of the original. However for those who did not know such Devanagari-based languages, or who did not have the transliterations available, the next recourse was to use the English language script. The author of this paper learnt Sanskrit at home from a priest as a boy, and had Sanskrit as second language in his undergraduate days. He also studied some Hindu scriptures in the Devanagari script, and formally studied a small portion of the Veda-s through a Guru. Problems with Devanagari-related languages: Many adjustments and accommodations have been made to reflect the sounds of Devanagari, such as the following (see Comparison Table at end): . The International Alphabet of Sanskrit Transliteration (IAST) includes the following conventions: "i", "u", etc. with a bar above it, to represent the long "i", "u", etc. "t", "d", and "s" with a dot underneath each, to represent the sound "t", "d", and "sh", while without the dot, they would represent, "th", "dh" and "s". -
General Historical and Analytical / Writing Systems: Recent Script
9 Writing systems Edited by Elena Bashir 9,1. Introduction By Elena Bashir The relations between spoken language and the visual symbols (graphemes) used to represent it are complex. Orthographies can be thought of as situated on a con- tinuum from “deep” — systems in which there is not a one-to-one correspondence between the sounds of the language and its graphemes — to “shallow” — systems in which the relationship between sounds and graphemes is regular and trans- parent (see Roberts & Joyce 2012 for a recent discussion). In orthographies for Indo-Aryan and Iranian languages based on the Arabic script and writing system, the retention of historical spellings for words of Arabic or Persian origin increases the orthographic depth of these systems. Decisions on how to write a language always carry historical, cultural, and political meaning. Debates about orthography usually focus on such issues rather than on linguistic analysis; this can be seen in Pakistan, for example, in discussions regarding orthography for Kalasha, Wakhi, or Balti, and in Afghanistan regarding Wakhi or Pashai. Questions of orthography are intertwined with language ideology, language planning activities, and goals like literacy or standardization. Woolard 1998, Brandt 2014, and Sebba 2007 are valuable treatments of such issues. In Section 9.2, Stefan Baums discusses the historical development and general characteristics of the (non Perso-Arabic) writing systems used for South Asian languages, and his Section 9.3 deals with recent research on alphasyllabic writing systems, script-related literacy and language-learning studies, representation of South Asian languages in Unicode, and recent debates about the Indus Valley inscriptions. -
Phonetic Characters in Tamil
29 Phonetic Characters in Tamil P.Chellappan Palaniappan Bros., 14, Peters Road, Chennai - 600014 <Email: [email protected]> ___________________________________________________________________________ INTRODUCTION Tamil is a very ancient language with a rich heritage and literature. Over the centuries some changes have been made to the script. These changes consist of not only modification of existing glyphs of the Tamil characters but also introduction of new characters like the Grantha characters. The introduction of Grantha characters was done so that many of the Sanskrit words adopted and used in Tamil could be written in the Tamil script. For any language to survive it should be flexible and willing to adopt words from other languages as and when necessary. A primary example of such a language is the English language which has adopted words from many languages. By doing this the English language has only enriched itself. It has not lost its identity as some may fear. The Tamil script is essentially phonetic / syllable based and unlike the English language it has structured pronunciation rules. When we adopt words from other languages we should be in a position to pronounce them correctly. It is with this perspective in mind , this paper presents the concept of Phonetic characters in Tamil. This paper also deals with the problem of Spoken Tamil versus Written Tamil. NEED We are at a point of time in history when technology has reduced the once 'Huge World' to 'A Global Village'. Communication and Computer technology has made instant communication between peoples of the world possible. International Commerce and travel is the order of the day.