Recognition of Ancient Tamil Characters in Stone Inscription Using Improved Feature Extraction M
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Intelligence System for Tamil VattezhuttuopticalMr R.Vinoth et al. / International Journal of Computer Science & Engineering Technology (IJCSET) INTELLIGENCE SYSTEM FOR TAMIL VATTEZHUTTUOPTICAL CHARACTER RCOGNITION Mr R.Vinoth Assistant Professor, Department of Information Technology Agni college of Technology, Chennai, India [email protected] Rajesh R. UG Student, Department of Information Technology Agni college of Technology, Chennai, India [email protected] Yoganandhan P. UG Student, Department of Information Technology Agni college of Technology, Chennai, India [email protected] Abstract--A system that involves character recognition and information retrieval of Palm Leaf Manuscript. The conversion of ancient Tamil to the present Tamil digital text format. Various algorithms were used to find the OCR for different languages, Ancient letter conversion still possess a big challenge. Because Image recognition technology has reached near-perfection when it comes to scanning Tamil text. The proposed system overcomes such a situation by converting all the palm manuscripts into Tamil digital text format. Though the Tamil scripts are difficult to understand. We are using this approach to solve the existing problems and convert it to Tamil digital text. Keyword - Vatteluttu Tamil (VT); Data set; Character recognition; Neural Network. I. INTRODUCTION Tamil language is one of the longest surviving classical languages in the world. Tamilnadu is a place, where the Palm Leaf Manuscript has been preserved. There are some difficulties to preserve the Palm Leaf Manuscript. So, we need to preserve the Palm Leaf Manuscript by converting to the form of digital text format. Computers and Smart devices are used by mostof them now a day. So, this system helps to convert and preserve in a fine manner.
- 
												  Elements of South-Indian Palaeography, from the Fourth ToThis is a reproduction of a library book that was digitized by Google as part of an ongoing effort to preserve the information in books and make it universally accessible. https://books.google.com ELEMENTS SOUTH-INDIAN PALfi3&BAPBY FROM THE FOURTH TO THE SEVENTEENTH CENTURY A. D. BEIN1 AN INTRODUCTION TO ?TIK STUDY OF SOUTH-INDIAN INSCRIPTIONS AND MSS. BY A. C. BURNELL HON'. PH. O. OF TUE UNIVERSITY M. K. A, ri'VORE PIS I. A SOClfcTE MANGALORE \ BASEL MISSION BOOK & TRACT DEPOSITORY ft !<3 1874 19 Vi? TRUBNER & Co. 57 & 69 LUDOATE HILL' . ' \jj *£=ggs3|fg r DISTRIBUTION of S INDIAN alphabets up to 1550 a d. ELEMENTS OF SOUTH-INDIAN PALEOGRAPHY FROM THE FOURTH TO THE SEVENTEENTH CENTURY A. D. BEING AN INTRODUCTION TO THE STUDY OF SOUTH-INDIAN INSCRIPTIONS AND MSS. BY A. p. j^URNELL HON. PH. D. OF THE UNIVERSITY OF STRASSBUB.G; M. R. A. S.; MEMBKE DE LA S0CIETE ASIATIQUE, ETC. ETC. MANGALORE PRINTED BY STOLZ & HIRNER, BASEL MISSION PRESS 1874 LONDON TRtlBNER & Co. 57 & 59 LUDGATE HILL 3« w i d m « t als ^'ctdjcn kr §anltekcit fiir Mc i|jm bdic<jcnc JJoctorMvk ttcsc fetlings^kit auf rincm fejjcr mtfrckntcn Jfclk bet 1®4 INTRODUCTION. I trust that this elementary Sketch of South-Indian Palaeography may supply a want long felt by those who are desirous of investigating the real history of the peninsula of India. Trom the beginning of this century (when Buchanan executed the only archaeological survey that has ever been done in even a part of the South of India) up to the present time, a number of well meaning persons have gone about with much simplicity and faith collecting a mass of rubbish which they term traditions and accept as history.
- 
												  General Historical and Analytical / Writing Systems: Recent Script9 Writing systems Edited by Elena Bashir 9,1. Introduction By Elena Bashir The relations between spoken language and the visual symbols (graphemes) used to represent it are complex. Orthographies can be thought of as situated on a con- tinuum from “deep” — systems in which there is not a one-to-one correspondence between the sounds of the language and its graphemes — to “shallow” — systems in which the relationship between sounds and graphemes is regular and trans- parent (see Roberts & Joyce 2012 for a recent discussion). In orthographies for Indo-Aryan and Iranian languages based on the Arabic script and writing system, the retention of historical spellings for words of Arabic or Persian origin increases the orthographic depth of these systems. Decisions on how to write a language always carry historical, cultural, and political meaning. Debates about orthography usually focus on such issues rather than on linguistic analysis; this can be seen in Pakistan, for example, in discussions regarding orthography for Kalasha, Wakhi, or Balti, and in Afghanistan regarding Wakhi or Pashai. Questions of orthography are intertwined with language ideology, language planning activities, and goals like literacy or standardization. Woolard 1998, Brandt 2014, and Sebba 2007 are valuable treatments of such issues. In Section 9.2, Stefan Baums discusses the historical development and general characteristics of the (non Perso-Arabic) writing systems used for South Asian languages, and his Section 9.3 deals with recent research on alphasyllabic writing systems, script-related literacy and language-learning studies, representation of South Asian languages in Unicode, and recent debates about the Indus Valley inscriptions.
- 
												  Proposal for a Malayalam Script Root Zone Label Generation Ruleset (LGR)Proposal for a Malayalam Script Root Zone Label Generation Ruleset (LGR) LGR Version: 4.0 Date: 2020-05-07 Document version: 2.4 Authors: Neo-Brahmi Generation Panel [NBGP] 1. General Information The purpose of this document is to give an overview of the proposed Malayalam LGR in the XML format and the rationale behind the design decisions taken. It includes a discussion of relevant features of the script, the communities or languages using it, the process and methodology used, the repertoire of code points included, variant code point(s), whole label evaluation rules and information on the contributors. The formal specification of the LGR can be found in the accompanying XML document: proposal-malayalam-lgr-07may20-en.xml. Labels for testing can be found in the accompanying text document: malayalam-test-labels-07may20-en.txt This LGR proposal was originally published on April 22, 2019. It has been updated to correct an inconsistency involving the support for conjunct “nta” and to address new cross-script variants for LGR-4. 2. Script for Which the LGR Is Proposed ISO 15924 Code: Mlym ISO 15924 Key N°: 347 ISO 15924 English Name: Malayalam Latin transliteration of native script name: malayāḷaṁ Native name of the script: മലയാളം Maximal Starting Repertoire (MSR) version: MSR-4 3. Background on Script and Principal Languages Using It Malayalam is a Dravidian language with about 38 million speakers spoken mainly in the south west of India, particularly in Kerala, the Lakshadweep Islands and neighbouring states, and also in Bahrain, Fiji, Israel, Malaysia, Qatar, Singapore, UAE and the UK.
- 
												  List of Ancient Indian Scripts - GK Notes for SSC & BankingList of Ancient Indian Scripts - GK Notes for SSC & Banking India is known for its culture, heritage, and history. One of the treasures of India is its ancient scripts. There are a huge number of scripts in India, many of these scripts have not been deciphered yet. Most Indian languages are written in Brahmi-derived scripts, such as Devanagari, Tamil, Telugu, Kannada, Odia etc. Read this article to know about a List of Ancient Indian Scripts. Main Families of Indian Scripts All Indian scripts are known to be derived from Brahmi. There are three main families of Indian scripts: 1. Devanagari - This script is the basis of the languages of northern India and western India: Hindi, Gujarati, Bengali, Marathi, Panjabi, etc. 1 | P a g e 2. Dravidian - This script is the basis of Telugu, Kannada. 3. Grantha - This script is a subsection of the Dravidian languages such as Tamil and Malayalam, although it is not as important as the other two. Ancient Indian Scripts Given below are five ancient Indian scripts that have proven to be very important in the history of India - 1. Indus Script The Indus script is also known as the Harappan script. It uses a combination of symbols which were developed during the Indus Valley Civilization which started around 3500 and ended in 1900 BC. Although this script has not been deciphered yet, many have argued that this script is a predecessor to the Brahmi Script. 2. Brahmi Script Brahmi is a modern name for one of the oldest scripts in the Indian as well as Central Asian writing system.
- 
												  Request to Change Pulli Representation in the ProposedRequest to Change Pulli Representation in the Proposed Vatteluttu Encoding Author: Cibu C Johny ([email protected]) Date: 2021 Feb Action: For consideration by Script Ad Hoc and UTC Abstract The document L2/16-068 proposes to encode Vatteluttu characters as per the Indic model for modern scripts like Tamil. Even though we can appreciate the potential developer conveniences in adopting such an encoding model, modern Indic is not the most intuitive model for certain characters like p uḷḷi . In this document, I request that the p uḷḷi may follow an encoding model where it could be applied to both consonants, vowels and vowel signs, acting as a vowel reducer. Puḷḷi in Vatteluttu The p uḷḷi diacritic appears as a combining dot at the top or top-right corner of a base character. The base can be certain independent vowels, consonants or <consonant, vowel sign> clusters. Fig 1: P uḷḷi on top of KA and K-O in Velvikudi plates (769CE)1 Puḷḷi diacritic, when used, explicitly indicates vowel reduction and this function is analogous to that of  ( ˘ ) in Latin. When  is placed on a vowel sign or on a <consonant, vowel breve puḷḷi  sign> cluster, it explicitly marks the corresponding vowel as a short one. When placed on a consonant, its default vowel is shortened or removed. All are vowel reduction mechanisms. In Fig 1, p uḷḷi represents both vowel removal and reduction in the K-K-O conjunct. The first puḷḷi removes the default vowel /a/ from KA. The second p uḷḷi indicates that the Dependent Vowel Sign O on KA should be reduced.
- 
												  Internationalization and MathInternationalization and Math Test collection Made by ckepper • English • 2 articles • 156 pages Contents Internationalization 1. Arabic alphabet . 3 2. Bengali alphabet . 27 3. Chinese script styles . 47 4. Hebrew language . 54 5. Iotation . 76 6. Malayalam . 80 Math Formulas 7. Maxwell's equations . 102 8. Schrödinger equation . 122 Appendix 9. Article ourS ces and Contributors . 152 10. Image Sources, Licenses and Contributors . 154 Internationalization Arabic alphabet Arabic Alphabet Type Abjad Languages Arabic Time peri- 356 AD to the present od Egyptian • Proto-Sinaitic ◦ Phoenician Parent ▪ Aramaic systems ▪ Syriac ▪ Nabataean ▪ Arabic Al- phabet Arabic alphabet | Article 1 fo 2 3 َْ Direction Right-to-left األ ْب َج ِد َّية :The Arabic alphabet (Arabic ا ْل ُح ُروف al-ʾabjadīyah al-ʿarabīyah, or ا ْل َع َربِ َّية ISO ْ al-ḥurūf al-ʿarabīyah) or Arabic Arab, 160 ال َع َربِ َّية 15924 abjad is the Arabic script as it is codi- Unicode fied for writing Arabic. It is written Arabic alias from right to left in a cursive style and includes 28 letters. Most letters have • U+0600–U+06FF contextual letterforms. Arabic • U+0750–U+077F Originally, the alphabet was an abjad, Arabic Supplement with only consonants, but it is now con- • U+08A0–U+08FF sidered an "impure abjad". As with other Arabic Extended-A abjads, such as the Hebrew alphabet, • U+FB50–U+FDFF scribes later devised means of indicating Unicode Arabic Presentation vowel sounds by separate vowel diacrit- range Forms-A ics. • U+FE70–U+FEFF Arabic Presentation Consonants Forms-B • U+1EE00–U+1EEFF The basic Arabic alphabet contains 28 Arabic Mathematical letters.
- 
												  Proposal for a Malayalam Script Root Zone Label Generation Ruleset (LGR)Proposal for a Malayalam Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2019-04-22 Document version: 2.1 Authors: Neo-Brahmi Generation Panel [NBGP] 1. General Information The purpose of this document is to give an overview of the proposed Malayalam LGR in the XML format and the rationale behind the design decisions taken. It includes a discussion of relevant features of the script, the communities or languages using it, the process and methodology used, the repertoire of code points included, variant code point(s), whole label evaluation rules and information on the contributors. The formal specification of the LGR can be found in the accompanying XML document: proposal-malayalam-lgr-22apr19-en.xml. Labels for testing can be found in the accompanying text document: malayalam-test-labels-22apr19-en.txt 2. Script for Which the LGR Is Proposed ISO 15924 Code: Mlym ISO 15924 Key N°: 347 ISO 15924 English Name: Malayalam Latin transliteration of native script name: malayāḷaṁ Native name of the script: മലയാളം Maximal Starting Repertoire (MSR) version: MSR-4 3. Background on Script and Principal Languages Using It Malayalam is a Dravidian language with about 38 million speakers spoken mainly in the south west of India, particularly in Kerala, the Lakshadweep Islands and neighbouring states, and also in Bahrain, Fiji, Israel, Malaysia, Qatar, Singapore, UAE and the UK. Malayalam was first written with the Vatteluttu alphabet (വെ)ഴു,് Vaṭṭeḻuttŭ), which means 'round writing' and developed from the Brahmi script. The oldest known written text in Malayalam is known as the Vazhappalli or Vazhappally inscription, is in the Vatteluttu alphabet and dates from about 830 AD.
- 
												  Preliminary Proposal to Encode Vatteluttu in UnicodeL2/16-068 2016-03-13 Preliminary proposal to encode Vatteluttu in Unicode Anshuman Pandey [email protected] March 13, 2016 1 Introduction This is a preliminary proposal to encode the Vatteluttu script in Unicode. Its purpose is to present a tentative encoding model for the script. Research is ongoing and a formal proposal is forthcoming, which will contain background information and additional details on the script. Feedback is sought from experts. The code points used here are arbitrary and will likely change once the script is allocated to the Unicode Roadmap. 2 Script Details 2.1 Structure The general structure of Vatteluttu is similar to that of Tamil. Each consonant letter possesses an inherent vowel. Independent vowels are written using letters. Non-initial forms of vowels are expressed using com- bining signs. Vowel signs attach to the left, right, or below the base. Some signs consist of two parts, which are written to the left and right of the base. Historically, the script did not possess a virāma or puḷḷi for si- lencing the inherent vowel, but such a sign was introduced at a later time. Consonant clusters are represented linearly. In later sources, each bare consonant in a cluster is marked by placing the virāma above the letter. 2.2 Character repertoire and representative glyphs The proposed repertoire is based upon characters that occur in inscriptions and character inventories pub- lished in secondary sources. The representative glyphs are normalizations of forms given in these sources and have been designed by the proposal author. The glyphs are intended to illustrative, not typographically aesthetic.
- 
												  Proposal for a Malayalam Script Root Zone Label Generation Ruleset (LGR)Proposal for a Malayalam Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2018-09-25 Document version: 1.7 Authors: Neo-Brahmi Generation Panel [NBGP] 1. General Information The purpose of this document is to give an overview of the proposed Malayalam LGR in the XML format and the rationale behind the design decisions taken. It includes a discussion of relevant features of the script, the communities or languages using it, the process and methodology used, the repertoire of code points included, variant code point(s), whole label evaluation rules and information on the contributors. The formal specification of the LGR can be found in the accompanying XML document: proposal- malayalam-lgr-25sep18-en.xml. Labels for testing can be found in the accompanying text document: test-labels-malayalam-25sep18-en.txt 2. Script for Which the LGR Is Proposed ISO 15924 Code: Mlym ISO 15924 Key N°: 347 ISO 15924 English Name: Malayalam Latin transliteration of native script name: malayāḷaṁ Native name of the script: മലയാളം Maximal Starting Repertoire (MSR) version: MSR-3 3. Background on Script and Principal Languages Using It Malayalam is a Dravidian language with about 38 million speakers spoken mainly in the south west of India, particularly in Kerala, the Lakshadweep Islands and neighbouring states, and also in Bahrain, Fiji, Israel, Malaysia, Qatar, Singapore, UAE and the UK. Malayalam was first written with the Vatteluttu alphabet (വെഴു,് Vaṭṭeḻuttŭ), which means 'round writing' and developed from the Brahmi script. The oldest known written text in Malayalam is known as the Vazhappalli or Vazhappally inscription, is in the Vatteluttu alphabet and dates from about 830 AD.
- 
												  Development of Scripts in India – a Study© 2020 JETIR April 2020, Volume 7, Issue 4 www.jetir.org (ISSN-2349-5162) Development of scripts in India – A Study *Dr. Rachoti Devaru. Faculty of Sanskrit, University of Mysore, Manasa Gangotri, Mysore. Abstract The present paper intends to explore the applied growth of the evolution of Indian scripts, and also the nature of the development. The linguistic landscape of the subcontinent changed dramatically during the 2nd millennium BCE, so that is is impossible to determine if there is a connection between the IVC script and the next clearly attested script in India, the Brahmi script found in the inscriptions of the Mauryan Emperor Ashoka (ruled 268-232 BCE), especially since they probably represented vastly different, unrelated languages. The sudden appearance of the Brahmi writing system is one of the great mysteries of writing in India, as there is no evidence of inscriptions beforehand. Another script, the (extinct, childless) Kharosthi of northwest Pakistan and Afghanistan seems to be clearly derived from the imperial Aramaic script used by the Persians who ruled over parts of the Indus Valley for two centuries until the arrival of Alexander the Great. It is unclear if the fully developed Brahmi script was invented by the Mauryan Empire as a result of exposure to Aramaic, but this seems unlikely, particularly since there were advanced states in the Ganges valley and a corpus of Vedic literature dating from before the Mauryan period. It is more likely that pre-Mauryan inscriptions may still be discovered, and in fact, some Brahmi inscriptions have been found in Tamil Nadu and Sri Lanka dating to the 6th century BCE.
- 
												  A NOTE on the POSSIBLE RELATIONSHIP of KING RAMA KHAMHAENG's SUKHODAYA SCRIPT of THAILAND to the GRANTHA SCRIPT of SOUTH INDIA by SA NOTE ON THE POSSIBLE RELATIONSHIP OF KING RAMA KHAMHAENG'S SUKHODAYA SCRIPT OF THAILAND TO THE GRANTHA SCRIPT OF SOUTH INDIA by S. Singaravelu University of Malaya The discussion in this paper is based entirely on materials that have already been published in one form or another (such as volumes of inscriptions, grammatical works, general historical works, and learned papers in journals concerning some aspects of South Indian scripts and the Thai scripts). I In other words, the evidence to be cited in this paper does not proceed from any new epigraphical docu ment, whether of South India or of Southeast Asia, which might have remained unpublished or been unknown. Let me now outline briefly, by way of an introduction, some of the important facts concerning the Grantba script of South India of relevance to the subject-matter of this paper. The Grantha Script of South India A.C. Burnell believed that the Grantha script of South India had its origin in the 'Chera Character', so called because it was first used in the Chera kingdom of South India in the early centuries A.D.; and he thought that the Chera character was a variety of 'cave charac ter'. He was also of the opinion that the Grantha character in the early stages of its development was of two main varieties: one that 1) The Select Bibliography, which is given at the end of this paper, is in three sections: In Section (a) are listed general works and papers which have reference to the matter under discussion; Section (b) includes books and papers which are of relevance to the scripts of South India and also scripts of South Indian origin found in inscriptions discovered in various parts of mainland Southeast Asia; Section (c) is made up of volumes of Khmer and Thai inscriptions together with some papers dealing with the alphabets and .