Proposal for a Tamil Script Root Zone Label Generation Rule-Set (LGR)
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Some Principles of the Use of Macro-Areas Language Dynamics &A
Online Appendix for Harald Hammarstr¨om& Mark Donohue (2014) Some Principles of the Use of Macro-Areas Language Dynamics & Change Harald Hammarstr¨om& Mark Donohue The following document lists the languages of the world and their as- signment to the macro-areas described in the main body of the paper as well as the WALS macro-area for languages featured in the WALS 2005 edi- tion. 7160 languages are included, which represent all languages for which we had coordinates available1. Every language is given with its ISO-639-3 code (if it has one) for proper identification. The mapping between WALS languages and ISO-codes was done by using the mapping downloadable from the 2011 online WALS edition2 (because a number of errors in the mapping were corrected for the 2011 edition). 38 WALS languages are not given an ISO-code in the 2011 mapping, 36 of these have been assigned their appropri- ate iso-code based on the sources the WALS lists for the respective language. This was not possible for Tasmanian (WALS-code: tsm) because the WALS mixes data from very different Tasmanian languages and for Kualan (WALS- code: kua) because no source is given. 17 WALS-languages were assigned ISO-codes which have subsequently been retired { these have been assigned their appropriate updated ISO-code. In many cases, a WALS-language is mapped to several ISO-codes. As this has no bearing for the assignment to macro-areas, multiple mappings have been retained. 1There are another couple of hundred languages which are attested but for which our database currently lacks coordinates. -
“Lost in Translation”: a Study of the History of Sri Lankan Literature
Karunakaran / Lost in Translation “Lost in Translation”: A Study of the History of Sri Lankan Literature Shamila Karunakaran Abstract This paper provides an overview of the history of Sri Lankan literature from the ancient texts of the precolonial era to the English translations of postcolonial literature in the modern era. Sri Lanka’s book history is a cultural record of texts that contains “cultural heritage and incorporates everything that has survived” (Chodorow, 2006); however, Tamil language works are written with specifc words, ideas, and concepts that are unique to Sri Lankan culture and are “lost in translation” when conveyed in English. Keywords book history, translation iJournal - Journal Vol. 4 No. 1, Fall 2018 22 Karunakaran / Lost in Translation INTRODUCTION The phrase “lost in translation” refers to when the translation of a word or phrase does not convey its true or complete meaning due to various factors. This is a common problem when translating non-Western texts for North American and British readership, especially those written in non-Roman scripts. Literature and texts are tangible symbols, containing signifed cultural meaning, and they represent varying aspects of an existing international ethnic, social, or linguistic culture or group. Chodorow (2006) likens it to a cultural record of sorts, which he defnes as an object that “contains cultural heritage and incorporates everything that has survived” (pg. 373). In particular, those written in South Asian indigenous languages such as Tamil, Sanskrit, Urdu, Sinhalese are written with specifc words, ideas, and concepts that are unique to specifc culture[s] and cannot be properly conveyed in English translations. -
Honorificity, Indexicality and Their Interaction in Magahi
SPEAKER AND ADDRESSEE IN NATURAL LANGUAGE: HONORIFICITY, INDEXICALITY AND THEIR INTERACTION IN MAGAHI BY DEEPAK ALOK A dissertation submitted to the School of Graduate Studies Rutgers, The State University of New Jersey In partial fulfillment of the requirements For the degree of Doctor of Philosophy Graduate Program in Linguistics Written under the direction of Mark Baker and Veneeta Dayal and approved by New Brunswick, New Jersey October, 2020 ABSTRACT OF THE DISSERTATION Speaker and Addressee in Natural Language: Honorificity, Indexicality and their Interaction in Magahi By Deepak Alok Dissertation Director: Mark Baker and Veneeta Dayal Natural language uses first and second person pronouns to refer to the speaker and addressee. This dissertation takes as its starting point the view that speaker and addressee are also implicated in sentences that do not have such pronouns (Speas and Tenny 2003). It investigates two linguistic phenomena: honorification and indexical shift, and the interactions between them, andshow that these discourse participants have an important role to play. The investigation is based on Magahi, an Eastern Indo-Aryan language spoken mainly in the state of Bihar (India), where these phenomena manifest themselves in ways not previously attested in the literature. The phenomena are analyzed based on the native speaker judgements of the author along with judgements of one more native speaker, and sometimes with others as the occasion has presented itself. Magahi shows a rich honorification system (the encoding of “social status” in grammar) along several interrelated dimensions. Not only 2nd person pronouns but 3rd person pronouns also morphologically mark the honorificity of the referent with respect to the speaker. -
Ajit Kumar Baishya Email-ID
Faculty profile Name : Ajit Kumar Baishya Email-ID: [email protected] Tel.09435566247 Designation: Professor. Specialization: (a). Sociolinguistics with special reference to Pidgin and Creole Studies( b). Endangered and Minority Languages Present Research Interest: Lesser known languages of Assam, Lingua francas of the North- East India. Publications: Book (edited): Bilingualism and North East India, published by The Registrar, Assam University, Silchar, June 2008. Articles: 1. “The Making of Nagamese: A historical Perspective” Journal of Assam University, Silchar, January 2006. 2. “Language Maintenance by the Dimasas of Barak Valley: A Case Study” in Indian Linguistics, Vol. 67, 2006. 3. “Borrowing in Rabha: A Few Observations” in International Journal of Dravidian Linguistics, June 2006. 4. “Word Formation in Contemporary Assamese” in Journal of Assam University, Silchar, January 2007. 5. “Relexification in Nagamese: An Observation” in International Journal of Dravidian Linguistics, June 2007. 6. “Case Markers in Sylheti” in Indian Linguistics, Vol. 68, 2007. 7. “Reduplication in Modern Assamese” in Journal of Assam University, Silchar, January 2008. 8. “Word formation in Dimasa” in International Journal of Dravidian Linguistics, January 2008. 9. “Assamese: An SOV Language” in Indian Linguistics, Vol. 69, 2008. 10. “Sadri: The Lingua Franca” in The Humanities in the Present Context eds by Ramanan Mohan, P. Mohanty, T. Mukherjee, Allied Publishers, Hyderabad, 2009. 11. “Phonology of English Loan Words in Assamese” in Journal of Assam University, Silchar, January 2010. 12. “The Development of Script for Nagamese” in Manuscript and Manuscriptology in India eds by Nandi, S. G. and P. Palit, Kaveri Books, New Delhi, 2010. 13. “Problems of Teaching Assamese as a Second Language” in Literature, Culture and Language Education eds. -
Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only. -
2001 Presented Below Is an Alphabetical Abstract of Languages A
Hindi Version Home | Login | Tender | Sitemap | Contact Us Search this Quick ABOUT US Site Links Hindi Version Home | Login | Tender | Sitemap | Contact Us Search this Quick ABOUT US Site Links Census 2001 STATEMENT 1 ABSTRACT OF SPEAKERS' STRENGTH OF LANGUAGES AND MOTHER TONGUES - 2001 Presented below is an alphabetical abstract of languages and the mother tongues with speakers' strength of 10,000 and above at the all India level, grouped under each language. There are a total of 122 languages and 234 mother tongues. The 22 languages PART A - Languages specified in the Eighth Schedule (Scheduled Languages) Name of language and Number of persons who returned the Name of language and Number of persons who returned the mother tongue(s) language (and the mother tongues mother tongue(s) language (and the mother tongues grouped under each grouped under each) as their mother grouped under each grouped under each) as their mother language tongue language tongue 1 2 1 2 1 ASSAMESE 13,168,484 13 Dhundhari 1,871,130 1 Assamese 12,778,735 14 Garhwali 2,267,314 Others 389,749 15 Gojri 762,332 16 Harauti 2,462,867 2 BENGALI 83,369,769 17 Haryanvi 7,997,192 1 Bengali 82,462,437 18 Hindi 257,919,635 2 Chakma 176,458 19 Jaunsari 114,733 3 Haijong/Hajong 63,188 20 Kangri 1,122,843 4 Rajbangsi 82,570 21 Khairari 11,937 Others 585,116 22 Khari Boli 47,730 23 Khortha/ Khotta 4,725,927 3 BODO 1,350,478 24 Kulvi 170,770 1 Bodo/Boro 1,330,775 25 Kumauni 2,003,783 Others 19,703 26 Kurmali Thar 425,920 27 Labani 22,162 4 DOGRI 2,282,589 28 Lamani/ Lambadi 2,707,562 -
GRAMMAR of OLD TAMIL for STUDENTS 1 St Edition Eva Wilden
GRAMMAR OF OLD TAMIL FOR STUDENTS 1 st Edition Eva Wilden To cite this version: Eva Wilden. GRAMMAR OF OLD TAMIL FOR STUDENTS 1 st Edition. Eva Wilden. Institut français de Pondichéry; École française d’Extrême-Orient, 137, 2018, Collection Indologie. halshs- 01892342v2 HAL Id: halshs-01892342 https://halshs.archives-ouvertes.fr/halshs-01892342v2 Submitted on 24 Jan 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. GRAMMAR OF OLD TAMIL FOR STUDENTS 1st Edition L’Institut Français de Pondichéry (IFP), UMIFRE 21 CNRS-MAE, est un établissement à autonomie financière sous la double tutelle du Ministère des Affaires Etrangères (MAE) et du Centre National de la Recherche Scientifique (CNRS). Il est partie intégrante du réseau des 27 centres de recherche de ce Ministère. Avec le Centre de Sciences Humaines (CSH) à New Delhi, il forme l’USR 3330 du CNRS « Savoirs et Mondes Indiens ». Il remplit des missions de recherche, d’expertise et de formation en Sciences Humaines et Sociales et en Écologie dans le Sud et le Sud- est asiatiques. Il s’intéresse particulièrement aux savoirs et patrimoines culturels indiens (langue et littérature sanskrites, histoire des religions, études tamoules…), aux dynamiques sociales contemporaines, et aux ecosystèmes naturels de l’Inde du Sud. -
Proposal to Encode Tamil Fractions and Symbols §1. Thanks §2
Proposal to encode Tamil fractions and symbols Shriramana Sharma, jamadagni-at-gmail-dot-com, India 2012-Jul-17 §1. Thanks I owe much to everyone who helped with this proposal. G Balachandran of Sri Lanka heavily contributed to this proposal by sharing the attestations and data which he had collected over five years back in researching this very same topic. Dr Jean-Luc Chevillard of France and Dr Elmar Kniprath of Germany provided the second majority of attestations. Dr Kalyanasundaram (Switzerland), Dr Jayabarathi (Malaysia), K Ramanraj (Chennai), Vijayaraghavan Vanbakkam (Germany), Dr Rajam (US), Dr Vijayavenugopal (Pondicherry), Mani Manivannan (Chennai/US) and A E Elangovan (Chennai) also helped with attestations. Various members of the CTamil list also contributed to related discussions. Vinodh Rajan of Chennai is my personal sounding board for all my proposals and he contributes in too many ways for me to specify. Deborah Anderson continuously helped out with encouragement and enthusiasm-bolstering :-). I express my sincere thanks to all these people and to anyone else not mentioned (my apologies!). Preparing this has been a lengthy journey, and you all helped me through! §2. Introduction Compared to other Indic scripts/regions, the Tamil-speaking region has employed a larger set of symbols for fractions and abbreviations. These symbols, especially the fractions, have been referred to in various documents already submitted to the UTC, including my Grantha proposal L2/09-372 (p 6). A comprehensive proposal has been desirable for quite some time for the addition of characters to Unicode to enable the textual representation of these rare heritage written forms which are to be found in old Tamil manuscripts and books. -
Vestiges of Indus Civilisation in Old Tamil
Vestiges of Indus Civilisation in Old Tamil Iravatham Mahadevan Indus Research Centre Roja Mulhiah Research LJbrary. Chennai Tamiloadu Endowment Lecture Tamilnadu History Congress 16tb Annual Session, 9-11 October 2009 Tirucbirapalli, Tamilnadu Tamilnadu Hi story Congress 2009 Vestiges of Indus Civilisation in Old Tamil Iravatham Mahadevan Introduction 0. 1 II is indeed a great privilege 10 be invited to deliver the prestigious Tamilnadu Endowment Lecture at the Annual Session of the Tamilnadu History Congress. I am grateful to the Executive and to the General Bod y of the Congress for the signal honour be~towed on me. I am not a historian. My discipline is ep igraphy, in whic h I have specialised in the ralher are,lne fields o f the Indus Script and Tamil - Brahmi inscriptions. It is rather unusual for an epigraphist to be asked to deliver tile keynote address at a History Congre%. I alll all the more pleased at the recognition accorded to Epigraphy, which is, especially ill the case of Tamilnadu. the foundation on which the edifice of history has been raised. 0.2 Let me also at the outset declare my interest. [ have two personal reasons 10 accept the invitation. despite my advanced age and failing health. This session is being held at Tiruchirapalli. where I was born and brought up. ! am happy 10 be back in my hOllle lown to participate in these proceedings. I am also eager to share with you some of my recent and still-not-fully-published findings relating 10 the interpretation of the In dus Script. My studies have gradually led me to the COllciusion thaI the Indus Script is not merely Dm\'idian linguistically. -
Islamic Legal Cosmopolis and Its Arabic and Malay Microcosmoi
Languages of Law: Islamic Legal Cosmopolis and its Arabic and Malay Microcosmoi MAHMOOD KOORIA Abstract In premodern monsoon Asia, the legal worlds of major and minor traditions formed a cosmopolis of laws which expanded chronologically and geographically. Without necessarily replacing one another, they all coexisted in a larger domain with fluctuating influences over time and place. In this legal cosmopolis, each tradition had its own aggregation of diverse juridical, linguistic and contextual variants. In South and Southeast Asia, Islam has accordingly formed its own cosmopolis of law by incorporating a network of different juridical texts, institutions, jurists and scholars and by the meaningful use of these variants through shared vocabularies and languages. Focusing on the Shafīʿı ̄School of Islamic law and its major proponents in Malay and Arabic textual productions, this article argues that the intentional choice of a lingua franca contributed to the wider reception and longer sustainability of this particular legal school. The Arabic and Malay microcosmoi thus strengthened the larger cosmopolis of Islamic law through trans- regional and translinguistic exchanges across legal, cultural and continental borders. Keywords: Shaf̄iʿı School;̄ Malay and Arabic textual productions. Introduction Law was one of the most important realms that set frameworks for transregional interactions among individuals, communities and institutions in the premodern world. It synchronised structures of exchange through commonly agreed practices and expectations.1 The written legal treatises, along with unwritten customs, norms and rituals, prescribed ideal forms of social, cultural, economic and political relations to maintain order, peace and affinity in the societies in which they operated. These written and unwritten laws influenced each other in the long run both internally and externally through spatial and temporal concur- rences. -
Minority Languages in India
Thomas Benedikter Minority Languages in India An appraisal of the linguistic rights of minorities in India ---------------------------- EURASIA-Net Europe-South Asia Exchange on Supranational (Regional) Policies and Instruments for the Promotion of Human Rights and the Management of Minority Issues 2 Linguistic minorities in India An appraisal of the linguistic rights of minorities in India Bozen/Bolzano, March 2013 This study was originally written for the European Academy of Bolzano/Bozen (EURAC), Institute for Minority Rights, in the frame of the project Europe-South Asia Exchange on Supranational (Regional) Policies and Instruments for the Promotion of Human Rights and the Management of Minority Issues (EURASIA-Net). The publication is based on extensive research in eight Indian States, with the support of the European Academy of Bozen/Bolzano and the Mahanirban Calcutta Research Group, Kolkata. EURASIA-Net Partners Accademia Europea Bolzano/Europäische Akademie Bozen (EURAC) – Bolzano/Bozen (Italy) Brunel University – West London (UK) Johann Wolfgang Goethe-Universität – Frankfurt am Main (Germany) Mahanirban Calcutta Research Group (India) South Asian Forum for Human Rights (Nepal) Democratic Commission of Human Development (Pakistan), and University of Dhaka (Bangladesh) Edited by © Thomas Benedikter 2013 Rights and permissions Copying and/or transmitting parts of this work without prior permission, may be a violation of applicable law. The publishers encourage dissemination of this publication and would be happy to grant permission. -
Malayalam Range: 0D00–0D7F
Malayalam Range: 0D00–0D7F This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation.