Proposal to Encode the Kaithi Script in ISO/IEC 10646

Proposal to Encode the Kaithi Script in ISO/IEC 10646 Anshuman Pandey University of Michigan Ann Arbor, Michigan, U.S.A. [email protected] December 13, 2007 Contents Proposal Summary Form i 1 Introduction 1 1.1 Description ..................................... ...... 1 1.2 Justification for Encoding . ........... 1 1.3 Acknowledgments ................................. ...... 2 1.4 ProposalHistory ................................. ....... 2 2 Characters Proposed 3 2.1 CharactersNotProposed . ......... 4 2.2 Basis for Character Shapes . .......... 4 3 Technical Features 8 3.1 Name ............................................ 8 3.2 Classification ................................... ....... 8 3.3 Allocation...................................... ...... 8 3.4 EncodingModel................................... ...... 8 3.5 CharacterProperties. .......... 8 3.6 Collation ....................................... ..... 11 3.7 Typology of Characters . ......... 11 4 Background 13 4.1 Origins ......................................... 13 4.2 Name ............................................ 13 4.3 Definitions...................................... ...... 13 4.4 Languages Written in the Script . ........... 14 4.5 Standardization and Growth . .......... 16 4.6 Decline ......................................... 16 4.7 Usage ........................................... 17 5 Orthography 23 5.1 Distinguishing Features . ........... 23 5.2 Vowels.......................................... 23 5.3 Consonants ...................................... ..... 23 5.4 Nasalization.................................... ....... 25 5.5 Consonant-Vowel Ligatures . ........... 25 5.6 ConsonantConjuncts .... ...... ..... ...... ...... ... ........ 27 5.7 WordBoundaries .................................. ...... 28 5.8 Sentence and Paragraph Boundaries . ............ 29 5.9 Hyphenation..................................... ...... 32 5.10Abbreviation ................................... ....... 32 5.11Enumeration.................................... ....... 33 5.12Nukta .......................................... 34 5.13RuledLines ..................................... ...... 35 5.14Digits ......................................... ..... 36 5.15 NumberFormsandUnitMarks. ......... 36 5.16 RegionalVariants . ..... ...... ..... ...... ...... .. ......... 37 5.17Typefaces ...................................... ...... 37 6 Relationship to Other Scripts 39 6.1 Relationship to Gujarati . ........... 39 6.2 Relationship to Devanagari . ........... 41 6.3 Relationship to Syloti Nagri . ........... 42 6.4 Comparison of Kaithi, Gujarati, and Devanagari . ................. 42 7 References 46 List of Tables 1 Glyph chart and character names and properties for the Kaithiscript. ............... 5 2 Comparison of consonants of metal fonts with the digitized Kaithifont .............. 6 3 Comparison of vowels of metal fonts with the digitized Kaithifont ................ 7 4 Comparison of regional variants of the nasal consonant letters................... 7 5 Statistical comparison of similarity of letters across Kaithi, Gujarati, and Devanagari . 43 6 A comparison of consonants of Kaithi, Gujarati, Devanagari, and Syloti Nagri . 44 7 A comparison of vowels of Kaithi, Gujarati, Devanagari, and Syloti Nagri . 45 8 A comparison of digits of Kaithi, Gujarati, Devanagari, and Syloti Nagri . 45 List of Figures 1 Geographic center of the Kaithi script in South Asia. ................... 15 2 Relationship of Kaithi to selected Nagari-based scripts . ...................... 41 3 A comparison of the three regional forms of Kaithi . ................. 50 4 A list of Kaithi conjuncts used in the Maithili style of Kaithi.................... 51 5 Currency, weights, and measures signs used in Kaithi . .................. 52 6 Specimen of hand-written Bhojpuri style of Kaithi . .................. 53 7 Specimen of hand-written Maithili style of Kaithi . .................. 54 8 Specimen of hand-written Magahi style of Kaithi . ................. 55 9 Excerpt from a specimen of Maithili written in the Magahi style of Kaithi . 56 10 Excerpt from a specimen of Awadhi written in Kaithi . ................. 57 11 Excerpt from a specimen of Bengali written in Kaithi . .................. 58 12 A specimen of Magahi printed in Kaithi type . ............... 59 13 A specimen of Maithili printed in Kaithi type . ................. 60 14 A specimen of Bhojpuri printed in Kaithi type . ................ 61 15 Table of the Kaithi script in the Linguistic Survey of India ..................... 62 16 Table of the Kaithi script from Eastwick . ............... 63 17 Inventory of Kaithi letters . ............. 64 18 Comparison of numerals of Kaithi and other scripts . .................. 64 19 Folios 1b and 2a from the Mahagan¯ . apatistotra in Devanagari and Kaithi . 65 20 Folios 1a and 4a from the Mahagan¯ . apatistotra in Devanagari and Kaithi . 66 21 Excerpt from a plaint from the district court of Patna, Bihar .................... 67 22 Excerpt from a plaint from the district court of Bhagalpur,Bihar.................. 68 23 Excerpt from a statement from the district court of Ranchi,Bihar ................. 69 24 Rent receipt from the former Principality of Seraikella . ...................... 70 25 Title and first pages of the Book of Genesis in Kaithi type . ................... 71 26 Title and first pages of the New Testament in Kaithi type . ................... 72 27 Entries for the ‘Bihari’ languages in The Book of a Thousand Tongues .............. 73 28 A folio from the ”Ekad. ala”¯ manuscript of Miragavat¯ ¯ı ....................... 74 29 A folio from the Tale of Sudama ................................... 75 30 A sanad of Sher Shah Suri in Persian and Kaithi . ....... 76 31 A letter to the Supreme Civil Court of Appeals in Calcutta . .................... 77 32 Comparison of Kaithi, Gujarati, and Devanagari types from the Linguistic Survey of India . 78 33 Comparison of hand-written Kaithi and Gujarati letters . ..................... 79 34 Comparison of Kaithi and Devanagari . .............. 79 35 Comparison of Kaithi and Devanagari metal fonts . ................. 80 36 A comparison of the Kaithi script with the Devanagari and Mahajani . 81 37 Comparison of Kaithi drawn with the headstroke and Devanagari ................. 82 38 Comparison of writing techniques in Kaithi and Devanagari.................... 83 39 Comparison of scripts descended from proto-Bengali . ................... 84 40 Comparison of Kaithi with other scripts used for writing Hindi .................. 85 41 Comparison of Kaithi with other Indic scripts . .................. 86 42 Comparison of Kaithi with other Indic scripts . .................. 87 43 A family tree of north Indic scripts showing Kaithi as a member of the Nagari family . 88 44 The relationship of Kaithi to other Indic scripts . .................... 88 ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC 106461 Please fill all the sections A, B and C below. Please read Principles and Procedures Document (P & P) from http://www.dkuug.dk/JTC1/SC2/WG2/docs/principles.html for guidelines and details before filling this form. Please ensure you are using the latest Form from http://www.dkuug.dk/JTC1/SC2/WG2/docs/summaryform.html. See also http://www.dkuug.dk/JTC1/SC2/WG2/docs/roadmaps.html for latest Roadmaps. A. Administrative 1. Title: Proposal to Encode the Kaithi Script in ISO/IEC 10646 2. Requester’s name: University of California, Berkeley Script Encoding Initiative (Universal Scripts Project); author: Anshuman Pandey ([email protected]) 3. Requester type (Member Body/Liaison/Individual contribution): Liaison contribution 4. Submission date: December 13, 2007 5. Requester’s reference (if applicable): N/A 6. Choose one of the following: (a) This is a complete proposal: Yes (b) or, More information will be provided later: No B. Technical - General 1. Choose one of the following: (a) This proposal is for a new script (set of characters): Yes i. Proposed name of script: Kaithi (b) The proposal is for addition of character(s) to an existing block: No i. Name of the existing block: N/A 2. Number of characters in proposal: 71 3. Proposed category: C - Major extinct 4. Is a repertoire including character names provided?: Yes (a) If Yes, are the names in accordance with the “character naming guidelines” in Annex L of P&P document?: Yes (b) Are the character shapes attached in a legible form suitable for review?: Yes 5. Who will provide the appropriate computerized font (ordered preference: True Type, or PostScript format) for publishing the standard?: Anshuman Pandey; True Type format (a) If available now, identify source(s) for the font and indicate the tools used: The letters of the digitized Kaithi font are based on normalized forms of letters of Kaithi metal fonts and, in some cases, on written forms. The font was drawn by Anshuman Pandey with Metafont and converted to True Type with FontForge. 6. References: (a) Are references (to other character sets, dictionaries, descriptive texts etc.) provided?: Yes (b) Are published examples of use (such as samples from newspapers, magazines, or other sources) of proposed characters attached?: Yes 7. Special encoding issues: (a) Does the proposaladdress other aspects of character data processing (if applicable) such as input, presentation, sorting, searching, indexing, transliteration etc. (if yes please enclose information)? Yes; see proposal for additional details.. 8. Additional Information: Submitters are invited

Proposal to Encode the Kaithi Script in ISO/IEC 10646

On the Origin of the Indian Brahma Alphabet

Handwriting Recognition in Indian Regional Scripts: a Survey of Ofﬂine Techniques

The Original Pronunciation of Sanskrit Devanàgará – Transliteration – IPA-Symbols

A Manipuri-English Bilingual Electronic Dictionary -Design and Implementation S

Sanskrit Alphabet

5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721

Proposal for a Gujarati Script Root Zone Label Generation Ruleset (LGR)

An Analysis Towards the Development of Electronic Bilingual Dictionary (Manipuri-English) -A Report

The Art of Not Being Scripted So Much. the Politics of Writing Hmong

16-Sanskrit-In-JAPAN.Pdf

Inventory of Romanization Tools

ISO/IEC International Standard 10646-1