Jtc1/Sc2/Wg2 N3412 L2/08-131

Total Page:16

File Type:pdf, Size:1020Kb

Jtc1/Sc2/Wg2 N3412 L2/08-131 JTC1/SC2/WG2 N3412 L2/08-131 2008-04-01 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Proposal for encoding Last Resort Pictures in Plane 14 of the UCS Source: Michael Everson Status: Individual Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Date: 2008-04-01 1. Introduction. The Last Resort font is a collection of glyphs which represents types of UCS characters. These glyphs are designed to allow users to recognize that an encoded value is a specific type of UCS character, a Private Use Area character, an unassigned character, or one of the illegal character codes. Apple Computer and SIL are two organizations which have shipped Last Resort fonts for some time now. Recently Apple decided to make its Last Resort font available for public distribution via the Unicode Consortium’s website (sometime after the publication of Unicode 5.1). One of the principles of a Last Resort font is that is normally displayed when no other font is available for the character in question. Similarly, the Control characters encoded at U+0000-001F cannot be represented in text, since they are intended “do” things (even if few of them do much on modern operating systems. At 2400-243F the CONTROL PICTURES block provides glyphs for those characters so that they can be discussed and displayed in text. This facility would be as valuable for the Last Resort functionality as it is for the control characters. Accordingly, this proposal requests the encoding of a new block, U+E0200-E03FF LAST RESORT PICTURES, in Plane 14 of the UCS. The scheme for encoding in Plane 14 proposed here is block-based. As new blocks are added to the standard, new characters would be added to the Last Resort Pictures block. This maintenance burden is not very great, since the Consortium’s Last Resort font has to be updated in any case. In fact maintaining the Plane 14 block could well ensure a more timely updating of the Last Resort font. 2. Character names. The character names proposed begin with the prefix “LR-”, followed by the character range and block name. 3. Unicode Character Properties. E0200;LR-0000-FFFD PLANE-0 UNDEFINED;So;0;ON;;;;;N;;;;; E0201;LR-0000-007F BASIC LATIN;So;0;ON;;;;;N;;;;; E0202;LR-0080-00FF LATIN-1 SUPPLEMENT;So;0;ON;;;;;N;;;;; E0203;LR-0100-017F LATIN EXTENDED-A;So;0;ON;;;;;N;;;;; ... 4. Issues. The BMP is nearly full, and it is not hard to leave a bit of space where between blocks for future expansion. Whether this should be done for the SMP or not may merit some discussion. 1 A. Administrative 1. Title Proposal for encoding Last Resort Pictures in Plane 14 of the UCS 2. Requester’s name Michael Everson 3. Requester type (Member body/Liaison/Individual contribution) Indi v i dual co ntri buti o n. 4. Submission date 2008-04-01 5. Requester’s reference (if applicable) 6. Choose one of the following: 6a. This is a complete proposal Yes. 6b. More information will be provided later No. B. Technical – General 1. Choose one of the following: 1a. This proposal is for a new script (set of characters) Yes. 1b. Proposed name of script Last Resort Pictures. 1c. The proposal is for addition of character(s) to an existing block No. 1d. Name of the existing block 2. Number of characters in proposal 201. 3. Proposed category (A-Contemporary; B.1-Specialized (small collection); B.2-Specialized (large collection); C-Major extinct; D- Attested extinct; E-Minor extinct; F-Archaic Hieroglyphic or Ideographic; G-Obscure or questionable usage symbols) Category G. 4a. Is a repertoire including character names provided? Yes. 4b. If YES, are the names in accordance with the “character naming guidelines” in Annex L of P&P document? Yes. 4c. Are the character shapes attached in a legible form suitable for review? Yes. 5a. Who will provide the appropriate computerized font (ordered preference: True Type, or PostScript format) for publishing the standard? Michael Everson. 5b. If available now, identify source(s) for the font (include address, e-mail, ftp-site, etc.) and indicate the tools used: Michael Everson, Fontographer. 6a. Are references (to other character sets, dictionaries, descriptive texts etc.) provided? Yes. 6b. Are published examples of use (such as samples from newspapers, magazines, or other sources) of proposed characters attached? Yes. 7. Does the proposal address other aspects of character data processing (if applicable) such as input, presentation, sorting, searching, indexing, transliteration etc. (if yes please enclose information)? Yes. 8. Submitters are invited to provide any additional information about Properties of the proposed Character(s) or Script that will assist in correct understanding of and correct linguistic processing of the proposed character(s) or script. Examples of such properties are: Casing information, Numeric information, Currency information, Display behaviour information such as line breaks, widths etc., Combining behaviour, Spacing behaviour, Directional behaviour, Default Collation behaviour, relevance in Mark Up contexts, Compatibility equivalence and other Unicode normalization related information. See the Unicode standard at http://www. unicode. org for such information on other scripts. Also see Unicode Character Database http://www. unicode. org/Public/UNIDATA/ UnicodeCharacterDatabase.html and associated Unicode Technical Reports for information needed for consideration by the Unicode Technical Committee for inclusion in the Unicode Standard. See above. C. Technical – Justification 1. Has this proposal for addition of character(s) been submitted before? If YES, explain. No. 2a. Has contact been made to members of the user community (for example: National Body, user groups of the script or characters, other experts, etc.)? No. 2b. If YES, with whom? 2c. If YES, available relevant documents 3. Information on the user community for the proposed characters (for example: size, demographics, information technology use, or publishing use) is included? No. 2 4a. The context of use for the proposed characters (type of use; common or rare) Rare. 4b. Reference 5a. Are the proposed characters in current use by the user community? The glyphs are, but they are unavailable for use in text. 5b. If YES, where? In Last Resort fonts. 6a. After giving due considerations to the principles in the P&P document must the proposed characters be entirely in the BMP? No. 6b. If YES, is a rationale provided? 6c. If YES, reference 7. Should the proposed characters be kept together in a contiguous range (rather than being scattered)? Yes. 8a. Can any of the proposed characters be considered a presentation form of an existing character or character sequence? No. 8b. If YES, is a rationale for its inclusion provided? 8c. If YES, reference 9a. Can any of the proposed characters be encoded using a composed character sequence of either existing characters or other proposed characters? No. 9b. If YES, is a rationale for its inclusion provided? 9c. If YES, reference 10a. Can any of the proposed character(s) be considered to be similar (in appearance or function) to an existing character? No. 10b. If YES, is a rationale for its inclusion provided? 10c. If YES, reference 11a. Does the proposal include use of combining characters and/or use of composite sequences (see clauses 4.12 and 4.14 in ISO/IEC 10646-1: 2000)? No. 11b. If YES, is a rationale for such use provided? 11c. If YES, reference 11d. Is a list of composite sequences and their corresponding glyph images (graphic symbols) provided? No. 11e. If YES, reference 12a. Does the proposal contain characters with any special properties such as control function or similar semantics? No. 12b. If YES, describe in detail (include attachment if necessary) 13a. Does the proposal contain any Ideographic compatibility character(s)? No. 13b. If YES, is the equivalent corresponding unified ideographic character(s) identified? 3 E0200 Last Resort Pictures E02FF E020 E021 E022 E023 E024 E025 E026 E027 E028 E029 E02A E02B E02C E02D E02E E02F 0 ÈÒÜæðúĄĎĘ §­±Á E0200 E0210 E0220 E0230 E0240 E0250 E0260 E0270 E0280 E02A0 E02C0 E02D0 E02E0 E02F0 1 ÉÓÝçñûąďę ¨®²Â E0201 E0211 E0221 E0231 E0241 E0251 E0261 E0271 E0281 E02A1 E02B1 E02C1 E02D1 E02E1 E02F1 2 ÊÔÞèòüĆĐĚ ¯³Ã E0202 E0212 E0222 E0232 E0242 E0252 E0262 E0272 E0282 E02A2 E02B2 E02D2 E02E2 E02F2 3 Ë ß óýćđě °´Ä E0203 E0223 E0243 E0253 E0263 E0273 E0283 E02A3 E02D3 E02E3 E02F3 4 Ì àêôþĈĒĜ µÅ E0204 E0224 E0234 E0244 E0254 E0264 E0274 E0284 E02A4 E02B4 E02E4 E02F4 5 Í áëõÿĉēĝħ¡ ¶Æ E0205 E0225 E0235 E0245 E0255 E0265 E0275 E0285 E0295 E02A5 E02B5 E02E5 E02F5 6 Î âìöĀĊĔĞĨ ·Ç E0206 E0226 E0236 E0246 E0256 E0266 E0276 E0286 E0296 E02A6 E02E6 E02F6 7 ÏÙãí÷āċĕğĩ¢© ¸È E0207 E0217 E0227 E0237 E0247 E0257 E0267 E0277 E0287 E0297 E02A7 E02B7 E02C7 E02E7 E02F7 8 ÐÚäîøĂČĖĠĪ£ª ¹É E0208 E0218 E0228 E0238 E0248 E0258 E0268 E0278 E0288 E0298 E02A8 E02B8 E02C8 E02E8 E02F8 9 ÑÛåïùăčėġī º E0209 E0219 E0229 E0239 E0249 E0259 E0269 E0279 E0289 E0299 E02A9 E02E9 A ' CS r¤ » E020A E021A E022A E024A E025A E027A E028A E029A E02AA E02BA E02EA B ( DTcs ¥ ¼ E020B E021B E022B E024B E025B E026B E027B E029B E02AB E02BB E02EB C )6EUdt¦« ½ E020C E021C E022C E023C E024C E025C E026C E027C E028C E029C E02AC E02BC E02CC E02EC D *7FVeu ¾ E020D E021D E022D E023D E024D E025D E026D E027D E029D E02AD E02ED E + GWfv ¬ ¿ E020E E021E E022E E024E E025E E026E E027E E028E E029E E02AE E02CE E02EE F ,8HXg À E020F E021F E022F E023F
Recommended publications
  • A Method to Accommodate Backward Compatibility on the Learning Application-Based Transliteration to the Balinese Script
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 6, 2021 A Method to Accommodate Backward Compatibility on the Learning Application-based Transliteration to the Balinese Script 1 3 4 Gede Indrawan , I Gede Nurhayata , Sariyasa I Ketut Paramarta2 Department of Computer Science Department of Balinese Language Education Universitas Pendidikan Ganesha (Undiksha) Universitas Pendidikan Ganesha (Undiksha) Singaraja, Indonesia Singaraja, Indonesia Abstract—This research proposed a method to accommodate transliteration rules (for short, the older rules) from The backward compatibility on the learning application-based Balinese Alphabet document 1 . It exposes the backward transliteration to the Balinese Script. The objective is to compatibility method to accommodate the standard accommodate the standard transliteration rules from the transliteration rules (for short, the standard rules) from the Balinese Language, Script, and Literature Advisory Agency. It is Balinese Language, Script, and Literature Advisory Agency considered as the main contribution since there has not been a [7]. This Bali Province government agency [4] carries out workaround in this research area. This multi-discipline guidance and formulates programs for the maintenance, study, collaboration work is one of the efforts to preserve digitally the development, and preservation of the Balinese Language, endangered Balinese local language knowledge in Indonesia. The Script, and Literature. proposed method covered two aspects, i.e. (1) Its backward compatibility allows for interoperability at a certain level with This study was conducted on the developed web-based the older transliteration rules; and (2) Breaking backward transliteration learning application, BaliScript, for further compatibility at a certain level is unavoidable since, for the same ubiquitous Balinese Language learning since the proposed aspect, there is a contradictory treatment between the standard method reusable for the mobile application [8], [9].
    [Show full text]
  • Ka И @И Ka M Л @Л Ga Н @Н Ga M М @М Nga О @О Ca П
    ISO/IEC JTC1/SC2/WG2 N3319R L2/07-295R 2007-09-11 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Proposal for encoding the Javanese script in the UCS Source: Michael Everson, SEI (Universal Scripts Project) Status: Individual Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Replaces: N3292 Date: 2007-09-11 1. Introduction. The Javanese script, or aksara Jawa, is used for writing the Javanese language, the native language of one of the peoples of Java, known locally as basa Jawa. It is a descendent of the ancient Brahmi script of India, and so has many similarities with modern scripts of South Asia and Southeast Asia which are also members of that family. The Javanese script is also used for writing Sanskrit, Jawa Kuna (a kind of Sanskritized Javanese), and Kawi, as well as the Sundanese language, also spoken on the island of Java, and the Sasak language, spoken on the island of Lombok. Javanese script was in current use in Java until about 1945; in 1928 Bahasa Indonesia was made the national language of Indonesia and its influence eclipsed that of other languages and their scripts. Traditional Javanese texts are written on palm leaves; books of these bound together are called lontar, a word which derives from ron ‘leaf’ and tal ‘palm’. 2.1. Consonant letters. Consonants have an inherent -a vowel sound. Consonants combine with following consonants in the usual Brahmic fashion: the inherent vowel is “killed” by the PANGKON, and the follow- ing consonant is subjoined or postfixed, often with a change in shape: §£ ndha = § NA + @¿ PANGKON + £ DA-MAHAPRANA; üù n.
    [Show full text]
  • The Unicode Cookbook for Linguists: Managing Writing Systems Using Orthography Profiles
    Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2017 The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles Moran, Steven ; Cysouw, Michael DOI: https://doi.org/10.5281/zenodo.290662 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-135400 Monograph The following work is licensed under a Creative Commons: Attribution 4.0 International (CC BY 4.0) License. Originally published at: Moran, Steven; Cysouw, Michael (2017). The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles. CERN Data Centre: Zenodo. DOI: https://doi.org/10.5281/zenodo.290662 The Unicode Cookbook for Linguists Managing writing systems using orthography profiles Steven Moran & Michael Cysouw Change dedication in localmetadata.tex Preface This text is meant as a practical guide for linguists, and programmers, whowork with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together. The intersection of the Unicode Standard and the International Phonetic Al- phabet is often not met without frustration by users. Nevertheless, thetwo standards have provided language researchers with a consistent computational architecture needed to process, publish and analyze data from many different languages. We bring to light common, but not always transparent, pitfalls that researchers face when working with Unicode and IPA. Our research uses quantitative methods to compare languages and uncover and clarify their phylogenetic relations. However, the majority of lexical data available from the world’s languages is in author- or document-specific orthogra- phies.
    [Show full text]
  • Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
    1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only.
    [Show full text]
  • COSC345 Week 24 Internationalisation And
    COSC345 Week 24 Internationalisation and Localisation 29 September 2015 Richard A. O'Keefe 1 From a Swedish h^otelroom Hj¨alposs att v¨arnerom v˚armilj¨o! F¨oratt minska utsl¨appav tv¨attmedel, byter vi Er handduk bara n¨arNi vill: 1. Handduk p˚agolvet | betyder att Ni vill ha byte 2. ... 2 The translation Help us to care for our environment! To reduce the use of laundry detergents, we shall change your towel as follows: 1. Towel on the floor | you want to have a new towel. 2. Towel hung up | you want to use it again. 3 People should be able to use computers in their own language. | It's just right not to make people struggle with unfamiliar lin- guistic and cultural codes. | Sensible people won't pay for programs that are hard to use. | Internationalisation (I18N) means making a program so that it does not enforce a particular language or set of cultural conventions | Localisation (L10N) means adapting an internationalised pro- gram to a particular language etc. | UNIX, VMS, Windows, all support internationalisation and lo- calisation; the Macintosh operating system has done this better for longer. 4 Characters You know that there are 26 letters in 2 cases. But Swedish has ˚a,¨a,¨o, A,˚ A,¨ and O¨ (29 letters), Croatian has d j, D j, D J, and others (3 cases), German has ß, which has no single upper case version (might be SS, might be SZ, both of which are two letters), Latin-1 has 58 letters in 2 cases (including 2 lower case letters with no upper case version), Arabic letters have 4 contextual shapes (beginning, middle, or end of word, or isolated), which are not case variants (Greek has one such letter, and Hebrew has several; even English used to), and Chinese has tens of thousands of characters.
    [Show full text]
  • Proposal to Encode 0D5F MALAYALAM LETTER ARCHAIC II
    Proposal to encode 0D5F MALAYALAM LETTER ARCHAIC II Shriramana Sharma, jamadagni-at-gmail-dot-com, India 2012-May-22 §1. Introduction In the Malayalam Unicode encoding, the independent letter form for the long vowel Ī is: ഈ where the length mark ◌ൗ is appended to the short vowel ഇ to parallel the symbols for U/UU i.e. ഉ/ഊ. However, Ī was originally written as: As these are entirely different representations of Ī, although their sound value may be the same, it is proposed to encode the archaic form as a separate character. §2. Background As the core Unicode encoding for the major Indic scripts is based on ISCII, and the ISCII code chart for Malayalam only contained the modern form for the independent Ī [1, p 24]: … thus it is this written form that came to be encoded as 0D08 MALAYALAM LETTER II. While this “new” written form is seen in print as early as 1936 CE [2]: … there is no doubt that the much earlier form was parallel to the modern Grantha , both of which are derived from the old Grantha Ī as seen below: 1 ma - īdṛgvidhā | tatarāya īṇ Old Grantha from the Iḷaiyānputtūr copper plates [3, p 13]. Also seen is the Vatteḻuttu Ī of the same time (line 2, 2nd char from right) which also exhibits the two dots. Of course, both are derived from old South Indian Brahmi Ī which again has the two dots. It is said [entire paragraph: Radhakrishna Warrier, personal communication] that it was the poet Vaḷḷattōḷ Nārāyaṇa Mēnōn (1878–1958) who introduced the new form of Ī ഈ.
    [Show full text]
  • Bopomofo Extended Range: 31A0–31BF
    Bopomofo Extended Range: 31A0–31BF This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation.
    [Show full text]
  • Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR)
    Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR) Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2019-03-06 Document version: 2.6 Authors: Neo-Brahmi Generation Panel [NBGP] 1. General Information/ Overview/ Abstract The purpose of this document is to give an overview of the proposed Kannada LGR in the XML format and the rationale behind the design decisions taken. It includes a discussion of relevant features of the script, the communities or languages using it, the process and methodology used and information on the contributors. The formal specification of the LGR can be found in the accompanying XML document: proposal-kannada-lgr-06mar19-en.xml Labels for testing can be found in the accompanying text document: kannada-test-labels-06mar19-en.txt 2. Script for which the LGR is Proposed ISO 15924 Code: Knda ISO 15924 N°: 345 ISO 15924 English Name: Kannada Latin transliteration of the native script name: Native name of the script: ಕನ#ಡ Maximal Starting Repertoire (MSR) version: MSR-4 Some languages using the script and their ISO 639-3 codes: Kannada (kan), Tulu (tcy), Beary, Konkani (kok), Havyaka, Kodava (kfa) 1 Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR) 3. Background on Script and Principal Languages Using It 3.1 Kannada language Kannada is one of the scheduled languages of India. It is spoken predominantly by the people of Karnataka State of India. It is one of the major languages among the Dravidian languages. Kannada is also spoken by significant linguistic minorities in the states of Andhra Pradesh, Telangana, Tamil Nadu, Maharashtra, Kerala, Goa and abroad.
    [Show full text]
  • Haptiread: Reading Braille As Mid-Air Haptic Information
    HaptiRead: Reading Braille as Mid-Air Haptic Information Viktorija Paneva Sofia Seinfeld Michael Kraiczi Jörg Müller University of Bayreuth, Germany {viktorija.paneva, sofia.seinfeld, michael.kraiczi, joerg.mueller}@uni-bayreuth.de Figure 1. With HaptiRead we evaluate for the first time the possibility of presenting Braille information as touchless haptic stimulation using ultrasonic mid-air haptic technology. We present three different methods of generating the haptic stimulation: Constant, Point-by-Point and Row-by-Row. (a) depicts the standard ordering of cells in a Braille character, and (b) shows how the character in (a) is displayed by the three proposed methods. HaptiRead delivers the information directly to the user, through their palm, in an unobtrusive manner. Thus the haptic display is particularly suitable for messages communicated in public, e.g. reading the departure time of the next bus at the bus stop (c). ABSTRACT Author Keywords Mid-air haptic interfaces have several advantages - the haptic Mid-air Haptics, Ultrasound, Haptic Feedback, Public information is delivered directly to the user, in a manner that Displays, Braille, Reading by Blind People. is unobtrusive to the immediate environment. They operate at a distance, thus easier to discover; they are more hygienic and allow interaction in 3D. We validate, for the first time, in INTRODUCTION a preliminary study with sighted and a user study with blind There are several challenges that blind people face when en- participants, the use of mid-air haptics for conveying Braille. gaging with interactive systems in public spaces. Firstly, it is We tested three haptic stimulation methods, where the hap- more difficult for the blind to maintain their personal privacy tic feedback was either: a) aligned temporally, with haptic when engaging with public displays.
    [Show full text]
  • Proposal for a Gujarati Script Root Zone Label Generation Ruleset (LGR)
    Proposal for a Gujarati Root Zone LGR Neo-Brahmi Generation Panel Proposal for a Gujarati Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2019-03-06 Document version: 3.6 Authors: Neo-Brahmi Generation Panel [NBGP] 1 General Information/ Overview/ Abstract The purpose of this document is to give an overview of the proposed Gujarati LGR in the XML format and the rationale behind the design decisions taken. It includes a discussion of relevant features of the script, the communities or languages using it, the process and methodology used and information on the contributors. The formal specification of the LGR can be found in the accompanying XML document: proposal-gujarati-lgr-06mar19-en.xml Labels for testing can be found in the accompanying text document: gujarati-test-labels-06mar19-en.txt 2 Script for which the LGR is proposed ISO 15924 Code: Gujr ISO 15924 Key N°: 320 ISO 15924 English Name: Gujarati Latin transliteration of native script name: gujarâtî Native name of the script: ગજુ રાતી Maximal Starting Repertoire (MSR) version: MSR-4 1 Proposal for a Gujarati Root Zone LGR Neo-Brahmi Generation Panel 3 Background on the Script and the Principal Languages Using it1 Gujarati (ગજુ રાતી) [also sometimes written as Gujerati, Gujarathi, Guzratee, Guujaratee, Gujrathi, and Gujerathi2] is an Indo-Aryan language native to the Indian state of Gujarat. It is part of the greater Indo-European language family. It is so named because Gujarati is the language of the Gujjars. Gujarati's origins can be traced back to Old Gujarati (circa 1100– 1500 AD).
    [Show full text]
  • Comment on L2/13-065: Grantha Characters from Other Indic Blocks Such As Devanagari, Tamil, Bengali Are NOT the Solution
    Comment on L2/13-065: Grantha characters from other Indic blocks such as Devanagari, Tamil, Bengali are NOT the solution Dr. Naga Ganesan ([email protected]) Here is a brief comment on L2/13-065 which suggests Devanagari block for Grantha characters to write both Aryan and Dravidian languages. From L2/13-065, which suggests Devanagari block for writing Sanskrit and Dravidian languages, we read: “Similar dedicated characters were provided to cater different fonts as URUDU MARATHI SINDHI MARWARI KASHMERE etc for special conditions and needs within the shared DEVANAGARI range. There are provisions made to suit Dravidian specials also as short E short O LLLA RRA NNNA with new created glyptic forms in rhythm with DEVANAGARI glyphs (with suitable combining ortho-graphic features inside DEVANAGARI) for same special functions. These will help some aspirant proposers as a fall out who want to write every Indian language contents in Grantham when it is unified with DEVANAGARI because those features were already taken care of by the presence in that DEVANAGARI. Even few objections seen in some docs claim them as Tamil characters to induce a bug inside Grantham like letters from GoTN and others become tangential because it is going to use the existing Sanskrit properties inside shared DEVANAGARI and also with entirely new different glyph forms nothing connected with Tamil. In my doc I have shown rhythmic non-confusable glyphs for Grantham LLLA RRA NNNA for which I believe there cannot be any varied opinions even by GoTN etc which is very much as Grantham and not like Tamil form at all to confuse (as provided in proposals).” Grantha in Devanagari block to write Sanskrit and Dravidian languages will not work.
    [Show full text]
  • Proposal for Ethiopic Script Root Zone LGR
    Proposal for Ethiopic Script Root Zone LGR LGR Version 2 Date: 2017-05-17 Document version:5.2 Authors: Ethiopic Script Generation Panel Contents 1 General Information/ Overview/ Abstract ........................................................................................ 3 2 Script for which the LGR is proposed ................................................................................................ 3 3 Background on Script and Principal Languages Using It .................................................................... 4 3.1 Local Languages Using the Script .............................................................................................. 4 3.2 Geographic Territories of the Language or the Language Map of Ethiopia ................................ 7 4 Overall Development Process and Methodology .............................................................................. 8 4.1 Sources Consulted to Determine the Repertoire....................................................................... 8 4.2 Team Composition and Diversity .............................................................................................. 9 4.3 Analysis of Code Point Repertoire .......................................................................................... 10 4.4 Analysis of Code Point Variants .............................................................................................. 11 5 Repertoire ....................................................................................................................................
    [Show full text]