Myanmar Range: 1000–109F

Total Page:16

File Type:pdf, Size:1020Kb

Myanmar Range: 1000–109F Myanmar Range: 1000–109F This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation. Copying characters from the character code tables or list of character names is not recommended, because for production reasons the PDF files for the code charts cannot guarantee that the correct character codes will always be copied. Fonts The shapes of the reference glyphs used in these code charts are not prescriptive. Considerable variation is to be expected in actual fonts. The particular fonts used in these charts were provided to the Unicode Consortium by a number of different font designers, who own the rights to the fonts. See https://www.unicode.org/charts/fonts.html for a list. Terms of Use You may freely use these code charts for personal or internal business uses only. You may not incorporate them either wholly or in part into any product or publication, or otherwise distribute them without express written permission from the Unicode Consortium. However, you may provide links to these charts. The fonts and font data used in production of these code charts may NOT be extracted, or used in any other way in any product or publication, without permission or license granted by the typeface owner(s). The Unicode Consortium is not liable for errors or omissions in this file or the standard itself. Information on characters added to the Unicode Standard since the publication of the most recent version of the Unicode Standard, as well as on characters currently being considered for addition to the Unicode Standard can be found on the Unicode web site. See https://www.unicode.org/pending/pending.html and https://www.unicode.org/alloc/Pipeline.html. Copyright © 1991-2021 Unicode, Inc. All rights reserved. 1000 Myanmar 109F 100 101 102 103 104 105 106 107 108 109 0 က တ ဠ $ူ ၀ ၐ $ၠ ၰ ႀ ႐ 1000 1010 1020 1030 1040 1050 1060 1070 1080 1090 1 ခ ထ အ $ေ ၁ ၑ ၡ $ၱ ႁ ႑ 1001 1011 1021 1031 1041 1051 1061 1071 1081 1091 2 ဂ ဒ ဢ $ဲ ၂ ၒ $ၢ $ၲ $ႂ ႒ 1002 1012 1022 1032 1042 1052 1062 1072 1082 1092 3 ဃ ဓ ဣ $ဳ ၃ ၓ $ၣ $ၳ $ႃ ႓ 1003 1013 1023 1033 1043 1053 1063 1073 1083 1093 4 င န ဤ $ဴ ၄ ၔ $ၤ $ၴ $ႄ ႔ 1004 1014 1024 1034 1044 1054 1064 1074 1084 1094 5 စ ပ ဥ $ဵ ၅ ၕ ၥ ၵ $ႅ ႕ 1005 1015 1025 1035 1045 1055 1065 1075 1085 1095 6 ဆ ဖ ဦ $ံ ၆ $ၖ ၦ ၶ $ႆ ႖ 1006 1016 1026 1036 1046 1056 1066 1076 1086 1096 7 ဇ ဗ ဧ $့ ၇ $ၗ $ၧ ၷ $ႇ ႗ 1007 1017 1027 1037 1047 1057 1067 1077 1087 1097 8 ဈ ဘ ဨ $း ၈ $ၘ $ၨ ၸ $ႈ ႘ 1008 1018 1028 1038 1048 1058 1068 1078 1088 1098 9 ဉ မ ဩ ၉ $ၙ $ၩ ၹ $ႉ ႙ 1009 1019 1029 1039 1049 1059 1069 1079 1089 1099 A ည ယ ဪ $် ၊ ၚ $ၪ ၺ $ႊ $ႚ 100A 101A 102A 103A 104A 105A 106A 107A 108A 109A B ဋ ရ $ါ $ျ ။ ၛ $ၫ ၻ $ႋ $ႛ 100B 101B 102B 103B 104B 105B 106B 107B 108B 109B C ဌ လ $ာ $ြ ၌ ၜ $ၬ ၼ $ႌ $ႜ 100C 101C 102C 103C 104C 105C 106C 107C 108C 109C D ဍ ဝ $ိ $ွ ၍ ၝ $ၭ ၽ $ႍ $ႝ 100D 101D 102D 103D 104D 105D 106D 107D 108D 109D E ဎ သ $ီ $ှ ၎ $ၞ ၮ ၾ ႎ ႞ 100E 101E 102E 103E 104E 105E 106E 107E 108E 109E F ဏ ဟ $ု ဿ ၏ $ၟ ၯ ၿ $ႏ ႟ 100F 101F 102F 103F 104F 105F 106F 107F 108F 109F The Unicode Standard 14.0, Copyright © 1991-2021 Unicode, Inc. All rights reserved. 1000 Myanmar 1054 This script is also known historically as the Burmese script. Dependent vowel signs Consonants 102B $ါ MYANMAR VOWEL SIGN TALL AA MYANMAR VOWEL SIGN AA 1000 MYANMAR LETTER KA 102C $ာ က MYANMAR VOWEL SIGN I dotted form 102D $ိ ⁓ 1000 FE00 က MYANMAR VOWEL SIGN II 1001 ခ MYANMAR LETTER KHA 102E $ီ MYANMAR VOWEL SIGN U 1002 ဂ MYANMAR LETTER GA 102F $ု MYANMAR VOWEL SIGN UU 1002 FE00 dotted form 1030 $ူ ⁓ ဂ MYANMAR VOWEL SIGN E 1003 ဃ MYANMAR LETTER GHA 1031 $ေ 1004 င MYANMAR LETTER NGA • stands to the left of the consonant dotted form ⁓ 1031 FE00 ေ$ dotted form ⁓ 1004 FE00 င MYANMAR VOWEL SIGN AI 1005 စ MYANMAR LETTER CA 1032 $ဲ MYANMAR VOWEL SIGN MON II 1006 ဆ MYANMAR LETTER CHA 1033 $ဳ MYANMAR VOWEL SIGN MON O 1007 ဇ MYANMAR LETTER JA 1034 $ဴ MYANMAR VOWEL SIGN E ABOVE 1008 ဈ MYANMAR LETTER JHA 1035 $ဵ 1009 ဉ MYANMAR LETTER NYA Various signs 100A ည MYANMAR LETTER NNYA 1036 $ံ MYANMAR SIGN ANUSVARA 100B ဋ MYANMAR LETTER TTA 1037 $့ MYANMAR SIGN DOT BELOW 100C ဌ MYANMAR LETTER TTHA = aukmyit 100D ဍ MYANMAR LETTER DDA • a tone mark 100E ဎ MYANMAR LETTER DDHA 1038 $း MYANMAR SIGN VISARGA MYANMAR LETTER NNA 100F ဏ Virama and killer 1010 MYANMAR LETTER TA တ 1039 MYANMAR SIGN VIRAMA ⁓ 1010 FE00 တ dotted form MYANMAR SIGN ASAT MYANMAR LETTER THA 103A $် 1011 ထ = killer (always rendered visibly) ⁓ 1011 FE00 ထ dotted form 1012 ဒ MYANMAR LETTER DA Dependent consonant signs 1013 ဓ MYANMAR LETTER DHA 103B $ျ MYANMAR CONSONANT SIGN MEDIAL YA 1014 န MYANMAR LETTER NA 103C $ြ MYANMAR CONSONANT SIGN MEDIAL RA 1015 ပ MYANMAR LETTER PA 103D $ွ MYANMAR CONSONANT SIGN MEDIAL WA MYANMAR CONSONANT SIGN MEDIAL HA ⁓ 1015 FE00 ပ dotted form 103E $ှ 1016 ဖ MYANMAR LETTER PHA Consonant MYANMAR LETTER BA 1017 ဗ 103F ဿ MYANMAR LETTER GREAT SA 1018 ဘ MYANMAR LETTER BHA 1019 မ MYANMAR LETTER MA Digits MYANMAR DIGIT ZERO ⁓ 1019 FE00 မ dotted form 1040 ၀ 101A ယ MYANMAR LETTER YA 1041 ၁ MYANMAR DIGIT ONE MYANMAR DIGIT TWO ⁓ 101A FE00 ယ dotted form 1042 ၂ 101B ရ MYANMAR LETTER RA 1043 ၃ MYANMAR DIGIT THREE 101C လ MYANMAR LETTER LA 1044 ၄ MYANMAR DIGIT FOUR MYANMAR DIGIT FIVE ⁓ 101C FE00 လ dotted form 1045 ၅ 101D ဝ MYANMAR LETTER WA 1046 ၆ MYANMAR DIGIT SIX MYANMAR DIGIT SEVEN ⁓ 101D FE00 ဝ dotted form 1047 ၇ 101E သ MYANMAR LETTER SA 1048 ၈ MYANMAR DIGIT EIGHT 101F ဟ MYANMAR LETTER HA 1049 ၉ MYANMAR DIGIT NINE MYANMAR LETTER LLA 1020 ဠ Punctuation Independent vowels 104A ၊ MYANMAR SIGN LITTLE SECTION 1021 အ MYANMAR LETTER A → 0964 । devanagari danda • also represents the glottal stop as a consonant 104B ။ MYANMAR SIGN SECTION 1022 ဢ MYANMAR LETTER SHAN A → 0965 ॥ devanagari double danda ⁓ 1022 FE00 ဢ dotted form Various signs 1023 ဣ MYANMAR LETTER I MYANMAR SYMBOL LOCATIVE MYANMAR LETTER II 104C ၌ 1024 ဤ 104D ၍ MYANMAR SYMBOL COMPLETED 1025 ဥ MYANMAR LETTER U MYANMAR SYMBOL AFOREMENTIONED MYANMAR LETTER UU 104E ၎ 1026 ဦ 104F ၏ MYANMAR SYMBOL GENITIVE ≡ 1025 ဥ 102E $ီ 1027 ဧ MYANMAR LETTER E Pali and Sanskrit extensions 1028 ဨ MYANMAR LETTER MON E 1050 ၐ MYANMAR LETTER SHA 1029 ဩ MYANMAR LETTER O 1051 ၑ MYANMAR LETTER SSA 102A ဪ MYANMAR LETTER AU 1052 ၒ MYANMAR LETTER VOCALIC R 1053 ၓ MYANMAR LETTER VOCALIC RR 1054 ၔ MYANMAR LETTER VOCALIC L The Unicode Standard 14.0, Copyright © 1991-2021 Unicode, Inc. All rights reserved. 1055 Myanmar 109F 1055 ၕ MYANMAR LETTER VOCALIC LL 107C ၼ MYANMAR LETTER SHAN NA 1056 $ၖ MYANMAR VOWEL SIGN VOCALIC R 107D ၽ MYANMAR LETTER SHAN PHA 1057 $ၗ MYANMAR VOWEL SIGN VOCALIC RR 107E ၾ MYANMAR LETTER SHAN FA 1058 $ၘ MYANMAR VOWEL SIGN VOCALIC L 107F ၿ MYANMAR LETTER SHAN BA 1059 $ၙ MYANMAR VOWEL SIGN VOCALIC LL 1080 ႀ MYANMAR LETTER SHAN THA Extensions for Mon ⁓ 1080 FE00 ႀ dotted form 1081 ႁ MYANMAR LETTER SHAN HA 105A ၚ MYANMAR LETTER MON NGA 1082 $ႂ MYANMAR CONSONANT SIGN SHAN MEDIAL 105B ၛ MYANMAR LETTER MON JHA WA MYANMAR LETTER MON BBA 105C ၜ 1083 $ႃ MYANMAR VOWEL SIGN SHAN AA MYANMAR LETTER MON BBE 105D ၝ 1084 $ႄ MYANMAR VOWEL SIGN SHAN E 105E $ၞ MYANMAR CONSONANT SIGN MON MEDIAL MYANMAR VOWEL SIGN SHAN E ABOVE NA 1085 $ႅ 1086 $ႆ MYANMAR VOWEL SIGN SHAN FINAL Y 105F $ၟ MYANMAR CONSONANT SIGN MON MEDIAL MA 1087 $ႇ MYANMAR SIGN SHAN TONE-2 MYANMAR SIGN SHAN TONE-3 1060 $ၠ MYANMAR CONSONANT SIGN MON MEDIAL 1088 $ႈ LA 1089 $ႉ MYANMAR SIGN SHAN TONE-5 108A $ႊ MYANMAR SIGN SHAN TONE-6 Extensions for S'gaw Karen 108B $ႋ MYANMAR SIGN SHAN COUNCIL TONE-2 MYANMAR LETTER SGAW KAREN SHA 1061 ၡ 108C $ႌ MYANMAR SIGN SHAN COUNCIL TONE-3 MYANMAR VOWEL SIGN SGAW KAREN EU 1062 $ၢ 108D $ႍ MYANMAR SIGN SHAN COUNCIL EMPHATIC 1063 $ၣ MYANMAR TONE MARK SGAW KAREN HATHI TONE 1064 $ၤ MYANMAR TONE MARK SGAW KAREN KE PHO Extensions for Rumai Palaung Extensions for Western Pwo Karen 108E ႎ MYANMAR LETTER RUMAI PALAUNG FA 1065 ၥ MYANMAR LETTER WESTERN PWO KAREN THA 108F $ႏ MYANMAR SIGN RUMAI PALAUNG TONE-5 1066 ၦ MYANMAR LETTER WESTERN PWO KAREN PWA Shan digits 1067 $ၧ MYANMAR VOWEL SIGN WESTERN PWO KAREN EU 1090 ႐ MYANMAR SHAN DIGIT ZERO 1068 $ၨ MYANMAR VOWEL SIGN WESTERN PWO 1091 ႑ MYANMAR SHAN DIGIT ONE KAREN UE 1092 ႒ MYANMAR SHAN DIGIT TWO 1069 $ၩ MYANMAR SIGN WESTERN PWO KAREN TONE-
Recommended publications
  • Ka И @И Ka M Л @Л Ga Н @Н Ga M М @М Nga О @О Ca П
    ISO/IEC JTC1/SC2/WG2 N3319R L2/07-295R 2007-09-11 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Proposal for encoding the Javanese script in the UCS Source: Michael Everson, SEI (Universal Scripts Project) Status: Individual Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Replaces: N3292 Date: 2007-09-11 1. Introduction. The Javanese script, or aksara Jawa, is used for writing the Javanese language, the native language of one of the peoples of Java, known locally as basa Jawa. It is a descendent of the ancient Brahmi script of India, and so has many similarities with modern scripts of South Asia and Southeast Asia which are also members of that family. The Javanese script is also used for writing Sanskrit, Jawa Kuna (a kind of Sanskritized Javanese), and Kawi, as well as the Sundanese language, also spoken on the island of Java, and the Sasak language, spoken on the island of Lombok. Javanese script was in current use in Java until about 1945; in 1928 Bahasa Indonesia was made the national language of Indonesia and its influence eclipsed that of other languages and their scripts. Traditional Javanese texts are written on palm leaves; books of these bound together are called lontar, a word which derives from ron ‘leaf’ and tal ‘palm’. 2.1. Consonant letters. Consonants have an inherent -a vowel sound. Consonants combine with following consonants in the usual Brahmic fashion: the inherent vowel is “killed” by the PANGKON, and the follow- ing consonant is subjoined or postfixed, often with a change in shape: §£ ndha = § NA + @¿ PANGKON + £ DA-MAHAPRANA; üù n.
    [Show full text]
  • Pos. KE QA GE GA Initial ᠬ ᠭ Medial Final
    Proposal to encode two Mongolian letters Badral Sanlig [email protected] Jamiyansuren Togoobat [email protected] Munkh-Uchral Enkhtur [email protected] Bolorsoft LLC, Mongolia 1 Introduction This is a proposal to encode two additional mongolian letters that are most actively used for writing texts in traditional Mongolian writing system. These letters are at the present partially implemented as variant forms of correspond- ingly QA, GA. The first letter is Mongolian KE, which is known as feminine form of QA and second letter is Mongolian GE, which is known as feminine form of GA. Pos. KE QA GE GA ᠬ ᠭ initial medial final - Table 1: Forms of KE(QA) and GE(GA). In current encoding scheme, only final and medial form of GE are encoded and all other forms of GE, KE (such as initial GE, KE, medial KE) can be illustrated only through open type font algorithms. On top of that, those cur- rently encoded forms of GE are only as variant of GA (medial form of GE is second variant by FVS1, whereas final form of GE is fourth variant by FVS3) implemented. QA, KE, GA, GE are most frequently used characters in Mongol script, as most of the heading words are started by these letters, all nominal forms of verb are built by these letters and all long vowels are illustrated by these letters. To back up our argument, we have done the frequency analysis of Mongol script letters in our lexical database, which contains 41808 non-inflected distinct words (lemma), result of our analysis are shown in Table 2.
    [Show full text]
  • Proposal for a Gurmukhi Script Root Zone Label Generation Ruleset (LGR)
    Proposal for a Gurmukhi Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2019-04-22 Document version: 2.7 Authors: Neo-Brahmi Generation Panel [NBGP] 1. General Information/ Overview/ Abstract This document lays down the Label Generation Ruleset for Gurmukhi script. Three main components of the Gurmukhi Script LGR i.e. Code point repertoire, Variants and Whole Label Evaluation Rules have been described in detail here. All these components have been incorporated in a machine-readable format in the accompanying XML file named "proposal-gurmukhi-lgr-22apr19-en.xml". In addition, a document named “gurmukhi-test-labels-22apr19-en.txt” has been provided. It provides a list of labels which can produce variants as laid down in Section 6 of this document and it also provides valid and invalid labels as per the Whole Label Evaluation laid down in Section 7. 2. Script for which the LGR is proposed ISO 15924 Code: Guru ISO 15924 Key N°: 310 ISO 15924 English Name: Gurmukhi Latin transliteration of native script name: gurmukhī Native name of the script: ਗੁਰਮੁਖੀ Maximal Starting Repertoire [MSR] version: 4 1 3. Background on Script and Principal Languages Using It 3.1. The Evolution of the Script Like most of the North Indian writing systems, the Gurmukhi script is a descendant of the Brahmi script. The Proto-Gurmukhi letters evolved through the Gupta script from 4th to 8th century, followed by the Sharda script from 8th century onwards and finally adapted their archaic form in the Devasesha stage of the later Sharda script, dated between the 10th and 14th centuries.
    [Show full text]
  • Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
    1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only.
    [Show full text]
  • "9-41516)9? "9787:)4 ;7 -6+7,- )=1 16 ;0- & $
    L2/20-256 "9-41516)9?"9787:)4;7-6+7,-)=116;0-&$ ᭛᭜᭛ <;079 ,1;?))?<"-9,)6)215-14,7;3755/5)14+75 40)5 <9=)6:)0140)56<9=)6:)0/5)14+75 );- ;0$-8;-5*-9 6;97,<+;176 ,=:#6L>H8G>EI>H6=>HIDG>86AG6=B>76H:9H8G>EI;DJC9>CK6G>DJH>CH8G>EI>DCH6C96GI:;68IHEGD9J8:97:IL::CI=: I=6C9I=: I=8:CIJGN>C>CHJA6G+DJI=:6HIH>6A6G<:EDGI>DCD;>IH8DGEJH>H;DJC9>C"6K67JI#6L>B6I:G>6AH =6K:6AHD7::C;DJC9>C+JB6IG6%6A6N(:C>CHJA66A>6C9I=:(=>A>EE>C:H,=:H8G>EI>H;G:FJ:CIAN6HHD8>6I:9L>I= I=:'A9"6K6C:H:A6C<J6<:7JIB6I:G>6AHLG>II:C>C+6CH@G>I'A9%6A6N'A96A>C:H:6C9'A9+JC96C:H:A6C<J6<: =6H6AHD7::C;DJC9>CI=:#6L>H8G>EIGDBI=:B>9I=8:CIJGNH>BEA:;JC8I>DC6A#6L>L6HL>9:ANJH:9IDG:8DG9 A6C9 <G6CIH GDN6A :9>8IH 6C9 H>B>A6G 8=6C8:GN 9D8JB:CIH ,DL6G9H I=: :C9 D; I=: ;>GHI B>AA:CC>JB I=: H8G>EI 7:86B:>C8G:6H>C<AN9:8DG6I>K:6C986AA><G6E=>89J:ID>IHJH:6HI=:B6>CK:=>8A:D;'A9"6K6C:H:A>I:G6GNA6C<J6<: L>I=ADC<A6HI>C<A:<68N>CI=:A>I:G6GNIG69>I>DCD;I=:BD9:GC"6K6C:H:6C96A>C:H:A6C<J6<:H$6I:G#6L>H=DLH B6CNK6G>6I>DCHDK:G6L>9:<:D<G6E=>89>HIG>7JI>DC'K:GI>B:I=:H:K6G>6CIH=6K::KDAK:9>G:8IANDG>C9>G:8IAN >CIDI=:B6CNBD9:GCG6=B>8H8G>EIHD;>CHJA6G+H>6HJ8=6H6A>C:H:6I6@"6K6C:H:$DCI6G6:I8 /=>A:I=:68I>K:JH:D;#6L>H8G>EI=6H7::CG:EA68:97NDI=:GH8G>EIHH>C8:I=: I=8:CIJGNI=:G:6G:6CJB7:GD; BD9:GC96N:CI=JH>6HIH6C98DBBJC>I>:HL=DJH:I=:H8G>EIID96N;DGDI=:GEJGEDH:HI=6C6C8>:CIG:EGD9J8I>DC ;DG:M6BEA:ID8=6I>CHD8>6A6EEA>86I>DC6C98G:6I:>B6<:EDHIH!CI=>HG:K>K6AINE:D;JH:I=:#6L>H8G>EIB6N7: JH:9IDLG>I:A6C<J6<:HI=6I6G:CDI;DJC9>C‘6JI=:CI>8’#6L>8DGEJHHJ8=6HI=:BD9:GC"6K6C:H:A6C<J6<:DG I=: !C9DC:H>6C A6C<J6<: H#6L>=6H CDI 7::C :C8D9:9>C I=: -C>8D9: N:I I=:
    [Show full text]
  • Ahom Range: 11700–1174F
    Ahom Range: 11700–1174F This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation.
    [Show full text]
  • 2016 Semi Finalists Medals
    2016 US Physics Olympiad Semi Finalists Medal Rankings StudentMedal School City State Abbott, Ryan WHopkinsBronze Medal SchoolNew Haven CT Alton, James SLakesideHonorable Mention High SchoolEvans GA ALUMOOTIL, VARKEY TCanyonHonorable Mention Crest AcademySan Diego CA An, Seung HwanGold Medal Taft SchoolWatertown CT Ashary, Rafay AWilliamHonorable Mention P Clements High SchoolSugar Land TX Balaji, ShreyasSilver Medal John Foster Dulles High SchoolSugar Land TX Bao, MikeGold Medal Cambridge Educational InstituteChino Hills CA Beasley, NicholasGold Medal Stuyvesant High SchoolNew York NY BENABOU, JOSHUA N Gold Medal Plandome NY Bhattacharyya, MoinakSilver Medal Lynbrook High SchoolSan Jose CA Bhattaram, Krishnakumar SLynbrookBronze Medal High SchoolSan Jose CA Bhimnathwala, Tarung SBronze Medal Manalapan High SchoolManalapan NJ Boopathy, AkhilanGold Medal Lakeside Upper SchoolSeattle WA Cao, AntonSilver Medal Evergreen Valley High SchoolSan Jose CA Cen, Edward DBellaireHonorable Mention High SchoolBellaire TX Chadraa, Dalai BRedmondHonorable Mention High SchoolRedmond WA Chakrabarti, DarshanBronze Medal Northside College Preparatory HSChicago IL Chan, Clive ALexingtonSilver Medal High SchoolLexington MA Chang, Kevin YBellarmineSilver Medal Coll PrepSan Jose CA Cheerla, NikhilBronze Medal Monta Vista High SchoolSan Jose CA Chen, AlexanderSilver Medal Princeton High SchoolPrinceton NJ Chen, Andrew LMissionSilver Medal San Jose High SchoolFremont CA Chen, Benjamin YArdentSilver Medal Academy for Gifted YouthIrvine CA Chen, Bryan XMontaHonorable
    [Show full text]
  • The Unicode Standard, Version 4.0--Online Edition
    This PDF file is an excerpt from The Unicode Standard, Version 4.0, issued by the Unicode Consor- tium and published by Addison-Wesley. The material has been modified slightly for this online edi- tion, however the PDF files have not been modified to reflect the corrections found on the Updates and Errata page (http://www.unicode.org/errata/). For information on more recent versions of the standard, see http://www.unicode.org/standard/versions/enumeratedversions.html. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial capital letters. However, not all words in initial capital letters are trademark designations. The Unicode® Consortium is a registered trademark, and Unicode™ is a trademark of Unicode, Inc. The Unicode logo is a trademark of Unicode, Inc., and may be registered in some jurisdictions. The authors and publisher have taken care in preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode®, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. Dai Kan-Wa Jiten used as the source of reference Kanji codes was written by Tetsuji Morohashi and published by Taishukan Shoten.
    [Show full text]
  • An Introduction to Indic Scripts
    An Introduction to Indic Scripts Richard Ishida W3C [email protected] HTML version: http://www.w3.org/2002/Talks/09-ri-indic/indic-paper.html PDF version: http://www.w3.org/2002/Talks/09-ri-indic/indic-paper.pdf Introduction This paper provides an introduction to the major Indic scripts used on the Indian mainland. Those addressed in this paper include specifically Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. I have used XHTML encoded in UTF-8 for the base version of this paper. Most of the XHTML file can be viewed if you are running Windows XP with all associated Indic font and rendering support, and the Arial Unicode MS font. For examples that require complex rendering in scripts not yet supported by this configuration, such as Bengali, Oriya, and Malayalam, I have used non- Unicode fonts supplied with Gamma's Unitype. To view all fonts as intended without the above you can view the PDF file whose URL is given above. Although the Indic scripts are often described as similar, there is a large amount of variation at the detailed implementation level. To provide a detailed account of how each Indic script implements particular features on a letter by letter basis would require too much time and space for the task at hand. Nevertheless, despite the detail variations, the basic mechanisms are to a large extent the same, and at the general level there is a great deal of similarity between these scripts. It is certainly possible to structure a discussion of the relevant features along the same lines for each of the scripts in the set.
    [Show full text]
  • Specifying Optional Malayalam Conjuncts
    Specifying Optional Malayalam Conjuncts Cibu Johny <[email protected]> Roozbeh Poornader <[email protected]> 2013­Jan­28 Current status Indic conjunct formation scheme currently favors the full conjunct for a given set of characters. Example: क् + ष → is prefered as opposed to क् ​ष. (KAd + SSAl → K.SSAn ) क् ​ष can be obtained by क् + ZWJ + ष which is KAd + ZWJ + SSAl → KAh + SSAn The Need In Malayalam there are two prevailing orthographies ­ traditional and reformed ­ both written with same Malayalam character set. The difference between them is typically manifested only by the font. Traditional orthography fonts accomodate lot more full conjuncts, while reformed orthography fonts would use visibile virama (Chandrakkala) separated sequences for many of those full conjuncts. For the vowel signs of U, UU, and Vocalic vowels and also for the RA­sign, reformed orthography font would use visually separate conjoining form. However, there is a definite need for the ability in a reformed orthography font to display the traditional full conjuncts on demand. As of now there is no mechanism specified in the standard to suggest a full conjunct of a cluster. The reverse case is also needed ­ a traditional orthography font might want to display reformed othrography grapheme clusters optionally. Following proposal uses ZWJ and ZWNJ insertions to achieve this need. However, potentially Chillu forming sequence <Consonant + Virama + ZWJ> is not used for any of the cases listed below. Proposal Case 1 1 The sequence <Consonant + ZWJ + Conjoining Vowel Sign> has following fallback order for display: 1. Full Conjunct 2. Consonant + non­conjoining vowel sign Example with reformed orthography font (in a reformed orthography Malayalam font that can allow optional traditional orthography) SA + Vowel Sign U → SA + ZWJ + Vowel Sign U → Case 2 <Consonant1 + ZWJ + Virama + Consonant2> has following display fallback order: 1.
    [Show full text]
  • Introduction to Old Javanese Language and Literature: a Kawi Prose Anthology
    THE UNIVERSITY OF MICHIGAN CENTER FOR SOUTH AND SOUTHEAST ASIAN STUDIES THE MICHIGAN SERIES IN SOUTH AND SOUTHEAST ASIAN LANGUAGES AND LINGUISTICS Editorial Board Alton L. Becker John K. Musgrave George B. Simmons Thomas R. Trautmann, chm. Ann Arbor, Michigan INTRODUCTION TO OLD JAVANESE LANGUAGE AND LITERATURE: A KAWI PROSE ANTHOLOGY Mary S. Zurbuchen Ann Arbor Center for South and Southeast Asian Studies The University of Michigan 1976 The Michigan Series in South and Southeast Asian Languages and Linguistics, 3 Open access edition funded by the National Endowment for the Humanities/ Andrew W. Mellon Foundation Humanities Open Book Program. Library of Congress Catalog Card Number: 76-16235 International Standard Book Number: 0-89148-053-6 Copyright 1976 by Center for South and Southeast Asian Studies The University of Michigan Printed in the United States of America ISBN 978-0-89148-053-2 (paper) ISBN 978-0-472-12818-1 (ebook) ISBN 978-0-472-90218-7 (open access) The text of this book is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License: https://creativecommons.org/licenses/by-nc-nd/4.0/ I made my song a coat Covered with embroideries Out of old mythologies.... "A Coat" W. B. Yeats Languages are more to us than systems of thought transference. They are invisible garments that drape themselves about our spirit and give a predetermined form to all its symbolic expression. When the expression is of unusual significance, we call it literature. "Language and Literature" Edward Sapir Contents Preface IX Pronounciation Guide X Vowel Sandhi xi Illustration of Scripts xii Kawi--an Introduction Language ancf History 1 Language and Its Forms 3 Language and Systems of Meaning 6 The Texts 10 Short Readings 13 Sentences 14 Paragraphs..
    [Show full text]
  • A Barrier to Indic-Language Implementation of Unicode Is the Perception That Encoding Order in Unicode Is Equivalent to Lingui
    Issues in Indic Language Collation Issues in Indic Language Collation Cathy Wissink Program Manager, Windows Globalization Microsoft Corporation I. Introduction As the software market for India1 grows, so does the interest in developing products for this market, and Unicode is part of many vendors’ solutions. However, many software vendors see a barrier to implementing Unicode on products for the Indic-language market. This barrier is the perception that deficiencies in Unicode will keep software developers from creating products that are culturally and linguistically appropriate for the Indian market. This perception manifests itself in a number of ways, but one major concern that the Indic language community has voiced is the fact that the Unicode character encoding order is not appropriate for linguistic collation (or sorting). This belief that character encoding order in Unicode must be equivalent to linguistic collation of these same scripts and their respective languages is considered by some developers a blocking point to adoption of Unicode in the Indian market, and is indicative of the greater concern within the Indic-language community about the feasibility of Unicode for their scripts. This paper will demonstrate that this perceived barrier to Unicode adoption does not exist and that it is possible to provide properly globalized software for the Indic market with the current implementation of Unicode, using the example of Indic language collation. A brief history of Indic encodings will be given to set the stage for the current mentality regarding Unicode in the Indian market. The basics of linguistic collation and its application to Indic scripts will then be discussed, compared to encoding, and demonstrated as it exists on Windows XP.
    [Show full text]