Unibook Document

Total Page:16

File Type:pdf, Size:1020Kb

Unibook Document Title: Draft additional repertoire for ISO/IEC 10646:2016 (5th ed.) Amendment 1.2 Date: 2016-09-29 L2/16-xxx WG2 N4770 Source: Michel Suignard, project editor Status: Project Editor's summary of the character repertoire addition as of September 29 2016 Action: For review by WG2 and UTC experts Distribution: WG2 and UTC Replaces: Status This document presents a summary of all characters that constitute the tentative new repertoire for ISO/IEC 10646 5th edition Amd1, with code positions, representative glyphs and character names. Manner of Presentation The character names and code points shown are the same for Unicode and ISO/IEC 10646, including annotations. Note to Reviewers UTC reviewers, please use this document as a summary of UTC review of pending ballots and proposals. WG2 Reviewers, please use this document as an aid during disposition of ballot comments. Contents This document lists 1178 characters. The following list shows all 29 blocks (existing or new) to which characters are proposed to be added, or which have been affected by other changes documented here. Some glyph updates are not shown in these code charts (U+13430 ę , and U+13431 Ě ), these will be done before the amendment goes through ballot. 07C0-07FF NKo See document L2/15-338 08A0-08FF Arabic Extended-A See document L2/16-056 0A00-0A7F Gurmukhi See document L2/16-209 0C80-0CFF Kannada See document L2/16-031 2B00-2BFF Miscellaneous Symbols and Arrows See document L2/16-064 L2/16-067 L2/16-080R 2E00-2E7F Supplemental Punctuation See document L2/15-327R L2/16-220 L2/16-235 3100-312F Bopomofo See document L2/16-106 A720- A7FF Latin Extended-D See document L2/15-241 L2/16-032 A8E0-A8FF Devanagari Extended See document L2/15-335 11080-110CF Kaithi See document L2/16-097R 11300-1137F Grantha See document L2/15-256 11700-1173F Ahom See document L2/15-272 11800-1184F Dogra See document L2/15-234R 11A00-11A4F Zanabazar Square See document L2/15-341R L2/15-342 11A50-11AAF Soyombo See document L2/16-016 11D60-11DAF Gunjala Gondi See document L2/15-235 11EE0-11EFF Makasar See document L2/15-233 13430-1343F Egyptian Hieroglyphs Format Controls See document L2/16-018 16E40-16E9F Medefaidrin See document L2/16-020 17000-187FF Tangut See document L2/16-095 18B00-18CFF Khitan Small Script See document: WG2 N4771 1B000-1B0FF Kana Supplement See document: WG2 N4732 1B100-1B12F Kana Extended-A See document: WG2 N4732 1D360-1D37F Counting Rod Numerals See document L2/15-328 1EC70-1ECBF Indic Siyaq Numbers See document L2/15-121R2 1F100-1F1FF Enclosed Alphanumeric Supplement See document L2/16-059 1F200-1F2FF Enclosed Ideographic Supplement see document: WG2 N4766 L2/16-270 1F780-1F7FF Geometric Shapes Extended See document: L2/16-185 1F900-1F9FF Supplemental Symbols and Pictographs see document L2/16-045 07C0 NKo 07FF 07C 07D 07E 07F 0 ߀ ߐ ߠ ߰ 07C0 07D0 07E0 07F0 1 ߁ ߑ ߡ ߱ 07C1 07D1 07E1 07F1 2 ߂ ߒ ߢ ߲ 07C2 07D2 07E2 07F2 3 ߃ ߓ ߣ ߳ 07C3 07D3 07E3 07F3 4 ߄ ߔ ߤ ߴ 07C4 07D4 07E4 07F4 5 ߅ ߕ ߥ ߵ 07C5 07D5 07E5 07F5 6 ߆ ߖ ߦ ߶ 07C6 07D6 07E6 07F6 7 ߇ ߗ ߧ ߷ 07C7 07D7 07E7ߨ 07F7 8 ߈ ߘ ߸ 07C8 07D8 07E8ߩ 07F8 9 ߉ ߙ ߹ 07C9 07D9 07E9 07F9 A ߊ ߚ ߪ ߺ 07CA 07DA 07EA 07FA B ߋ ߛ ߫ 07CB 07DB 07EB C ߌ ߜ ߬ 07CC 07DC 07EC D ߍ ߝ ߭ ߽ 07CD 07DD 07ED 07FD E ߎ ߞ ߮ ߾ 07CE 07DE 07EE 07FE F ߏ ߟ ߯ ߿ 07CF 07DF 07EF 07FF Printed: 30-Sep-2016 3 07C0 NKo 07FF Digits 07F2 ߲ NKO COMBINING NASALIZATION MARK 07C0 ߀ NKO DIGIT ZERO → 0323 $̣ combining dot below 07C1 ߁ NKO DIGIT ONE 07F3 ߳ NKO COMBINING DOUBLE DOT ABOVE 07C2 ߂ NKO DIGIT TWO → 0308 $̈ combining diaeresis 07C3 ߃ NKO DIGIT THREE 07F4 ߴ NKO HIGH TONE APOSTROPHE 07C4 ߄ NKO DIGIT FOUR → 02BC ʼ modifier letter apostrophe 07C5 ߅ NKO DIGIT FIVE 07F5 ߵ NKO LOW TONE APOSTROPHE 07C6 ߆ NKO DIGIT SIX → 02BB ʻ modifier letter turned comma 07C7 ߇ NKO DIGIT SEVEN Symbol 07C8 ߈ NKO DIGIT EIGHT 07F6 ߶ NKO SYMBOL OO DENNEN 07C9 ߉ NKO DIGIT NINE Punctuation Letters 07F7 ߷ NKO SYMBOL GBAKURUNEN 07CA ߊ NKO LETTER A 07F8 ߸ NKO COMMA 07CB ߋ NKO LETTER EE 07F9 ߹ NKO EXCLAMATION MARK 07CC ߌ NKO LETTER I 07CD ߍ NKO LETTER E Letter extender 07CE ߎ NKO LETTER U 07FA ߺ NKO LAJANYALAN 07CF ߏ NKO LETTER OO → 005F _ low line 07D0 ߐ NKO LETTER O → 0640 arabic tatweel 07D1 ߑ NKO LETTER DAGBASINNA Abbreviation sign 07D2 ߒ NKO LETTER N 07FD ߽ NKO DANTAYALAN 07D3 ߓ NKO LETTER BA • used to abbreviate units of measure 07D4 ߔ NKO LETTER PA Currency signs 07D5 ߕ NKO LETTER TA 07FE ߾ NKO DOROME SIGN 07D6 ߖ NKO LETTER JA → ߘ 07D7 ߗ NKO LETTER CHA 07D8 nko letter da 07FF ߿ NKO TAMAN SIGN 07D8 ߘ NKO LETTER DA → ߕ 07D9 ߙ NKO LETTER RA 07D5 nko letter ta 07DA ߚ NKO LETTER RRA 07DB ߛ NKO LETTER SA 07DC ߜ NKO LETTER GBA 07DD ߝ NKO LETTER FA 07DE ߞ NKO LETTER KA 07DF ߟ NKO LETTER LA 07E0 ߠ NKO LETTER NA WOLOSO 07E1 ߡ NKO LETTER MA 07E2 ߢ NKO LETTER NYA 07E3 ߣ NKO LETTER NA 07E4 ߤ NKO LETTER HA 07E5 ߥ NKO LETTER WA 07E6 ߦ NKO LETTER YA 07E7 ߧ NKO LETTER NYA WOLOSO Archaic letters 07E8 ߨ NKO LETTER JONA JA 07E9 ߩ NKO LETTER JONA CHA 07EA ߪ NKO LETTER JONA RA → 07D9 ߙ nko letter ra Tone marks 07EB ߫ NKO COMBINING SHORT HIGH TONE → 0304 $̄ combining macron 07EC ߬ NKO COMBINING SHORT LOW TONE → 0303 $̃ combining tilde 07ED ߭ NKO COMBINING SHORT RISING TONE → 0307 $̇ combining dot above 07EE ߮ NKO COMBINING LONG DESCENDING TONE → 0302 $̂ combining circumflex accent 07EF ߯ NKO COMBINING LONG HIGH TONE 07F0 ߰ NKO COMBINING LONG LOW TONE 07F1 ߱ NKO COMBINING LONG RISING TONE 4 Printed: 30-Sep-2016 08A0 Arabic Extended-A 08FF 08A 08B 08C 08D 08E 08F 0 ࢰ 08A0 08B0 08E0 08F0 1 ࢡ ࢱ 08A1 08B1 08E1 08F1 2 ࢲ 08A2 08B2 08E2 08F2 3 08A3 08B3 08D3 08E3 08F3 4 08A4 08B4 08D4 08E4 08F4 5 08A5 08D5 08E5 08F5 6 08A6 08B6 08D6 08E6 08F6 7 08A7 08B7 08D7 08E7 08F7 8 08A8 08B8 08D8 08E8 08F8 9 08A9 08B9 08D9 08E9 08F9 A 08AA 08BA 08DA 08EA 08FA B 08AB 08BB 08DB 08EB 08FB C 08AC 08BC 08DC 08EC 08FC D ࢭ 08AD 08BD 08DD 08ED 08FD E ࢮ 08AE 08DE 08EE 08FE F ࢯ $ࣿ 08AF 08DF 08EF 08FF Printed: 30-Sep-2016 5 08A0 Arabic Extended-A 08F3 Arabic letters for African languages Arabic letters for Warsh orthography 08A0 ARABIC LETTER BEH WITH SMALL V BELOW The Warsh orthography is the most widespread tradition for 08A1 ࢡ ARABIC LETTER BEH WITH HAMZA ABOVE the Arabic script in North and West Africa. • Adamawa Fulfulde (Cameroon) 08BB ARABIC LETTER AFRICAN FEH • used for the implosive bilabial stop • initial and medial forms have one dot below → ɓ 0253 latin small letter b with hook → 06A1 arabic letter dotless feh 08A2 ARABIC LETTER JEEM WITH TWO DOTS ABOVE → 06A2 arabic letter feh with dot moved 08A3 ARABIC LETTER TAH WITH TWO DOTS ABOVE below 08A4 ARABIC LETTER FEH WITH DOT BELOW AND 08BC ARABIC LETTER AFRICAN QAF THREE DOTS ABOVE • initial and medial forms have one dot above 08A5 ARABIC LETTER QAF WITH DOT BELOW → 066F arabic letter dotless qaf 08A6 ARABIC LETTER LAM WITH DOUBLE BAR → 06A7 arabic letter qaf with dot above 08A7 ARABIC LETTER MEEM WITH THREE DOTS 08BD ARABIC LETTER AFRICAN NOON ABOVE • initial and medial forms have one dot above 08A8 ARABIC LETTER YEH WITH TWO DOTS BELOW → 06BA arabic letter noon ghunna AND HAMZA ABOVE → 0646 arabic letter noon • Adamawa Fulfulde • used for the implosive palatal approximant, Quranic annotation sign realized as pharyngealization of the 08D3 ARABIC SMALL LOW WAW approximant Pakistani Quranic marks → 01B4 ƴ latin small letter y with hook 08A9 ARABIC LETTER YEH WITH TWO DOTS BELOW 08D4 ARABIC SMALL HIGH WORD AR-RUB AND DOT ABOVE 08D5 ARABIC SMALL HIGH SAD • Adamawa Fulfulde 08D6 ARABIC SMALL HIGH AIN • used for the voiced palatal nasal 08D7 ARABIC SMALL HIGH QAF → 0272 ɲ latin small letter n with left hook 08D8 ARABIC SMALL HIGH NOON WITH KASRA 08D9 ARABIC SMALL LOW NOON WITH KASRA Dependent consonants for Rohingya 08DA ARABIC SMALL HIGH WORD ATH-THALATHA 08AA ARABIC LETTER REH WITH LOOP 08DB ARABIC SMALL HIGH WORD AS-SAJDA = bottya-reh 08DC ARABIC SMALL HIGH WORD AN-NISF 08AB ARABIC LETTER WAW WITH DOT WITHIN 08DD ARABIC SMALL HIGH WORD SAKTA = nota-wa 08AC ARABIC LETTER ROHINGYA YEH 08DE ARABIC SMALL HIGH WORD QIF 08DF ARABIC SMALL HIGH WORD WAQFA = bottya-yeh 08E0 ARABIC SMALL HIGH FOOTNOTE MARKER Arabic letters for European and Central Asian 08E1 ARABIC SMALL HIGH SIGN SAFHA languages 08E2 ARABIC DISPUTED END OF AYAH 08AD ࢭ ARABIC LETTER LOW ALEF Extended vowel sign for Arwi • Bashkir, Tatar 08AE ࢮ ARABIC LETTER DAL WITH THREE DOTS BELOW 08E3 ARABIC TURNED DAMMA BELOW • Belarusian Extended vowel signs for Rohingya 08AF ࢯ ARABIC LETTER SAD WITH THREE DOTS BELOW 08E4 ARABIC CURLY FATHA • Belarusian 08E5 ARABIC CURLY DAMMA 08B0 ࢰ ARABIC LETTER GAF WITH INVERTED STROKE 08E6 ARABIC CURLY KASRA • Crimean Tatar, Chechen, Lak 08E7 ARABIC CURLY FATHATAN 08B1 ࢱ ARABIC LETTER STRAIGHT WAW 08E8 ARABIC CURLY DAMMATAN • Tatar 08E9 ARABIC CURLY KASRATAN Arabic letter for Berber Tone marks for Rohingya 08B2 ࢲ ARABIC LETTER ZAIN WITH INVERTED V ABOVE 08EA ARABIC TONE ONE DOT ABOVE Arabic letters for Arwi 08EB ARABIC TONE TWO DOTS ABOVE 08B3 ARABIC LETTER AIN WITH THREE DOTS BELOW 08EC ARABIC TONE LOOP ABOVE 08B4 ARABIC LETTER KAF WITH DOT BELOW 08ED ARABIC TONE ONE DOT BELOW 08EE ARABIC TONE TWO DOTS BELOW Arabic letters for Bravanese 08EF ARABIC TONE LOOP BELOW 08B6 ARABIC LETTER BEH WITH SMALL MEEM ABOVE Quranic annotation signs 08B7 ARABIC LETTER PEH WITH SMALL MEEM 08F0 ARABIC OPEN FATHATAN ABOVE = successive fathatan 08B8 ARABIC LETTER TEH WITH SMALL TEH ABOVE 08F1 ARABIC OPEN DAMMATAN 08B9 ARABIC LETTER REH WITH SMALL NOON = successive dammatan ABOVE 08F2 ARABIC OPEN KASRATAN
Recommended publications
  • On the Origin of the Indian Brahma Alphabet
    - ON THE <)|{I<; IN <>F TIIK INDIAN BRAHMA ALPHABET GEORG BtfHLKi; SECOND REVISED EDITION OF INDIAN STUDIES, NO III. TOGETHER WITH TWO APPENDICES ON THE OKU; IN OF THE KHAROSTHI ALPHABET AND OF THK SO-CALLED LETTER-NUMERALS OF THE BRAHMI. WITH TIIKKK PLATES. STRASSBUKi-. K A K 1. I. 1 1M I: \ I I; 1898. I'lintccl liy Adolf Ilcil/.haiisi'ii, Vicniiii. Preface to the Second Edition. .As the few separate copies of the Indian Studies No. Ill, struck off in 1895, were sold very soon and rather numerous requests for additional ones were addressed both to me and to the bookseller of the Imperial Academy, Messrs. Carl Gerold's Sohn, I asked the Academy for permission to issue a second edition, which Mr. Karl J. Trlibner had consented to publish. My petition was readily granted. In addition Messrs, von Holder, the publishers of the Wiener Zeitschrift fur die Kunde des Morgenlandes, kindly allowed me to reprint my article on the origin of the Kharosthi, which had appeared in vol. IX of that Journal and is now given in Appendix I. To these two sections I have added, in Appendix II, a brief review of the arguments for Dr. Burnell's hypothesis, which derives the so-called letter- numerals or numerical symbols of the Brahma alphabet from the ancient Egyptian numeral signs, together with a third com- parative table, in order to include in this volume all those points, which require fuller discussion, and in order to make it a serviceable companion to the palaeography of the Grund- riss.
    [Show full text]
  • International Standard Iso/Iec 10646
    This is a preview - click here to buy the full publication INTERNATIONAL ISO/IEC STANDARD 10646 Sixth edition 2020-12 Information technology — Universal coded character set (UCS) Technologies de l'information — Jeu universel de caractères codés (JUC) Reference number ISO/IEC 10646:2020(E) © ISO/IEC 2020 This is a preview - click here to buy the full publication ISO/IEC 10646:2020 (E) CONTENTS 1 Scope ..................................................................................................................................................1 2 Normative references .........................................................................................................................1 3 Terms and definitions .........................................................................................................................2 4 Conformance ......................................................................................................................................8 4.1 General ....................................................................................................................................8 4.2 Conformance of information interchange .................................................................................8 4.3 Conformance of devices............................................................................................................8 5 Electronic data attachments ...............................................................................................................9 6 General structure
    [Show full text]
  • 8 December 2004 (Revised 10 January 2005) Topic: Unicode Technical Meeting #101, 15 -18 November 2004, Cupertino, California
    To: LSA and UC Berkeley Communities From: Deborah Anderson, UCB representative and LSA liaison Date: 8 December 2004 (revised 10 January 2005) Topic: Unicode Technical Meeting #101, 15 -18 November 2004, Cupertino, California As the UC Berkeley representative and LSA liaison, I am most interested in the proposals for new characters and scripts that were discussed at the UTC, so these topics are the focus of this report. For the full minutes, readers should consult the "Unicode Technical Committee Minutes" web page (http://www.unicode.org/consortum/utc-minutes.html), where the minutes from this meeting will be posted several weeks hence. I. Proposals for New Scripts and Additional Characters A summary of the proposals and the UTC's decisions are listed below. As the proposals discussed below are made public, I will post the URLs on the SEI web page (www.linguistics.berkeley.edu/sei). A. Linguistics Characters Lorna Priest of SIL International submitted three proposals for additional linguistics characters. Most of the characters proposed are used in the orthographies of languages from Africa, Asia, Mexico, Central and South America. (For details on the proposed characters, with a description of their use and an image, see the appendix to this document.) Two characters from these proposals were not approved by the UTC because there are already characters encoded that are very similar. The evidence did not adequately demonstrate that the proposed characters are used distinctively. The two problematical proposed characters were: the modifier straight letter apostrophe (used for a glottal stop, similar to ' APOSTROPHE U+0027) and the Latin small "at" sign (used for Arabic loanwords in an orthography for the Koalib language from the Sudan, similar to @ COMMERCIAL AT U+0040).
    [Show full text]
  • The Unicode Standard, Version 6.1 This File Contains an Excerpt from the Character Code Tables and List of Character Names for the Unicode Standard, Version 6.1
    Latin Extended-D Range: A720–A7FF The Unicode Standard, Version 6.1 This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 6.1. Characters in this chart that are new for The Unicode Standard, Version 6.1 are shown in conjunction with any existing characters. For ease of reference, the new characters have been highlighted in the chart grid and in the names list. This file will not be updated with errata, or when additional characters are assigned to the Unicode Standard. See http://www.unicode.org/errata/ for an up-to-date list of errata. See http://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See http://www.unicode.org/charts/PDF/Unicode-6.1/ for charts showing only the characters added in Unicode 6.1. See http://www.unicode.org/Public/6.1.0/charts/ for a complete archived file of character code charts for Unicode 6.1. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 6.1 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 6.1, online at http://www.unicode.org/versions/Unicode6.1.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, and #44, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online.
    [Show full text]
  • Inventing Your Own Number System
    Inventing Your Own Number System Through the ages people have invented many different ways to name, write, and compute with numbers. Our current number system is based on place values corresponding to powers of ten. In principle, place values could correspond to any sequence of numbers. For example, the places could have values corre- sponding to the sequence of square numbers, triangular numbers, multiples of six, Fibonacci numbers, prime numbers, or factorials. The Roman numeral system does not use place values, but the position of numerals does matter when determining the number represented. Tally marks are a simple system, but representing large numbers requires many strokes. In our number system, symbols for digits and the positions they are located combine to represent the value of the number. It is possible to create a system where symbols stand for operations rather than values. For example, the system might always start at a default number and use symbols to stand for operations such as doubling, adding one, taking the reciprocal, dividing by ten, squaring, negating, or any other specific operations. Create your own number system. What symbols will you use for your numbers? How will your system work? Demonstrate how your system could be used to perform some of the following functions. • Count from 0 up to 100 • Compare the sizes of numbers • Add and subtract whole numbers • Multiply and divide whole numbers • Represent fractional values • Represent irrational numbers (such as π) What are some of the advantages of your system compared with other systems? What are some of the disadvantages? If you met aliens that had developed their own number system, how might their mathematics be similar to ours and how might it be different? Make a list of some math facts and procedures that you have learned.
    [Show full text]
  • The Emoji Factor: Humanizing the Emerging Law of Digital Speech
    The Emoji Factor: Humanizing the Emerging Law of Digital Speech 1 Elizabeth A. Kirley and Marilyn M. McMahon Emoji are widely perceived as a whimsical, humorous or affectionate adjunct to online communications. We are discovering, however, that they are much more: they hold a complex socio-cultural history and perform a role in social media analogous to non-verbal behaviour in offline speech. This paper suggests emoji are the seminal workings of a nuanced, rebus-type language, one serving to inject emotion, creativity, ambiguity – in other words ‘humanity’ - into computer mediated communications. That perspective challenges doctrinal and procedural requirements of our legal systems, particularly as they relate to such requisites for establishing guilt or fault as intent, foreseeability, consensus, and liability when things go awry. This paper asks: are we prepared as a society to expand constitutional protections to the casual, unmediated ‘low value’ speech of emoji? It identifies four interpretative challenges posed by emoji for the judiciary or other conflict resolution specialists, characterizing them as technical, contextual, graphic, and personal. Through a qualitative review of a sampling of cases from American and European jurisdictions, we examine emoji in criminal, tort and contract law contexts and find they are progressively recognized, not as joke or ornament, but as the first step in non-verbal digital literacy with potential evidentiary legitimacy to humanize and give contour to interpersonal communications. The paper proposes a separate space in which to shape law reform using low speech theory to identify how we envision their legal status and constitutional protection. 1 Dr. Kirley is Barrister & Solicitor in Canada and Seniour Lecturer and Chair of Technology Law at Deakin University, MelBourne Australia; Dr.
    [Show full text]
  • 5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721
    Internet Engineering Task Force (IETF) P. Faltstrom, Ed. Request for Comments: 5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721 The Unicode Code Points and Internationalized Domain Names for Applications (IDNA) Abstract This document specifies rules for deciding whether a code point, considered in isolation or in context, is a candidate for inclusion in an Internationalized Domain Name (IDN). It is part of the specification of Internationalizing Domain Names in Applications 2008 (IDNA2008). Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5892. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
    [Show full text]
  • Emoticon Style: Interpreting Differences in Emoticons Across Cultures
    Emoticon Style: Interpreting Differences in Emoticons Across Cultures Jaram Park Vladimir Barash Clay Fink Meeyoung Cha Graduate School of Morningside Analytics Johns Hopkins University Graduate School of Culture Technology, KAIST [email protected] Applied Physics Laboratory Culture Technology, KAIST [email protected] clayton.fi[email protected] [email protected] Abstract emotion not captured by language elements alone (Lo 2008; Gajadhar and Green 2005). With the advent of mobile com- Emoticons are a key aspect of text-based communi- cation, and are the equivalent of nonverbal cues to munications, the use of emoticons has become an everyday the medium of online chat, forums, and social media practice for people throughout the world. Interestingly, the like Twitter. As emoticons become more widespread emoticons used by people vary by geography and culture. in computer mediated communication, a vocabulary Easterners, for example employ a vertical style like ^_^, of different symbols with subtle emotional distinctions while westerners employ a horizontal style like :-). This dif- emerges especially across different cultures. In this pa- ference may be due to cultural reasons since easterners are per, we investigate the semantic, cultural, and social as- known to interpret facial expressions from the eyes, while pects of emoticon usage on Twitter and show that emoti- westerners favor the mouth (Yuki, Maddux, and Masuda cons are not limited to conveying a specific emotion 2007; Mai et al. 2011; Jack et al. 2012). or used as jokes, but rather are socio-cultural norms, In this paper, we study emoticon usage on Twitter based whose meaning can vary depending on the identity of the speaker.
    [Show full text]
  • The Road to Literary Culture: Revisiting the Jurchen Language Examination System*
    T’OUNG PAO 130 T’oung PaoXin 101-1-3 Wen (2015) 130-167 www.brill.com/tpao The Road to Literary Culture: Revisiting the Jurchen Language Examination System* Xin Wen (Harvard University) Abstract This essay contextualizes the unique institution of the Jurchen language examination system in the creation of a new literary culture in the Jin dynasty (1115–1234). Unlike the civil examinations in Chinese, which rested on a well-established classical canon, the Jurchen language examinations developed in close connection with the establishment of a Jurchen school system and the formation of a literary canon in the Jurchen language and scripts. In addition to being an official selection mechanism, the Jurchen examinations were more importantly part of a literary endeavor toward a cultural ideal. Through complementing transmitted Chinese sources with epigraphic sources in Jurchen, this essay questions the conventional view of this institution as a “Jurchenization” measure, and proposes that what the Jurchen emperors and officials envisioned was a road leading not to Jurchenization, but to a distinctively hybrid literary culture. Résumé Cet article replace l’institution unique des examens en langue Jurchen dans le contexte de la création d’une nouvelle culture littéraire sous la dynastie des Jin (1115–1234). Contrairement aux examens civils en chinois, qui s’appuyaient sur un canon classique bien établi, les examens en Jurchen se sont développés en rapport étroit avec la mise en place d’un système d’écoles Jurchen et avec la formation d’un canon littéraire en langue et en écriture Jurchen. En plus de servir à la sélection des fonctionnaires, et de façon plus importante, les examens en Jurchen s’inscrivaient * This article originated from Professor Peter Bol’s seminar at Harvard University.
    [Show full text]
  • Supplemental Punctuation Range: 2E00–2E7F
    Supplemental Punctuation Range: 2E00–2E7F This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation.
    [Show full text]
  • ISO/IEC JTC1/SC2/WG2 N 4823 Date: 2017-05-24
    ISO/IEC JTC1/SC2/WG2 N 4823 Date: 2017-05-24 ISO/IEC JTC1/SC2/WG2 Coded Character Set Secretariat: Japan (JISC) Doc. Type: Disposition of comments Title: Disposition of comments on PDAM1.2 to ISO/IEC 10646 5th edition Source: Michel Suignard (project editor) Project: JTC1 02.10646.00.01.00.05 Status: For review by WG2 Date: 2017-05-24 Distribution: WG2 Reference: SC2 N4518 Medium: Paper, PDF file Comments were received from the following members: China, Ireland, Japan, Mongolia, UK, and USA. The following document is the disposition of those comments. The disposition is organized per country. Note – With some minor exceptions, the full content of the ballot comments has been included in this document to facilitate the reading. The dispositions are inserted in between these comments and are marked in Underlined Bold Serif text, with explanatory text in italicized serif. As a result of this disposition, a new PDAM1.3 ballot will be initiated. It is expected to be the last PDAM ballot for Amendment 1 before a DAM ballot is initiated. Page 1 Following these dispositions, the following changes were done to the Amendment repertoire: Xiangqi game symbols 30 characters removed (U+1F270..U+1F28D) from the Enclose Ideographic Supplement block (U+1F200..U+1F2FF) and replaced by 14 characters (U+1FA60..U+1FA6D) in a new block: Chess Symbols (U+1FA00..U+1FA6F) with names and code points as follows: 1FA60 RED XIANGQI GENERAL 1FA61 RED XIANGQI MANDARIN 1FA62 RED XIANGQI ELEPHANT 1FA63 RED XIANGQI HORSE 1FA64 RED XIANGQI CHARIOT 1FA65 RED XIANGQI CANNON 1FA66 RED XIANGQI SOLDIER 1FA67 BLACK XIANGQI GENERAL 1FA68 BLACK XIANGQI MANDARIN 1FA69 BLACK XIANGQI ELEPHANT 1FA6A BLACK XIANGQI HORSE 1FA6B BLACK XIANGQI CHARIOT 1FA6C BLACK XIANGQI CANNON 1FA6D BLACK XIANGQI SOLDIER Small Historic Kana The characters proposed at 1B127..1B12F are removed from this amendment.
    [Show full text]
  • Unicode Alphabets for L ATEX
    Unicode Alphabets for LATEX Specimen Mikkel Eide Eriksen March 11, 2020 2 Contents MUFI 5 SIL 21 TITUS 29 UNZ 117 3 4 CONTENTS MUFI Using the font PalemonasMUFI(0) from http://mufi.info/. Code MUFI Point Glyph Entity Name Unicode Name E262 � OEligogon LATIN CAPITAL LIGATURE OE WITH OGONEK E268 � Pdblac LATIN CAPITAL LETTER P WITH DOUBLE ACUTE E34E � Vvertline LATIN CAPITAL LETTER V WITH VERTICAL LINE ABOVE E662 � oeligogon LATIN SMALL LIGATURE OE WITH OGONEK E668 � pdblac LATIN SMALL LETTER P WITH DOUBLE ACUTE E74F � vvertline LATIN SMALL LETTER V WITH VERTICAL LINE ABOVE E8A1 � idblstrok LATIN SMALL LETTER I WITH TWO STROKES E8A2 � jdblstrok LATIN SMALL LETTER J WITH TWO STROKES E8A3 � autem LATIN ABBREVIATION SIGN AUTEM E8BB � vslashura LATIN SMALL LETTER V WITH SHORT SLASH ABOVE RIGHT E8BC � vslashuradbl LATIN SMALL LETTER V WITH TWO SHORT SLASHES ABOVE RIGHT E8C1 � thornrarmlig LATIN SMALL LETTER THORN LIGATED WITH ARM OF LATIN SMALL LETTER R E8C2 � Hrarmlig LATIN CAPITAL LETTER H LIGATED WITH ARM OF LATIN SMALL LETTER R E8C3 � hrarmlig LATIN SMALL LETTER H LIGATED WITH ARM OF LATIN SMALL LETTER R E8C5 � krarmlig LATIN SMALL LETTER K LIGATED WITH ARM OF LATIN SMALL LETTER R E8C6 UU UUlig LATIN CAPITAL LIGATURE UU E8C7 uu uulig LATIN SMALL LIGATURE UU E8C8 UE UElig LATIN CAPITAL LIGATURE UE E8C9 ue uelig LATIN SMALL LIGATURE UE E8CE � xslashlradbl LATIN SMALL LETTER X WITH TWO SHORT SLASHES BELOW RIGHT E8D1 æ̊ aeligring LATIN SMALL LETTER AE WITH RING ABOVE E8D3 ǽ̨ aeligogonacute LATIN SMALL LETTER AE WITH OGONEK AND ACUTE 5 6 CONTENTS
    [Show full text]