L2/19-388 (New Unihan Database Property: Kunihancore2020)

Total Page:16

File Type:pdf, Size:1020Kb

L2/19-388 (New Unihan Database Property: Kunihancore2020) L2/19-388 Title: New Unihan Database property: kUnihanCore2020 Author: Ken Lunde Date: 2019-12-02 Per L2/18-066R2, I previously proposed what I considered to be modest changes to the existing kIICore prop- erty, mainly to address some shortcomings that were identified in a series of five CJK Type Blog articles (ap- pended to this document). Given the reluctance on the part of some national bodies to accept such modest changes, I decided to instead propose via L2/18-279R a completely new Unihan Database property that re- leases the set from being hampered by memory constraints that may have been applicable 15 years ago, but which arguably no longer apply to modern environments. The new Unihan Database property name is kUnihanCore2020, which includes as part of its name the year in which the first version of Unicode that would include this new property is released, specifically Version 13.0. The attached unihancore2020-data.txt data file provides all of the property data, which covers 20,652 CJK Uni- fied Ideographs and 68 CJK Compatibility Ideographs. Compared to the existing kIICore property, the proposed kUnihanCore2020 property includes 10,910 additional ideographs. 22 ideographs that have a kIICore property value failed to meet the criteria for the kUnihanCore2020 property, but have been grandfathered. The follow- ing table lists these 22 grandfathered ideographs, their kIICore property value, and the source reference that corresponds to their source tag: Code Point kIICore Property Value Corresponding Source Reference U+3960 CK K3-2554 㥠 U+4137 CK K3-2D4F 䄷 U+48B5 CG G5-6F4F 䢵 U+48C5 CG G3-6F29 䣅 U+48D3 CG G3-7B67 䣓 U+49D1 CG GKX-1352.16 䧑 U+4A12 CK K3-3455 䨒 U+4CB3 CT T3-5028 䲳 U+4D08 CT T4-6C52 䴈 U+593D CK K2-2B54 夽 U+5D44 CK K2-2F33 嵄 U+5F34 CJ J13-7436 弴 U+5F45 CJ J13-743A 彅 U+66A3 CK K1-5B6F 暣 U+713F CT T3-6552 焿 U+7807 CK none 砇 U+7A66 CK none 穦 U+974D CJ J3-7D68 靍 U+974F CJ J13-7D6A 靏 U+9964 CG G8-2D43 饤 U+997E CG G8-2D48 饾 U+9AD9 CJ none 髙 Also see the attached grandfathered-22.txt data file. 1 The seven sections that follow describe the scope of each of the seven supported source tags, which are the same as those used by the existing kIICore property. G—PRC The scope of the “G” source tag is the union of the GB 2312 (6,763), TGH-2013/ /Tōngyòng Guīfàn Hànzìbiǎo (8,105—see the kTGH property), and /Xiàndài通用规范汉字表 Hànyǔ Tōngyòngzìbiǎo (7,000) standards, which results in 8,241 unique ideographs,现代汉语通用字表 all of which are CJK Unified Ideographs. This fig- ure is only 136 ideographs more than TGH-2013 itself. The following six ideographs were grandfathered from the kIICore property and use the “G” source tag: U+48B5 (G5), U+48C5 (G3), U+48D3 (G3), U+49D1 (GKX), U+9964 (G8) & U+997E (G8). The total number䢵 of ideographs 䣅with the “G” source䣓 tag is therefore䧑 8,247. 饤 饾 SPECIAL NOTES: 22 existing kIICore ideographs with the “G” source tag are excluded, because they are outside the scope of the three specified standards, but are included via other source tags. See the attached excluded- g-22.txt data file. H—Hong Kong SAR The scope of the “H” source tag is the union of the Big Five (13,060—see the kBigFive property) and HKSCS (4,603) standards, which results in 17,663 unique ideographs, 11 of which are CJK Compatibility Ideographs. There is no overlap between these two standards. J—Japan The scope of the “J” source tag is the union of the JIS X 0208 (6,356), /Jōyō Kanji (2,136—see the kJoyoKanji property), /Jinmei-yō Kanji (863—see the kJinmeiyoKanji常用漢字 property), and / Hyōgai Kanji (1,022) standards,人名用漢字 which results in 6,485 unique ideographs, 58 of which are CJK Compatibility表外漢字 Ideographs. This figure is only 129 more ideographs than JIS X 0208 itself. The following five ideographs were grandfathered from the kIICore property and use the “J” source tag: U+5F34 (J13), U+5F45 (J13), U+974D (J3), U+974F (J13) & U+9AD9 (no kIRG_JSource). The total number of弴 ideographs with彅 the “J” source tag靍 is therefore 6,490.靏 髙 SPECIAL NOTES: One existing kIICore ideograph with the “J” source tag is excluded, because it is outside the scope of the four specified standards, but is included via other source tags. See the attached excluded-j-1.txt data file. K—ROK The scope of the “K” source tag is the union of the KS X 1001 (4,620) and /Hanmun Gyoyug-yong Gicho Hanja (1,800—see the kKoreanEducationHanja한문 교육용 property)기초 한자/漢文敎育 standards, which用基礎漢字 results in 4,632 unique ideographs, all of which are CJK Unified Ideographs. This figure is only 12 more ideographs than KS X 1001 itself. The following eight ideographs were grandfathered from the kIICore property and use the “K” source tag: U+3960 (K3), U+4137 (K3), U+4A12 (K3), U+593D (K2), U+5D44 (K2), U+66A3 (K1), U+7807 (no kIRG_KSource㥠 ) & U+7A66䄷 (no kIRG_KSource䨒 ). The total夽 number of ideographs嵄 with the暣 “K” source tag is砇 therefore 4,640. 穦 SPECIAL NOTES: 126 existing kIICore ideographs with the “K” source tag are excluded, because they are out- side the scope of the two specified standards, but are included via other source tags. See the attached exclud- ed-k-126.txt data file. M—Macao SAR The scope of the “M” source tag is the union of the Big Five standard (13,060—see the kBigFive property) and the existing kIICore ideographs that have the “M” source tag (4,954), which results in 13,119 unique ideographs, all of which are CJK Unified Ideographs. This figure is only 59 more ideographs than Big Five itself. 2 SPECIAL NOTES: Only one existing kIICore ideograph with the “M” source tag, U+5F66 , is excluded for rea- sons explained in the 2018-02-15 CJK Type Blog article, but is covered by four of the other彦 six source tags (G, J, K & P): Only one ideograph, U+5F66 , stands out as odd in that its source references do not suggest Macao SAR use. Its related ideograph, U+5F65 彦, is also tagged “M” in kIICore (ATHM), and its source references, particularly T1-507D, more strongly suggest Macao彥 SAR use. See the attached excluded-m-1.txt data file. P—DPRK The scope of the “P” source tag is the KPS 9566 (4,653) standard, which means that this is unchanged from kIICore. T—ROC The scope of the “T” source tag is the union of the CNS 11643 Levels 1 & 2 (13,064) and Big Five (13,060—see the kBigFive property) standards, which results in 13,065 unique ideographs, all of which are CJK Unified Ideo- graphs. The following three ideographs were grandfathered from the kIICore property and use the “T” source tag: U+4CB3 (T3), U+4D08 (T4) & U+713F (T3). The total number of ideographs with the “T” source tag is therefore 13,068.䲳 䴈 焿 SPECIAL NOTES: 90 existing kIICore ideographs with the “T” source tag are excluded, because they are outside the scope of the two specified standards, but are included via other source tags. See the attached excluded- t-90.txt data file. No Priority Tags Because the notion of priority is largely source-specific, the kUnihanCore2020 property does not have a provi- sion to specify priority tags. The author of the proposal felt that they are not necessary, and that the source tags are sufficient. CJK Compatibility Ideographs Although the kUnihanCore2020 property specifies source tags for 68 CJK Compatibility Ideographs—11 with the “H” source tag, and 57 with the “J” source tag—it is expected that their corresponding SVSes (Standardized Variation Sequences) be used in actual implementations. In addition, the CJK Compatibility Ideographs that correspond to the Big Five (2) and KS X 1001 (268) standards have been intentionally excluded, because they represent genuine duplicate ideographs. See the attached svs-68.txt data file that provides a correspondence between these 68 CJK Compatibility Ideographs and their SVSes. That is all. 3 adobe.com CJK Type Blog CJK Fonts, Character Sets & Encodings. All CJK. #AllOfTheTime. HOME Exploring IICore—Part 1 By Dr. Ken Lunde Comments (0) Created February 5, 2018 Exploring IICore—Part 1 Today’s article is the very frst one that references IICore ( International Ideographs Core), which is best described as a region agnostic subset that includes the most commonly used CJK Unifed Ideographs in Unicode, and is intended for use in memory- challenged devices and environments. Included are 9,810 ideographs, the bulk of which are in the URO (9,706), with the remaining ones in Extensions A (42) and B (62). IICore is instantiated as the kIICore property of the Unihan Database, and documented in UAX #38. The kIICore property v consist of an initial letter—A, B, or C—that indicates priority, followed by one or more letters that specify a source that more or less corresponds to a region: G, H, J, K, M, P (short for KP), and T. In Part 1 of what may eventually become a multiple-part series about IICore, I will briefy explore the ideographs that are tagged “K” for Korean use, along with pointing out some that should have been tagged “K” after examining the mappings to the KS X 1001 standard.
Recommended publications
  • Cumberland Tech Ref.Book
    Forms Printer 258x/259x Technical Reference DRAFT document - Monday, August 11, 2008 1:59 pm Please note that this is a DRAFT document. More information will be added and a final version will be released at a later date. August 2008 www.lexmark.com Lexmark and Lexmark with diamond design are trademarks of Lexmark International, Inc., registered in the United States and/or other countries. © 2008 Lexmark International, Inc. All rights reserved. 740 West New Circle Road Lexington, Kentucky 40550 Draft document Edition: August 2008 The following paragraph does not apply to any country where such provisions are inconsistent with local law: LEXMARK INTERNATIONAL, INC., PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions; therefore, this statement may not apply to you. This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in later editions. Improvements or changes in the products or the programs described may be made at any time. Comments about this publication may be addressed to Lexmark International, Inc., Department F95/032-2, 740 West New Circle Road, Lexington, Kentucky 40550, U.S.A. In the United Kingdom and Eire, send to Lexmark International Ltd., Marketing and Services Department, Westhorpe House, Westhorpe, Marlow Bucks SL7 3RQ. Lexmark may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
    [Show full text]
  • ST.36 Page: 3.36.1
    HANDBOOK ON INDUSTRIAL PROPERTY INFORMATION AND DOCUMENTATION Ref.: Standards – ST.36 page: 3.36.1 STANDARD ST.36 Version 1.2 RECOMMENDATION FOR THE PROCESSING OF PATENT INFORMATION USING XML (EXTENSIBLE MARKUP LANGUAGE) Revision adopted by ST.36 Task Force of the Standards and Documentation Working Group (SDWG) on November 23, 2007 TABLE OF CONTENTS INTRODUCTION ............................................................................................................................................................ 2 DEFINITIONS ................................................................................................................................................................. 3 SCOPE OF THE STANDARD ........................................................................................................................................ 3 REQUIREMENTS OF THE STANDARD........................................................................................................................ 4 General ......................................................................................................................................................................... 4 Characters .................................................................................................................................................................... 5 Naming international common elements....................................................................................................................... 6 Naming office-specific elements
    [Show full text]
  • Download the Specification
    Internationalizing and Localizing Applications in Oracle Solaris Part No: E61053 November 2020 Internationalizing and Localizing Applications in Oracle Solaris Part No: E61053 Copyright © 2014, 2020, Oracle and/or its affiliates. License Restrictions Warranty/Consequential Damages Disclaimer This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. Warranty Disclaimer The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. Restricted Rights Notice If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial
    [Show full text]
  • Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
    1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only.
    [Show full text]
  • A Functional MRI Study on the Japanese Orthographies
    Modulation of the Visual Word Retrieval System in Writing: A Functional MRI Study on the Japanese Orthographies Kimihiro Nakamura1, Manabu Honda2, Shigeru Hirano1, Tatsuhide Oga1, Nobukatsu Sawamoto1, Takashi Hanakawa1, Downloaded from http://mitprc.silverchair.com/jocn/article-pdf/14/1/104/1757408/089892902317205366.pdf by guest on 18 May 2021 Hiroshi Inoue3, Jin Ito3, Tetsu Matsuda1, Hidenao Fukuyama1, and Hiroshi Shibasaki1 Abstract & We used functional magnetic resonance imaging (fMRI) to left sensorimotor areas and right cerebellum. The kanji versus examine whether the act of writing involves different neuro- kana comparison showed increased responses in the left psychological mechanisms between the two script systems of prefrontal and anterior cingulate areas. Especially, the lPITC the Japanese language: kanji (ideogram) and kana (phono- showed a significant task-by-script interaction. Two additional gram). The main experiments employed a 2 Â 2 factorial control tasks, repetition (REP) and semantic judgment (SJ), design that comprised writing-to-dictation and visual mental activated the bilateral perisylvian areas, but enhanced the lPITC recall for kanji and kana. For both scripts, the actual writing response only weakly. These results suggest that writing of the produced a widespread fronto-parietal activation in the left ideographic and phonographic scripts, although using the hemisphere. Especially, writing of kanji activated the left largely same cortical regions, each modulates the visual word- posteroinferior temporal cortex (lPITC), whereas that of retrieval system according to their graphic features. Further- kana also yielded a trend of activation in the same area. more, comparisons with two additional tasks indicate that the Mental recall for both scripts activated similarly the left parieto- activity of the lPITC increases especially in expressive language temporal regions including the lPITC.
    [Show full text]
  • Chinese Script Generation Panel Document
    Chinese Script Generation Panel Document Proposal for the Generation Panel for the Chinese Script Label Generation Ruleset for the Root Zone 1. General Information Chinese script is the logograms used in the writing of Chinese and some other Asian languages. They are called Hanzi in Chinese, Kanji in Japanese and Hanja in Korean. Since the Hanzi unification in the Qin dynasty (221-207 B.C.), the most important change in the Chinese Hanzi occurred in the middle of the 20th century when more than two thousand Simplified characters were introduced as official forms in Mainland China. As a result, the Chinese language has two writing systems: Simplified Chinese (SC) and Traditional Chinese (TC). Both systems are expressed using different subsets under the Unicode definition of the same Han script. The two writing systems use SC and TC respectively while sharing a large common “unchanged” Hanzi subset that occupies around 60% in contemporary use. The common “unchanged” Hanzi subset enables a simplified Chinese user to understand texts written in traditional Chinese with little difficulty and vice versa. The Hanzi in SC and TC have the same meaning and the same pronunciation and are typical variants. The Japanese kanji were adopted for recording the Japanese language from the 5th century AD. Chinese words borrowed into Japanese could be written with Chinese characters, while Japanese words could be written using the character for a Chinese word of similar meaning. Finally, in Japanese, all three scripts (kanji, and the hiragana and katakana syllabaries) are used as main scripts. The Chinese script spread to Korea together with Buddhism from the 2nd century BC to the 5th century AD.
    [Show full text]
  • A Comparative Analysis of the Simplification of Chinese Characters in Japan and China
    CONTRASTING APPROACHES TO CHINESE CHARACTER REFORM: A COMPARATIVE ANALYSIS OF THE SIMPLIFICATION OF CHINESE CHARACTERS IN JAPAN AND CHINA A THESIS SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAI‘I AT MĀNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS IN ASIAN STUDIES AUGUST 2012 By Kei Imafuku Thesis Committee: Alexander Vovin, Chairperson Robert Huey Dina Rudolph Yoshimi ACKNOWLEDGEMENTS I would like to express deep gratitude to Alexander Vovin, Robert Huey, and Dina R. Yoshimi for their Japanese and Chinese expertise and kind encouragement throughout the writing of this thesis. Their guidance, as well as the support of the Center for Japanese Studies, School of Pacific and Asian Studies, and the East-West Center, has been invaluable. i ABSTRACT Due to the complexity and number of Chinese characters used in Chinese and Japanese, some characters were the target of simplification reforms. However, Japanese and Chinese simplifications frequently differed, resulting in the existence of multiple forms of the same character being used in different places. This study investigates the differences between the Japanese and Chinese simplifications and the effects of the simplification techniques implemented by each side. The more conservative Japanese simplifications were achieved by instating simpler historical character variants while the more radical Chinese simplifications were achieved primarily through the use of whole cursive script forms and phonetic simplification techniques. These techniques, however, have been criticized for their detrimental effects on character recognition, semantic and phonetic clarity, and consistency – issues less present with the Japanese approach. By comparing the Japanese and Chinese simplification techniques, this study seeks to determine the characteristics of more effective, less controversial Chinese character simplifications.
    [Show full text]
  • Writing As Aesthetic in Modern and Contemporary Japanese-Language Literature
    At the Intersection of Script and Literature: Writing as Aesthetic in Modern and Contemporary Japanese-language Literature Christopher J Lowy A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2021 Reading Committee: Edward Mack, Chair Davinder Bhowmik Zev Handel Jeffrey Todd Knight Program Authorized to Offer Degree: Asian Languages and Literature ©Copyright 2021 Christopher J Lowy University of Washington Abstract At the Intersection of Script and Literature: Writing as Aesthetic in Modern and Contemporary Japanese-language Literature Christopher J Lowy Chair of the Supervisory Committee: Edward Mack Department of Asian Languages and Literature This dissertation examines the dynamic relationship between written language and literary fiction in modern and contemporary Japanese-language literature. I analyze how script and narration come together to function as a site of expression, and how they connect to questions of visuality, textuality, and materiality. Informed by work from the field of textual humanities, my project brings together new philological approaches to visual aspects of text in literature written in the Japanese script. Because research in English on the visual textuality of Japanese-language literature is scant, my work serves as a fundamental first-step in creating a new area of critical interest by establishing key terms and a general theoretical framework from which to approach the topic. Chapter One establishes the scope of my project and the vocabulary necessary for an analysis of script relative to narrative content; Chapter Two looks at one author’s relationship with written language; and Chapters Three and Four apply the concepts explored in Chapter One to a variety of modern and contemporary literary texts where script plays a central role.
    [Show full text]
  • The Not So Short Introduction to Latex2ε
    The Not So Short Introduction to LATEX 2ε Or LATEX 2ε in 139 minutes by Tobias Oetiker Hubert Partl, Irene Hyna and Elisabeth Schlegl Version 4.20, May 31, 2006 ii Copyright ©1995-2005 Tobias Oetiker and Contributers. All rights reserved. This document is free; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this document; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. Thank you! Much of the material used in this introduction comes from an Austrian introduction to LATEX 2.09 written in German by: Hubert Partl <[email protected]> Zentraler Informatikdienst der Universität für Bodenkultur Wien Irene Hyna <[email protected]> Bundesministerium für Wissenschaft und Forschung Wien Elisabeth Schlegl <noemail> in Graz If you are interested in the German document, you can find a version updated for LATEX 2ε by Jörg Knappen at CTAN:/tex-archive/info/lshort/german iv Thank you! The following individuals helped with corrections, suggestions and material to improve this paper. They put in a big effort to help me get this document into its present shape.
    [Show full text]
  • Legacy Character Sets & Encodings
    Legacy & Not-So-Legacy Character Sets & Encodings Ken Lunde CJKV Type Development Adobe Systems Incorporated bc ftp://ftp.oreilly.com/pub/examples/nutshell/cjkv/unicode/iuc15-tb1-slides.pdf Tutorial Overview dc • What is a character set? What is an encoding? • How are character sets and encodings different? • Legacy character sets. • Non-legacy character sets. • Legacy encodings. • How does Unicode fit it? • Code conversion issues. • Disclaimer: The focus of this tutorial is primarily on Asian (CJKV) issues, which tend to be complex from a character set and encoding standpoint. 15th International Unicode Conference Copyright © 1999 Adobe Systems Incorporated Terminology & Abbreviations dc • GB (China) — Stands for “Guo Biao” (国标 guóbiâo ). — Short for “Guojia Biaozhun” (国家标准 guójiâ biâozhün). — Means “National Standard.” • GB/T (China) — “T” stands for “Tui” (推 tuî ). — Short for “Tuijian” (推荐 tuîjiàn ). — “T” means “Recommended.” • CNS (Taiwan) — 中國國家標準 ( zhôngguó guójiâ biâozhün) in Chinese. — Abbreviation for “Chinese National Standard.” 15th International Unicode Conference Copyright © 1999 Adobe Systems Incorporated Terminology & Abbreviations (Cont’d) dc • GCCS (Hong Kong) — Abbreviation for “Government Chinese Character Set.” • JIS (Japan) — 日本工業規格 ( nihon kôgyô kikaku) in Japanese. — Abbreviation for “Japanese Industrial Standard.” — 〄 • KS (Korea) — 한국 공업 규격 (韓國工業規格 hangug gongeob gyugyeog) in Korean. — Abbreviation for “Korean Standard.” — ㉿ — Designation change from “C” to “X” on August 20, 1997. 15th International Unicode Conference Copyright © 1999 Adobe Systems Incorporated Terminology & Abbreviations (Cont’d) dc • TCVN (Vietnam) — Tiu Chun Vit Nam in Vietnamese. — Means “Vietnamese Standard.” • CJKV — Chinese, Japanese, Korean, and Vietnamese. 15th International Unicode Conference Copyright © 1999 Adobe Systems Incorporated What Is A Character Set? dc • A collection of characters that are intended to be used together to create meaningful text.
    [Show full text]
  • Programmer Guide: Advanced Data Formatting (ADF)
    Advanced Data Formatting (ADF) 72E-69680-07 PROGRAMMER GUIDE ADVANCED DATA FORMATTING PROGRAMMER GUIDE 72E-69680-07 Revision A June 2019 ii Advanced Data Formatting Programmer Guide No part of this publication may be reproduced or used in any form, or by any electrical or mechanical means, without permission in writing from Zebra. This includes electronic or mechanical means, such as photocopying, recording, or information storage and retrieval systems. The material in this manual is subject to change without notice. The software is provided strictly on an “as is” basis. All software, including firmware, furnished to the user is on a licensed basis. Zebra grants to the user a non-transferable and non-exclusive license to use each software or firmware program delivered hereunder (licensed program). Except as noted below, such license may not be assigned, sublicensed, or otherwise transferred by the user without prior written consent of Zebra. No right to copy a licensed program in whole or in part is granted, except as permitted under copyright law. The user shall not modify, merge, or incorporate any form or portion of a licensed program with other program material, create a derivative work from a licensed program, or use a licensed program in a network without written permission from Zebra. The user agrees to maintain Zebra’s copyright notice on the licensed programs delivered hereunder, and to include the same on any authorized copies it makes, in whole or in part. The user agrees not to decompile, disassemble, decode, or reverse engineer any licensed program delivered to the user or any portion thereof.
    [Show full text]
  • Hong Kong Supplementary Character Set – 2016 (Draft)
    中 文 界 面 諮 詢 委 員 會 工 作 小 組 文 件 編 號 2017/02 (B) Hong Kong Supplementary Character Set – 2016 (Draft) Office of the Government Chief Information Officer & Official Languages Division, Civil Service Bureau The Government of the Hong Kong Special Administrative Region April 2017 1/21 中 文 界 面 諮 詢 委 員 會 工 作 小 組 文 件 編 號 2017/02 (B) Table of Contents Preface Section 1 Overview……………….……………………………………………. 1 - 1 Section 2 Coding Scheme of the HKSCS–2016….……………………………. 2 - 1 Section 3 HKSCS–2016 under the Architecture of the ISO/IEC 10646………. 3 - 1 Table 1: Code Table of the HKSCS–2016……………………………………….. i - 1 Table 2: Newly Included Characters in the HKSCS–2016...………………….…. ii - 1 Table 3: Compatibility Characters in the HKSCS–2016…......………………..…. iii - 1 2/21 中 文 界 面 諮 詢 委 員 會 工 作 小 組 文 件 編 號 2017/02 (B) Preface After the first release of the Hong Kong Supplementary Character Set (HKSCS) in 1999, there have been three updated versions. The HKSCS-2001, HKSCS-2004 and HKSCS-2008 were published with 116, 123 and 68 new characters added respectively. A total of 5 009 characters were included in the HKSCS-2008. These publications formed the foundation for promoting the adoption of the ISO/IEC 10646 international coding standard, and were widely supported and adopted by the IT sector and members of the public. The ISO/IEC 10646 international coding standard is developed by the International Organization for Standardization (ISO) to provide a common technical basis for the storage and exchange of electronic information.
    [Show full text]