Genuine Han Unification

Total Page:16

File Type:pdf, Size:1020Kb

Genuine Han Unification Genuine Han Uni!cation Dr. Ken Lunde | Senior Computer Scientist | Adobe Systems Incorporated © 2011 Adobe Systems Incorporated. All Rights Reserved. IUC35, Santa Clara, CA, USA, Earth "e All-Important Disclaimer… "is presentation contains events that may happen… © 2011 Adobe Systems Incorporated. All Rights Reserved. 2 IUC35, Santa Clara, CA, USA, Earth "e All-Important Disclaimer… "is presentation contains events that may happen… …not those that necessarily will happen © 2011 Adobe Systems Incorporated. All Rights Reserved. 3 IUC35, Santa Clara, CA, USA, Earth "e All-Important Disclaimer… "is presentation contains events that may happen… …not those that necessarily will happen ☺ © 2011 Adobe Systems Incorporated. All Rights Reserved. 4 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? © 2011 Adobe Systems Incorporated. All Rights Reserved. 5 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 © 2011 Adobe Systems Incorporated. All Rights Reserved. 6 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 Chinese & Japanese © 2011 Adobe Systems Incorporated. All Rights Reserved. 7 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 Chinese & Japanese 実 © 2011 Adobe Systems Incorporated. All Rights Reserved. 8 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 Chinese & Japanese 実 Chinese © 2011 Adobe Systems Incorporated. All Rights Reserved. 9 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 Chinese & Japanese 実 Chinese 実 Japanese © 2011 Adobe Systems Incorporated. All Rights Reserved. 10 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 Chinese & Japanese 実 Chinese 実 Japanese 骨 © 2011 Adobe Systems Incorporated. All Rights Reserved. 11 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 Chinese & Japanese 実 Chinese 実 Japanese 骨 Chinese (Traditional) & Japanese © 2011 Adobe Systems Incorporated. All Rights Reserved. 12 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 Chinese & Japanese 実 Chinese 実 Japanese 骨 Chinese (Traditional) & Japanese 骨 Chinese (Simpli!ed) © 2011 Adobe Systems Incorporated. All Rights Reserved. 13 IUC35, Santa Clara, CA, USA, Earth Pop Quiz—Chinese or Japanese? 一 Chinese & Japanese → U+4E00 実 Chinese 実 Japanese → U+5B9F 骨 Chinese (Traditional) & Japanese 骨 Chinese (Simpli!ed) → U+9AA8 © 2011 Adobe Systems Incorporated. All Rights Reserved. 14 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform © 2011 Adobe Systems Incorporated. All Rights Reserved. 15 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary © 2011 Adobe Systems Incorporated. All Rights Reserved. 16 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s © 2011 Adobe Systems Incorporated. All Rights Reserved. 17 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 © 2011 Adobe Systems Incorporated. All Rights Reserved. 18 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time © 2011 Adobe Systems Incorporated. All Rights Reserved. 19 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm © 2011 Adobe Systems Incorporated. All Rights Reserved. 20 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm . Today’s East Asian fonts fall into two categories © 2011 Adobe Systems Incorporated. All Rights Reserved. 21 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm . Today’s East Asian fonts fall into two categories . Single-region fonts with a single glyph per CJK Uni!ed Ideograph © 2011 Adobe Systems Incorporated. All Rights Reserved. 22 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm . Today’s East Asian fonts fall into two categories . Single-region fonts with a single glyph per CJK Uni!ed Ideograph . Multiple-region (aka, Pan-CJK) fonts with multiple glyphs per CJK Uni!ed Ideograph © 2011 Adobe Systems Incorporated. All Rights Reserved. 23 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm . Today’s East Asian fonts fall into two categories . Single-region fonts with a single glyph per CJK Uni!ed Ideograph . Multiple-region (aka, Pan-CJK) fonts with multiple glyphs per CJK Uni!ed Ideograph . Requires a lot of time and effort, and can lead to extrapolation © 2011 Adobe Systems Incorporated. All Rights Reserved. 24 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm . Today’s East Asian fonts fall into two categories . Single-region fonts with a single glyph per CJK Uni!ed Ideograph . Multiple-region (aka, Pan-CJK) fonts with multiple glyphs per CJK Uni!ed Ideograph . Requires a lot of time and effort, and can lead to extrapolation . Some are useful only for a single region, but may need to be rendered for other regions © 2011 Adobe Systems Incorporated. All Rights Reserved. 25 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm . Today’s East Asian fonts fall into two categories . Single-region fonts with a single glyph per CJK Uni!ed Ideograph . Multiple-region (aka, Pan-CJK) fonts with multiple glyphs per CJK Uni!ed Ideograph . Requires a lot of time and effort, and can lead to extrapolation . Some are useful only for a single region, but may need to be rendered for other regions . East Asian writing systems will change in the future © 2011 Adobe Systems Incorporated. All Rights Reserved. 26 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm . Today’s East Asian fonts fall into two categories . Single-region fonts with a single glyph per CJK Uni!ed Ideograph . Multiple-region (aka, Pan-CJK) fonts with multiple glyphs per CJK Uni!ed Ideograph . Requires a lot of time and effort, and can lead to extrapolation . Some are useful only for a single region, but may need to be rendered for other regions . East Asian writing systems will change in the future . Can Unicode act as a catalyst for a single-glyph model that serves all regions? © 2011 Adobe Systems Incorporated. All Rights Reserved. 27 IUC35, Santa Clara, CA, USA, Earth Background & Goals . East Asian writing systems have changed over time, through reform . Reform is sometimes revolutionary—the hànzì simpli!cation of the 1950s—頭 → 头 . Character form (glyph) preferences change over time . What was once deemed unacceptable is now the norm . Today’s East Asian fonts fall into two categories . Single-region fonts with a single glyph per CJK Uni!ed Ideograph . Multiple-region (aka, Pan-CJK) fonts with multiple glyphs per CJK Uni!ed Ideograph . Requires a lot of time and effort, and can lead to extrapolation . Some are useful only for a single region, but may need to be rendered for other regions . East Asian writing systems will change in the future . Can Unicode act as a catalyst for a single-glyph model that serves all regions? . Will greater cross-cultural communication play a role? © 2011 Adobe Systems Incorporated. All Rights Reserved. 28 IUC35, Santa Clara, CA, USA, Earth What Is “Han Uni!cation”? What Is Being Uni!ed? . “Uni!cation” of the most important East Asian (“Han”) character set standards © 2011 Adobe Systems Incorporated. All Rights Reserved. 29 IUC35, Santa Clara, CA, USA, Earth What Is “Han Uni!cation”? What Is Being Uni!ed? . “Uni!cation” of the most important East Asian (“Han”) character set standards . Hànzì (Chinese), kanji (Japanese), and hanja (Korean) © 2011 Adobe Systems Incorporated. All Rights Reserved. 30 IUC35, Santa Clara, CA, USA, Earth What Is “Han Uni!cation”? What Is Being Uni!ed? . “Uni!cation” of the most important East Asian (“Han”) character set standards . Hànzì (Chinese), kanji (Japanese), and hanja (Korean) . "e end result is a single repertoire © 2011 Adobe Systems Incorporated.
Recommended publications
  • Unicode Overview.E
    Unicode SAP Systems Unicode@sap NW AS Internationalization SupportedlanguagesinUnicode.doc 09.05.2007 © Copyright 2006 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, and Informix are trademarks or registered trademarks of IBM Corporation in the United States and/or other countries. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MaxDB is a trademark of MySQL AB, Sweden. SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world.
    [Show full text]
  • San José, October 2, 2000 Feel Free to Distribute This Text
    San José, October 2, 2000 Feel free to distribute this text (version 1.2) including the author’s email address ([email protected]) and to contact him for corrections and additions. Please do not take this text as a literal translation, but as a help to understand the standard GB 18030-2000. Insertions in brackets [] are used throughout the text to indicate corresponding sections of the published Chinese standard. Thanks to Markus Scherer (IBM) and Ken Lunde (Adobe Systems) for initial critical reviews of the text. SUMMARY, EXPLANATIONS, AND REMARKS: CHINESE NATIONAL STANDARD GB 18030-2000: INFORMATION TECHNOLOGY – CHINESE IDEOGRAMS CODED CHARACTER SET FOR INFORMATION INTERCHANGE – EXTENSION FOR THE BASIC SET (信息技术-信息交换用汉字编码字符集 Xinxi Jishu – Xinxi Jiaohuan Yong Hanzi Bianma Zifuji – Jibenji De Kuochong) March 17, 2000, was the publishing date of the Chinese national standard (国家标准 guojia biaozhun) GB 18030-2000 (hereafter: GBK2K). This standard tries to resolve issues resulting from the advent of Unicode, version 3.0. More specific, it attempts the combination of Uni- code's extended character repertoire, namely the Unihan Extension A, with the character cov- erage of earlier Chinese national standards. HISTORY The People’s Republic of China had already expressed her fundamental consent to support the combined efforts of the ISO/IEC and the Unicode Consortium through publishing a Chinese National Standard that was code- and character-compatible with ISO 10646-1/ Unicode 2.1. This standard was named GB 13000.1. Whenever the ISO and the Unicode Consortium changed or revised their “common” standard, GB 13000.1 adopted these changes subsequently. In order to remain compatible with GB 2312, however, which at the time of publishing Unicode/GB 13000.1 was an already existing national standard widely used to represent the Chinese “simplified” characters, the “specification” GBK was created.
    [Show full text]
  • A Ruse Secluded Character Set for the Source
    JOURNAL OF ARCHITECTURE & TECHNOLOGY Issn No : 1006-7930 A Ruse Secluded character set for the Source Mr. J Purna Prakash1, Assistant Professor Mr. M. Rama Raju 2, Assistant Professor Christu Jyothi Institute of Technology & Science Abstract We are rich in data, but information is poor, typically world wide web and data streams. The effective and efficient analysis of data in which is different forms becomes a challenging task. Searching for knowledge to match the exact keyword is big task in Internet such as search engine. Now a days using Unicode Transform Format (UTF) is extended to UTF-16 and UTF-32. With helps to create more special characters how we want. China has GB 18030-character set. Less number of website are using ASCII format in china, recently. While searching some keyword we are unable get the exact webpage in search engine in top place. Issues in certain we face this problem in results announcement, notifications, latest news, latest products released. Mainly on government websites are not shown in the front page. To avoid this trap from common people, we require special character set to match the exact unique keyword. Most of the keywords are encoded with the ASCII format. While searching keyword called cbse net results thousands of websites will have the common keyword as cbse net results. Matching the keyword, it is already encoded in all website as ASCII format. Most of the government websites will not offer search engine optimization. Match a unique keyword in government, banking, Institutes, Online exam purpose. Proposals is to create a character set from A to Z and a to z, for the purpose of data cleaning.
    [Show full text]
  • DICOM PS3.5 2021C
    PS3.5​ DICOM PS3.5 2021d - Data Structures and Encoding​ Page 2​ PS3.5: DICOM PS3.5 2021d - Data Structures and Encoding​ Copyright © 2021 NEMA​ A DICOM® publication​ - Standard -​ DICOM PS3.5 2021d - Data Structures and Encoding​ Page 3​ Table of Contents​ Notice and Disclaimer ........................................................................................................................................... 13​ Foreword ............................................................................................................................................................ 15​ 1. Scope and Field of Application ............................................................................................................................. 17​ 2. Normative References ....................................................................................................................................... 19​ 3. Definitions ....................................................................................................................................................... 23​ 4. Symbols and Abbreviations ................................................................................................................................. 27​ 5. Conventions ..................................................................................................................................................... 29​ 6. Value Encoding ...............................................................................................................................................
    [Show full text]
  • Section 18.1, Han
    The Unicode® Standard Version 13.0 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2020 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 13.0. Includes index. ISBN 978-1-936213-26-9 (http://www.unicode.org/versions/Unicode13.0.0/) 1.
    [Show full text]
  • UTF-8 from Wikipedia, the Free Encyclopedia
    UTF-8 From Wikipedia, the free encyclopedia UTF-8 is a character encoding capable of encoding all possible characters, or code points, defined by Unicode and originally designed by Ken Thompson and Rob Pike.[1] The encoding is variable-length and uses 8-bit code units. It was designed for backward compatibility with ASCII and to avoid the complications of endianness and byte order marks in the alternative UTF-16 and UTF-32 encodings. The name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8- bit.[2] UTF-8 is the dominant character encoding for the World Wide Web, accounting for 89.1% of all Web pages in May 2017 (the most popular East Asian encodings, Shift JIS and GB 2312, have 0.9% and 0.7% respectively).[4][5][3] The Internet Mail Consortium (IMC) recommended that all e-mail programs be able to display and create mail using UTF-8,[6] and the W3C recommends UTF-8 as the default encoding in XML and HTML.[7] UTF-8 encodes each of the 1,112,064[8] valid code points in Unicode using one to four 8-bit bytes.[9] Code points with lower numerical values, which tend to occur more frequently, are encoded using fewer bytes. The first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single octet with the same binary value as ASCII, so that valid ASCII text is valid UTF-8-encoded Unicode as well. Since ASCII bytes do not occur when encoding non-ASCII code points into UTF-8, UTF-8 is safe to use within most programming and document languages that interpret certain ASCII characters in a special way, such as '/' in filenames, '\' in escape sequences, and '%' in printf.
    [Show full text]
  • Continue Nếu Lam Được, Ắt Ta and Lam Trầu Khang Vi Ắt Trầu Lạt, Kau Kang Hạt, Ắt Kau Gia
    At Continue nếu Lam được, ắt ta and lam Trầu khang vi ắt trầu lạt, kau kang hạt, ắt kau gia. (Cdao) Đồng nghĩa: tất At: PlaceWe use to describe a position or location is seen as a point: ... By: Numbers In specific contexts, we use in with numbers. ... On, on and in (the place) we use on: ... On, on and in (time) We use on: ... Expressions of time without on, on, in We usually do not use on, on or in pre-time expressions, starting with each, next, last, some, this is that, one, either, everything: ... In, on and in (time): typical errors ... Typographic symbol K and : redirect here. For the emoji see the list of emoticons. For the letter A within the circle see John Sorn and Thurston Moore's album, see Circled-a for other purposes. For technical reasons, I! Redirects here. For the album, see Pulley ( group) - Discography. For technical reasons: % Smiles redirects here. For the album, see Fucking Smilers. @At signIn UnicodeU-0040 - COMMERCIAL AT (HTML -#64; ) RelatedSee also U-FF20 - FULLWIDTH COMMERCIAL AT (HTML -#65312;)U-FE6B - SMALL COMMERCIAL AT (HTML - #65131;) The sign is usually read aloud as on; it is also commonly called on a symbol or commercially on. It is used as an acronym for accounting and invoice meaning at a rate (e.g. 7 widgets, 2 euros per widget and 14 euros), but it is now more widely visible in email addresses and social media platforms. The absence of a single English word for the symbol prompted some writers to use French arobaza or Spanish and Portuguese arrob, or to come up with new words such as ampersat, asperand, and strudel, but none of them were widely used.
    [Show full text]
  • I18n, M17n, Unicode, and All That
    I18N, M17N, UNICODE, AND ALL THAT Tim Bray General-Purpose Web Geek Sun Microsystems /[a-zA-Z]+/ This is probably a bug. The Problems We Have To Solve Identifying characters Storage Byte⇔character mapping Transfer Good string API Published in 1996; it has 74 major sections, most of which discuss whole families of writing systems. www.w3.org/TR/charmod Identifying Characters 1,1 17 “Planes”14,1 each with 64k code points: U+0000 – U+10FFFF BMP 12 Unicode Code Points 0 0000 1 0000 Basic Multilingual Plane 2 0000 Dead Languages & Math 3 0000 Han Characters 4 0000 5 0000 Non-BMP 6 0000 7 0000 99,024 characters defined in Unicode 5.0 “Astral” Planes 8 0000 9 0000 A 0000 B 0000 C 0000 D 0000 E 0000 Language F 0000 10 0000 Private Use T ags The Basic Multilingual Plane (BMP) U+0000 – U+FFFF 0000 Alphabets 1000 2000 3000 Punctuation 4000 Asian-language Support 5000 Han Characters 6000 7000 8000 9000 A000 Y B000 i Hangul C000 D000 E000 (*: Legacy-Compatibility junk)Surrogates F000 Private Use * Unicode Character Database 00C8;LATIN CAPITAL LETTER E WITH GRAVE;Lu;0;L;0045 0300;;;;N;LATIN CAPITAL LETTER E GRAVE;;;00E8; “Character #200 is LATIN CAPITAL LETTER E WITH GRAVE, a lower-case letter, combining class 0, renders L-to-R, can be composed by U+0045/U+0300, had a differentÈ name in Unicode 1, isn’t a number, lowercase is U+00E8.” www.unicode.org/Public/Unidata $ U+0024 DOLLAR SIGN Ž U+017D LATIN CAPITAL LETTER Z WITH CARON ® U+00AE REGISTERED SIGN ή U+03AE GREEK SMALL LETTER ETA WITH TONOS Ж U+0416 CYRILLIC CAPITAL LETTER ZHE א U+05D0 HEBREW LETTER
    [Show full text]
  • Omega — Why Bother with Unicode?
    Omega — why Bother with Unicode? Robin Fairbairns University of Cambridge Computer Laboratory Pembroke Street Cambridge CB23QG UK Email: [email protected] Abstract Yannis Haralambous’ and John Plaice’s Omega system employs Unicode as its coding system. This short note (which previously appeared in the UKTUG mag- azine Baskerville volume 5, number 3) considers the rationale behind Unicode itself and behind its adoption for Omega. Introduction and written by Yannis Haralambous (Lille) and John Plaice (Universit´e Laval, Montr´eal). It follows on As almost all T X users who ‘listen to the networks’ E quite naturally from Yannis’ work on exotic lan- at all will know, the Francophone T X users’ group, E guages (see, among many examples, Haralambous, GUTenberg, arranged a meeting in March at CERN 1990; 1991; 1993; 1994), which have always seemed (Geneva) to ‘launch’ Ω. The UKTUG responded to to me to be bedevilled by problems of text encoding. GUTenberg’s plea for support to enable T Xusers E Simply, Ω (the program) is able to read scripts from impoverished countries to attend, by making that are encoded in Unicode (or in some other code the first disbursement from the UKTUG’s newly- that is readily transformable to Unicode), and then established Cathy Booth fund. The meeting was to process them in the same way that T Xdoes. well attended, with representatives from both East- E Parallel work has defined formats for fonts and other ern and Western Europe (including me; I also at- necessary files to deal with the demands arising from tended with UKTUG money), and one representa- AFONT Unicode input, and upgraded versions of MET , tive from Australia (though he is presently resident the virtual font utilities, and so on, have been writ- in Europe, too).
    [Show full text]
  • The Unicode Standard, Version 4.0--Online Edition
    This PDF file is an excerpt from The Unicode Standard, Version 4.0, issued by the Unicode Consor- tium and published by Addison-Wesley. The material has been modified slightly for this online edi- tion, however the PDF files have not been modified to reflect the corrections found on the Updates and Errata page (http://www.unicode.org/errata/). For information on more recent versions of the standard, see http://www.unicode.org/standard/versions/enumeratedversions.html. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial capital letters. However, not all words in initial capital letters are trademark designations. The Unicode® Consortium is a registered trademark, and Unicode™ is a trademark of Unicode, Inc. The Unicode logo is a trademark of Unicode, Inc., and may be registered in some jurisdictions. The authors and publisher have taken care in preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode®, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. Dai Kan-Wa Jiten used as the source of reference Kanji codes was written by Tetsuji Morohashi and published by Taishukan Shoten.
    [Show full text]
  • NAME DESCRIPTION Supported Encodings
    Perl version 5.8.6 documentation - Encode::Supported NAME Encode::Supported -- Encodings supported by Encode DESCRIPTION Encoding Names Encoding names are case insensitive. White space in names is ignored. In addition, an encoding may have aliases. Each encoding has one "canonical" name. The "canonical" name is chosen from the names of the encoding by picking the first in the following sequence (with a few exceptions). The name used by the Perl community. That includes 'utf8' and 'ascii'. Unlike aliases, canonical names directly reach the method so such frequently used words like 'utf8' don't need to do alias lookups. The MIME name as defined in IETF RFCs. This includes all "iso-"s. The name in the IANA registry. The name used by the organization that defined it. In case de jure canonical names differ from that of the Encode module, they are always aliased if it ever be implemented. So you can safely tell if a given encoding is implemented or not just by passing the canonical name. Because of all the alias issues, and because in the general case encodings have state, "Encode" uses an encoding object internally once an operation is in progress. Supported Encodings As of Perl 5.8.0, at least the following encodings are recognized. Note that unless otherwise specified, they are all case insensitive (via alias) and all occurrence of spaces are replaced with '-'. In other words, "ISO 8859 1" and "iso-8859-1" are identical. Encodings are categorized and implemented in several different modules but you don't have to use Encode::XX to make them available for most cases.
    [Show full text]
  • Peopletools 8.57: Global Technology
    PeopleTools 8.57: Global Technology March 2020 PeopleTools 8.57: Global Technology Copyright © 1988, 2020, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government. This software or hardware is developed for general use in a variety of information management applications.
    [Show full text]