IDUG EU 2006 Roland Schock: Code Sets, NLS, Character Conversion Vs

Total Page:16

File Type:pdf, Size:1020Kb

IDUG EU 2006 Roland Schock: Code Sets, NLS, Character Conversion Vs D16 Code sets, NLS and character conversion vs. DB2 Roland Schock ARS Computer und Consulting GmbH 05.10.2006 • 11:45 a.m. – 12:45 p.m. Platform: DB2 for Linux, Unix, Windows Code sets and character conversion is something, which is usually neglected during database design and usage. Everybody expects it will work correctly without any effort. But practice shows, the true detail and impact is often misunderstood and a few details can help adminstrators and database developers to do the right thing. After some necessary definitions this presentation describes, how you can specify the code page used. You will see what character conversion is and how to avoid common problems. At the end we will shortly discuss performance impacts. 1 Overview • What are character sets, encoding schemes and code pages? • Where can I define the code page used? • What is code page conversion and where does it happen? • What problems can arise and how can I avoid them? • Performance considerations 2 On the next few slides we will define basic terms frequently used for this topic. The terms are widely used, but often they are only understood partially. In the case of problems it is essential to understand the concepts to deduct the origin of the problem. 2 Character Sets • Basically a character set is just a collection of entities or graphical symbols with a meaning. • Examples for character sets are the latin alphabet, digits, naval flag signs or other symbols: A, B, C, ... α γ π ξ ᇹ ぁゆ㌹㌺ 亹怔떟떥 3 Here we use the word 'set' in the mathematical context. It is an unordered collection of elements. One of the most used character sets in Europe is the latin alphabet. But this is just a very small subset of the character sets needed for the most common languages. Other, less obvious character sets are naval flag signs, symbols for the sign language of the deaf, japanese, chinese or other asiatic characters, etc. 3 Character Encoding • A character encoding or code page is a mapping of symbols of a character set to bit patterns which are also referred as code points. A → 17, B → 23, C → 42, … • Typical examples of encodings are ASCII, EBCDIC or Unicode. • Part of the encoding scheme is also the definition of a serialisation scheme to convert the code point into a sequence of bytes. 4 The symbols of a character set are now put in an sequence and are numbered. The ordinal number will then used as a code point for this symbol. If we have more than 256 symbols, a single byte isn't enough to encode a charater and we have to think about an encoding scheme. 4 ASCII • Sample of an encoding scheme: • First version 1963, Standardized 1968 • Ordered mapping to 7-bit numbers 5 5 Single Byte Char Sets (SBCS) • Extensions from 7-bit ASCII to 8-bit code pages • ISO-8859-x: ASCII + special characters for some languages • Platform specific charsets: Windows ANSI or MacRoman 6 ISO-8859-1 (Latin 1): ASCII + special characters for westeuropean languages ISO-8859-2 (Latin 2): ASCII + special characters for easteuropean languages ISO-8859-3, -4, ..., -14: ASCII + special characters for arabic, greek, turk, hebrew, thailandic or baltic languages ISO-8859-15: modified ISO-8859-1 including Euro-Symbol (€) 6 Double Byte Char Sets (DBCS) • Expansion of the SBCS concept from one byte to two bytes per character • Mainly used for asiatic languages with more than 256 characters to encode • Latin text is expanded to twice the size of SBCS 7 7 EUC (Extended Unix Code) • Multi Byte Char Set (MBCS): 2 or 4 bytes/char • Only used for Japanese, Korean, Traditional and Simplified Chinese on Unix platform • Uses single shift characters to switch to a another code group to build a multi byte character 8 8 Unicode • Intended to simplify and unify the different definitions of code pages and hence conversion. • The first definition contained 65536 characters (16-bit, 1991, UCS-2). • Version 2.0 extended the charset with 16 planes for up to 1.114.112 characters (32-bit, 1996, UCS-4). • Today in Unicode Version 4.0 we have approx. 100.000 characters assigned to code points. 9 See also: http://www.unicode.org 9 Unicode char sets and encodings • UCS-2: two bytes per character • UCS-4: four bytes per character • UTF-16: Encoding of UCS-4 into one or two words: the first 64k code points use two bytes per character, all others four byte • UTF-8: dynamic or variable length encoding of characters with one to four bytes • Possible problems with UCS-2, UCS-4, UTF-16: Byte order differences (big-endian vs. little-endian) between different processor architectures. 10 Beside a mapping of characters to numbers an enconding scheme is essential to store the data in a sequence of bytes. The simplest encoding is to store a 16-bit or 32-bit wide code point in 2 or 4 bytes. This is used in UCS-2 or UCS-4. But this encoding scheme is not very efficient for latin texts which mainly consist of ASCII characters. A text string would consist mainly of 00 bytes. This would also cause problems for the string functions of the C programming language, as it uses a null byte as termination character. UTF-8 is an encoding scheme, which distributes the bits needed in one or more bytes. This requires a more sophisticated routine to read and write strings, but it allows to continue to use the C string functions. Details of the UTF-8 encoding are on the next slide. 10 UTF-8 • Encoding in variable length sequence of bytes • Simple recognition of multibyte chars • Compact storage of text in latin chars • Only the shortest encoding allowed 11 11 Overview • What are character sets, encoding schemes and code pages? • Where can I define the code page used? • What is code page conversion and where does it happen? • What problems can arise and how can I avoid them? • Performance considerations 12 12 Usage of a code page Code pages can be specified at different levels: • At the operating system where the application runs • At the operating system where the server runs • At the operating system where the application is prepared/bound • At the database level 13 In a client/server environment, the code page used on a client needs not to be thesameas thecodepageusedon theserver. Local applications tend to use as a default the local defined code page of the operating system. A special situation can occur in a multiplatform environment, where clients, server and the application developers generating code with static SQL use different code pages on their machines. During compilation of programs with embedded static SQL a precompile pass is used, which needs a database connection. As default the local code page is used, which can be different from the other users code pages. If the user later accesses the static SQL, a code page conversion can happen to convert the data first to the code page used for the static SQL. During creation of a database the administrator can specify a code page of the database. This can't be changed afterwards. 13 Default code page • As default DB2 server and clients use the local settings of the operating system or user: • Windows: The server process is using the default region settings of the operating system. • Linux/Unix: The codepage is derived from the locale setting for the instance user (i.e. the user running the database processes). • Client (LUW): The current locale settings of the user determine the code page used during CONNECT. • Programming language: Java is always using Unicode when connecting to a database via JDBC. 14 14 Specifying a code page: OS level • Windows: Control Panel → Regional and Language settings, chcp command • Linux/Unix: locale command 15 15 At prepare/bind time • Special case during development of database software with static, embedded SQL. • Embedded SQL needs a prepare phase before compilation of the source code. • Later the prepared package needs to be bound to the database with the bind command. • Both commands need a database connection and at the connect time; the current setting of the locale is used. 16 16 Defining a database w/ code page • Explicitly set the code page at creation time: CREATE DB test USING CODESET codeset TERRITORY territory COLLATE collatingseq • Otherwise current locale is used to determine database codeset. • The choosen code page cannot be changed later. • In DB2 for iSeries and for z/OS you can also define single columns of a table in a different code set (not detailed here). 17 17 Overview • What are character sets, encoding schemes and code pages? • Where can I define the code page used? • What is code page conversion and where does it happen? • What problems can arise and how can I avoid them? • Performance considerations 18 18 Code page conversion • If application and server use a different code page, code page conversion happens. • Code page conversion is always done at the receivers side: • at the servers side for data sent from client to server • at the clients side for data sent from server to client • Exception: Importing IXF files generated on a different system with another code page • If conversion tables are missing: SQLCODE -332 19 In some rare cases a code page conversion is done more than once. If you import some IXF files on a client machine, a local code page conversion is used, if the IXF files were generated on another machine with a different code page (e.g. export data on a windows machine to IXF and import the data on a linux machine).
Recommended publications
  • Hieroglyphs for the Information Age: Images As a Replacement for Characters for Languages Not Written in the Latin-1 Alphabet Akira Hasegawa
    Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 5-1-1999 Hieroglyphs for the information age: Images as a replacement for characters for languages not written in the Latin-1 alphabet Akira Hasegawa Follow this and additional works at: http://scholarworks.rit.edu/theses Recommended Citation Hasegawa, Akira, "Hieroglyphs for the information age: Images as a replacement for characters for languages not written in the Latin-1 alphabet" (1999). Thesis. Rochester Institute of Technology. Accessed from This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact [email protected]. Hieroglyphs for the Information Age: Images as a Replacement for Characters for Languages not Written in the Latin- 1 Alphabet by Akira Hasegawa A thesis project submitted in partial fulfillment of the requirements for the degree of Master of Science in the School of Printing Management and Sciences in the College of Imaging Arts and Sciences of the Rochester Institute ofTechnology May, 1999 Thesis Advisor: Professor Frank Romano School of Printing Management and Sciences Rochester Institute ofTechnology Rochester, New York Certificate ofApproval Master's Thesis This is to certify that the Master's Thesis of Akira Hasegawa With a major in Graphic Arts Publishing has been approved by the Thesis Committee as satisfactory for the thesis requirement for the Master ofScience degree at the convocation of May 1999 Thesis Committee: Frank Romano Thesis Advisor Marie Freckleton Gr:lduate Program Coordinator C.
    [Show full text]
  • SUPPORTING the CHINESE, JAPANESE, and KOREAN LANGUAGES in the OPENVMS OPERATING SYSTEM by Michael M. T. Yau ABSTRACT the Asian L
    SUPPORTING THE CHINESE, JAPANESE, AND KOREAN LANGUAGES IN THE OPENVMS OPERATING SYSTEM By Michael M. T. Yau ABSTRACT The Asian language versions of the OpenVMS operating system allow Asian-speaking users to interact with the OpenVMS system in their native languages and provide a platform for developing Asian applications. Since the OpenVMS variants must be able to handle multibyte character sets, the requirements for the internal representation, input, and output differ considerably from those for the standard English version. A review of the Japanese, Chinese, and Korean writing systems and character set standards provides the context for a discussion of the features of the Asian OpenVMS variants. The localization approach adopted in developing these Asian variants was shaped by business and engineering constraints; issues related to this approach are presented. INTRODUCTION The OpenVMS operating system was designed in an era when English was the only language supported in computer systems. The Digital Command Language (DCL) commands and utilities, system help and message texts, run-time libraries and system services, and names of system objects such as file names and user names all assume English text encoded in the 7-bit American Standard Code for Information Interchange (ASCII) character set. As Digital's business began to expand into markets where common end users are non-English speaking, the requirement for the OpenVMS system to support languages other than English became inevitable. In contrast to the migration to support single-byte, 8-bit European characters, OpenVMS localization efforts to support the Asian languages, namely Japanese, Chinese, and Korean, must deal with a more complex issue, i.e., the handling of multibyte character sets.
    [Show full text]
  • Title the Practice of Basic Informatics 2019 Author(S) Kita, Hajime
    Title The Practice of Basic Informatics 2019 Kita, Hajime; Kitamura, Yumi; Hioki, Hirohisa; Sakai, Author(s) Hiroyuki; Lin, Donghui Citation (2020): 1-196 Issue Date 2020-03-08 URL http://hdl.handle.net/2433/246166 This book is licensed under CC-BY-NC-ND. For detail, access Right the following: https://creativecommons.org/licenses/by-nc- nd/4.0/deed.en Type Learning Material Textversion publisher Kyoto University The Practice of Basic Informatics 2019 Hajime Kita, Institute for Liberal Arts and Sciences, Yumi Kitamura, Kyoto University Library, Hirohisa Hioki, Graduate School of Human and Environmental Studies, Hiroyuki Sakai, Center for the Promotion of Excellence in Higher Education, Donghui Lin, Graduate School of Informatics Kyoto University Version 2020/03/08 0. Foreword Table of Contents 0. Foreword Kyoto University provides courses on ‘The Practice of Basic Informatics’ as part of its Liberal Arts and Sciences Program. The course is taught at many schools and departments, and course contents vary to meet the requirements of these schools and departments. This textbook is made open to the students of all schools that teach these courses. As stated in Chapter 1, this book is written with the aim of building ICT skills for study at university, that is, ICT skills for academic activities. Some topics may not be taught in class. However, the book is written for self-study by students. We include many exercises in this textbook so that instructors can select some of them for their classes, to accompany their teaching plans. The courses are given at the computer laboratories of the university, and the contents of this textbook assume that Windows 10 and Microsoft Office 2016 are available in these laboratories.
    [Show full text]
  • Accredited Standards Committee Doc. No.: X3L2/SD-3 X3
    Accredited Standards Committee Doc. No.: X3L2/SD-3 X3, Information Processing Systems* Date: 4 Feb., 1994 X3L2, Codes and Character Sets Project: ADMIN Reply to: John H. Jenkins Taligent, Inc. 10201 N. DeAnza Boulevard Cupertino, CA 95014 Voice: +1 408 862-3241 FAX: +1 408 257-9681 E-mail: [email protected] X3L2, Codes and Character Sets Document Register for 1993 Table 1. X3 Standing Documents Number Title Author Date Project X3/SD-0 Information Brochure X3 8901 ADMIN X3/SD-1A Master Plan (Overview) X3 9001 ADMIN X3/SD-1B Master Plan (operational) X3 9001 ADMIN X3/SD-1C Master Plan (Strategic) X3 9102 ADMIN X3/SD-2 Organization, Rules and X3 9301 ADMIN Procedures of X3 X3/SD-3 Project Proposal Guide X3 9108 ADMIN X3/SD-4 Projects Manual X3 9212 ADMIN X3/SD-5 Standards Evaluation Criteria X3 9212 ADMIN X3/SD-6 Membership and Officers X3 9208 ADMIN X3/SD-7 Meeting Schedule and Calendar X3 9111 ADMIN X3/SD-8 Officers' Reference Manual X3 9111 ADMIN X3/SD-9 Policy and Guidelines X3 9112 ADMIN X3/SD-10 X3 Subgroup Annual Report Format X3 9212 ADMIN Table 2. X3L2 Standing Documents Number Title Author Date Project X3L2/SD-1 Membership and Mailing List Jenkins 930804 ADMIN X3L2/SD-2 Action List Jenkins 930611 ADMIN X3L2/SD- Document Register for 1993 Jenkins 030204 ADMIN 3:1993 X3L2/SD-4 Technical Committee Summary Jenkins 930804 ADMIN X3L2/SD-5 List of Members in Jeopardy with Meeting Jenkins 930804 ADMIN Attendance and Ballot Records X3L2/SD-6 X3L2 Projects List Jenkins 921215 ADMIN X3L2/SD-7 ANSI Style Manual ANSI 91-03-01 ADMIN X3L2/SD-8 IEC/ISO Directives, Part 1, Proecedures for ISO/IEC 93 ADMIN the technical work * Operating under the procedures of The American National Standards Institute X3 Secretariat, Computer and Business Equipment Manufacturers Association, 1250 Eye Street, N.W., Suite 200, Washington, DC 20005 (Telephone: 202.737.8888 FAX: 202.638.4922) Table 3.
    [Show full text]
  • IBM Data Conversion Under Websphere MQ
    IBM WebSphere MQ Data Conversion Under WebSphere MQ Table of Contents .................................................................................................................................................... 3 .................................................................................................................................................... 3 Int roduction............................................................................................................................... 4 Ac ronyms and terms used in Data Conversion........................................................................ 5 T he Pieces in the Data Conversion Puzzle............................................................................... 7 Coded Character Set Identifier (CCSID)........................................................................................ 7 Encoding .............................................................................................................................................. 7 What Gets Converted, and How............................................................................................... 9 The Message Descriptor.................................................................................................................... 9 The User portion of the message..................................................................................................... 10 Common Procedures when doing the MQPUT................................................................. 10 The message
    [Show full text]
  • AIX Globalization
    AIX Version 7.1 AIX globalization IBM Note Before using this information and the product it supports, read the information in “Notices” on page 233 . This edition applies to AIX Version 7.1 and to all subsequent releases and modifications until otherwise indicated in new editions. © Copyright International Business Machines Corporation 2010, 2018. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents About this document............................................................................................vii Highlighting.................................................................................................................................................vii Case-sensitivity in AIX................................................................................................................................vii ISO 9000.....................................................................................................................................................vii AIX globalization...................................................................................................1 What's new...................................................................................................................................................1 Separation of messages from programs..................................................................................................... 1 Conversion between code sets.............................................................................................................
    [Show full text]
  • A Ruse Secluded Character Set for the Source
    JOURNAL OF ARCHITECTURE & TECHNOLOGY Issn No : 1006-7930 A Ruse Secluded character set for the Source Mr. J Purna Prakash1, Assistant Professor Mr. M. Rama Raju 2, Assistant Professor Christu Jyothi Institute of Technology & Science Abstract We are rich in data, but information is poor, typically world wide web and data streams. The effective and efficient analysis of data in which is different forms becomes a challenging task. Searching for knowledge to match the exact keyword is big task in Internet such as search engine. Now a days using Unicode Transform Format (UTF) is extended to UTF-16 and UTF-32. With helps to create more special characters how we want. China has GB 18030-character set. Less number of website are using ASCII format in china, recently. While searching some keyword we are unable get the exact webpage in search engine in top place. Issues in certain we face this problem in results announcement, notifications, latest news, latest products released. Mainly on government websites are not shown in the front page. To avoid this trap from common people, we require special character set to match the exact unique keyword. Most of the keywords are encoded with the ASCII format. While searching keyword called cbse net results thousands of websites will have the common keyword as cbse net results. Matching the keyword, it is already encoded in all website as ASCII format. Most of the government websites will not offer search engine optimization. Match a unique keyword in government, banking, Institutes, Online exam purpose. Proposals is to create a character set from A to Z and a to z, for the purpose of data cleaning.
    [Show full text]
  • International Language Environments Guide
    International Language Environments Guide Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. Part No: 806–6642–10 May, 2002 Copyright 2002 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, docs.sun.com, AnswerBook, AnswerBook2, Java, XView, ToolTalk, Solstice AdminTools, SunVideo and Solaris are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. SunOS, Solaris, X11, SPARC, UNIX, PostScript, OpenWindows, AnswerBook, SunExpress, SPARCprinter, JumpStart, Xlib The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry.
    [Show full text]
  • JFP Reference Manual 5 : Standards, Environments, and Macros
    JFP Reference Manual 5 : Standards, Environments, and Macros Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. Part No: 817–0648–10 December 2002 Copyright 2002 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, docs.sun.com, AnswerBook, AnswerBook2, and Solaris are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written license agreements.
    [Show full text]
  • Traditional Chinese Solaris User's Guide
    Traditional Chinese Solaris User’s Guide Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. Part No: 816–0669–10 May 2002 Copyright 2002 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, docs.sun.com, AnswerBook, AnswerBook2, and Solaris are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written license agreements.
    [Show full text]
  • Introduction to Japanese Computational Linguistics Francis Bond and Timothy Baldwin
    1 Introduction to Japanese Computational Linguistics Francis Bond and Timothy Baldwin The purpose of this chapter is to provide a brief introduction to the Japanese language, and natural language processing (NLP) research on Japanese. For a more complete but accessible description of the Japanese language, we refer the reader to Shibatani (1990), Backhouse (1993), Tsujimura (2006), Yamaguchi (2007), and Iwasaki (2013). 1 A Basic Introduction to the Japanese Language Japanese is the official language of Japan, and belongs to the Japanese language family (Gordon, Jr., 2005).1 The first-language speaker pop- ulation of Japanese is around 120 million, based almost exclusively in Japan. The official version of Japanese, e.g. used in official settings andby the media, is called hyōjuNgo “standard language”, but Japanese also has a large number of distinctive regional dialects. Other than lexical distinctions, common features distinguishing Japanese dialects are case markers, discourse connectives and verb endings (Kokuritsu Kokugo Kenkyujyo, 1989–2006). 1There are a number of other languages in the Japanese language family of Ryukyuan type, spoken in the islands of Okinawa. Other languages native to Japan are Ainu (an isolated language spoken in northern Japan, and now almost extinct: Shibatani (1990)) and Japanese Sign Language. Readings in Japanese Natural Language Processing. Francis Bond, Timothy Baldwin, Kentaro Inui, Shun Ishizaki, Hiroshi Nakagawa and Akira Shimazu (eds.). Copyright © 2016, CSLI Publications. 1 Preview 2 / Francis Bond and Timothy Baldwin 2 The Sound System Japanese has a relatively simple sound system, made up of 5 vowel phonemes (/a/,2 /i/, /u/, /e/ and /o/), 9 unvoiced consonant phonemes (/k/, /s/,3 /t/,4 /n/, /h/,5 /m/, /j/, /ó/ and /w/), 4 voiced conso- nants (/g/, /z/,6 /d/ 7 and /b/), and one semi-voiced consonant (/p/).
    [Show full text]
  • Building Cmap Files for CID-Keyed Fonts
    ® Building CMap Files ®®for CID-Keyed Fonts Adobe Developer Support Technical Note #5099 14 October 1998 Adobe Systems Incorporated Corporate Headquarters Adobe Systems Eastern Region 345 Park Avenue 24 New England San Jose, CA 95110 Executive Park (408) 536-6000 Main Number Burlington, MA 01803 (408) 536-9000 Developer Support (617) 273-2120 Fax: (408) 536-6883 Fax: (617) 273-2336 European Engineering Support Group Adobe Systems Co., Ltd. Adobe Systems Benelux B.V. Yebisu Garden Place Tower P.O. Box 22750 4-20-3 Ebisu, Shibuya-ku 1100 DG Amsterdam Tokyo 150 The Netherlands Japan +31-20-6511 355 +81-3-5423-8169 Fax: +31-20-6511 313 Fax: +81-3-5423-8204 PN LPS5099 Copyright © 1996 – 1998 Adobe Systems Incorporated. All rights reserved. NOTICE: All information contained herein is the property of Adobe Systems Incorporated. No part of this publication (whether in hardcopy or electronic form) may be reproduced or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of the publisher. PostScript is a registered trademark of Adobe Systems Incorporated. All instances of the name PostScript in the text are references to the PostScript language as defined by Adobe Systems Incorporated unless otherwise stated. The name PostScript also is used as a product trademark for Adobe Systems' implementation of the PostScript language interpreter. Except as otherwise stated, any reference to a “PostScript printing device,” “PostScript display device,” or similar item refers to a printing device, display device or item (respectively) which contains PostScript technology created or licensed by Adobe Systems Incorporated and not to devices or items which purport to be merely compatible.
    [Show full text]