A Look at Unicode and COBOL by Sasirekha Cota

Total Page:16

File Type:pdf, Size:1020Kb

A Look at Unicode and COBOL by Sasirekha Cota A Look at Unicode and COBOL By Sasirekha Cota Some IT experts have predicted that COBOL would die out, but some new applications are being built in COBOL.Although Unicode is the default in Java, it is now entering the COBOL world only.This article defines Unicode, explains what COBOL offers in terms of Unicode support, factors to consider before using Unicode, and the benefits of using it. OVERVIEW acters to numbers are done using different With documents getting transferred elec- encoding systems. tronically rather than the printed media across Industry pundits have been predicting the The 7-bit ASCII was not sufficient to rep- countries that use different alphabets, the end of COBOL for the past two decades, resent all the characters used by many of the problems like accented Latin characters get- whereas a recent survey shows that there are world’s languages. Multiple code pages are ting replaced by Greek characters arise. between 180 billion and 200 billion lines of created to meet this challenge. DBCS are Similar problems arise when documents are COBOL code in use worldwide. Before you used for handling languages like Chinese, moved across operating systems. conclude that COBOL is not the language of Japanese and Korean with thousands of Unicode that assigns a unique number to choice for new development, note that an esti- ideographs. DBCSs are encoded in such a each character in each of the major languages mated 15% of all new applications are being way as to allow double-byte character of the world is the solution to this problem built in COBOL. encodings to be mixed with single-byte created by the assortment of 8-bit fonts limit- COBOL has been undergoing a silent evolu- character encodings. ed to representing 256 unique characters. tion, and the latest IBM Enterprise COBOL for The basic issue in this approach is that two z/OS brings new features that help integrate encodings can use the same number for two WHAT IS UNICODE? COBOL business processes and Web-oriented different characters, or use different numbers business processes. IBM extensions1 range for the same character. The computer has to Unicode provides a unique number for from minor relaxation of rules to major support multiple encoding schemes, and the every character, no matter what the platform, capabilities, such as XML support, Unicode data always ran the risk of getting corrupted no matter what the program, no matter what support, object-oriented COBOL for Java when passed between computers. the language. interoperability, and Double-Byte Character Unicode is the Universal-encoding stan- Sets (DBCS) character handling. NEED FOR UNICODE dard for (eventually) all the characters and Though Unicode is the default in Java, it is symbols used in all the languages of the entering the COBOL world now only. This Of the maximum of 256 characters in each world, past and present. It has merged with article describes what COBOL offers in terms code page set, the first 128 characters include and is compatible with international stan- of Unicode support. For the benefit of hard- English alphabets, punctuation marks and dard ISO/IEC 10646 (also known as the core COBOL programmers, let us first numbers. In the English-speaking world, the Universal Character Set, or UCS, for short). establish what Unicode is and what problems second set of 128 characters comprises more Unicode is a registered trademark of it is expected to solve. punctuation marks, some currency symbols Unicode, Inc., and is in development by the (such as £ and ¥), and a lot of accented letters Unicode Consortium. BEFORE UNICODE (such as á, ç, è, ñ, ô and ü). In countries like Unicode is required by standards like Egypt or Greece that use a different alphabet, XML, Java, LDAP, CORBA, and WML. Computers deal with numbers ONLY, and the second set of 128 is used for characters Unicode enables a single software product all the alphabets and other characters are from the Arabic, Greek, Hebrew, Cyrillic or or a single Web site to be used across multi- stored as numbers. The mapping of the char- Thai alphabets. ple platforms, languages and countries. It ©2004 Technical Enterprises, Inc. Reproduction of this document without permission is prohibited. Technical Support | March 2004 allows data to be transported through many different systems with- FIGURE 1: THE FIGURATIVE CONSTANT VALUES IF THE CONTEXT IS A out risking corruption. NATIONAL CHARACTER DESIGN OF UNICODE Figurative Constant National Character Value ZERO/ZEROS/ZEROES NX’0030’ The requirement for a character set that is universal, efficient, uni- SPACE/SPACES NX’0020’ form and unambiguous forms the basis for the design of the Unicode QUOTE/QUOTES NX’0027’ HIGH-VALUE / LOW-VALUE <Doesn’t Apply> Standard that assigns each character a unique numeric value and name. Each character is assigned a single number called a code point and in FIGURE 2: EXAMPLE OF NATIONAL-OF FUNCTION text form prefixed by “U.” For example, character “A” is represented by code point U+0041 (hex number 0041 and decimal equivalent 65) and is assigned the character name “LATIN CAPITAL LETTER A.” 01 Japanese-MBCS-DATA PIC X(20). 01 National-data PIC N(10) Usage National. A single code is used for representing characters or ideographs com- mon to multiple languages avoiding duplicate encoding of the characters. Move Function NATIONAL-OF(Japanese-MBCS-data, 1399) To The ASCII encoding forms the first half of ISO-8859-1 (Latin1) and National-data forms the first page Unicode, making it easier to apply Unicode to existing software and data. FIGURE 3: EXAMPLE OF DISPLAY-OF FUNCTION The Unicode Standard has three encoding forms with 8, 16 or 32 bits per code unit. The data can be transformed between these encoding Move Function DISPLAY-OF(National-data,1399) To Japanese- forms efficiently without any loss of information. MBCS-data CHARACTERS IN COBOL FIGURE 4: EXAMPLE TO CONVERT FROM EBCDIC TO ASCII The most basic and indivisible unit of the COBOL language is the 77 EBCDIC-CCSID PIC 9(4) BINARY VALUE 1140. character. The basic character set is extended with the IBM Double- 77 ASCII-CCSID PIC 9(4) BINARY VALUE 819. Byte Character Set (DBCS). DBCS characters occupy two adjacent 77 Input-EBCDIC PIC X(80). 77 ASCII-Output PIC X(80). bytes to represent one character. A character-string containing only DBCS characters is called a DBCS character-string. Move Function In IBM Enterprise COBOL, support for Unicode is provided by Display-of ( Function National-of (Input-EBCDIC EBCDIC- CCSID),ASCII-CCSID) NATIONAL data type. This allows the run-time character set that is to ASCII-output processed to include alphanumeric characters, DBCS characters and national characters. of a national literal is 80 character positions, excluding the opening and UNICODE IN COBOL closing delimiters. For example, define as: When we say Unicode support by Enterprise COBOL, the following 01 Data-in-Unicode PIC N(10) USAGE NATIONAL. are the areas to be understood: National literals in hexadecimal format have NX” or NX’ as the 1. Unicode data represented as National Data items and literals opening delimiter. The literal can include hexadecimal digits in the 2. Intrinsic functions DISPLAY-OF and NATIONAL-OF for character range ‘0’ to ‘9’; ‘a’ - f’; and ‘A’ to ‘F’. Each group of four hexadeci- conversions to and from Unicode mal digits represents a single national character and must represent a 2. Additional compiler options NSYMBOL and CODEPAGE in the valid code point in UTF-16. Hence, the length must be a multiple of Unicode context. four and can be a maximum of 320 digits. In case of national data items, the LENGTH function returns the Unicode capabilities are used implicitly by COBOL programs number of character positions, rather than the actual bytes occupied. that use object-oriented syntax for Java interoperability. Installing For example, LENGTH(Data-in-Unicode) will return a value of 10 and configuring Unicode conversion services is the prerequisite for though it occupies 20 bytes. compiling COBOL programs with National Characters or object- oriented syntax. For specific details on setting up and activating Compiler Options—NSYMBOL and CODEPAGE Unicode conversion services, refer to the Enterprise COBOL When the NSYMBOL(NATIONAL) compiler option is in effect, Customization Guide. USAGE NATIONAL is assumed and just the opening delimiter N” or N’ identifies a national literal. For example, N’AAABBBCCC’ is a National Data Items and Literals national literal in basic format. COBOL uses UTF-16 to represent national characters. This default The CODEPAGE compiler option lets you specify the code page encoding for Unicode occupies 2 bytes of storage and offers the best used for both data items and literals in the program. To support compromise between processing and storage. Unicode, IBM has included new Coded Character Set IDs (CCSID) in To define national data items holding Unicode strings, use PICTURE the range of 1200 to 1208. UTF-16 is supported by CCSID 1200, in clause with symbol “N” and USAGE NATIONAL. The maximum length big-endian format, and the CCSID for UTF-8 is 1208. Technical Support | March 2004 ©2004 Technical Enterprises, Inc. Reproduction of this document without permission is prohibited. Usage of National Literals Internationalization is the essential area where the industry is converg- National literals in the source program are converted to UTF-16 for ing on Unicode. Most significant application systems and third-party use at run time. products with international versions already support Unicode or When an alphanumeric item is moved to a national item, it is implicit- they are moving towards it. The important advantage they offer is ly converted to the UTF-16 Unicode. You can use this implicit conversion the option of having a single source (where multiple versions are achieved while moving to convert alphanumeric, DBCS and integer data earlier maintained for US, Chinese, Japanese).
Recommended publications
  • Unicode and Code Page Support
    Natural for Mainframes Unicode and Code Page Support Version 4.2.6 for Mainframes October 2009 This document applies to Natural Version 4.2.6 for Mainframes and to all subsequent releases. Specifications contained herein are subject to change and these changes will be reported in subsequent release notes or new editions. Copyright © Software AG 1979-2009. All rights reserved. The name Software AG, webMethods and all Software AG product names are either trademarks or registered trademarks of Software AG and/or Software AG USA, Inc. Other company and product names mentioned herein may be trademarks of their respective owners. Table of Contents 1 Unicode and Code Page Support .................................................................................... 1 2 Introduction ..................................................................................................................... 3 About Code Pages and Unicode ................................................................................ 4 About Unicode and Code Page Support in Natural .................................................. 5 ICU on Mainframe Platforms ..................................................................................... 6 3 Unicode and Code Page Support in the Natural Programming Language .................... 7 Natural Data Format U for Unicode-Based Data ....................................................... 8 Statements .................................................................................................................. 9 Logical
    [Show full text]
  • Legacy Character Sets & Encodings
    Legacy & Not-So-Legacy Character Sets & Encodings Ken Lunde CJKV Type Development Adobe Systems Incorporated bc ftp://ftp.oreilly.com/pub/examples/nutshell/cjkv/unicode/iuc15-tb1-slides.pdf Tutorial Overview dc • What is a character set? What is an encoding? • How are character sets and encodings different? • Legacy character sets. • Non-legacy character sets. • Legacy encodings. • How does Unicode fit it? • Code conversion issues. • Disclaimer: The focus of this tutorial is primarily on Asian (CJKV) issues, which tend to be complex from a character set and encoding standpoint. 15th International Unicode Conference Copyright © 1999 Adobe Systems Incorporated Terminology & Abbreviations dc • GB (China) — Stands for “Guo Biao” (国标 guóbiâo ). — Short for “Guojia Biaozhun” (国家标准 guójiâ biâozhün). — Means “National Standard.” • GB/T (China) — “T” stands for “Tui” (推 tuî ). — Short for “Tuijian” (推荐 tuîjiàn ). — “T” means “Recommended.” • CNS (Taiwan) — 中國國家標準 ( zhôngguó guójiâ biâozhün) in Chinese. — Abbreviation for “Chinese National Standard.” 15th International Unicode Conference Copyright © 1999 Adobe Systems Incorporated Terminology & Abbreviations (Cont’d) dc • GCCS (Hong Kong) — Abbreviation for “Government Chinese Character Set.” • JIS (Japan) — 日本工業規格 ( nihon kôgyô kikaku) in Japanese. — Abbreviation for “Japanese Industrial Standard.” — 〄 • KS (Korea) — 한국 공업 규격 (韓國工業規格 hangug gongeob gyugyeog) in Korean. — Abbreviation for “Korean Standard.” — ㉿ — Designation change from “C” to “X” on August 20, 1997. 15th International Unicode Conference Copyright © 1999 Adobe Systems Incorporated Terminology & Abbreviations (Cont’d) dc • TCVN (Vietnam) — Tiu Chun Vit Nam in Vietnamese. — Means “Vietnamese Standard.” • CJKV — Chinese, Japanese, Korean, and Vietnamese. 15th International Unicode Conference Copyright © 1999 Adobe Systems Incorporated What Is A Character Set? dc • A collection of characters that are intended to be used together to create meaningful text.
    [Show full text]
  • Data Encoding: All Characters for All Countries
    PhUSE 2015 Paper DH03 Data Encoding: All Characters for All Countries Donna Dutton, SAS Institute Inc., Cary, NC, USA ABSTRACT The internal representation of data, formats, indexes, programs and catalogs collected for clinical studies is typically defined by the country(ies) in which the study is conducted. When data from multiple languages must be combined, the internal encoding must support all characters present, to prevent truncation and misinterpretation of characters. Migration to different operating systems with or without an encoding change make it impossible to use catalogs and existing data features such as indexes and constraints, unless the new operating system’s internal representation is adopted. UTF-8 encoding on the Linux OS is used by SAS® Drug Development and SAS® Clinical Trial Data Transparency Solutions to support these requirements. UTF-8 encoding correctly represents all characters found in all languages by using multiple bytes to represent some characters. This paper explains transcoding data into UTF-8 without data loss, writing programs which work correctly with MBCS data, and handling format catalogs and programs. INTRODUCTION The internal representation of data, formats, indexes, programs and catalogs collected for clinical studies is typically defined by the country or countries in which the study is conducted. The operating system on which the data was entered defines the internal data representation, such as Windows_64, HP_UX_64, Solaris_64, and Linux_X86_64.1 Although SAS Software can read SAS data sets copied from one operating system to another, it impossible to use existing data features such as indexes and constraints, or SAS catalogs without converting them to the new operating system’s data representation.
    [Show full text]
  • JFP Reference Manual 5 : Standards, Environments, and Macros
    JFP Reference Manual 5 : Standards, Environments, and Macros Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. Part No: 817–0648–10 December 2002 Copyright 2002 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, docs.sun.com, AnswerBook, AnswerBook2, and Solaris are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written license agreements.
    [Show full text]
  • Addendum 135-2008K-1
    ANSI/ASHRAE Addendum k to ANSI/ASHRAE Standard 135-2008 ASHRAEASHRAE STANDARDSTANDARD BACnet®—A Data Communication Protocol for Building Automation and Control Networks Approved by the ASHRAE Standards Committee on January 23, 2010; by the ASHRAE Board of Directors on January 27, 2010; and by the American National Standards Institute on January 28, 2010. This standard is under continuous maintenance by a Standing Standard Project Committee (SSPC) for which the Standards Committee has established a documented program for regular publication of addenda or revi- sions, including procedures for timely, documented, consensus action on requests for change to any part of the standard. The change submittal form, instructions, and deadlines may be obtained in electronic form from the ASHRAE Web site, http://www.ashrae.org, or in paper form from the Manager of Standards. The latest edi- tion of an ASHRAE Standard may be purchased from ASHRAE Customer Service, 1791 Tullie Circle, NE, Atlanta, GA 30329-2305. E-mail: [email protected]. Fax: 404-321-5478. Telephone: 404-636-8400 (world- wide), or toll free 1-800-527-4723 (for orders in US and Canada). © Copyright 2010 American Society of Heating, Refrigerating and Air-Conditioning Engineers, Inc. ISSN 1041-2336 American Society of Heating, Refrigerating and Air-Conditioning Engineers, Inc. 1791 Tullie Circle NE, Atlanta, GA 30329 www.ashrae.org ASHRAE Standing Standard Project Committee 135 Cognizant TC: TC 1.4, Control Theory and Application SPLS Liaison: Douglas T. Reindl David Robin, Chair* Coleman L. Brumley, Jr.* John J. Lynch Carl Neilson, Vice-Chair Bernhard Isler* Ted Sunderland Sharon E. Dinges, Secretary* Stephen Karg* David B.
    [Show full text]
  • Embrace Multilingual Data Chao Wang, MSD R&D (China) Co.,Ltd
    PharmaSUG China The Power of SAS National Language Support - Embrace Multilingual Data Chao Wang, MSD R&D (China) Co.,Ltd. Beijing ABSTRACT As clinical trials take place around the globe, data in different languages may need to be handled. The usual solution for this issue is to translate the data to English, after some manipulations, results then will be translated back to the original language. Since version 9.1.2, SAS software can provide a specific function called National Language Support, this issue would be easy to handle. In this paper, an introduction will be made on: 1) how to build the ready SAS session to import and display the mixed data with NLS system option 2) how to use the edge tool – SAS K-Function (including K-macro Function) to do data manipulations and 3) how to display any character in reports with the Unicode support. Here would use a multilingual (Chinese and English) AE dataset as an example on SAS 9.3 PC version. The related applications on the UNIX server and SAS Enterprise Guide would also be discussed. INTRODUCTION When conduct a multi-national study, data often comes from various countries using different languages and scripts, especially in Asia area. The different sources data will make the clinical operation more complex and cost additional budget and time. It always needs to translate the raw data to English and save the new information in database. And analysis and reporting are based on English words. After review, then translate the English reports back to the local language. Besides the CSR, the data monitoring may require to do some analysis based on the unclean raw data.
    [Show full text]
  • Progress Datadirect for ODBC Drivers Reference
    Progress® DataDirect® for ODBC Drivers Reference September 2020 Copyright © 2020 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. These materials and all Progress® software products are copyrighted and all rights are reserved by Progress Software Corporation. The information in these materials is subject to change without notice, and Progress Software Corporation assumes no responsibility for any errors that may appear therein. The references in these materials to specific platforms supported are subject to change. Corticon, DataDirect (and design), DataDirect Cloud, DataDirect Connect, DataDirect Connect64, DataDirect XML Converters, DataDirect XQuery, DataRPM, Defrag This, Deliver More Than Expected, Icenium, Ipswitch, iMacros, Kendo UI, Kinvey, MessageWay, MOVEit, NativeChat, NativeScript, OpenEdge, Powered by Progress, Progress, Progress Software Developers Network, SequeLink, Sitefinity (and Design), Sitefinity, SpeedScript, Stylus Studio, TeamPulse, Telerik, Telerik (and Design), Test Studio, WebSpeed, WhatsConfigured, WhatsConnected, WhatsUp, and WS_FTP are registered trademarks of Progress Software Corporation or one of its affiliates or subsidiaries in the U.S. and/or other countries. Analytics360, AppServer, BusinessEdge, DataDirect Autonomous REST Connector, DataDirect Spy, SupportLink, DevCraft, Fiddler, iMail, JustAssembly, JustDecompile, JustMock, NativeScript Sidekick, OpenAccess, ProDataSet, Progress Results, Progress Software, ProVision, PSE Pro, SmartBrowser, SmartComponent, SmartDataBrowser,
    [Show full text]
  • The Unicode Standard, Version 4.0--Online Edition
    This PDF file is an excerpt from The Unicode Standard, Version 4.0, issued by the Unicode Consor- tium and published by Addison-Wesley. The material has been modified slightly for this online edi- tion, however the PDF files have not been modified to reflect the corrections found on the Updates and Errata page (http://www.unicode.org/errata/). For information on more recent versions of the standard, see http://www.unicode.org/standard/versions/enumeratedversions.html. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial capital letters. However, not all words in initial capital letters are trademark designations. The Unicode® Consortium is a registered trademark, and Unicode™ is a trademark of Unicode, Inc. The Unicode logo is a trademark of Unicode, Inc., and may be registered in some jurisdictions. The authors and publisher have taken care in preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode®, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. Dai Kan-Wa Jiten used as the source of reference Kanji codes was written by Tetsuji Morohashi and published by Taishukan Shoten.
    [Show full text]
  • Peopletools 8.57: Global Technology
    PeopleTools 8.57: Global Technology March 2020 PeopleTools 8.57: Global Technology Copyright © 1988, 2020, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government. This software or hardware is developed for general use in a variety of information management applications.
    [Show full text]
  • View Character Data Translation Guide
    WWW.ASNA.COM Character Data Translation in DataGate/400 (Applies to Release 7.2 and above) One of the most important and overlooked features of ASNA DataGate is its ability to render character data (database fields and text) in various character encodings, as dictated by platforms and globalization requirements. Using the facilities of the iSeries and Windows, DataGate not only supports transparent iSeries-based EBCDIC character translation, but also the unique international character sets supported by particular platforms. This document discusses iSeries, Windows, and DataGate character translation functionality; past and present. By illuminating DataGate’s translation techniques and corresponding facilities, users are better equipped to configure and maintain robust character data applications. This document assumes some familiarity of applications programming with Windows, the iSeries, and DB2/400 database architecture. International Character Sets In the traditional green screen environment, developers were not as concerned with character translation issues, because in this hardware-based scheme, data entered into a green screen terminal was only rendered on a green screen terminal (or its report-printing twin). This “panacea” was disrupted by emerging globalization, the Internet, personal computers, and client/server-based access schemes. From very early on however, IBM and other computer vendors had introduced “national language” character sets reflecting local language and culture in international markets. In the case of IBM, these character sets were based on the original eight- bit EBCDIC character code. Later, to penetrate Asian and other markets with larger character sets, IBM introduced the idea of 16-bit character codes, known as Double- Byte Character Sets (DBCS). IBM identifies its various iSeries character encodings with Coded Character Set Identifiers (CCSIDs).
    [Show full text]
  • Building Cmap Files for CID-Keyed Fonts
    ® Building CMap Files ®®for CID-Keyed Fonts Adobe Developer Support Technical Note #5099 14 October 1998 Adobe Systems Incorporated Corporate Headquarters Adobe Systems Eastern Region 345 Park Avenue 24 New England San Jose, CA 95110 Executive Park (408) 536-6000 Main Number Burlington, MA 01803 (408) 536-9000 Developer Support (617) 273-2120 Fax: (408) 536-6883 Fax: (617) 273-2336 European Engineering Support Group Adobe Systems Co., Ltd. Adobe Systems Benelux B.V. Yebisu Garden Place Tower P.O. Box 22750 4-20-3 Ebisu, Shibuya-ku 1100 DG Amsterdam Tokyo 150 The Netherlands Japan +31-20-6511 355 +81-3-5423-8169 Fax: +31-20-6511 313 Fax: +81-3-5423-8204 PN LPS5099 Copyright © 1996 – 1998 Adobe Systems Incorporated. All rights reserved. NOTICE: All information contained herein is the property of Adobe Systems Incorporated. No part of this publication (whether in hardcopy or electronic form) may be reproduced or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of the publisher. PostScript is a registered trademark of Adobe Systems Incorporated. All instances of the name PostScript in the text are references to the PostScript language as defined by Adobe Systems Incorporated unless otherwise stated. The name PostScript also is used as a product trademark for Adobe Systems' implementation of the PostScript language interpreter. Except as otherwise stated, any reference to a “PostScript printing device,” “PostScript display device,” or similar item refers to a printing device, display device or item (respectively) which contains PostScript technology created or licensed by Adobe Systems Incorporated and not to devices or items which purport to be merely compatible.
    [Show full text]
  • UTR#17: Unicode Character Encoding Model
    UTR#17: Unicode Character Encoding Model http://www.unicode.org/reports/tr17/tr17-6.html Technical Reports Proposed Update Unicode Technical Report #17 UNICODE CHARACTER ENCODING MODEL Authors Ken Whistler ([email protected]), Mark Davis ([email protected]), Asmus Freytag ([email protected]) Date 2008-08-25 This Version http://www.unicode.org/reports/tr17/tr17-6.html Previous Version http://www.unicode.org/reports/tr17/tr17-5.html Latest Version http://www.unicode.org/reports/tr17/ Revision 6 Summary This document clarifies a number of the terms used to describe character encodings, and where the different forms of Unicode fit in. It elaborates the Internet Architecture Board (IAB) three-layer “text stream” definitions into a four-layer structure. Status This document is a proposed update of a previously approved Unicode Technical Report. This document may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Technical Report (UTR) contains informative material. Conformance to the Unicode Standard does not imply conformance to any UTR. Other specifications, however, are free to make normative references to a UTR. Please submit corrigenda and other comments with the online reporting form [Feedback]. Related information that is useful in understanding this document is found in the References. For the latest version of the Unicode Standard see [Unicode]. For a list of current Unicode Technical Reports see [Reports]. For more information about versions of the Unicode Standard, see [Versions].
    [Show full text]