Ebcdic/Ascii-Ansi: 0000060

Total Page:16

File Type:pdf, Size:1020Kb

Ebcdic/Ascii-Ansi: 0000060 The NLS SETUP Application and TRABASE.SAS: Easy ways to customise character conversions using the SAS System Manfred Kiefer, SAS Institute European Headquarters Abstract All software used in a multi-dimensional environment must account for the differences in character sets and encoding schemes. It must also accommodate differences in conventional usage between languages, such as the differing usage of upper and lower cases. The SAS System provides several features to ensure that applications can be written to use local conventions and provide national language support (NLS). Areas of concern include the following: • moving data and applications between hosts • management of text-strings • displaying and printing national characters other than the standard upper and lower case A-Z, as they are encoded by various ASCII and EBCDIC formats. Historically, the SAS System has provided internal translation tables, called TRANTAB entries, that convert one character encoding standard to another. Currently shipping as a sample application with the Orlando Release, NLSsetup fully automates the creation of TRANTABs, key maps, and device maps, and provides an easy point-and-click interface for users to transparently specify language features. The Problem for SAS System Users As shown in Table 1, each host or platform on which the SAS System runs uses different standards for encoding characters. As a result, you must convert or map characters when you move data across platforms. Table 1: Operating Systems (Hosts) Grouped by Character-Encoding Standard EBCDIC hosts: • CMS • MVS • VSE ASCII-ISO hosts (those that use character set(s) that are defined by the ISO 8859 standard): • AIX-RS/6000 • Convex • DG/UX • HP-UX • Intel ABI • MIPS ABI • OpenVMS-VAX • OpenVMS-AXP • Digital UNIX • Solaris 2 • SunOS 4.1 • ULTRIX ASCII-ANSI hosts,. which is the MS-Windows ANSI character set. (This is essentially ISO 8859, but it is called ASCII-ANSI because it was originally based on an ANSI draft standard) • Windows 3.1 • Windows 32s • Windows NT • Windows 95 ASCII-MAC hosts (those that use character set(s) that are defined by the vendor-specific Apple Macintosh character set): • Macintosh System 7.5 for Motorola 68020-, 68030-, and 68040-based systems • PowerPC-based Macintosh systems ASCII-OEM host (vendor-specific IBM PC-ASCII character set): • OS/2 When you transfer data with the "standard" A-Z characters, character conversion from one encoding standard to another is not a problem. You simply rely on the default conversion mechanisms. However, when you have data with national characters such as the æ, ø, and å in Danish; the ä, ö, ß and ö in German; and the accented characters, such as á, é, ú, and ñ in Spanish, different conversion mechanisms are involved, and unique character encoding standards are used on each platform. The NLSsetup application enables you to adapt conversion (translation) tables for each language. Although this is typically a system administrator's task, you should understand the process so that you can add your own customised tables or modify existing ones. The problem for a SAS system administrator is to enable users to transfer data and applications from one host to another, or transparently access data on one host from another host without concern about character conversion from one coded character set to another. The SAS System provides a number of ways of transporting data and applications across hosts. However, the processes and trantabs that are involved differ, depending on the mechanisms you use. The REMOTE engine feature of SAS/CONNECT and SAS/SHARE software uses host-to-host trantabs. PROCs UPLOAD, DOWNLOAD, CPORT, and CIMPORT use transport-format trantabs. Each of these mechanisms is explained in the following sections. Host-to-Host Trantabs: Transporting Data via the REMOTE Engine The REMOTE engine is a feature of SAS/CONNECT and SAS/SHARE software that allows you to access remote data. When you move data across platforms, the REMOTE engine translates character sets directly from the source platform's encoding standard to the target platform's encoding standard, as shown in the following diagram. source target platform <------> translation <--------> platform (host-to-host trantabs) For example, if you are using the REMOTE engine to access data on an MVS host, which uses EBCDIC encoding, from a PC client, which uses IBM PC-ASCII encoding, characters are translated directly from EBCDIC to IBM PC-ASCII and vice-versa. Table 2 shows the trantabs that the SAS System provides for direct host-to-host character-set translation. Table 2: SAS Host-to-Host Trantabs Trantab Name Entry and Function Specific Hosts ------------------------------------------------------------------------------------------- On EBCDIC hosts (IBM mainframes) _0000030 (0) import from ASCII-ISO to EBCDIC connecting MVS, CMS, or VSE to (1) export from EBCDIC to ASCII-ISO OpenVMS or UNIX systems _0000060 (0) import from ASCII-ANSI to EBCDIC connecting MVS, CMS, or VSE to (1) export from EBCDIC to ASCII-ANSI Windows _00000A0 (0) import from ASCII-OEM to EBCDIC connecting MVS, CMS, or VSE to (1) export from EBCDIC to ASCII-OEM OS/2 _0000120 (0) import from ASCII-MAC to EBCDIC connecting MVS, CMS, or VSE to (1) export from EBCDIC to ASCII-MAC MAC ------------------------------------------------------------------------------------------- On Windows hosts - Windows in ANSI mode: _0000050 (0) import from ASCII-ISO to ASCII-ANSI connecting Windows to (1) export from ASCII-ANSI to ASCII-ISO OpenVMS and UNIX systems _0000060 (0) import from EBCDIC to ASCII-ANSI connecting Windows to MVS (1) export from ASCII-ANSI to EBCDIC CMS, or VSE _00000C0 (0) import from ASCII-OEM to ASCII-ANSI connecting Windows to OS/2 (1) export from ASCII-ANSI to ASCII-OEM or to Windows in ASCII-OEM mode _0000140 (0) import from ASCII-MAC to ASCII-ANSI connecting Windows to MAC (1) export from ASCII-ANSI to ASCII-MAC - Windows in OEM mode: _0000090 (0) import from ASCII-ISO to ASCII-OEM connecting Windows to (1) export from ASCII-OEM to ASCII-ISO OpenVMS or UNIX systems _00000A0 (0) import from EBCDIC to ASCII-OEM connecting Windows to (1) export from ASCII-OEM to EBCDIC MVS, CMS, or VSE _00000C0 (0) import from ASCII-ANSI to ASCII-OEM connecting Windows to (1) export from ASCII-OEM to ASCII-ANSI Windows in ASCII-ANSI mode _0000180 (0) import from ASCII-MAC to ASCII-OEM connecting Windows to MAC (1) export from ASCII-OEM to ASCII-MAC -------------------------------------------------------------------------------------------- On OS/2 _0000090 (0) import from ASCII-ISO to ASCII-OEM connecting OS/2 to OpenVMS (1) export from ASCII-OEM to ASCII-ISO or UNIX systems _00000A0 (0) import from EBCDIC to ASCII-OEM connecting OS/2 to MVS, (1) export from ASCII-OEM to EBCDIC CMS, or VSE _00000C0 (0) import from ASCII-ANSI to ASCII-OEM connecting OS/2 to Windows (1) export from ASCII-OEM to ASCII-ANSI _0000180 (0) import from ASCII-MAC to ASCII-OEM connecting OS/2 to MAC (1) export from ASCII-OEM to ASCII-MAC -------------------------------------------------------------------------------------------- On MAC _0000110 (0) import from ASCII-ISO to ASCII-MAC connecting MAC to OpenVMS (1) export from ASCII-MAC to ASCII-ISO or UNIX systems _0000120 (0) import from EBCDIC to ASCII-MAC connecting MAC to MVS, (1) export from ASCII-MAC to EBCDIC CMS, or VSE _0000140 (0) import from ASCII-ANSI to ASCII-MAC connecting MAC to Windows (1) export from ASCII-MAC to ASCII-ANSI _0000180 (0) import from ASCII-OEM to ASCII-MAC connecting MAC to OS/2 (1) export from ASCII-MAC to ASCII-OEM -------------------------------------------------------------------------------------------- On OpenVMS and UNIX hosts _0000030 (0) import from EBCDIC to ASCII-ISO connecting OpenVMS or UNIX to (1) export from ASCII-ISO to EBCDIC MVS, CMS, or VSE _0000050 (0) import from ASCII-ANSI to ASCII-ISO connecting OpenVMS or UNIX (1) export from ASCII-ISO to ASCII-ANSI to Windows _0000090 (0) import from ASCII-OEM to ASCII-ISO connecting OpenVMS or UNIX (1) export from ASCII-ISO to ASCII-OEM to OS/2 _0000110 (0) import from ASCII-MAC to ASCII-ISO connecting OpenVMS or UNIX (1) export from ASCII-ISO to ASCII-MAC to MAC -------------------------------------------------------------------------------------------- The same trantabs are used for all connectivity mechanisms, including APPC, TCP/IP, NETBIOS, and DECnet. As you can see from Tables 1 and 2, the conversion subsystem distinguishes the following character architectures: • EBCDIC • ASCII-ISO (ISO 8859) • ASCII-OEM (which, for our purposes, includes only the IBM PC-ASCII standard) • Microsoft's ASCII-ANSI • Apple's ASCII-MAC. Each host-to-host trantab actually consists of two halves, or "entries": • ordered entry 0 (for importing) • entry 1 (for exporting). For example, on UNIX hosts, the _0000030 (EBCDIC to ASCII-ISO) trantab is shown in Table 3 as it appears when both halves are listed by the TRANTAB procedure. Table 3: The _0000030 Trantab Table name is _0000030. 0 1 2 3 4 5 6 7 8 9 A B C D E F 00 '000102039C09867F978D8E0B0C0D0E0F'x 10 '101112139D8508871819928F1C1D1E1F'x 20 '80818283840A171B88898A8B8C050607'x 30 '909116939495960498999A9B14159E1A'x 40 '20A0A1A2A3A4A5A6A7A8D52E3C282B7C'x -> 50 '26A9AAABACADAEAFB0B121242A293B5E'x 60 '2D2FB2B3B4B5B6B7B8B9E52C255F3E3F'x 70 'BABBBCBDBEBFC0C1C2603A2340273D22'x 80 'C3616263646566676869C4C5C6C7C8C9'x 90 'CA6A6B6C6D6E6F707172CBCCCDCECFD0'x A0 'D17E737475767778797AD2D3D45BD6D7'x B0 'D8D9DADBDCDDDEDFE0E1E2E3E45DE6E7'x C0 '7B414243444546474849E8E9EAEBECED'x
Recommended publications
  • INTERSKILL MAINFRAME QUARTERLY December 2011
    INTERSKILL MAINFRAME QUARTERLY December 2011 Retaining Data Center Skills Inside This Issue and Knowledge Retaining Data Center Skills and Knowledge 1 Interskill Releases - December 2011 2 By Greg Hamlyn Vendor Briefs 3 This the final chapter of this four part series that briefly Taking Care of Storage 4 explains the data center skills crisis and the pros and cons of Learning Spotlight – Managing Projects 5 implementing a coaching or mentoring program. In this installment we will look at some of the steps to Tech-Head Knowledge Test – Utilizing ISPF 5 implementing a program such as this into your data center. OPINION: The Case for a Fresh Technical If you missed these earlier installments, click the links Opinion 6 below. TECHNICAL: Lost in Translation Part 1 - EBCDIC Code Pages 7 Part 1 – The Data Center Skills Crisis MAINFRAME – Weird and Unusual! 10 Part 2 – How Can I Prevent Skills Loss in My Data Center? Part 3 – Barriers to Implementing a Coaching or Mentoring Program should consider is the GROW model - Determine whether an external consultant should be Part Four – Implementing a Successful Coaching used (include pros and cons) - Create a basic timeline of the project or Mentoring Program - Identify how you will measure the effectiveness of the project The success of any project comes down to its planning. If - Provide some basic steps describing the coaching you already believe that your data center can benefit from and mentoring activities skills and knowledge transfer and that coaching and - Next phase if the pilot program is deemed successful mentoring will assist with this, then outlining a solid (i.e.
    [Show full text]
  • ISO Basic Latin Alphabet
    ISO basic Latin alphabet The ISO basic Latin alphabet is a Latin-script alphabet and consists of two sets of 26 letters, codified in[1] various national and international standards and used widely in international communication. The two sets contain the following 26 letters each:[1][2] ISO basic Latin alphabet Uppercase Latin A B C D E F G H I J K L M N O P Q R S T U V W X Y Z alphabet Lowercase Latin a b c d e f g h i j k l m n o p q r s t u v w x y z alphabet Contents History Terminology Name for Unicode block that contains all letters Names for the two subsets Names for the letters Timeline for encoding standards Timeline for widely used computer codes supporting the alphabet Representation Usage Alphabets containing the same set of letters Column numbering See also References History By the 1960s it became apparent to thecomputer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization (ISO) encapsulated the Latin script in their (ISO/IEC 646) 7-bit character-encoding standard. To achieve widespread acceptance, this encapsulation was based on popular usage. The standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 × 2 letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 8859 (8-bit character encoding) and ISO/IEC 10646 (Unicode Latin), have continued to define the 26 × 2 letters of the English alphabet as the basic Latin script with extensions to handle other letters in other languages.[1] Terminology Name for Unicode block that contains all letters The Unicode block that contains the alphabet is called "C0 Controls and Basic Latin".
    [Show full text]
  • Unicode and Code Page Support
    Natural for Mainframes Unicode and Code Page Support Version 4.2.6 for Mainframes October 2009 This document applies to Natural Version 4.2.6 for Mainframes and to all subsequent releases. Specifications contained herein are subject to change and these changes will be reported in subsequent release notes or new editions. Copyright © Software AG 1979-2009. All rights reserved. The name Software AG, webMethods and all Software AG product names are either trademarks or registered trademarks of Software AG and/or Software AG USA, Inc. Other company and product names mentioned herein may be trademarks of their respective owners. Table of Contents 1 Unicode and Code Page Support .................................................................................... 1 2 Introduction ..................................................................................................................... 3 About Code Pages and Unicode ................................................................................ 4 About Unicode and Code Page Support in Natural .................................................. 5 ICU on Mainframe Platforms ..................................................................................... 6 3 Unicode and Code Page Support in the Natural Programming Language .................... 7 Natural Data Format U for Unicode-Based Data ....................................................... 8 Statements .................................................................................................................. 9 Logical
    [Show full text]
  • Iso/Iec 8632-4:1999(E)
    This is a preview - click here to buy the full publication INTERNATIONAL ISO/IEC STANDARD 8632-4 Second edition 1999-12-01 Information technology — Computer graphics — Metafile for the storage and transfer of picture description information — Part 4: Clear text encoding Technologies de l'information — Infographie — Métafichier de stockage et de transfert des informations de description d'images — Partie 4: Codage en clair des textes Reference number ISO/IEC 8632-4:1999(E) © ISO/IEC 1999 ISO/IEC 8632-4:1999(E) This is a preview - click here to buy the full publication PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. © ISO/IEC 1999 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body in the country of the requester.
    [Show full text]
  • Database Globalization Support Guide
    Oracle® Database Database Globalization Support Guide 19c E96349-05 May 2021 Oracle Database Database Globalization Support Guide, 19c E96349-05 Copyright © 2007, 2021, Oracle and/or its affiliates. Primary Author: Rajesh Bhatiya Contributors: Dan Chiba, Winson Chu, Claire Ho, Gary Hua, Simon Law, Geoff Lee, Peter Linsley, Qianrong Ma, Keni Matsuda, Meghna Mehta, Valarie Moore, Cathy Shea, Shige Takeda, Linus Tanaka, Makoto Tozawa, Barry Trute, Ying Wu, Peter Wallack, Chao Wang, Huaqing Wang, Sergiusz Wolicki, Simon Wong, Michael Yau, Jianping Yang, Qin Yu, Tim Yu, Weiran Zhang, Yan Zhu This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S.
    [Show full text]
  • Basis Technology Unicode対応ライブラリ スペックシート 文字コード その他の名称 Adobe-Standard-Encoding A
    Basis Technology Unicode対応ライブラリ スペックシート 文字コード その他の名称 Adobe-Standard-Encoding Adobe-Symbol-Encoding csHPPSMath Adobe-Zapf-Dingbats-Encoding csZapfDingbats Arabic ISO-8859-6, csISOLatinArabic, iso-ir-127, ECMA-114, ASMO-708 ASCII US-ASCII, ANSI_X3.4-1968, iso-ir-6, ANSI_X3.4-1986, ISO646-US, us, IBM367, csASCI big-endian ISO-10646-UCS-2, BigEndian, 68k, PowerPC, Mac, Macintosh Big5 csBig5, cn-big5, x-x-big5 Big5Plus Big5+, csBig5Plus BMP ISO-10646-UCS-2, BMPstring CCSID-1027 csCCSID1027, IBM1027 CCSID-1047 csCCSID1047, IBM1047 CCSID-290 csCCSID290, CCSID290, IBM290 CCSID-300 csCCSID300, CCSID300, IBM300 CCSID-930 csCCSID930, CCSID930, IBM930 CCSID-935 csCCSID935, CCSID935, IBM935 CCSID-937 csCCSID937, CCSID937, IBM937 CCSID-939 csCCSID939, CCSID939, IBM939 CCSID-942 csCCSID942, CCSID942, IBM942 ChineseAutoDetect csChineseAutoDetect: Candidate encodings: GB2312, Big5, GB18030, UTF32:UTF8, UCS2, UTF32 EUC-H, csCNS11643EUC, EUC-TW, TW-EUC, H-EUC, CNS-11643-1992, EUC-H-1992, csCNS11643-1992-EUC, EUC-TW-1992, CNS-11643 TW-EUC-1992, H-EUC-1992 CNS-11643-1986 EUC-H-1986, csCNS11643_1986_EUC, EUC-TW-1986, TW-EUC-1986, H-EUC-1986 CP10000 csCP10000, windows-10000 CP10001 csCP10001, windows-10001 CP10002 csCP10002, windows-10002 CP10003 csCP10003, windows-10003 CP10004 csCP10004, windows-10004 CP10005 csCP10005, windows-10005 CP10006 csCP10006, windows-10006 CP10007 csCP10007, windows-10007 CP10008 csCP10008, windows-10008 CP10010 csCP10010, windows-10010 CP10017 csCP10017, windows-10017 CP10029 csCP10029, windows-10029 CP10079 csCP10079, windows-10079
    [Show full text]
  • IBM Data Conversion Under Websphere MQ
    IBM WebSphere MQ Data Conversion Under WebSphere MQ Table of Contents .................................................................................................................................................... 3 .................................................................................................................................................... 3 Int roduction............................................................................................................................... 4 Ac ronyms and terms used in Data Conversion........................................................................ 5 T he Pieces in the Data Conversion Puzzle............................................................................... 7 Coded Character Set Identifier (CCSID)........................................................................................ 7 Encoding .............................................................................................................................................. 7 What Gets Converted, and How............................................................................................... 9 The Message Descriptor.................................................................................................................... 9 The User portion of the message..................................................................................................... 10 Common Procedures when doing the MQPUT................................................................. 10 The message
    [Show full text]
  • Making the Most of Your Mainframe Data: from EBCDIC to ASCII
    Making the Most of Your Mainframe Data: From EBCDIC to ASCII Syncsort | Making the Most of Your Mainframe Data: From EBCDIC to ASCII Introduction Providing access to mainframe data in systems across your organization is unfortunately not as simple as a one-to-one database duplication. Mainframe data represents one of the most complex data formats found in enterprise systems, and as a result, many teams struggle with data integration projects for this crucial asset and fail to extract the maximum value from their mainframe data. Organizations that implement a data integration strategy that encompasses mainframe data see significant benefits. To make the most of your mainframe data and realize its full potential, it is vital that you take into consideration the human and technical challenges that can occur. By avoiding common pitfalls, you’ll be on the road to unlocking your mainframe data and receiving what is most important to your organization – maximum business value. Syncsort | Making the Most of Your Mainframe Data: From EBCDIC to ASCII 02 Mainframe Data is Still the Bedrock of your Business Despite the growth of next-wave technologies, mainframes will continue to play an essential role in many businesses. Mainframes have no peer when it comes to the volume of transactions they can handle. In fact, estimates are that 2.5 billion transactions are run per day, per mainframe across the world, and analysts estimate that more than 70% of Fortune 500 companies use mainframes at the core of their most valuable business systems. Mainframes contain the vital data that organizations run on, and in turn, they power initiatives that help move a business forward, such as machine learning, AI, and predictive analytics.
    [Show full text]
  • Onetouch 4.0 Sanned Documents
    TO: MSPM Distribution FROM: J. H. Saltzer SUBJECT: 88.3.02 DATE: 02/05/68 This revision of BB.3.02 is because 1. The ASCII standard character set has been approved. References are altered accordingly. 2. The latest proposed ASCII standard card code has been revised slightly. Since the Multics standard card code matches the ASCII standard wherever convenient# 88.3.02 is changed. Codes for the grave accent# left and right brace, and tilde are affected. 3. One misprint has been corrected; the code for capita 1 11 S" is changed. MULTICS SYSTEM-PROGRAMMERS' MANUAL SECTION BB.3.02 PAGE 1 Published: 02/05/68; (Supersedes: BB.3.02; 03/30/67; BC.2.06; 11/10/66) Identification Multics standard card punch codes and Relation between ASCII and EBCDIC J • H • Sa 1 tze r Purpose This section defines standard card punch codes to be used in representing ASCII characters for use with Multics. Since the card punch codes are based on the punch codes defined for the IBM EBCDIC standard, automatically a correspondence between the EBCDIC and ASCII character sets is also defined. Note The Multics standard card punch codes described in this section are DQ! identical to the currently proposed ASCII punched card code. The proposed ASCII standard code is not supported by any currently available punched card equipment; until such support exists it is not a practical standard for Multics work. The Multics standard card punch code described here is based on widely available card handling equipment used with IBM System/360 computers. The six characters for which the Multics standard card code differs with the ASCII card code are noted in the table below.
    [Show full text]
  • Iso/Iec Jtc 1/Sc 2/ Wg 2 N ___Ncits-L2-98
    Unicode support in EBCDIC based systems ISO/IEC JTC 1/SC 2/ WG 2 N _______ NCITS-L2-98-257REV 1998-09-01 Title: EBCDIC-Friendly UCS Transformation Format -- UTF-8-EBCDIC Source: US, Unicode Consortium and V.S. UMAmaheswaran, IBM National Language Technical Centre, Toronto Status: For information and comment Distribution: WG2 and UTC Abstract: This paper defines the EBCDIC-Friendly Universal Multiple-Octet Coded Character Set (UCS) Transformation Format (TF) -- UTF-8-EBCDIC. This transform converts data encoded using UCS (as defined in ISO/IEC 10646 and the Unicode Standard defined by the Unicode Consortium) to and from an encoding form compatible with IBM's Extended Binary Coded Decimal Interchange Code (EBCDIC). This revised document incorporates the suggestions made by Unicode Technical Committee Meeting No. 77, on 31 July 98, and several editoiral changes. It is also being presented at the Internationalization and Unicode Conference no. 13, in San Jose, on 11 September 98. It has been accepted by the UTC as the basis for a Unicode Technical Report and is being distributed to SC 2/WG 2 for information and comments at this time. 13th International Unicode Conference 1 San Jose, CA, September 1998 Unicode support in EBCDIC based systems 1 Background UCS Transformation Format UTF-8 (defined in Amendment No. 2 to ISO/IEC 10646-1) is a transform for UCS data that preserves the subset of 128 ISO-646-IRV (ASCII) characters of UCS as single octets in the range X'00' to X'7F', with all the remaining UCS values converted to multiple-octet sequences containing only octets greater than X'7F'.
    [Show full text]
  • Exploiting Unicode-Enabled Software
    Unraveling Unicode: A Bag of Tricks for Bug Hunting Black Hat USA July 2009 Chris Weber www.lookout.net [email protected] Casaba Security Can you tell the difference? Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber How about now? Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber The Transformers When good input turns bad <scrİpt> becomes <script> Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber Agenda Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber Unicode Transformations Agenda • Unicode crash course • Root Causes • Attack Vectors • Tools – Find Unicode issues in Web-testing – Visual Spoofing Detection Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber Unicode Transformations Agenda • Unicode crash course • Root Causes • Attack Vectors • Tools Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber Unicode Crash Course The Unicode Attack Surface • End users • Applications • Databases • Programming languages • Operating Systems Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber Unicode Crash Course Unthink it Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber Unicode Crash Course • A large and complex standard code points canonical mappings encodings decomposition types categorization case folding normalization best-fit mapping binary properties 17 planes case mapping private use ranges conversion tables script blocks bi-directional properties escapings Black Hat USA - July 2009 © 2009 Chris
    [Show full text]
  • All About Alphanumeric Display Terminals
    C2S-01 0-1 01 Terminals All About Alphanumeric Display Terminals The video display terminal (VDT, or CRT, as it is com­ monly referred to) is the principal interface between people The traditional alphanumeric display terminal, and computers. As the computer (particularly the micro­ threatened by the onslaught of microcomputers computer) becomes pervasive in today's business world, with terminal emulation capabilities, remains more and more people are being exposed to this popular alive and well. In fact. market studies consistently business tool. Originally invented as a "glass teletype," an show a steady, stable growth for this market in the alternative to using a teleprinter terminal as a computer next few years. This report focuses on non-user­ operator console, the display terminal has evolved to the programmable alphanumeric display terminals point where it is a primary component in the vast majority designed for general-purpose business applica­ of modern computer applications, including data entry, tions. It includes a brief historical summary of the inquiry/response, program development, business and sci­ market; current market trends; developments in entific graphics, word processing/text editing, CAD/CAM, ergonomics; and a look at the industry's major and many others. For the purpose of this report, we will segments. Also included are comparison columns focus on alphanumeric display terminals designed for gen­ detailing the specifications of 352 display termi­ eral-purpose business applications. nal models offered by 87 vendors. The steady introduction of improvements in CRT design As with all segments of the hardware industry, technologi­ and functional capability, such as editing, highlighting, cal improvements have led to lower prices for the user.
    [Show full text]