Character Set & Globalization

Total Page:16

File Type:pdf, Size:1020Kb

Character Set & Globalization DOAG Konferenz + Ausstellung 18.11.2014 Nürnberg Character Set & Globalization Martin Hoermann [email protected] www.ordix.de Character Set & Globalization „Das Alpha und das Omega, der Erste und der Letzte, der Anfang und das Ende “ Offenbarung des Johannes (Kap. 22,13) Character Set und Globalization, Martin Hoermann, DOAG 2014 1 Gut zu wissen… http://www.interessante.verweise/ FAQ: 340512.1 Timestamp Details … Infos zum Nachschlagen Literatur http://www.buch.de Wo ist Larry? Character Set und Globalization, Martin Hoermann, DOAG 2014 2 Character Sets U+2318 place of interest http://www.fileformat.info/ Character Set und Globalization, Martin Hoermann, DOAG 2014 3 Character Set Encoding (ISO/IEC 8859-1) Hermes Baby: http://www.typewriters.ch/images/hermes_baby_jubilee_gr.jpg Character Set und Globalization, Martin Hoermann, DOAG 2014 4 Wichtige Character Sets / Encodings Oracle Pendant . ISO 8859-1 WE8ISO8859P1 . ISO 8859-15 WE8ISO8859P15 . CP 1252 WE8MSWIN1252 . CP 850 . UTF-8 UTF8 / AL32UTF8 . UTF-16 AL16UTF16 Oracle Äquivalent Character Set und Globalization, Martin Hoermann, DOAG 2014 5 Character Set & Oracle Betriebssystem Programme Editoren Fonts NLS_LANG Transfer Character-Set- Datenbank Betriebssystem Programme Editoren NLS_LANG Character Set und Globalization, Martin Hoermann, DOAG 2014 6 NLS_LANG NLS_LANG = GERMAN __ GERMANY . WE8ISO8859P1 Character Set (Client) Territory (Client + Datenbank) Date format Decimal character and group separator Local currency symbol ISO currency symbol Dual currency symbol First day of the week Credit and debit symbols ISO week flag List separator Language (Client + Datenbank) Language for server messages Language for day and month names and their abbreviations Symbols for equivalents of AM, PM, AD, and BC. Default sorting sequence for character data Writing direction Affirmative and negative response strings (for example, YES and NO) NLS_LANG = .WE8ISO8859P1 Kurzform Character Set und Globalization, Martin Hoermann, DOAG 2014 7 Character Set & Oracle NLS_LANG SELECT sid, serial#, client_charset FROM V$SESSION_CONNECT_INFO; SELECT * FROM V$NLS_PARAMETERS WHERE parameter LIKE '%CHARACTERSET%' ORDER BY parameter; Character Set und Globalization, Martin Hoermann, DOAG 2014 8 Character-Set - Datenbank Changing Or Choosing the Database Character Set ( NLS_CHARACTERSET ) (Doc ID 225912.1) Character Set und Globalization, Martin Hoermann, DOAG 2014 9 Invalid Data ISO-8859-1 ISO-8859-7 Character Set und Globalization, Martin Hoermann, DOAG 2014 10 Invalid Data Pass Through Configuration Encoding: EL8ISO8895P7 WE8ISO8895P1 NLS_LANG: .WE8ISO8859P1 Character Set und Globalization, Martin Hoermann, DOAG 2014 11 Conversion Errors & Replacement Character U+FFFD Unbekannte und ungültige Zeichen im Quell-Character-Set werden zum Replacement Character konvertiert Character Set und Globalization, Martin Hoermann, DOAG 2014 12 Conversion Errors Encoding: EL8ISO8895P7 WE8ISO8895P1 NLS_LANG: .EL8ISO8859P7 Character Set und Globalization, Martin Hoermann, DOAG 2014 13 Konvertierung „proben“ SELECT convert( zeichen, 'WE8ISO8859P1', 'EL8ISO8859P7' ) FROM charset_iso_8859_p1; Character Set und Globalization, Martin Hoermann, DOAG 2014 14 Kodierung eines Character Set U+2A6A5 Drachen, Drachen, Drachen, Drachen Character Set und Globalization, Martin Hoermann, DOAG 2014 15 Character-Set-Kodierung „Im Anfang (ἀρχή) war das Wort (λόγος) und das Wort war bei Gott, und das Wort war Gott.“ Johannesevangelium, Prolog Character Set und Globalization, Martin Hoermann, DOAG 2014 16 Character Set Encoding (ISO/IEC 8859-1) Character Set und Globalization, Martin Hoermann, DOAG 2014 17 Unicode vs. Encoding AL32UTF8 / UTF8 (Unicode) Database Character Set Implications (Doc ID 788156.1) Character Set und Globalization, Martin Hoermann, DOAG 2014 18 UTF-8 nach Unicode SELECT zeichen, dump( zeichen ) FROM T06_8859P1 WHERE zeichen IN( 'Ä', 'ß' ); 1: 19510 13210 2: 11000011 10000100 3: 00011000100 4: 19610 = 0xC4 5: U+00C4 Character Set und Globalization, Martin Hoermann, DOAG 2014 19 Übung U+004C U+0061 U+0072 U+0072 U+0079 U+0045 U+006C U+006C U+0069 U+0073 U+006F U+006E Character Set und Globalization, Martin Hoermann, DOAG 2014 20 Wo ist Larry? Lawrence „Larry“ Joseph Ellison Larry = U+004C U+0061 U+0072 U+0072 U+0079 Ellison = U+0045 U+006C U+006C U+0069 U+0073 U+006F U+006E DECLARE -- Konvertierung von ASCII-Zeichen nach Unicode vc VARCHAR2(200) := 'Ellison'; vc_zeichen VARCHAR2(1); vc_out VARCHAR2(1000); BEGIN FOR i IN 1..length( vc ) LOOP vc_zeichen := substr( vc, i, 1 ); vc_out := vc_out || ' U+' || trim( to_char( ascii( vc_zeichen ), '000X' )); END LOOP; dbms_output.put_line( vc || ' = ' || vc_out ); END; / Character Set und Globalization, Martin Hoermann, DOAG 2014 21 Dump SELECT zeichen, dump( zeichen, 1016 ) FROM T06_8859P1 WHERE zeichen IN( 'Ä', 'ß' ); Character Set und Globalization, Martin Hoermann, DOAG 2014 22 Wie wird der Character Set bestimmt? U+212E Estimated Symbol Character Set und Globalization, Martin Hoermann, DOAG 2014 23 Betriebssystem . Windows: abhängig von Ländereinstellungen (z.B. CP1252) . DOS: bestimmt durch chcp (z.B. Code Page 850) . Unix/Linux: durch LANG Character Set und Globalization, Martin Hoermann, DOAG 2014 24 SQL*Developer Character Set und Globalization, Martin Hoermann, DOAG 2014 25 TOAD Character Set und Globalization, Martin Hoermann, DOAG 2014 26 Ultra Compare Ultra Compare bestimmt die Kodiderung einer Datei automatisch, wird beispielsweise das Rote Zeichen im Screenshot gefunden, so nimmt das Tool eine bestimmte Kodierung an (hier MAC-UTF-8). Die Kodierung lässt sich einstellen. Character Set und Globalization, Martin Hoermann, DOAG 2014 27 Unicode in der Datenbank Character Set und Globalization, Martin Hoermann, DOAG 2014 28 Unicode Character Sets Character Set Name RDBMS Version Unicode Version AL24UTFFSS 7.2-8.1 1.1 2.1 (8.0-8.1.6) UTF8 8.0-12c 3.0 (8.1.7-12.1) 2.1 (8.0-8.1.6) UTFE 8.0-12c 3.0 (8.1.7-12.1) 3.0 (9.0) 3.1 (9.2) AL32UTF8 3.2 (10.1) 9.0-12c AL16UTF16 4.01 (10.2) 5.0 (11.1 and 11.2) 6.1 (12.1). Unicode Character Sets In The Oracle Database (Doc ID 260893.1) Character Set und Globalization, Martin Hoermann, DOAG 2014 29 Character-Set-Einstellungen . V$NLS_PROPERTIES . NLS_CHARACTERSET . CHAR . VARCHAR2 . LONG . CLOB (in Multibyte-Zeichensätzen immer AL16UTF16 Big Endian) . Objektnamen und PL/SQL Code . NLS_NCHAR_CHARACTERSET . NCHAR . NVARCHAR2 . NCLOB Character Set und Globalization, Martin Hoermann, DOAG 2014 30 NLS_LENGTH_SEMANTIC . Maximal 1.500 Zeichen VARCHAR2( 1500 CHAR ) . Maximal 4.000 Bytes . Maximal 32.767 Bytes (12c) . MAX_STRING_SIZE = EXTENDED NLS_LENGTH_SEMANTICS . VARCHAR2, NVARCHAR2, RAW The National Character Set ( NLS_NCHAR_CHARACTERSET ) in Oracle 9i, 10g , 11g and 12c (Doc ID 276914.1) Character Set und Globalization, Martin Hoermann, DOAG 2014 31 UTF8 in SQL*Plus Character Set und Globalization, Martin Hoermann, DOAG 2014 32 Character Set ändern U+1F4BE Character Set und Globalization, Martin Hoermann, DOAG 2014 33 Update Character Set „There are still "dba's" out there who try to change the NLS_CHARACTERSET or NLS_NCHAR_CHARACTERSET by updating props$ . This is NOT supported and WILL corrupt your database. This is one of the best way's to destroy a complete dataset. Oracle Support will TRY to help you out of this but Oracle will NOT warrant that the data can be recovered or recovered data is correct. You WILL be asked to do a FULL export and a complete rebuild of the database. Please, do NOT update props$.“ Changing Or Choosing the Database Character Set ( NLS_CHARACTERSET ) (Doc ID 225912.1) Character Set und Globalization, Martin Hoermann, DOAG 2014 34 Character Set ändern . Export / Import . CSSCAN / CSALTER . Database Migration Assistant for Unicode (DMU) Character Set und Globalization, Martin Hoermann, DOAG 2014 35 DMU Startseite Character Set und Globalization, Martin Hoermann, DOAG 2014 36 DMU Features . GUI -geführte Migration . Vorteile: . In Place . Data Validation / Cleansing Editor . Behebung von Pass-Through-Konfigurationen (je Spalte) . Ausschluss der Migration (je Spalte) . Zu beachten: . manuelle Tätigkeiten bzgl. Data Dictionary . nur Unicode als Ziel Character Set und Globalization, Martin Hoermann, DOAG 2014 37 DMU Pass-Through-Konfiguration Character Set und Globalization, Martin Hoermann, DOAG 2014 38 DMU Fehler im Data Dictionary SELECT table_name , convert( table_name, 'AL32UTF8' ) konvertiert FROM user_tables WHERE convert( table_name, 'AL32UTF8' ) <> table_name; Character Set und Globalization, Martin Hoermann, DOAG 2014 39 DMU Migration Ausnahmen Character Set und Globalization, Martin Hoermann, DOAG 2014 40 DMU Finish http://blogs.loopback.org/wp-content/uploads/2013/11/DOAG_2013_Unicode_V_1.0.pdf Character Set und Globalization, Martin Hoermann, DOAG 2014 41 Unicode (U+FDFA) U+FDFA SALLALLAHOU ALAYHE WASALLAM Mögen Allahs Segen und Frieden auf ihm sein Character Set und Globalization, Martin Hoermann, DOAG 2014 42 Unicode Themenauswahl . Glyphen vs. Zeichen . Diakritische Zeichen . Ligaturen & Digraphen . Carriage Return vs. Line Feed . Komposition . Schreibrichtung . BOM Character Set und Globalization, Martin Hoermann, DOAG 2014 43 Glyphen vs. Zeichen Character Set und Globalization, Martin Hoermann, DOAG 2014 44 Zeichen U+0041 Latin A U+FF21Halfwidth Latin A U+0391 GREEK CAPITAL LETTER ALPHA U+0410 Cyrillic Capital Letter A Character Set und Globalization, Martin Hoermann, DOAG 2014 45 Diakritische Zeichen www.wikipedia.de Character Set und Globalization, Martin Hoermann, DOAG 2014 46 Komposition
Recommended publications
  • Cumberland Tech Ref.Book
    Forms Printer 258x/259x Technical Reference DRAFT document - Monday, August 11, 2008 1:59 pm Please note that this is a DRAFT document. More information will be added and a final version will be released at a later date. August 2008 www.lexmark.com Lexmark and Lexmark with diamond design are trademarks of Lexmark International, Inc., registered in the United States and/or other countries. © 2008 Lexmark International, Inc. All rights reserved. 740 West New Circle Road Lexington, Kentucky 40550 Draft document Edition: August 2008 The following paragraph does not apply to any country where such provisions are inconsistent with local law: LEXMARK INTERNATIONAL, INC., PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions; therefore, this statement may not apply to you. This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in later editions. Improvements or changes in the products or the programs described may be made at any time. Comments about this publication may be addressed to Lexmark International, Inc., Department F95/032-2, 740 West New Circle Road, Lexington, Kentucky 40550, U.S.A. In the United Kingdom and Eire, send to Lexmark International Ltd., Marketing and Services Department, Westhorpe House, Westhorpe, Marlow Bucks SL7 3RQ. Lexmark may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
    [Show full text]
  • Okidata 320/390/420 Turbo
    Okidata 320/390/420 Turbo Microline Printers Specifications Print Method: Okidata 320/420: 9-Pin (0.34 mm dia.) serial impact dot matrix Okidata 390: 24-Pin (0.20 mm dia.) serial impact dot matrix Graphics Resolution: PRINTERS Okidata 320/420: 240 (H) x 216 (V) DPI maximum (Epson®/IBM®) Okidata 390: 360 (H) x 360 (V) DPI maximum (Epson/IBM AGM) Print Speed** (cps): Okidata 320: NLQ: 75; Utility: 300; High Speed Draft: 390; Super Speed Draft: 435 Okidata 390: LQ: 105; Utility: 315; Super Speed Okidata 320 Turbo Microline Draft: 390 (15 cpi) Okidata 420: NLQ95; High Speed Draft: 5p 510; Utility: 570, Super 570 Feed Rate: Standard Features 5.0 ips • Vertical tabs Emulations: • Long-lasting, self-inking ribbon cartridge Okidata 320: Epson EX-PPR II and OKI® Microline® Okidata 390: Epson ESC/P2, IBM ProPrinter® and • Bit image graphics for plotting charts, graphs and drawings IBM AGM • Full ASCII character set Okidata 420: Epson Fx IMB Pro Point Micro • Friction and adjustable pin feed paper handling Interface: Okidata 320/390/420: IEEE 1284 bidirectional Okidata 320/420 Turbo*: parallel; Windows® 95 plug and play compatible, USB • Nine-pin, long-life print head (parallel input) Okidata 390: Centronics® and IEEE 1284 • Centronics®-compatible parallel interface bidirectional parallel • Near letter-quality printing at 63 characters per second Compatibility: • 80 columns with standard characters, 160 columns with condensed characters Windows® XP, 2000, 98/95 • Front access panel for quick and easy control of type size, print quality and other
    [Show full text]
  • D Printer Commands 207 D Printer Commands
    Appendix D Printer commands 207 D Printer commands Overview Most software applications do not require you to enter printer commands. See your computer and software documentation to find the method for entering printer commands, if needed. This appendix includes: z Understanding PCL 5e printer command syntax z Selecting PCL 5e fonts z Common PCL 5e printer commands Note The table at the end of this appendix contains commonly used PCL 5e commands (See “Common PCL 5e printer commands” on page 212). To order a PCL 5/PJL Technical Reference Documentation Package, see www.hp.com/support/lj9000. Overview Appendix D Printer commands 208 PCL 5e PCL 5e printer commands tell the printer which tasks to perform or which fonts to use. This appendix provides a quick reference for users familiar with PCL 5e command structure. Note Unless PCL 5e backward compatibility is required, HP recommends that PCL 6 printer drivers be used to take full advantage of all printer features. The PCL 5e printer driver in this printer is not backward compatible with older printers that use PCL 5e printer drivers. HP-GL/2 The printer has the ability to print vector graphics using the HP-GL/2 graphics language. Printing in the HP-GL/2 language requires that the printer leave PCL 5e language and enter HP-GL/2 mode, which can be done by sending the printer PCL 5e code. Some software applications change languages through their drivers. PJL HP’s Printer Job Language (PJL) provides control above PCL 5e and other printer languages. The four major functions provided by PJL are: printer language changing, job separation, printer configuration, and status readback from the printer.
    [Show full text]
  • Windows NLS Considerations Version 2.1
    Windows NLS Considerations version 2.1 Radoslav Rusinov [email protected] Windows NLS Considerations Contents 1. Introduction ............................................................................................................................................... 3 1.1. Windows and Code Pages .................................................................................................................... 3 1.2. CharacterSet ........................................................................................................................................ 3 1.3. Encoding Scheme ................................................................................................................................ 3 1.4. Fonts ................................................................................................................................................... 4 1.5. So Why Are There Different Charactersets? ........................................................................................ 4 1.6. What are the Difference Between 7 bit, 8 bit and Unicode Charactersets? ........................................... 4 2. NLS_LANG .............................................................................................................................................. 4 2.1. Setting the Character Set in NLS_LANG ............................................................................................ 4 2.2. Where is the Character Conversion Done? .........................................................................................
    [Show full text]
  • Zeichensätze/ Character Sets Anhang/ Appendix
    BA6x Zeichensätze/ Character sets Anhang/ Appendix We would like to know Ihre Meinung/ Your your opinion on this opinion: publication. Please send us a copy of this page if you have any constructive criticism. We would like to thank you in advance for your comments. With kind regards, Uns interessiert Ihre Meinung zu dieser Druckschrift. Schicken Sie uns bitte eine Information, wenn Sie uns konstruktive Hinweise geben wollen: Dafür bedanken wir uns im voraus. Mit freundlichen Grüßen Wincor Nixdorf International GmbH Dokumentation RD PD1 Rohrdamm 7 Haus 16 D-13629 Berlin eMail: [email protected] Bestellnummer dieser Druckschrift/Order No.: 017501544 08 BA6x Be die ner an zei ge/ Cas hier Dis play Zei chen sät ze/Cha rac ter Sets An hang/ Ap pen dix Pen ti um™ ist ein eingetragenes Wa ren zei chen der In tel Cor po ra ti on MS-DOS™, Wind ows 95™, Wind ows 98™, Wind ows NT™ und Wind ows CE™ sind eingetragene Wa ren zei chen der Mi cro soft Corpo rati on BEET LE™ ist ein ein ge tra ge nes Wa ren zeichen der Win cor Nix dorf In ter na tio nal GmbH Co py right © Win cor Nix dorf In ter na tio nal GmbH, 2008 Alle Rech te vor be hal ten, ins be son de re (auch aus zugs wei se) die der Über set zung, des Nach drucks, Wie der ga be durch Ko pie ren oder ähn li che Ver fah ren. Zu wi der - hand lun gen ver pflich ten zu Scha dens er satz.
    [Show full text]
  • Zeichensätze/ Character Sets Anhang/ Appendix
    BA63 Zeichensätze/ Character sets Anhang/ Appendix We would like to know Ihre Meinung/ Your your opinion on this opinion: publication. Please send us a copy of this page if you have any contructive criticism. We would like to thank you in advance for your comments. With kind regards, Uns interessiert Ihre Meinung zu dieser Druckschrift. Schicken Sie uns bitte eine Information, wenn Sie uns konstruktive Hinweise geben wollen: Dafür bedanken wir uns im voraus. Mit freundlichen Grüßen Wincor Nixdorf International GmbH Dokumentation RD PD1 Wernerwerkdamm 16 Gebäude 36 D-13629 Berlin eMail: [email protected] Bestellnummer dieser Druckschrift/Order No.: 0175 0000 179G BA63 Kundenanzeige/ Customer Display Zeichensätze/Character Sets Anhang/ Appendix Pentium™ ist ein eingetragenes Warenzeichen der Intel Corporation MS-DOS™, Windows 95™, Windows 98™, Windows NT™ und Windows CE™ sind eingetragene Warenzeichen der Microsoft Corporation BEETLE™ ist ein eingetragenes Warenzeichen der Wincor Nixdorf International GmbH Copyright © Wincor Nixdorf International GmbH, 2003 Alle Rechte vorbehalten, insbesondere (auch auszugsweise) die der Übersetzung, des Nachdrucks, Wiedergabe durch Kopieren oder ähnliche Verfahren. Zuwider- handlungen verpflichten zu Schadensersatz. Alle Rechte vorbehalten, insbesondere für den Fall der Patenterteilung oder GM-Ein- tragung. Liefermöglichkeiten und technische Änderungen vorbehalten. Pentium™ is a registered trademark of the Intel Corporation MS-DOS™, Windows 95™, Windows 98™, Windows NT™ and Windows CE™ are registered trademarks of the Microsoft Corporation BEETLE™ is a registered trademark of Wincor Nixdorf International GmbH Copyright© Wincor Nixdorf International GmbH, 2003 The reproduction, transmission or use of this document or its contents is not permitted without express authority. Offenders will be liable for damages.
    [Show full text]
  • Dl4 Installation & Configuration Guide for Unix
    Installation & Configuration Guide Unix Revision 4.3 Information in this document is subject to change without notice and does not represent a commitment on the part of Dynamic Concepts Inc. (DCI). Every attempt was made to present this document in a complete and accurate form. DCI shall not be responsible for any damages (including, but not limited to consequential) caused by the use of or reliance upon the product(s) described herein. The software described in this document is furnished under a license agreement or nondisclosure agreement. The purchaser may use and/or copy the software only in accordance with the terms of the agreement. No part of this manual may be reproduced in any way, shape or form, for any purpose, without the express written consent of DCI. © Copyright 2001 Dynamic Concepts Inc. (DCI). All rights reserved. Dynamic Concepts Inc. 18-B Journey Aliso Viejo, CA 92656 www.dynamic.com dL4 is a trademark of Dynamic Concepts Inc. UniBasic is a trademark of Dynamic Concepts Inc. Dynamic Windows is a trademark of Dynamic Concepts Inc. BITS is a trademark of Dynamic Concepts Inc. IRIS is a trademark of Point 4 Data Corporation. c-tree is a trademark of Faircom. IQ is a trademark of IQ Software Corporation. UNIX is a registered trademark of UNIX Systems Laboratories. SYBASE is a registered trademark of Sybase Inc. CodeBase is a trademark of Sequiter Software Inc. Microsoft, MS, MS-DOS, Microsoft Access, and FoxPro are registered trademarks, and ODBC, Windows and Windows NT are trademarks of Microsoft Corporation in the USA and other countries.
    [Show full text]
  • Iso/Iec Jtc 1/Sc 2/Wg 3 N 424 Iso/Iec Jtc 1/Sc 2
    ,62,(&-7&6&:*N 424 Date : 1998-02-16 ,62,(&-7&6&:* ELWDQGELWFRGHVDQGWKHLUH[WHQVLRQ 6(&5(7$5,$7(/27 DOC TYPE : Disposition of comments Disposition of comments on ballot results SC2 N 3005 - Summary TITLE : of Voting on SC2 N 2946 - Combined CD Registration and FCD Ballot on Project JTC 1.02.20.15 (.15 is tentative) - ISO/IEC 8859 Part 15 ("Latin 9" nicknamed "Latin 0") SOURCE : Alain LaBonté, Project Editor, and Michael Everson, Acting Co- editor PROJECT: JTC 1.02.20.15 (.15 is tentative) STATUS : For approval by SC2/WG3 at its Seattle meeting in March 1998 ACTION ID : ACT DUE DATE : -- P, O and L Members of ISO/IEC JTC 1/SC 23 DISTRIBUTION : WG Conveners, Secretariats WG 3 Members ISO/IEC JTC 1 Secretariat ISO/IEC ITTF MEDIUM : P/Def NO OF PAGES : 11 Contact 1: Secretariat ISO/IEC JTC 1/SC 2/WG 3 ELOT Mrs K.Velli (acting) Acharnon 313, 111 45 Kato Patissia, ATHENS – GREECE Tel: +30 1 22 80 001 Fax : +30 1 22 86 219 E-mail : [email protected] Contact 2 : Convenor ISO/IEC JTC 1/SC 2/WG 3 Mr E.Melagrakis Acharnon 313, 111 45 Kato Patissia, ATHENS – GREECE Tel: +30 1 22 80 001 Fax : +30 1 22 86 219 E-mail: [email protected] ,62,(&-7&6&:*N 424 'LVSRVLWLRQRIFRPPHQWVRQEDOORWUHVXOWV6&1 6XPPDU\RI9RWLQJRQ6&1&RPELQHG&' 5HJLVWUDWLRQDQG)&'%DOORWRQ3URMHFW-7& LVWHQWDWLYH ,62,(&3DUW /DWLQ $WWDFKPHQW±&DQDGD Canadian Ballot Response: FCD 8859-15 (2N2946) Draft satisfactory - with requested changes Comments: 1: Reinstate plus/minus sign into Latin-0 and replace International Currency Sign instead, with the Euro Sign.
    [Show full text]
  • Character Set & Entity References
    Europäisches European Office européen Patentamt Patent Office des brevets Principal Directorate Information Systems The Hague EPO Publications I: Standards Character set & entity references Version 1.0a October, 1995 EP 0 800 000 A1 EUROPEAN PATENT APPLICATION EPO Character set and entity references Version 1.0a, October 1995 Prepared by: P. Brewin & the ELPAC Team epoline Principal Directorate 4.1: Information Systems Version 1.0a, October, 1995 Document status: Production version. (c) European Patent Office, 1995 Contact: epoline HelpDesk European Patent Office Patentlaan 2 Postbus 5818 2280 EE Rijswijk The Netherlands Tel. +31 - (0)70 - 340 4500 Fax +31 - (0)70 - 340 4600 E-mail: [email protected] The latest version of this document be can found at: www.epoline.org Please send any comments and/or corrections to the HelpDesk. Page 2 EPO Character set and entity references Version 1.0a, October 1995 EPO CHARACTER SET AND ENTITY REFERENCES Introduction One of the most common problems in data transfer between computer systems, which may be the same or completely different systems, is whether or not they ‘speak the same language’ at the most basic level - the character set level used to encode data. A character set is: "a set of characters that is handled by a specific machine. The set usually includes the English alphanumeric characters, special characters and operation characters, all of which are graphics characters, and various control characters. Graphic characters thus denote a printed mark or a space while control characters produce some particular effect." A character set is contained in a code page. The simplest code page, which is an international standard, is ISO 646: 1983 IRV (International Reference Version).
    [Show full text]
  • Code Sets, NLS and Character Conversion Vs. DB2
    #IDUG Code sets, NLS and character conversion vs. DB2 Roland Schock ARS Computer und Consulting GmbH Session Code: C04 2014-09-10 | Platform: LUW #IDUG 2 Overview • What are character sets, encoding schemes and code pages? • Where can I define the code page used? • What is code page conversion and where does it happen? • What problems can arise and how can I avoid them? • Performance considerations #IDUG 3 Character Sets • Basically a character set is just a collection of entities or graphical symbols with a meaning. • Examples for character sets are the latin alphabet, digits, naval flag signs or other symbols: A, B, C, ... ᇹ ぁ ゆ ㌹ ㌺ α γ π ξ 亹 怔 떟 떥 #IDUG 4 Character Encoding • A character encoding or code page is a mapping of symbols of a character set to bit patterns which are also referred as code points. A → 17, B → 23, C → 42, … • Typical examples of encodings are ASCII, EBCDIC or Unicode. • Part of the encoding scheme is also the definition of a serialisation scheme to convert the code point into a sequence of bytes. #IDUG 5 ASCII • Sample of an encoding scheme: • First version 1963, Standardized 1968 • Ordered mapping to 7-bit numbers #IDUG 6 Single Byte Char Sets (SBCS) • Extensions from 7-bit ASCII to 8-bit code pages • ISO-8859-x: ASCII + special characters for some languages • ISO-8859-1 (Latin 1): ASCII + Westeuropean Chars • ISO-8859-2 (Latin 2): ASCII + Easteuropean Chars • ISO-8859-15: Modified ISO-8859-1 including Euro-Symbol (€) • Platform specific charsets: Windows ANSI or MacRoman #IDUG 7 Double Byte Char Sets
    [Show full text]
  • Programming with Unicode Documentation Release 2011
    Programming with Unicode Documentation Release 2011 Victor Stinner Oct 01, 2019 Contents 1 About this book 1 1.1 License..................................................1 1.2 Thanks to.................................................1 1.3 Notations.................................................1 2 Unicode nightmare 3 3 Definitions 5 3.1 Character.................................................5 3.2 Glyph...................................................5 3.3 Code point................................................5 3.4 Character set (charset)..........................................5 3.5 Character string.............................................6 3.6 Byte string................................................6 3.7 UTF-8 encoded strings and UTF-16 character strings..........................7 3.8 Encoding.................................................7 3.9 Encode a character string.........................................7 3.10 Decode a byte string...........................................8 3.11 Mojibake.................................................8 3.12 Unicode: an Universal Character Set (UCS)...............................9 4 Unicode 11 4.1 Unicode Character Set.......................................... 11 4.2 Categories................................................ 11 4.3 Statistics................................................. 12 4.4 Normalization.............................................. 12 5 Charsets and encodings 15 5.1 Encodings................................................ 15 5.2 Popularity...............................................
    [Show full text]
  • MICROLINE® 8810 Series
    MICROLINE® 8810 Series The new standard in intelligent impact printing Highlights: MICROLINE® 8810 Series Impact printers Simple operation, flexible connectivity from OKI move easily through a stack of ML8810 Series printers are as uncomplicated as • Dual 9-pin (18-pin) Impact printhead for multi-part continuous forms, report media they are powerful and durable. They’re simple near-letter quality output and single sheets at a pace that will keep your to operate, with an uncluttered control panel. At • Fast 810 characters-per-second print speed for business moving. the touch of a button you can change the paper increased productivity feed from auto to manual, the printing mode You can feed them manually or by push tractor, • Crisp, clear printing – even on last page of from Normal to Quiet, and the output direction front and back. Count on crisp, near-letter an 8-part form from front to back. They also have a backlit LCD quality output day in and day out, saving paper display that allows you to quickly assess printer • Handles a wide variety of media, including and time, while enhancing productivity. status, print mode, current paper path, and multi-part forms, cut sheets, and slips manual printhead gap selection. • Auto Skew Correction – “straightens” hand-fed Built to perform relentlessly and with style sheets before feeding them into the printer We’ve designed the ML8810 Series for durable Their rugged durability, low profile and even • Location-Free Paper Loading – enables feed output. Both models have a printhead impact lower noise level make ML8810 Series printers of cut sheets and slips anywhere along the force that allows even the last page of an 8-part a perfect fit for the shop or back office.
    [Show full text]