Data Loss Prevention R80.40 Administration Guide

Total Page:16

File Type:pdf, Size:1020Kb

Data Loss Prevention R80.40 Administration Guide 22 January 2020 DATA LOSS PREVENTION R80.40 Administration Guide ] Protected [Classification: Check Point Copyright Notice © 2020 Check Point Software Technologies Ltd. All rights reserved. This product and related documentation are protected by copyright and distributed under licensing restricting their use, copying, distribution, and decompilation. No part of this product or related documentation may be reproduced in any form or by any means without prior written authorization of Check Point. While every precaution has been taken in the preparation of this book, Check Point assumes no responsibility for errors or omissions. This publication and features described herein are subject to change without notice. RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c) (1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 and FAR 52.227-19. TRADEMARKS: Refer to the Copyright page for a list of our trademarks. Refer to the Third Party copyright notices for a list of relevant copyrights and third-party licenses. Data Loss Prevention R80.40 Administration Guide Important Information Latest Software We recommend that you install the most recent software release to stay up-to-date with the latest functional improvements, stability fixes, security enhancements and protection against new and evolving attacks. Certifications For third party independent certification of Check Point products, see the Check Point Certifications page. Check Point R80.40 For more about this release, see the R80.40 home page. Latest Version of this Document Open the latest version of this document in a Web browser. Download the latest version of this document in PDF format. Feedback Check Point is engaged in a continuous effort to improve its documentation. Please help us by sending your comments. Revision History Date Description 22 January 2020 First release of this document Data Loss Prevention R80.40 Administration Guide | 3 Table of Contents Table of Contents Glossary 10 Introduction to Data Loss Prevention 20 The Need for Data Loss Prevention 20 Data Loss Prevention and Privacy 20 The Check Point Solution for DLP 21 Content Awareness Software Blade 22 How DLP Works 22 Integrated DLP Security Gateway Deployment 23 Dedicated DLP Gateway Deployment 24 Alternative Gateway Deployments 26 What Happens on Rule Match 28 Role of DLP Administrator 28 DLP Permissions for Administrator Accounts 29 Configuring Full DLP Permissions 30 Configuring a Subset of Permissions 30 Installation and Configuration 32 Installing the DLP Gateway 32 DLP Software Blade Trial License 32 Configuring a DLP Gateway or Security Cluster 32 Data Loss Prevention Wizard 34 Configuring a DLP Gateway in Bridge Mode 35 Configuring Active Directory and LDAP for DLP 36 Rerunning the Data Loss Prevention Wizard 37 Configuring a DLP Gateway for a Web Proxy 37 Configuring DLP for an Internal Web Proxy 39 Configuring Proxy Settings after Management Upgrade 39 Mail Server Required Configuration 40 Action Settings for DLP Rules 40 Configuring Mail Relay 41 Configuring Settings for the Mail Relay 41 Data Loss Prevention R80.40 Administration Guide | 4 Table of Contents Configuring a Dedicated DLP Gateway and Relay on DMZ 42 Recommended Deployment - DLP Gateway with Mail Relay 43 Workarounds for a Non-Recommended Mail Relay Deployment 44 Untrusted Mail Relays and Microsoft Outlook 46 TLS-Encrypted SMTP Connections 46 Configuring Incident Log Handling 46 Configuring the Exchange Security Agent 47 Configuring SmartConsole for the Exchange Security Agent 48 Exchange Server Configuration 49 Configuring SMTP Mirror Port Mode 53 Configuring HTTPS Inspection 54 Inspecting HTTPS Packets 55 Outbound Connections 55 Inbound Connections 55 Configuring Gateways to Inspect Outbound and Inbound HTTPS 56 UserCheck Interaction Objects 66 Configuring UserCheck 66 Kerberos Single Sign On 67 UserCheck Page 71 Creating UserCheck Interaction Objects 72 Plain Text Email Notifications 74 More UserCheck Interaction Options 75 Localizing and Customizing the UserCheck Portal 75 UserCheck Client 77 Enabling UserCheck Client 77 Client and Gateway Communication 78 Renaming the MSI 79 Troubleshooting DNS Based Configuration 82 Getting the MSI File 83 Distributing and Connecting Clients 83 UserCheck and Check Point Password Authentication 85 Helping Users 85 Out of the Box 87 Data Loss Prevention R80.40 Administration Guide | 5 Table of Contents Default Deployment 87 Data Loss Prevention in SmartDashboard 87 Defining My Organization 89 Adding Email Addresses and Domains to My Organization 89 Managing Users 90 Managing Networks 91 Managing VPNs 92 Data Loss Prevention Policies 94 Overview of DLP Rules 94 DLP and Identity Awareness 95 DLP Rule Matching 98 DLP Rule Actions 99 Managing Rules in Detect 100 Setting DLP Rule Tracking 100 Store Incident 101 Setting a Time Restriction 103 DLP Selective Deployment 104 Auditing and Analysis of Incidents 105 DLP Actions 106 Data Owner and User Notifications 110 Defining Data Owners 110 Preparing Corporate Guidelines 110 Communicating with Data Owners 111 Communicating with Users 112 Notifying Data Owners 113 Notifying Users 113 Customizing Notifications 114 Setting and Managing Rules to Ask User 116 Setting Rules to Ask User 116 Managing Rules in Ask User 116 DLP Self Incident-Handling Portal 117 What Users See and Do 117 Unhandled UserCheck Incidents 117 Data Loss Prevention R80.40 Administration Guide | 6 Table of Contents Managing Incidents by Replying to Emails 118 UserCheck Notifications 118 Learning Mode 118 Data Loss Prevention by Scenario 120 Analytical Deployment 120 Creating New Rules 120 Internal DLP Policy Rules 121 More Options for Rules 123 Rule Exceptions 124 Fine Tuning 127 Customized Deployment 127 Setting Rules to Prevent 128 Multi-Realm Authentication Support 128 Troubleshooting DLP-Related Authentication Issues 129 Defining Data Types 130 Protecting Data by Keyword 130 Protecting Data by Pattern 131 Protecting Documents by Template 131 Protecting Data by Fingerprint 133 Repository Scanning 134 Filtering the Repository for Efficiency 134 Granularity 134 Scan Times 135 Logging 135 Log Details 135 NFS Repository scanning in NATed Environments 139 Protecting Files by Attributes 139 Defining Compound Data Types 140 Advanced Data Types 140 Enhancing Accuracy through Statistical Analysis 145 Adding Data Types to Rules 146 Repositories 156 Whitelist Policy 158 Data Loss Prevention R80.40 Administration Guide | 7 Table of Contents Defining Email Addresses 159 Configuring the DLP Watermark 160 Watermarking documents 160 Creating a New Watermark Profile 161 Adding a Shadow Behind Watermark Text in Word and PowerPoint 162 Configuring Watermark Settings on the General Page 163 Configuring Watermark Settings on the Hidden Text Page 163 Completing the Watermark Profile 164 Previewing Watermarks 164 Viewing Watermarks in MS Office Documents 164 Resolving Watermark Conflicts 165 Turning Watermarking On and Off 168 Using the DLP Watermark Viewing Tool 168 Fine Tuning Source and Destination 169 Creating Different Rules for Different Departments 169 Isolating the DMZ 171 Defining Strictest Security 171 Defining Protocols of DLP Rules 172 Fine Tuning for Protocol 173 Configuring More HTTP Ports 173 Advanced Configuration 175 Configuring User Access to an Integrated DLP Gateway 175 Internal Firewall Policy for a Dedicated DLP Gateway 176 Advanced Expiration Handling 177 Advanced SMTP Quotas 177 Advanced FTP and HTTP Quotas 178 Advanced User Notifications 179 Gateway Cleanup of Data 179 Gateway Cleanup of Expired Data 180 Gateway Cleanup of All Captured Data 180 Customizing DLP User-Related Notifications 182 Supporting LDAP Servers with UTF-8 Records 184 Configuring the Corporate Guidelines Link 185 Data Loss Prevention R80.40 Administration Guide | 8 Table of Contents Editing Extreme Condition Values 185 Editing Exchange Security Agent Values 187 Configuring HTTP Inspection on All Ports 189 Defining New File Types 189 Supported File Types 190 Server Certificates 207 Obtaining, Installing, and Viewing a Trusted Server Certificate 207 Troubleshooting 210 Incidents Do Not Expire 210 Mail Server Full 210 Advanced Options for Data Types 212 Regular Expressions and Character Sets 215 Non-Printable Characters 216 Character Types 216 Supported Character Sets 216 Character Set Aliases 218 Command Line Reference 220 Syntax Legend 220 dlpcmd 223 Working with Kernel Parameters on Security Gateway 226 Kernel Debug on Security Gateway 227 Data Loss Prevention R80.40 Administration Guide | 9 Glossary Glossary A Administrator A user with permissions to manage Check Point security products and the network environment. API In computer programming, an application programming interface (API) is a set of subroutine definitions, protocols, and tools for building application software. In general terms, it is a set of clearly defined methods of communication between various software components. Appliance A physical computer manufactured and distributed by Check Point. B Bond A virtual interface that contains (enslaves) two or more physical interfaces for redundancy and load sharing. The physical interfaces share one IP address and one MAC address. See "Link Aggregation". Bonding See "Link Aggregation". Bridge Mode A Security Gateway or Virtual System that works as a Layer 2 bridge device for easy deployment in an existing topology. Data Loss Prevention R80.40 Administration Guide | 10 Glossary C CA Certificate
Recommended publications
  • INTERSKILL MAINFRAME QUARTERLY December 2011
    INTERSKILL MAINFRAME QUARTERLY December 2011 Retaining Data Center Skills Inside This Issue and Knowledge Retaining Data Center Skills and Knowledge 1 Interskill Releases - December 2011 2 By Greg Hamlyn Vendor Briefs 3 This the final chapter of this four part series that briefly Taking Care of Storage 4 explains the data center skills crisis and the pros and cons of Learning Spotlight – Managing Projects 5 implementing a coaching or mentoring program. In this installment we will look at some of the steps to Tech-Head Knowledge Test – Utilizing ISPF 5 implementing a program such as this into your data center. OPINION: The Case for a Fresh Technical If you missed these earlier installments, click the links Opinion 6 below. TECHNICAL: Lost in Translation Part 1 - EBCDIC Code Pages 7 Part 1 – The Data Center Skills Crisis MAINFRAME – Weird and Unusual! 10 Part 2 – How Can I Prevent Skills Loss in My Data Center? Part 3 – Barriers to Implementing a Coaching or Mentoring Program should consider is the GROW model - Determine whether an external consultant should be Part Four – Implementing a Successful Coaching used (include pros and cons) - Create a basic timeline of the project or Mentoring Program - Identify how you will measure the effectiveness of the project The success of any project comes down to its planning. If - Provide some basic steps describing the coaching you already believe that your data center can benefit from and mentoring activities skills and knowledge transfer and that coaching and - Next phase if the pilot program is deemed successful mentoring will assist with this, then outlining a solid (i.e.
    [Show full text]
  • JFP Reference Manual 5 : Standards, Environments, and Macros
    JFP Reference Manual 5 : Standards, Environments, and Macros Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. Part No: 817–0648–10 December 2002 Copyright 2002 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, docs.sun.com, AnswerBook, AnswerBook2, and Solaris are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written license agreements.
    [Show full text]
  • Windows NLS Considerations Version 2.1
    Windows NLS Considerations version 2.1 Radoslav Rusinov [email protected] Windows NLS Considerations Contents 1. Introduction ............................................................................................................................................... 3 1.1. Windows and Code Pages .................................................................................................................... 3 1.2. CharacterSet ........................................................................................................................................ 3 1.3. Encoding Scheme ................................................................................................................................ 3 1.4. Fonts ................................................................................................................................................... 4 1.5. So Why Are There Different Charactersets? ........................................................................................ 4 1.6. What are the Difference Between 7 bit, 8 bit and Unicode Charactersets? ........................................... 4 2. NLS_LANG .............................................................................................................................................. 4 2.1. Setting the Character Set in NLS_LANG ............................................................................................ 4 2.2. Where is the Character Conversion Done? .........................................................................................
    [Show full text]
  • Report Syntax Report Syntax
    Report Syntax Report Syntax Report Syntax This chapter describes the syntax and parameters of the ADAREP utility. Optional Parameters Examples Optional Parameters ADAREP can be specified alone to retrieve a database status report. You can optionally customize the report by added parameter values. ACODE: Identify Code Page for XML UTF-8 Output Use the ACODE parameter only if the UTF8 option is specified on the OUTPUT parameter. This parameter identifies the code page to which the database name, file names, and checkpoint names should be converted. This parameter is specified in three parts, in a specific order, and with each part separated by a comma from the others: ACODE=’database-name-codepage,file-name-codepage,checkpoint-name-codepage’ The database-name-codepage can either be the literal "GCB" or a valid code page number; the file-name-codepage can either be the literal "GCB", the literal "FCB", or a valid code page number; the checkpoint-name-codepage can either be the literal "GCB" or a valid code page number: A setting of "GCB" for any part indicates that the code page for that part should be acquired from the code page in the GCB (the GCB ACODE setting). 1 Report Syntax Optional Parameters A setting of "FCB" for the file-name-codepage indicates that the code page for converting file names in the XML document to UTF-8 should be acquired from the code page setting in the FCB (the FCB ACODE setting). If code pages for all three parts of the ACODE parameter are being specified, they must be specified in the order shown above.
    [Show full text]
  • DB2 UDB for Z/OS V8 and UNICODE: It’S No Tower of Babel by Jim Schesvold
    DB2 UDB for z/OS V8 and UNICODE: It’s No Tower of Babel By Jim Schesvold INTRODUCTION DB2 UDB for z/OS Version 8 is the largest, most profound, highest impact version in the product’s history. Consider: There are more new lines of code in DB2 V8 than there were total lines of code in DB2 Version 1.1. Some individuals say migrating to DB2 V8 is a migration equivalent to DB2 V6 and V7 combined. The Quality Partnership Program—a beta program—lasted 14 months, the longest ever for DB2. There are more changes to the catalog than any other release or version of DB2. DB2 V8 requires 64-bit addressing. A DB2 V8 address space can be up to 16 exobytes, 8 billion times the storage available with V7 and prior. DB2 V8 entities have been increased by magnitudes. For example: Maximum table size has been increased from 16 terabytes to 128 terabytes. Table names can now be up to 128 bytes instead of 18 bytes. The number of partitions for a table has been increased from 254 to 4096. SQL statement length can now be up to 2 MB instead of 32KB. Archive logs have been increased to 10,000 from 1,000. And the list goes on and on. The installation methodology of DB2 has—optionally—been re-worked through the implemen- tation of msys for Setup as an installation technique. Yet for all these product changes and enhancements, one of the largest modifications in DB2 V8 is the implementation of Unicode. While earlier versions of DB2 had Unicode capability, DB2 V8 has extensively integrated character conversion into nearly every aspect of the product.
    [Show full text]
  • Pdflib Text Extraction Toolkit (TET) Manual
    ABC Text Extraction Toolkit (TET) Version 4.3 Toolkit for extracting Text, Images, and Metadata from PDF Documents Copyright © 2002–2014 PDFlib GmbH. All rights reserved. Protected by European and U.S. patents. PDFlib GmbH Franziska-Bilek-Weg 9, 80339 München, Germany www.pdflib.com phone +49 • 89 • 452 33 84-0 fax +49 • 89 • 452 33 84-99 If you have questions check the PDFlib mailing list and archive at tech.groups.yahoo.com/group/pdflib Licensing contact: [email protected] Support for commercial PDFlib licensees: [email protected] (please include your license number) This publication and the information herein is furnished as is, is subject to change without notice, and should not be construed as a commitment by PDFlib GmbH. PDFlib GmbH assumes no responsibility or lia- bility for any errors or inaccuracies, makes no warranty of any kind (express, implied or statutory) with re- spect to this publication, and expressly disclaims any and all warranties of merchantability, fitness for par- ticular purposes and noninfringement of third party rights. PDFlib and the PDFlib logo are registered trademarks of PDFlib GmbH. PDFlib licensees are granted the right to use the PDFlib name and logo in their product documentation. However, this is not required. Adobe, Acrobat, PostScript, and XMP are trademarks of Adobe Systems Inc. AIX, IBM, OS/390, WebSphere, iSeries, and zSeries are trademarks of International Business Machines Corporation. ActiveX, Microsoft, OpenType, and Windows are trademarks of Microsoft Corporation. Apple, Macintosh and TrueType are trademarks of Apple Computer, Inc. Unicode and the Unicode logo are trademarks of Unicode, Inc.
    [Show full text]
  • LANSA Version 11.3
    http://www.lansa.com Newsletter June 2006 LANSA version 11.3 Cumulative Update 3 (CU3) Information CU3 is a cumulative set of changes for a range of LANSA features. It includes software to be installed on an iSeries machine and on a Windows machine. CU3 provides a single install point for all features. So, although the software changes have been grouped for a product feature under an EPC number you do NOT need to directly invoke an EPC install. See the epc77n.htm files for details of the changes delivered. CU3 comprises these individual EPCs: • epc770 - LANSA Communications • epc771 - Visual LANSA • epc772 - LANSA Integrator (only on CU3 install CD) • epc773 - LANSA Web Utilities • epc774 - LANSA Web Server • epc775 - LANSA Open • epc776 - Visual LANSA Frameworks CU3 is distributed by CD-ROM or a Zip File that can be downloaded. LANSA version 11.3 page 1 Compile error RDMLX objecten page 10 In LANSA and V5R4 page 2 PrintScreen functionality in VL page 11 CGIConvMode Setting and L4Web page 3 Crash WAM on HP Notebook page 15 This Advanced Enter key in VL page 4 Cannot start VL on Server PC page 17 VLF Web with IE7 beta page 6 Using the PDFSpoolFileService page 18 Issue VLF demo before CU3 page 7 VLF: What’s new in CU3 page 19 LANSA Newsletter June 2006 page 1 © LANSA 2006 LANSA and V5R4 Description IBM’s latest release of i5/OS, V5R4 was made generally available on February 14th. This version of OS/400 is listed in the V11.0 Supported Platforms document as Confirmed.
    [Show full text]
  • AXIS AFP Printer Controller User's Manual
    Preface Preface Thank you for purchasing the AXIS AFP Printer Controller. Our goal in developing this product is to enable you to connect any PostScript printer to your IBM IPDS environment, allowing you to take full advantage of both the IPDS functions and your PostScript printer capabilites. About Axis Axis Communications, founded in 1984, is one of the world’s fastest grow- ing companies in the CD-ROM server, network print server, network cam- era and IBM printer interface market. The head quarters are located in Lund, Sweden, with subsidiaries in Beijing, Boston, Paris, Shanghai, Sin- gapore, Tokyo, and Hong Kong. Axis Communications has a distributor network operating in more than 60 countries world-wide, marketing four product lines: Network Print Servers: These intelligent Ethernet and Token Ring print servers support a wide range of LAN protocols. The Axis NPS 530, 550, and AXIS 150, 540, 560, 570, 570 MIO are Ethernet print servers, while the Axis NPS 630, 650, and AXIS 640, 660, 670, 670 MIO are Token Ring print servers. IBM Mainframe and S/3x – AS/400 Printer Interfaces: These products include a wide range of plug-in interfaces and stand-alone products such as the AXIS Cobra+, AXIS 330/370 Cobra, AXIS HP MIO, the AXIS AFP IPDS-to-PostScript converter, and the AXIS AFP MIO/IOP IPDS- to-PCL converters. Network CD-ROM Servers: AXIS StorPoint CD multiprotocol CD- ROM servers provide a flexible and cost-efficient solution for sharing CD- ROMs across the network. They are available in Ethernet and Token Ring versions. Network Camera Server: The AXIS NetEye 200 Network Camera attach- es directly to an Ethernet network.
    [Show full text]
  • Unicode to the Rescue
    Session: B02 DB2 and the Tower of Babel: Unicode to the Rescue Jim Dee BMC Software Inc. 13 October 2008 • 13:30 – 14:30 Platform: DB2 for z/OS UTF-8, UTF-16, code pages, and so on - Unicode can be a daunting subject! There is much interest in supporting it - many of our businesses are expanding globally, and DBA’s and application developers will need to support Unicode data stored in DB2 tables. The topic is easier than it looks, and this presentation will help you prepare to migrate your mission critical data to Unicode. 1 Key Points • Understand Unicode concepts - what it is, and how different languages and symbols are represented. • Explain common Unicode terms like code point, plane, and surrogate. • Understand how Unicode is implemented in DB2 and what the code pages mean. • Understand the implications of Unicode for application programs and utilities. • Discuss data conversion considerations. 2 Introduction. What is Unicode? EBCDIC and ASCII code pages Unicode encoding schemes Unicode in DB2 UTF-8 and UTF-16 Code pages Catalog Migration to Unicode Application programs Utilities Data conversion 2 Tower of Babel "The Tower of Babel," oil painting by Pieter Bruegel the Elder, 1563 (credit: Courtesy of the Kunsthistorisches Museum, Vienna) 3 The Tower of Babel is a story from the book of Genesis. It is a classic tale of how an ambitious development project can be disrupted by lack of communication. Sometimes, DB2 data management can seem like the tower. We can store data associated with a particular language in DB2 tables by defining the correct code page for a DB2 subsystem, but how do we combine data representing different languages? There are so many languages! Unicode is part of the solution, but it too seems confusing.
    [Show full text]
  • Fonts Reference Version 12.1
    Start Oracle® Documaker Fonts Reference version 12.1 Part number: E22582-01 June 2012 Notice Copyright © 2009, 2012, Oracle and/or its affiliates. All rights reserved. The Programs (which include both the software and documentation) contain proprietary information; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent, and other intellectual and industrial property laws. Reverse engineering, disassembly, or decompilation of the Programs, except to the extent required to obtain interoperability with other independently created software or as specified by law, is prohibited. The information contained in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. This document is not warranted to be error-free. Except as may be expressly permitted in your license agreement for these Programs, no part of these Programs may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose. If the Programs are delivered to the United States Government or anyone licensing or using the Programs on behalf of the United States Government, the following notice is applicable: U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the Programs, including documentation and technical data, shall be subject to the licensing restrictions set forth in the applicable Oracle license agreement, and, to the extent applicable, the additional rights set forth in FAR 52.227-19, Commercial Computer Software--Restricted Rights (June 1987).
    [Show full text]
  • IRG Principle and Procedures
    INTERNATIONAL ORGANIZATION FOR STANDARDIZATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC 1/SC 2/WG 2/IRG Universal Coded Character Set (UCS) ISO/IEC JTC 1/SC 2/WG 2/IRGN2424 WG2N5161 SC2N4752 (Revision of IRG N1503/N1772/N1823/N1920/N1942/N1975/N2016/N2092/N2153/ N2222/N2275/N2310/N2345/N2408) Confirmed 2021-04-05 Title: IRG Principles and Procedures (IRG PnP) Version 13 Source: IRG Convenor Action: For review by IRG and WG2 Distribution: IRG Reviewers and Ideographic Experts Editor in chief: Lu Qin, IRG Convenor References: Recommendations from IRG #53(IRGN2410&IRG2408,IRGN2412), IRG #52 (IRGN2360), IRGN2345 drafts, and feedback from Ken Lunde and HKSAR; IRG #51 (IRGN2329); IRG #49 ( IRGN2275); IRG #48 (IRGN2220); IRG #47 (IRGN2180); IRG #45 (IRGN2150); IRG # 44 (IRGN2080), IRGN2016 and IRGN1975; IRG #42 (IRGN1952 and feedback from HKSAR, Japan, ROK and TCA, IRG1920 Draft (2012-11-15), Draft 2 (2013-05-04) and Draft 3 (2013-05-22), feedback from Japan (2013-04-23) and ROK (2013-05-16 and 2013-05-21); IRG #40 discussions, IRG1823 Draft 3 and feedback from HKSAR, Korea; IRG #39 discussions IRGN1823 Draft2 feedback from HKSAR and Japan, from KIM Kyongsok, IRGN1781 and N1782 Feedback from KIM Kyongsok, IRGN1772 (P&P Version 5), IRGN1646 (P&P Version 4 draft), IRGN1602 (P&P Draft 4) and IRGN1633 (P&P Editorial Report), IRGN1601 (P&P Draft 3 Feedback from HKSAR), IRGN1590 and IRGN1601(P&P V2 and V3 draft and all feedback), IRGN1562 (P&P V3 Draft 1 and Feedback from HKSAR), IRGN1561 (P&P V2 and all feedback), IRGN1559 (P&P V2 Draft and all feedback), IRGN1516 (P&P V1 Feedback from HKSAR), IRGN1489 (P&P V1 Feedback from Taichi Kawabata) IRGN1487 (P&P V1 Feedback from HKSAR), IRGN1465, IRGN1498 and IRGN1503 (P&P V1 drafts) Table of Contents 1.
    [Show full text]
  • Crash Course on Character Encodings
    Crash Course on Character Encodings Yusuke Shinyama NYCNLP Oct. 27, 2006 Introduction 2 Are they the same? • Unicode • UTF 3 Two Mappings Character Byte Character Code Sequence A 64 64 182 216 1590 ﺽ 美 32654 231 190 142 4 Two Mappings Character Byte Character Code Sequence Unicode UTF-8 A 64 64 182 216 1590 ﺽ 美 32654 231 190 142 “Character Set” “Encoding Scheme” 5 Terminology • Character Set - Mapping from abstract characters to numbers. • Encoding Scheme - Way to represent (encode) a number in a byte sequence in a decodable way. - Only necessary for character sets that have more than 256 characters. 6 In ASCII... Character Byte Character Code Sequence ASCII 5 53 53 A 65 65 m 109 109 7 Character Sets 8 Character Sets • ≤ 256 characters: - ASCII (English) - ISO 8859-1 (English & Western European languages) - KOI8 (Cyrillic) - ISO-8859-6 (Arabic) • 256 < characters: - Unicode - GB 2312 (Simplified Chinese) - Big5 (Traditional Chinese) - JISX 0208 (Japanese) - KPS 9566 (North Korean) 9 Character Sets ISO ISO ASCII GB 2312 Unicode 8859-1 8859-6 A 65 65 65 65 65 ë - 235 - - 235 1590 - 214 - - ﺽ 美 - - - 50112 32654 ♥ - - - - 9829 10 Unicode Standard • History - ISO Universal Character Set (1989) - Unicode 1.0 (1991) ■ 16-bit fixed length codes. - Unicode 2.0 (1996) ■ Oops, we’ve got many more. ■ Extended to 32 bits. - Unicode 5.0 (2006) ■ Keep growing... 11 Unicode Standard • Hexadecimal notation (U+XXXX). • ISO 8859-1 is preserved as the first 256 characters. ISO Unicode 8859-1 U+0041 A 65 (65) U+00EB ë 235 (235) 12 Problems in Unicode • Politics (Microsoft, Apple, Sun, ...) • Lots of application specific characters.
    [Show full text]