A Comparative Analysis of Information Hiding Techniques for Copyright Protection of Text Documents

Total Page:16

File Type:pdf, Size:1020Kb

A Comparative Analysis of Information Hiding Techniques for Copyright Protection of Text Documents Hindawi Security and Communication Networks Volume 2018, Article ID 5325040, 22 pages https://doi.org/10.1155/2018/5325040 Review Article A Comparative Analysis of Information Hiding Techniques for Copyright Protection of Text Documents Milad Taleby Ahvanooey ,1 Qianmu Li ,1 Hiuk Jae Shim,2 and Yanyan Huang3 1 School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China 2School of Computer and Sofware, Nanjing University of Information Science and Technology, Nanjing, China 3School of Automation, Nanjing University of Science and Technology, Nanjing, China Correspondence should be addressed to Milad Taleby Ahvanooey; [email protected] and Qianmu Li; [email protected] Received 6 October 2017; Revised 28 December 2017; Accepted 16 January 2018; Published 17 April 2018 Academic Editor: Pino Caballero-Gil Copyright © 2018 Milad Taleby Ahvanooey et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. With the ceaseless usage of web and other online services, it has turned out that copying, sharing, and transmitting digital media over the Internet are amazingly simple. Since the text is one of the main available data sources and most widely used digital media on the Internet, the signifcant part of websites, books, articles, daily papers, and so on is just the plain text. Terefore, copyrights protection of plain texts is still a remaining issue that must be improved in order to provide proof of ownership and obtain the desired accuracy. During the last decade, digital watermarking and steganography techniques have been used as alternatives to prevent tampering, distortion, and media forgery and also to protect both copyright and authentication. Tis paper presents a comparative analysis of information hiding techniques, especially on those ones which are focused on modifying the structure and content of digital texts. Herein, various text watermarking and text steganography techniques characteristics are highlighted along with their applications. In addition, various types of attacks are described and their efects are analyzed in order to highlight the advantages and weaknesses of current techniques. Finally, some guidelines and directions are suggested for future works. 1. Introduction Digital text watermarking is a data hiding technique which conceals a signature or copyright information called Following the progressive growth of Internet and advance- watermark inside a cover text in an imperceptible way. ment of online services, digital publishing has become an Separating the hidden watermark from the cover text is very essential topic and in the next-generation organizations, difcult due to the fact that watermarked text is invisible for ofces (e.g., institutions and publishers) seem to be paperless. everyone except the original owner. Recently, text watermark- Nowadays, various studies are in process to execute and ing has not drawn much attention from cyber security experts organize some ideas such as e-commerce, e-government, and and researchers. Tere are some reasons for that: the much online libraries. Digital publishing has many privileges, but lower capacity of text to retain data might be one of the main it has some fundamental threats such as illegal use of copy- reasons (i.e., compared to other digital media such as image, righted documents, manipulating the data, and redistributing audio, and videos). However, there are a number of reasons such information [1–3]. In this case, some protective solutions why we should pay more attention to it. Firstly, the text is consisting of copyright protection, integrality, authenticity, still a major form of universally applicable digital media. In and confdentiality are essential to prevent forgery and pla- other words, text is an important part of communication giarism problems. Downloading and manipulating a copy- between people compared to other media. Secondly, text righted text and thus reusing it without any control are easy watermarking has no clear evaluation criteria to analyze its these days; hence, copyright management is very necessary to efciency [6–10]. protect such information against modifying and reproducing Te diferent categories of information security systems processes [4–6]. are depicted in Figure 1. Te cryptography and information 2 Security and Communication Networks Information security systems Cryptography Information hiding Black-box cryptography Steganography Watermarking Gray-box cryptography Linguistic Technical White-box cryptography Working domain Type of documents Special Frequency Text Image Video Audio Structural Linguistic Figure 1: Diferent categories of information security systems. hiding are security systems that are used to protect data from Te main contributions of this paper are summarized as deceivers, crackers, hackers, and spies. Commonly, most of follows: the malicious users want to leave traces from cuts, manip- ulations, and infections [7]. Te cryptography scrambles a (i) We present a brief overview of existing literature on plain text into ciphertext which is reversible without data text watermarking categories, architecture, applica- loss. Te goal of cryptography is to prevent unauthorized tions, attacks, and evaluation criteria. access to the secret information by scrambling the content (ii) We summarize some information hiding techniques of information. On the other hand, information hiding is a which are focused on altering the structure and powerful security technique which hides a secret data in a content of the cover text in order to hide secret cover media (e.g., text, image, audio, or video) so that the information. trace of embedding hidden data is completely unnoticeable. (iii) We provide a comparative analysis of the summarized Te cryptography and information hiding are similar in a techniques and evaluate their efciency based on the way that both are utilized to protect sensitive information. specifed criteria. However, the imperceptibility is the diference between both techniques; that is, information hiding concerns how to Te rest of the paper is organized as follows. In Section 2, hide information unnoticeably. Generally, the information we review text watermarking literature and related studies. hidingcanbefurthercategorizedintosteganographyand In Section 3, we introduce individual text watermarking watermarking. Te aim of steganography is to hide a secret methods and analyze them based on evaluation criteria, and message in a cover media in order to transmit the secret we give some suggestions for the future works in Section 4. information; therefore, the main concern is how to conceal Finally, Section 5 draws the conclusions. the secret information without raising suspicion; that is, steganography needs to conceal the fact that the message 2. Literature Review is hidden. Watermarking is concerned with hiding a small data in digital fles such that the hidden data is robust to In what follows, we describe the existing literature on text alterations and adjustments. In other words, watermarking watermarking including architecture, the Unicode standard, aimstoprotectintellectualpropertyofdigitalmediaagainst text watermarking categories, applications, evaluation crite- unauthorized copy or access by embedding a watermark ria, and attacks. (visible or invisible) in the cove media which can remain beside the data, and it can be used whenever there is any query 2.1. Text Watermarking Architecture. AsshowninFigure2, about the originality of media (e.g., the hidden watermark digital text watermarking includes two main phases, namely, refers to the original owner) [2–10]. watermark embedding and watermark extraction. Over the last two decades, many information hiding tech- niques have been proposed in terms of text watermarking and (i) Watermark Embedding. Te embedding phase of text text steganography for copyright protection [11–14], proof of watermarking algorithm includes three stages. Te frst stage ownership [15–23], and copy control and authentication [24– is generating a watermark string which includes the owner’s 31]. Although the aim of steganography is diferent, it also name or other pieces of information (e.g., author and pub- can be used for the copyright protection of digital texts like lisher). In the second stage, the watermark string is converted watermarking. to a binary string, which is modifed by a hash function Security and Communication Networks 3 Key Digital watermark Digital watermark Key Embedding Extraction Te watermarked text is sharing (i) Watermark generation on the communication channel (i) Watermark extraction, (ii) Watermark securing Rehash, and detection (iii) Watermark inserting Attacks (ii) Watermark authentication (i) Alterations (ii) Distortions Original or cover text document (iii) Redistributions Watermarking No Tampered authenticator text document Yes Authenticated text document Figure 2: Digital text watermarking (embedding and extraction) architecture. according to an optional key, and then an invisible watermark and classical letters in many languages. Tis standard is string is generated for embedding it into special locations compatible with the latest version of ISO/IEC 10646-1:2017 in the cover text. Finally, it is inserted into special locations and has the same characters and codes of ISO/IEC 10646. where the watermark string will not be afected by attacks As of
Recommended publications
  • The Unicode Cookbook for Linguists: Managing Writing Systems Using Orthography Profiles
    Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2017 The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles Moran, Steven ; Cysouw, Michael DOI: https://doi.org/10.5281/zenodo.290662 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-135400 Monograph The following work is licensed under a Creative Commons: Attribution 4.0 International (CC BY 4.0) License. Originally published at: Moran, Steven; Cysouw, Michael (2017). The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles. CERN Data Centre: Zenodo. DOI: https://doi.org/10.5281/zenodo.290662 The Unicode Cookbook for Linguists Managing writing systems using orthography profiles Steven Moran & Michael Cysouw Change dedication in localmetadata.tex Preface This text is meant as a practical guide for linguists, and programmers, whowork with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together. The intersection of the Unicode Standard and the International Phonetic Al- phabet is often not met without frustration by users. Nevertheless, thetwo standards have provided language researchers with a consistent computational architecture needed to process, publish and analyze data from many different languages. We bring to light common, but not always transparent, pitfalls that researchers face when working with Unicode and IPA. Our research uses quantitative methods to compare languages and uncover and clarify their phylogenetic relations. However, the majority of lexical data available from the world’s languages is in author- or document-specific orthogra- phies.
    [Show full text]
  • Detection of Suspicious IDN Homograph Domains Using Active DNS Measurements
    A Case of Identity: Detection of Suspicious IDN Homograph Domains Using Active DNS Measurements Ramin Yazdani Olivier van der Toorn Anna Sperotto University of Twente University of Twente University of Twente Enschede, The Netherlands Enschede, The Netherlands Enschede, The Netherlands [email protected] [email protected] [email protected] Abstract—The possibility to include Unicode characters in problem in the case of Unicode – ASCII homograph domain names allows users to deal with domains in their domains. We propose a low-cost method for proactively regional languages. This is done by introducing Internation- detecting suspicious IDNs. Since our proactive approach alized Domain Names (IDN). However, the visual similarity is based not on use, but on characteristics of the domain between different Unicode characters - called homoglyphs name itself and its associated DNS records, we are able to - is a potential security threat, as visually similar domain provide an early alert for both domain owners as well as names are often used in phishing attacks. Timely detection security researchers to further investigate these domains of suspicious homograph domain names is an important before they are involved in malicious activities. The main step towards preventing sophisticated attacks, since this can contributions of this paper are that we: prevent unaware users to access those homograph domains • propose an improved Unicode Confusion table able that actually carry malicious content. We therefore propose to detect 2.97 times homograph domains compared a structured approach to identify suspicious homograph to the state-of-the-art confusion tables; domain names based not on use, but on characteristics • combine active DNS measurements and Unicode of the domain name itself and its associated DNS records.
    [Show full text]
  • Armenian Alphabet Confusing Letters
    Armenian Alphabet: Help to Distinguish the “Look-Alike” (confusing) 12 of the 38 Letters of the Armenian Alphabet 6. B b z zebra 27. A a j jet (ch) church 17. ^ 6 dz (adze) (ts) cats 25. C c ch church 19. ch unaspirated ch/j (j) jet Q q • 14. ts unaspirated ts/dz (dz) adze ) 0 • 21. & 7 h hot initial y Maya medial y final in monosyllable except verbs silent final elsewhere 35. W w p piano 36. @ 2 k kid 34. U u v violin 31. t step unaspirated t/d (d) dance K k • 38. ~ ` f fantastic ARMENIAN ALPHABET First created in the 5th c. by Mesrop Mashtots. Eastern Armenian Dialect Transliteration. (Western Dialect, when different, follows in parentheses). 1. a father 14. ts unaspirated ts/dz (dz) adze G g ) 0 • 2. E e b boy (p) piano 15. k skip unaspirated k/g (g) go I i • 3. D d g go (k) kid 16. L l h hot 4. X x d dance (t) ten 17. ^ 6 dz (adze) (ts) cats 5. F f ye yet initial e pen medial 18. * 8 gh Parisian French Paris e pen initial ft, fh, fh2, fr you are 19. ch unaspirated ch/j (j) jet Q q • y yawn before vowel 20. T t m meet 6. B b z zebra 21. h hot initial 7. e pen & 7 + = y Maya medial 8. O o u focus y final in monosyllable except verbs silent final elsewhere 9. P p t ten 22. H h n nut 10.
    [Show full text]
  • Fonts for Latin Paleography 
    FONTS FOR LATIN PALEOGRAPHY Capitalis elegans, capitalis rustica, uncialis, semiuncialis, antiqua cursiva romana, merovingia, insularis majuscula, insularis minuscula, visigothica, beneventana, carolina minuscula, gothica rotunda, gothica textura prescissa, gothica textura quadrata, gothica cursiva, gothica bastarda, humanistica. User's manual 5th edition 2 January 2017 Juan-José Marcos [email protected] Professor of Classics. Plasencia. (Cáceres). Spain. Designer of fonts for ancient scripts and linguistics ALPHABETUM Unicode font http://guindo.pntic.mec.es/jmag0042/alphabet.html PALEOGRAPHIC fonts http://guindo.pntic.mec.es/jmag0042/palefont.html TABLE OF CONTENTS CHAPTER Page Table of contents 2 Introduction 3 Epigraphy and Paleography 3 The Roman majuscule book-hand 4 Square Capitals ( capitalis elegans ) 5 Rustic Capitals ( capitalis rustica ) 8 Uncial script ( uncialis ) 10 Old Roman cursive ( antiqua cursiva romana ) 13 New Roman cursive ( nova cursiva romana ) 16 Half-uncial or Semi-uncial (semiuncialis ) 19 Post-Roman scripts or national hands 22 Germanic script ( scriptura germanica ) 23 Merovingian minuscule ( merovingia , luxoviensis minuscula ) 24 Visigothic minuscule ( visigothica ) 27 Lombardic and Beneventan scripts ( beneventana ) 30 Insular scripts 33 Insular Half-uncial or Insular majuscule ( insularis majuscula ) 33 Insular minuscule or pointed hand ( insularis minuscula ) 38 Caroline minuscule ( carolingia minuscula ) 45 Gothic script ( gothica prescissa , quadrata , rotunda , cursiva , bastarda ) 51 Humanist writing ( humanistica antiqua ) 77 Epilogue 80 Bibliography and resources in the internet 81 Price of the paleographic set of fonts 82 Paleographic fonts for Latin script 2 Juan-José Marcos: [email protected] INTRODUCTION The following pages will give you short descriptions and visual examples of Latin lettering which can be imitated through my package of "Paleographic fonts", closely based on historical models, and specifically designed to reproduce digitally the main Latin handwritings used from the 3 rd to the 15 th century.
    [Show full text]
  • Ninidze, Mariam " a New Point of Vew on the Origin of the Georgian Alphabet"
    Maia Ninidze A New Point of View on the Origin of the Georgian Alphabet The history of the development of different scripts shows that writing may assume the organized systemic appearance that Georgian asomtavruli (capital alphabet) had in the 5th century only as a result of reform. No graphic system has received such refined appearance through its independent evolution. Take, for example, the development of the Greek alphabet down to 403 B.C. In reconstructing the original script of the Georgian alphabet I was guided by the point of view that "the first features and the later developed or introduced foreign elements should be demarked and in this way comparative chronology be defined" (10, p. 50) and that "knowledge of the historical development of the outline of letters will enable us to form an idea of how the change of the outline of letters occurred and ... perceive the type of the preceding change, no imprints on monuments of which have come down to us" (12. p. 202). When one alphabet evinces similarity with another in the outline of corresponding graphemes of similar sounds, this suggests their common provenance or one's derivation from the other. But closeness existing at the level of general systemic peculiarities is a sign of later, artificial assimilation. The systemic similarity of Georgian asomtavruli with Greek would seem to suggest that it was reformed on the basis of the Greek counterpart. The earliest Georgian inscriptions in the asomtavruli are characterized by similarity to Greek in: 1) equal height of letters, 2) geometric roundness of graphemes and linearity-angularity, 3) the direction of writing from left to right, 4) the order of the alphabet and the numerical value of graphemes, but not similarity in the form of individual graphemes.
    [Show full text]
  • Imperceptible NLP Attacks
    Bad‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌ Characters: Imperceptible NLP Attacks Nicholas Boucher Ilia Shumailov University of Cambridge University of Cambridge Cambridge, United Kingdom Vector Institute, University of Toronto [email protected] [email protected] Ross Anderson Nicolas Papernot University of Cambridge Vector Institute, University of Toronto University of Edinburgh Toronto, Canada [email protected] [email protected] Abstract—Several years of research have shown that machine- Indeed, the title of this paper contains 1000 invisible characters
    [Show full text]
  • Vocabularies in the VO Version 2.0
    International Virtual Observatory Alliance Vocabularies in the VO Version 2.0 IVOA Working Draft 2020-03-26 Working group Semantics This version http://www.ivoa.net/documents/Vocabularies/20200326 Latest version http://www.ivoa.net/documents/Vocabularies Previous versions WD-20190905 Author(s) Markus Demleitner Editor(s) Markus Demleitner Abstract In this document, we discuss practices related to the use of RDF-based vocabularies in the VO. This primarily concerns the creation, publication, and maintenance of vocabularies agreed upon within the IVOA. To cover the wide range of use cases envisoned, we define three flavours of such vo- cabularies: SKOS for informal knowledge organisation on the one hand, and strict hierarchies of classes and properties on the other. While the framework rests on the solid foundations of W3C RDF, provisions are made to facili- tate using IVOA vocabularies without specific RDF tooling. Non-normative appendices detail the current vocabulary-related tooling. Status of this document This is an IVOA Working Draft for review by IVOA members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use IVOA Working Drafts as reference materials or to cite them as other than “work in progress”. A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/documents/. Contents 1 Introduction4 1.1 Role within the VO Architecture................5 1.2 Relationship to Vocabularies in the VO Version 1.......6 1.3 Reading Guide..........................7 1.4 Terminology, Conventions, Typography.............7 2 Derivation of Requirements (Non-Normative)8 2.1 Use Cases.............................8 2.1.1 Controlled Vocabulary in VOResource.........8 2.1.2 Controlled Vocabularies in VOTable..........8 2.1.3 Datalink Link Selection.................8 2.1.4 VOEvent Filtering, Query Expansion..........9 2.1.5 Vocabulary Updates in VOResource..........9 2.1.6 Discovering Meanings.................
    [Show full text]
  • Integrated Issues Report)
    The IDN Variant Issues Project A Study of Issues Related to the Management of IDN Variant TLDs (Integrated Issues Report) 20 February 2012 The IDN Variant Issues Project: A Study of Issues Related to the Management of IDN Variant TLDs 20 February 2012 Contents Executive Summary…………………………………………………………………………………………………………….. 6 1 Overview of this Report………………………………………………………………………………………………. 9 1.1 Fundamental Assumptions…………………………………………………………………………………. 10 1.2 Variants and the Current Environment………………………………………………………………. 12 2 Project Overview……………………………………………………………………………………………………….. 16 2.1 The Variant Issues Project…………………………………………………………………………………… 16 2.2 Objectives of the Integrated Issues Report………………………………………………………… 17 2.3 Scope of the Integrated Issues Report………………………………………………………………… 18 3 Range of Possible Variant Cases Identified………………………………………………………………… 19 3.1 Classification of Variants as Discovered……………………………………………………………… 19 3.1.1 Code Point Variants……………………………………………………………………………………. 20 3.1.2 Whole-String Variants………………………………………………………………………………… 20 3.2 Taxonomy of Identified Variant Cases……………………………………………………………….. 21 3.3 Discussion of Variant Classes……………………………………………………………………………… 28 3.4 Visual Similarity Cases……………………………………………………………………………………….. 33 3.4.1 Treatment of Visual Similarity Cases………………………………………………………….. 33 3.4.2 Cross-Script Visual Similarity ……………………………………………………………………….34 3.4.3 Terminology concerning Visual Similarity ……………………………………………………35 3.5 Whole-String Issues …………………………………………………………………………………………….36 3.6 Synopsis of Issues ……………………………………………………………………………………………….39
    [Show full text]
  • IETF A. Freytag Internet-Draft ASMUS, Inc. Intended Status: Standards Track J
    IETF A. Freytag Internet-Draft ASMUS, Inc. Intended status: Standards Track J. Klensin Expires: December 31, 2018 A. Sullivan Oracle Corp. June 29, 2018 Those Troublesome Characters: A Registry of Unicode Code Points Needing Special Consideration When Used in Network Identifiers draft-freytag-troublesome-characters-02 Abstract Unicode’s design goal is to be the universal character set for all applications. The goal entails the inclusion of very large numbers of characters. It is also focused on written language in general; special provisions have always been needed for identifiers. The sheer size of the repertoire increases the possibility of accidental or intentional use of characters that can cause confusion among users, particularly where linguistic context is ambiguous, unavailable, or impossible to determine. A registry of code points that can be sometimes especially problematic may be useful to guide system administrators in setting parameters for allowable code points or combinations in an identifier system, and to aid applications in creating security aids for users. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on December 31, 2018.
    [Show full text]
  • MSR-4: Annotated Repertoire Tables, Non-CJK
    Maximal Starting Repertoire - MSR-4 Annotated Repertoire Tables, Non-CJK Integration Panel Date: 2019-01-25 How to read this file: This file shows all non-CJK characters that are included in the MSR-4 with a yellow background. The set of these code points matches the repertoire specified in the XML format of the MSR. Where present, annotations on individual code points indicate some or all of the languages a code point is used for. This file lists only those Unicode blocks containing non-CJK code points included in the MSR. Code points listed in this document, which are PVALID in IDNA2008 but excluded from the MSR for various reasons are shown with pinkish annotations indicating the primary rationale for excluding the code points, together with other information about usage background, where present. Code points shown with a white background are not PVALID in IDNA2008. Repertoire corresponding to the CJK Unified Ideographs: Main (4E00-9FFF), Extension-A (3400-4DBF), Extension B (20000- 2A6DF), and Hangul Syllables (AC00-D7A3) are included in separate files. For links to these files see "Maximal Starting Repertoire - MSR-4: Overview and Rationale". How the repertoire was chosen: This file only provides a brief categorization of code points that are PVALID in IDNA2008 but excluded from the MSR. For a complete discussion of the principles and guidelines followed by the Integration Panel in creating the MSR, as well as links to the other files, please see “Maximal Starting Repertoire - MSR-4: Overview and Rationale”. Brief description of exclusion
    [Show full text]
  • 2020 Global Payments Guide
    FS TREASURY SERVICES 2020 Global Payments Guide Your Guide To Making Cross-Currency Payments in over 160 Countries with Ease. 2020 Global Payments Guide Last Updated: November 13, 2020 For the most up-to-date version, please visit jpmorgan.com/visit/guide | 2 The J.P. Morgan Global Payments Guide is your desktop resource to help you make timely and accurate payments to beneficiaries around the world. Work with J.P. Morgan to get the global payment support that your business Setting up your payment Cross-Border Payment Requirement demands It is best practice to include the below standard information in Intermediary banks are often used when a payment is made in a With employees, suppliers and operations located around the payment instructions to avoid potential delays or returns: currency that is different from the local currency. When making a globe, ensuring prompt payments in multiple currencies is a payment through an intermediary bank, their SWIFT BIC must be Ordering Customer challenge. Your business requires a partner who takes the time to – included. understand your needs and helps ensure your payments are Account number processed smoothly. – Full name (no initials) Key Terms – Full address International Bank Account Number (IBANN As one of the top-ranked cash management and payments – Street address (avoid P.O. Box numbers) The International Bank Account Number, IBAN, is an internationally processors in the world, J.P. Morgan is able to offer the tools that – City agreed standard to identify an individual’s account at a financial help you manage your day-to-day global operations, along with – State Code institution.
    [Show full text]
  • Overview and Rationale
    Integration Panel: Maximal Starting Repertoire — MSR-4 Overview and Rationale REVISION – November 09, 2018 Table of Contents 1 Overview 3 2 Maximal Starting Repertoire (MSR-4) 3 2.1 Files 3 2.1.1 Overview 3 2.1.2 Normative Definition 3 2.1.3 Code Charts 4 2.2 Determining the Contents of the MSR 5 2.3 Process of Deciding the MSR 6 3 Scripts 7 3.1 Comprehensiveness and Staging 7 3.2 What Defines a Related Script? 8 3.3 Separable Scripts 8 3.4 Deferred Scripts 9 3.5 Historical and Obsolete Scripts 9 3.6 Selecting Scripts and Code Points for the MSR 9 3.7 Scripts Appropriate for Use in Identifiers 9 3.8 Modern Use Scripts 10 3.8.1 Common and Inherited 11 3.8.2 Scripts included in MSR-1 11 3.8.3 Scripts added in MSR-2 11 3.8.4 Scripts added in MSR-3 or MSR-4 12 3.8.5 Modern Scripts Ineligible for the Root Zone 12 3.9 Scripts for Possible Future MSRs 12 3.10 Scripts Identified in UAX#31 as Not Suitable for identifiers 13 4 Exclusions of Individual Code Points or Ranges 14 4.1 Historic and Phonetic Extensions to Modern Scripts 14 4.2 Code Points That Pose Special Risks 15 4.3 Code Points with Strong Justification to Exclude 15 4.4 Code Points That May or May Not be Excludable from the Root Zone LGR 15 4.5 Non-spacing Combining Marks 16 5 Discussion of Particular Code Points 18 Integration Panel: Maximal Starting Repertoire — MSR-3 Overview and Rationale 5.1 Digits and Hyphen 19 5.2 CONTEXT O Code Points 19 5.3 CONTEXT J Code Points 19 5.4 Code Points Restricted for Identifiers 19 5.5 Compatibility with IDNA2003 20 5.6 Code Points for Which the
    [Show full text]