Unicode Notes
Total Page:16
File Type:pdf, Size:1020Kb
Unicode Notes NOTE: This is a “living document” and is regularly updated. I recommend book- marking it at one or more of the URLs on page 5 and updating frequently. Table of Contents Document Conventions and Styles.......................................................................................................4 Language..........................................................................................................................................4 Emacs and Emacs Lisp Terminology...............................................................................................4 File Format.......................................................................................................................................4 Document Availability.....................................................................................................................5 Organised Adversary Regional Download..................................................................................5 Dropbox......................................................................................................................................5 Unicode Version...................................................................................................................................6 Unicode Quick Stats........................................................................................................................6 Frequently checked references.........................................................................................................6 Code Point Insertion Methods..............................................................................................................7 Hotkeys............................................................................................................................................7 Unicode and File or Data Formats........................................................................................................9 Unicode and XML...........................................................................................................................9 Unicode and the Darwin Information Typing Architecture (DITA)............................................9 Unicode and DITA for Publishers (D4P)....................................................................................9 Unicode, HTML, XHTML and XML.........................................................................................9 Unicode and HTML 3.2 and earlier..........................................................................................10 Unicode and HTML 4.x............................................................................................................10 Unicode and XHTML...............................................................................................................10 Unicode and HTML 5...............................................................................................................10 Unicode and HTML 5 with XML.............................................................................................10 Unicode, HTML, XML and Ebooks.........................................................................................10 Unicode, HTML 3.2 and Open E-Book Publication Structure (OEBPS).................................11 OEBPS and Mobipocket...........................................................................................................11 Unicode, XHTML and EPUB 2.0.1..........................................................................................12 Unicode, HTML 5, XML and EPUB 3.0.1...............................................................................12 Unicode, HTML 5 and EPUB 3.1.............................................................................................12 Unicode, HTML 5 and above EPUB 3.1..................................................................................12 Embedding Fonts in EPUB Files...................................................................................................14 Font Obfuscation.......................................................................................................................14 Embedding Fonts for Apple iBooks..........................................................................................14 Recommended Fonts for Embedding........................................................................................15 Unicode and LaTeX.......................................................................................................................16 PDF and PostScript........................................................................................................................19 PostScript..................................................................................................................................19 PDF/A-1....................................................................................................................................19 PDF/X-3:2002...........................................................................................................................19 Last edit: 04/06/2020 Copyright © Benjamin D. McGinnes, 2010-2020 Page 1 of 78 File Names and Locations..............................................................................................................20 The Domain Name System and Unicode..................................................................................20 Unicode and Programming Languages..............................................................................................22 Unicode and Python.......................................................................................................................22 Python and Punycode................................................................................................................24 Python Handling of URIs, URLs and URNs............................................................................24 Lisp and Emacs Lisp......................................................................................................................25 Scheme and Guile..........................................................................................................................26 Unicode and Operating Systems........................................................................................................27 Unicode and Mac OS X.................................................................................................................27 Unicode and Linux.........................................................................................................................27 Unicode and BSD..........................................................................................................................27 Unicode and Windows...................................................................................................................27 Unicode and Software........................................................................................................................29 Unicode and Emacs.......................................................................................................................29 Unicode and Atom.........................................................................................................................30 Unicode and Vi and/or Vim...........................................................................................................30 Unicode and LibreOffice...............................................................................................................30 Unicode and oXygenXML Editor..................................................................................................31 Unicode and IRC...........................................................................................................................31 Unicode and HexChat or XChat...............................................................................................32 Unicode Utilities............................................................................................................................32 Unum.........................................................................................................................................32 Unicode and Related Subject Matter Experts................................................................................33 Joel Spolsky on Developers and Unicode.................................................................................33 Dan Sugalski on C, Unicode, strings and character encoding..................................................33 The USA's National Security Agency (NSA) on PDFs and Metadata......................................34 Conversions and Transformations......................................................................................................36 Emacs Modes.................................................................................................................................36 Htmlize......................................................................................................................................36 Org Mode..................................................................................................................................36 Pandoc............................................................................................................................................37 Docutils and reStructuredText.......................................................................................................38