The Unicode Standard, Version 10.0
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
RFC 3629 UTF-8 November 2003
Network Working Group F. Yergeau Request for Comments: 3629 Alis Technologies STD: 63 November 2003 Obsoletes: 2279 Category: Standards Track UTF-8, a transformation format of ISO 10646 Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract ISO/IEC 10646-1 defines a large character set called the Universal Character Set (UCS) which encompasses most of the world's writing systems. The originally proposed encodings of the UCS, however, were not compatible with many current applications and protocols, and this has led to the development of UTF-8, the object of this memo. UTF-8 has the characteristic of preserving the full US-ASCII range, providing compatibility with file systems, parsers and other software that rely on US-ASCII values but are transparent to other values. This memo obsoletes and replaces RFC 2279. Table of Contents 1. Introduction . 2 2. Notational conventions . 3 3. UTF-8 definition . 4 4. Syntax of UTF-8 Byte Sequences . 5 5. Versions of the standards . 6 6. Byte order mark (BOM) . 6 7. Examples . 8 8. MIME registration . 9 9. IANA Considerations . 10 10. Security Considerations . 10 11. Acknowledgements . 11 12. Changes from RFC 2279 . 11 13. Normative References . 12 Yergeau Standards Track [Page 1] RFC 3629 UTF-8 November 2003 14. -
Supplementary Guide to UEB Reference Materials V.8.31.16
Supplementary Guide to UEB Reference Materials v.8.31.16 Unless otherwise indicated, page numbers refer to The Rules of Unified English Braille, 2013 For referenced BANA Guidances visit: www.brailleauthority.org * indicates definition of entry word A @ sign, 25 Caret, 24, 42 Abbreviations, 106, 152 Cent Sign ¢, 26 Accented letters, 42, 190 Chemistry, 89, 178, see BANA Guidance capitals, 80 Code switching, 199-210 in fully capped words, 89 how to use, 202-203 Acronyms, 106, 152 indicators Addition foreign language, 191-192, 195 non-technical materials, 31 IPA, 199, 207-208 technical materials, 169 music, 199, 208-209 Alphabetic wordsign, *7, 9, 15, 103-106, Nemeth code, 199, 209-210 164 non-UEB, 199, 203-208 Ampersand &, 21 Coinage, 26, 64 Anglicized words, 45, 158, 186, 189 Colored type, 11, 97 Apostrophe, 18, 69, 105, 107 Comma, 69 Arrows, 21, 174 numeric mode, 59 line mode, 219 Comparison, signs of, 169,31 Asterisk, 21 Compound words, bridging, 146 At sign @, 25 Computer material contractions in, 155 B email addresses, 155 Blank to be filled in, 73, 160 grade 1 indicators, 52 Boldface indicators, 91 Computer notation, 178 Brackets, opening and closing, 69, 78 Contracted (grade 2) braille, *7, 14 Braille grouping indicators, 23, 45, 172 usage cross-referenced, 14 Braille order, list of symbols, 275 Contractions summary, 9 Bullet, 24, 34, 37 Contractions, *7, 9, 103-168 abbreviations, 152 C acronyms, 152 Capitalization, 79-90 alphabetic wordsigns, *7, 9, 15, 103-106, grade 1, 55 164 indicators bridging, 146-152 choice of, 87 aspirated -
Quarkxpress 9.1 Keyboard Command Guide: Mac OS
QuarkXPress 9.1 Keyboard Command Guide: Mac OS Menu commands (Mac OS®) ...................................................................................................... 2 Dialog box commands (Mac OS) ................................................................................................ 7 Palette commands (Mac OS) ...................................................................................................... 8 Project and layout commands (Mac OS) ................................................................................... 10 Item commands (Mac OS) ........................................................................................................ 12 Text commands (Mac OS) ........................................................................................................ 14 Picture commands (Mac OS) .................................................................................................... 20 1 Menu commands (Mac OS®) QuarkXPress menu QuarkXPress® Environment dialog box Option+About QuarkXPress or Control+Option+E Preferences +Option+Shift+Y Quit +Q File menu New Project +N New Library +Option+N Open +O Close +W Save +S Save As +Shift+S Revert to last Auto Save Option+Revert to Saved Import +E Save Text +Option+E Append +Option+A Export Layout as PDF +Option+P Export Page as EPS +Option+Shift+S Print +P Output Job +Option+Shift+O Edit menu Undo +Z Redo +Y, +Z, or +Shift+Z (configurable) Cut +X Copy +C Paste +V Paste without Formatting +Option+V Paste In Place +Option+Shift+V Select All +A -
Double Hyphen" Source: Karl Pentzlin Status: Individual Contribution Action: for Consideration by JTC1/SC2/WG2 and UTC Date: 2010-09-28
JTC1/SC2/WG2 N3917 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Revised Proposal to encode a punctuation mark "Double Hyphen" Source: Karl Pentzlin Status: Individual Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Date: 2010-09-28 Dashes and Hyphens A U+2E4E DOUBLE HYPHEN → 2010 hyphen → 2E17 double oblique hyphen → 003D equals sign → A78A modifier letter short equals sign · used in transcription of old German prints and handwritings · used in some non-standard punctuation · not intended for standard hyphens where the duplication is only a font variant Properties: 2E4E;DOUBLE HYPHEN;Pd;0;ON;;;;;N;;;;; Entry in LineBreak.TXT: 2E4E;BA # DOUBLE HYPHEN 1. Introduction The "ordinary" hyphen, which is representable by U+002D HYPHEN-MINUS or U+2010 HYPHEN, usually is displayed by a single short horizontal dash, but has a considerable glyph variation: it can be slanted to oblique or doubled (stacked) according to the used font. For instance, in Fraktur (Blackletter) fonts, it commonly is represented by two stacked short oblique dashes. However, in certain applications, double hyphens (consisting of two stacked short dashes) are used as characters with semantics deviating from the "ordinary" hyphen, e.g. to represent a definite unit in transliteration. For such a special application, in this case for transliteration of Coptic, U+2E17 DOUBLE OBLIQUE HYPHEN was encoded ([1], example on p. 9). However, there are other applications where the double hyphen us usually not oblique. For such applications, here a "DOUBLE HYPHEN" is proposed, which consists of two stacked short dashes which usually are horizontal. -
The Unicode Standard 5.2 Code Charts
C0 Controls and Basic Latin Range: 0000–007F This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 5.2. This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See http://www.unicode.org/errata/ for an up-to-date list of errata. See http://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See http://www.unicode.org/charts/PDF/Unicode-5.2/ for charts showing only the characters added in Unicode 5.2. See http://www.unicode.org/Public/5.2.0/charts/ for a complete archived file of character code charts for Unicode 5.2. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 5.2 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 5.2, online at http://www.unicode.org/versions/Unicode5.2.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, and #44, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See http://www.unicode.org/ucd/ and http://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation. -
CJK Symbols and Punctuation Range: 3000–303F
CJK Symbols and Punctuation Range: 3000–303F This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation. -
The Ultimate Guide to Style, Grammar, Punctuation, Usage
THE AMA HANDBOOK OF BUSINESS WRITING This page intentionally left blank The AMA Handbook of Business Writing The Ultimate Guide to Style, Grammar, Usage, Punctuation, Construction, and Formatting KEVIN WILSON and JENNIFER WAUSON AMERICAN MANAGEMENT ASSOCIATION New York • Atlanta • Brussels • Chicago • Mexico City • San Francisco Shanghai • Tokyo • Toronto • Washington, D. C. Bulk discounts available. For details visit: www.amacombooks.org/go/specialsales Or contact special sales: Phone: 800-250-5308 Email: [email protected] View all the AMACOM titles at: www.amacombooks.org This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional service. If legal advice or other expert assistance is required, the services of a competent professional person should be sought. Library of Congress Cataloging-in-Publication Data AMA handbook of business writing : the ultimate guide to style, grammar, usage, punctuation, construction, and formatting / Kevin Wilson and Jennifer Wauson. p. cm. Includes bibliographical references and index. ISBN-13: 978-0-8144-1589-4 Isbn-10: 0-8144-1589-x 1. Commercial correspondence--Handbooks, manuals, etc. 2. Business writing— Handbooks, manuals, etc. 3. English language—Business English—Handbooks, manuals, etc. I. Wilson, K. (Kevin), 1958– II. Wauson, Jennifer. III. American Management Association. HF5726.A485 1996 808'.06665—dc22 2009050050 © 2010 Kevin Wilson and Jennifer Wauson. All rights reserved. Printed in the United States of America. This publication may not be reproduced, stored in a retrieval system, or transmitted in whole or in part, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of AMACOM, a division of American Management Association, 1601 Broadway, New York, NY 10019. -
Unified English Braille (UEB) General Symbols and Indicators
Unified English Braille (UEB) General Symbols and Indicators UEB Rulebook Section 3 Published by International Council on English Braille (ICEB) space (see 3.23) ⠣ opening braille grouping indicator (see 3.4) ⠹ first transcriber‐defined print symbol (see 3.26) ⠫ shape indicator (see 3.22) ⠳ arrow indicator (see 3.2) ⠳⠕ → simple right pointing arrow (east) (see 3.2) ⠳⠩ ↓ simple down pointing arrow (south) (see 3.2) ⠳⠪ ← simple left pointing arrow (west) (see 3.2) ⠳⠬ ↑ simple up pointing arrow (north) (see 3.2) ⠒ ∶ ratio (see 3.17) ⠒⠒ ∷ proportion (see 3.17) ⠢ subscript indicator (see 3.24) ⠶ ′ prime (see 3.11 and 3.15) ⠶⠶ ″ double prime (see 3.11 and 3.15) ⠔ superscript indicator (see 3.24) ⠼⠡ ♮ natural (see 3.18) ⠼⠣ ♭ flat (see 3.18) ⠼⠩ ♯ sharp (see 3.18) ⠼⠹ second transcriber‐defined print symbol (see 3.26) ⠜ closing braille grouping indicator (see 3.4) ⠈⠁ @ commercial at sign (see 3.7) ⠈⠉ ¢ cent sign (see 3.10) ⠈⠑ € euro sign (see 3.10) ⠈⠋ ₣ French franc sign (see 3.10) ⠈⠇ £ pound sign (pound sterling) (see 3.10) ⠈⠝ ₦ naira sign (see 3.10) ⠈⠎ $ dollar sign (see 3.10) ⠈⠽ ¥ yen sign (Yuan sign) (see 3.10) ⠈⠯ & ampersand (see 3.1) ⠈⠣ < less‐than sign (see 3.17) ⠈⠢ ^ caret (3.6) ⠈⠔ ~ tilde (swung dash) (see 3.25) ⠈⠼⠹ third transcriber‐defined print symbol (see 3.26) ⠈⠜ > greater‐than sign (see 3.17) ⠈⠨⠣ opening transcriber’s note indicator (see 3.27) ⠈⠨⠜ closing transcriber’s note indicator (see 3.27) ⠈⠠⠹ † dagger (see 3.3) ⠈⠠⠻ ‡ double dagger (see 3.3) ⠘⠉ © copyright sign (see 3.8) ⠘⠚ ° degree sign (see 3.11) ⠘⠏ ¶ paragraph sign (see 3.20) -
Control Characters in ASCII and Unicode
Control characters in ASCII and Unicode Tens of odd control characters appear in ASCII charts. The same characters have found their way to Unicode as well. CR, LF, ESC, CAN... what are all these codes for? Should I care about them? This is an in-depth look into control characters in ASCII and its descendants, including Unicode, ANSI and ISO standards. When ASCII first appeared in the 1960s, control characters were an essential part of the new character set. Since then, many new character sets and standards have been published. Computing is not the same either. What happened to the control characters? Are they still used and if yes, for what? This article looks back at the history of character sets while keeping an eye on modern use. The information is based on a number of standards released by ANSI, ISO, ECMA and The Unicode Consortium, as well as industry practice. In many cases, the standards define one use for a character, but common practice is different. Some characters are used contrary to the standards. In addition, certain characters were originally defined in an ambiguous or loose way, which has resulted in confusion in their use. Contents Groups of control characters Control characters in standards o ASCII control characters o C1 control characters o ISO 8859 special characters NBSP and SHY o Control characters in Unicode Control characters in modern applications Character list o ASCII o C1 o ISO 8859 Categories Translations Character index Sources This article starts by looking at the history of control characters in standards. We then move to modern times. -
Character Properties 4
The Unicode® Standard Version 14.0 – Core Specification To learn about the latest version of the Unicode Standard, see https://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2021 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at https://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see https://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 14.0. Includes index. ISBN 978-1-936213-29-0 (https://www.unicode.org/versions/Unicode14.0.0/) 1. -
Chapter 6, Writing Systems and Punctuation
The Unicode® Standard Version 13.0 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2020 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 13.0. Includes index. ISBN 978-1-936213-26-9 (http://www.unicode.org/versions/Unicode13.0.0/) 1. -
Iso/Iec 10646:2011 Fdis
Proposed Draft Amendment (PDAM) 2 ISO/IEC 10646:2012/Amd.2: 2012 (E) Information technology — Universal Coded Character Set (UCS) — AMENDMENT 2: Caucasian Albanian, Psalter Pahlavi, Old Hungarian, Mahajani, Grantha, Modi, Pahawh Hmong, Mende, and other characters Page 22, Sub-clause 16.3 Format characters Insert the following entry in the list of format characters: 061C ARABIC LETTER MARK 1107F BRAHMI NUMBER JOINER Page 23, Sub-clause 16.5 Variation selectors and variation sequences Remove the first sentence of the third paragraph (starting with ‘No variation sequences using characters’). Insert the following text at the end of the sub-clause. The following list provides a list of variation sequences corresponding to the use of appropriate variation selec- tors with allowed pictographic symbols. The range of presentations may include a traditional black and white text style, using FE0E VARIATION SELECTOR-15, or an ‘emoji’ style, using FE0F VARIATION SELECTOR-16, whose presentation often involves color/grayscale and/or animation. Sequence (UID notation) Description of sequence <0023, FE0E, 20E3> NUMBER SIGN inside a COMBINING ENCLOSING KEYCAP <0023, FE0F, 20E3> <0030, FE0E, 20E3> DIGIT ZERO inside a COMBINING ENCLOSING KEYCAP <0030, FE0F, 20E3> <0031, FE0E, 20E3> DIGIT ONE inside a COMBINING ENCLOSING KEYCAP <0031, FE0F, 20E3> <0032, FE0E, 20E3> DIGIT TWO inside a COMBINING ENCLOSING KEYCAP <0032, FE0F, 20E3> <0033, FE0E, 20E3> DIGIT THREE inside a COMBINING ENCLOSING KEYCAP <0033, FE0F, 20E3> <0034, FE0E, 20E3> DIGIT FOUR inside a COMBINING