The Unicode Standard, Version 4.1 This File Contains an Excerpt from the Character Code Tables and List of Character Names for the Unicode Standard, Version 4.1
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Unicode Alphabets for L ATEX
Unicode Alphabets for LATEX Specimen Mikkel Eide Eriksen March 11, 2020 2 Contents MUFI 5 SIL 21 TITUS 29 UNZ 117 3 4 CONTENTS MUFI Using the font PalemonasMUFI(0) from http://mufi.info/. Code MUFI Point Glyph Entity Name Unicode Name E262 � OEligogon LATIN CAPITAL LIGATURE OE WITH OGONEK E268 � Pdblac LATIN CAPITAL LETTER P WITH DOUBLE ACUTE E34E � Vvertline LATIN CAPITAL LETTER V WITH VERTICAL LINE ABOVE E662 � oeligogon LATIN SMALL LIGATURE OE WITH OGONEK E668 � pdblac LATIN SMALL LETTER P WITH DOUBLE ACUTE E74F � vvertline LATIN SMALL LETTER V WITH VERTICAL LINE ABOVE E8A1 � idblstrok LATIN SMALL LETTER I WITH TWO STROKES E8A2 � jdblstrok LATIN SMALL LETTER J WITH TWO STROKES E8A3 � autem LATIN ABBREVIATION SIGN AUTEM E8BB � vslashura LATIN SMALL LETTER V WITH SHORT SLASH ABOVE RIGHT E8BC � vslashuradbl LATIN SMALL LETTER V WITH TWO SHORT SLASHES ABOVE RIGHT E8C1 � thornrarmlig LATIN SMALL LETTER THORN LIGATED WITH ARM OF LATIN SMALL LETTER R E8C2 � Hrarmlig LATIN CAPITAL LETTER H LIGATED WITH ARM OF LATIN SMALL LETTER R E8C3 � hrarmlig LATIN SMALL LETTER H LIGATED WITH ARM OF LATIN SMALL LETTER R E8C5 � krarmlig LATIN SMALL LETTER K LIGATED WITH ARM OF LATIN SMALL LETTER R E8C6 UU UUlig LATIN CAPITAL LIGATURE UU E8C7 uu uulig LATIN SMALL LIGATURE UU E8C8 UE UElig LATIN CAPITAL LIGATURE UE E8C9 ue uelig LATIN SMALL LIGATURE UE E8CE � xslashlradbl LATIN SMALL LETTER X WITH TWO SHORT SLASHES BELOW RIGHT E8D1 æ̊ aeligring LATIN SMALL LETTER AE WITH RING ABOVE E8D3 ǽ̨ aeligogonacute LATIN SMALL LETTER AE WITH OGONEK AND ACUTE 5 6 CONTENTS -
1 Symbols (2286)
1 Symbols (2286) USV Symbol Macro(s) Description 0009 \textHT <control> 000A \textLF <control> 000D \textCR <control> 0022 ” \textquotedbl QUOTATION MARK 0023 # \texthash NUMBER SIGN \textnumbersign 0024 $ \textdollar DOLLAR SIGN 0025 % \textpercent PERCENT SIGN 0026 & \textampersand AMPERSAND 0027 ’ \textquotesingle APOSTROPHE 0028 ( \textparenleft LEFT PARENTHESIS 0029 ) \textparenright RIGHT PARENTHESIS 002A * \textasteriskcentered ASTERISK 002B + \textMVPlus PLUS SIGN 002C , \textMVComma COMMA 002D - \textMVMinus HYPHEN-MINUS 002E . \textMVPeriod FULL STOP 002F / \textMVDivision SOLIDUS 0030 0 \textMVZero DIGIT ZERO 0031 1 \textMVOne DIGIT ONE 0032 2 \textMVTwo DIGIT TWO 0033 3 \textMVThree DIGIT THREE 0034 4 \textMVFour DIGIT FOUR 0035 5 \textMVFive DIGIT FIVE 0036 6 \textMVSix DIGIT SIX 0037 7 \textMVSeven DIGIT SEVEN 0038 8 \textMVEight DIGIT EIGHT 0039 9 \textMVNine DIGIT NINE 003C < \textless LESS-THAN SIGN 003D = \textequals EQUALS SIGN 003E > \textgreater GREATER-THAN SIGN 0040 @ \textMVAt COMMERCIAL AT 005C \ \textbackslash REVERSE SOLIDUS 005E ^ \textasciicircum CIRCUMFLEX ACCENT 005F _ \textunderscore LOW LINE 0060 ‘ \textasciigrave GRAVE ACCENT 0067 g \textg LATIN SMALL LETTER G 007B { \textbraceleft LEFT CURLY BRACKET 007C | \textbar VERTICAL LINE 007D } \textbraceright RIGHT CURLY BRACKET 007E ~ \textasciitilde TILDE 00A0 \nobreakspace NO-BREAK SPACE 00A1 ¡ \textexclamdown INVERTED EXCLAMATION MARK 00A2 ¢ \textcent CENT SIGN 00A3 £ \textsterling POUND SIGN 00A4 ¤ \textcurrency CURRENCY SIGN 00A5 ¥ \textyen YEN SIGN 00A6 -
The Brill Typeface User Guide & Complete List of Characters
The Brill Typeface User Guide & Complete List of Characters Version 2.06, October 31, 2014 Pim Rietbroek Preamble Few typefaces – if any – allow the user to access every Latin character, every IPA character, every diacritic, and to have these combine in a typographically satisfactory manner, in a range of styles (roman, italic, and more); even fewer add full support for Greek, both modern and ancient, with specialised characters that papyrologists and epigraphers need; not to mention coverage of the Slavic languages in the Cyrillic range. The Brill typeface aims to do just that, and to be a tool for all scholars in the humanities; for Brill’s authors and editors; for Brill’s staff and service providers; and finally, for anyone in need of this tool, as long as it is not used for any commercial gain.* There are several fonts in different styles, each of which has the same set of characters as all the others. The Unicode Standard is rigorously adhered to: there is no dependence on the Private Use Area (PUA), as it happens frequently in other fonts with regard to characters carrying rare diacritics or combinations of diacritics. Instead, all alphabetic characters can carry any diacritic or combination of diacritics, even stacked, with automatic correct positioning. This is made possible by the inclusion of all of Unicode’s combining characters and by the application of extensive OpenType Glyph Positioning programming. Credits The Brill fonts are an original design by John Hudson of Tiro Typeworks. Alice Savoie contributed to Brill bold and bold italic. The black-letter (‘Fraktur’) range of characters was made by Karsten Lücke. -
The Unicode Standard, Version 10.0
Phonetic Extensions Supplement Range: 1D80–1DBF This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 10.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See http://www.unicode.org/errata/ for an up-to-date list of errata. See http://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See http://www.unicode.org/charts/PDF/Unicode-10.0/ for charts showing only the characters added in Unicode 10.0. See http://www.unicode.org/Public/10.0.0/charts/ for a complete archived file of character code charts for Unicode 10.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 10.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 10.0, online at http://www.unicode.org/versions/Unicode10.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, and #45, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See http://www.unicode.org/ucd/ and http://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation. -
The BU Phonetic Keyboarding System
The BU Phonetic keyboarding System Albert Bickford, March 18, 2021 Updated August 03, 2020 1 Introduction The BU1 Phonetic Keyboard provides access to a wide range of characters for Latin-based scripts in Unicode 4.1.0 (www.unicode.org), including: • English, Spanish, French, German, and other major European languages2 • nearly complete set of IPA and Americanist phonetic symbols, include obscure and obsolete symbols • special characters commonly-used in typesetting • arrows • common mathematical, numeric, and currency symbols (It also works with non-Unicode applications, providing access to the standard Windows ANSI Latin-1 character set, also known as codepage 1252, using virtually the same keyboarding conventions as for the corresponding Unicode characters.) The BU Phonetic keyboard is one of the more extensive Unicode keyboards for Latin scripts available, although it still does not cover all of the hundreds of Latin characters in Unicode. I have tried to include those that are more likely to be used by linguists and others working with multiple languages.3 To ease the memory load, the keyboarding conventions use a relatively small set of conventions that are applied very broadly and generally. Once you learn the conventions, you should be able to guess the keyboarding sequence for many characters without looking them up.4 I have also tried to avoid using keystroke combinations that may be needed for other purposes, e.g. for shortcut commands in common application programs. 1 “BU” stands for “Bickford Unicode”. I named it after myself not for vainglory but simply as an easy way to distinguish it from other Unicode keyboards. -
Latin Extended-B Range: 0180–024F
Latin Extended-B Range: 0180–024F This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation. -
Proposed Mapping of Extipa and Modifier Phonetic Characters Kirk Miller, [email protected] 2020 April 14
Proposed mapping of extIPA and modifier phonetic characters Kirk Miller, [email protected] 2020 April 14 Per Script Ad Hoc Committee, 10780–107BF could be allocated to Latin. Mike Everson would like to draw characters. We will need to send a font to Unicode. Latin Extended-D U+A7C0, C1 still available? ...0 ...1 ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9 ...A ...B ...C ...D ...E ...F Latin D block U+A7Cx U+A7Dx U+A7Ex U+A7Fx Latin E block U+AB6x Combining Diacritical Marks Extended U+1ACx ◌ ◌ ◌ ◌ U+1ADx U+1AEx U+1AFx Supplementary plain U+1078x U+1079x U+107Ax U+107Bx U+A7C0? LATIN LETTER SMALL CAPITAL TURNED L U+A7C1? or sup plane so turned U could join it? LATIN LETTER SMALL CAPITAL TURNED K LATIN SMALL LETTER O WITH RETROFLEX HOOK. Figure 15. LATIN SMALL LETTER I WITH STROKE AND RETROFLEX HOOK. Figure 16. LATIN SMALL LETTER TESH DIGRAPH WITH RETROFLEX HOOK. Figure 14. LATIN SMALL LETTER L WITH BELT AND PALATAL HOOK. Figure 23. LATIN SMALL LETTER ENG WITH PALATAL HOOK. Figure 24. LATIN SMALL LETTER TURNED R WITH PALATAL HOOK. Figures 17–19. ... LATIN SMALL LETTER R WITH FISHHOOK AND PALATAL HOOK. Figure 19. LATIN SMALL LETTER EZH WITH PALATAL HOOK. Figures 20–21. LATIN SMALL LETTER DEZH DIGRAPH WITH PALATAL HOOK. Figure 22. LATIN SMALL LETTER TESH DIGRAPH WITH PALATAL HOOK. Figure 22. LEFT SQUARE BRACKET WITH STROKE. Figures 11, 13. RIGHT SQUARE BRACKET WITH STROKE. Figures 11, 13. LEFT SQUARE BRACKET WITH DOUBLE STROKE. Figures 11, 14. RIGHT SQUARE BRACKET WITH DOUBLE STROKE. -
Considerations in the Identification and Management of Variant Elements in Latin Script Tables for IDN Registration
Considerations in the identification and management of variant elements in Latin script tables for IDN registration Cary Karp Swedish Museum of Natural History This Support Brief was contributed to the ICANN VIP initiative by the host of its Latin script study, the Internet Infrastructure Foundation, with the support of the Swedish Museum of Natural History. The code points available for use in IDNs are all taken from the Unicode Character Code Charts. The Latin script is divided there into nine blocks. The one headed “Basic Latin” restates the ASCII repertoire and therefore includes the familiar letter-digit-hyphen (“LDH”) array to which TLD registries previously restricted all second-level domain names. The TLD labels, themselves, were further restricted to the letters in that repertoire. Latin letters other than the ‘a–z’ encoded in ASCII, as well as diacritically marked and otherwise decorated forms are presented in supplemental and extended Latin blocks, with further Latin letters in blocks under the heading “Phonetic Symbols”. Many of the marked letters can be represented with differing series of code points. Other letters that are intrinsically different and have different code points may share the same glyph. Protocol constraint renders the first of these situations tractable. Contextual restriction on the use of certain code points is necessary for the second. Basic concepts and considerations in the protocol and contextual management of these conditions are discussed below, with specific regard to the local collation of permissible IDN character repertoires and the identification of variant relationships among the listed characters. The rubric “Support Brief” indicates the intention of this text serving as a source document for the study group’s deliberations and report, without being a structured work in itself. -
Proposal to Encode Phonetic Symbols with Palatal Hook in the UCS
Proposal to Encode Phonetic Symbols with Palatal Hook in the UCS Date: 2003-5-30 Author: Peter Constable, SIL International Address: 7500 W. Camp Wisdom Rd. Dallas, TX 75236 USA Tel: +1 972 708 7485 Email: [email protected] A. Administrative 1. Title Proposal to Encode Phonetic Symbols with Palatal Hook in the UCS 2. Requester’s name SIL International (contact: Peter Constable) 3. Requester type Expert contribution 4. Submission date 2003-05-30 5. Requester’s reference 6a. Completion This is a complete proposal 6b. More information to be Only as required for clarification. provided? B. Technical------General 1a. New Script? Name? No 1b. Addition of characters to existing block? Yes — Phonetic Extensions Name? 2. Number of characters in proposal 17 3. Proposed category A 4. Proposed level of implementation and 1 (no combining marks or jamo) rationale 5a. Character names included in proposal? Yes 5b. Character names in accordance with Yes guidelines? 5c. Character shapes reviewable? Yes 6a. Who will provide computerized font? SIL International 6b. Font currently available? Yes 6c. Font format? TrueType Proposal to Encode Phonetic Symbols with Palatal Hook in the UCS Page 1 of 12 Peter G. Constable May 30, 2003 Rev: 11 7a. Are references (to other character sets, Yes dictionaries, descriptive texts, etc.) provided? 7b. Are published examples (such as samples Yes from newspapers, magazines, or other sources) of use of proposed characters attached? 8. Does the proposal address other aspects of Yes, suggested character properties are included (see section E). character data processing? C. Technical------Justification 1. Has this proposal for addition of No character(s) been submitted before? 2a. -
Phonetic Extensions Supplement Range: 1D80–1DBF
Phonetic Extensions Supplement Range: 1D80–1DBF This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation. -
Considerations in the Use of the Latin Script in Variant Internationalized Top-Level Domains
Considerations in the use of the Latin script in variant internationalized top-level domains Final report of the ICANN VIP Study Group for the Latin script Executive summary The study group examined all the characters in the Unicode Character Code Chart version 6.1.0 that are associated with the Latin script and valid under the IDNA2008 protocol. It identified several forms of “confusability” that might require careful consideration in the collation of a subset of the broader repertoire for local use. The resolution of such issues is, however, highly dependent on local orthographic conventions. These frequently treat the same characters in different manners. Strings that are confusingly similar in the context of one language may have no such connotations in another. Noting that the Latin script is used by a larger number of separate language communities than is any other single script, attempting to provide a comprehensive overview of the needs of all of them is an unrealistic endeavor. A summary attempt at doing so nonetheless would be culturally insensitive to communities that have yet to join the IDN discussion. The study group therefore finds no basis for the categorical treatment of any code point assigned to an element of the Latin script as being equivalent to any other such code point. Nor does it believe that any such basis exists beyond what is already incorporated in the IDNA2008 protocol. The ICANN TLD application process should not permit requests for multiple Latin strings under the premise that they are variants of each other. Careful scrutiny is required when evaluating proposed TLD labels for confusability but that does not make them variants in the focused sense of the VIP study. -
Those Obscure Accents
Those obscure accents . Karel Hor´ak Institute of Mathematics, Academy of Sciences, Praha horakk (at) math dot cas dot cz Abstract »A special shape of a háček, similar to an apostrophe, is used in Czech and Slovak with ď, ľ, Ľ and ť characters. It could be derived from the apostrophe or comma, but it should be more humble, smaller, and, importantly, narrower. Generally, the symbol should draw less attention than the comma. This special form could also take a straight shape similar to acute; this usually occupies less space than an apostrophe-like form and it does not cause as many problems in kerning. Vertically, the symbol is most often placed towards the ascender line, but its position does not necessarily have to be constant (with ť, it is often necessary to place the accent higher that with the other characters). With capital Ľ, it is desirable that the accent exceeds the height of the character. This is mostly equivalent with justifying the upper edge of the accent to the ascender line.« [DIACRITICS, a project by typo.cz and designiq.cz] An excursion into history with many examples of good, bad and ugly solutions. Briefly from the history It should be noticed that black letters (frak- tur) were widely used in those times for typesetting. The motto quoted in the abstract, which states in And for many years, types were often not created the condensed form the final lesson I learned during in the country but brought from abroad. Black let- the long (never finished) way to understand typo- ters were used in printing until the end of 18th cen- graphic quality, would be sufficient.