Unicode Security Considerations (TR#36)
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Unicode Request for Cyrillic Modifier Letters Superscript Modifiers
Unicode request for Cyrillic modifier letters L2/21-107 Kirk Miller, [email protected] 2021 June 07 This is a request for spacing superscript and subscript Cyrillic characters. It has been favorably reviewed by Sebastian Kempgen (University of Bamberg) and others at the Commission for Computer Supported Processing of Medieval Slavonic Manuscripts and Early Printed Books. Cyrillic-based phonetic transcription uses superscript modifier letters in a manner analogous to the IPA. This convention is widespread, found in both academic publication and standard dictionaries. Transcription of pronunciations into Cyrillic is the norm for monolingual dictionaries, and Cyrillic rather than IPA is often found in linguistic descriptions as well, as seen in the illustrations below for Slavic dialectology, Yugur (Yellow Uyghur) and Evenki. The Great Russian Encyclopedia states that Cyrillic notation is more common in Russian studies than is IPA (‘Transkripcija’, Bol’šaja rossijskaja ènciplopedija, Russian Ministry of Culture, 2005–2019). Unicode currently encodes only three modifier Cyrillic letters: U+A69C ⟨ꚜ⟩ and U+A69D ⟨ꚝ⟩, intended for descriptions of Baltic languages in Latin script but ubiquitous for Slavic languages in Cyrillic script, and U+1D78 ⟨ᵸ⟩, used for nasalized vowels, for example in descriptions of Chechen. The requested spacing modifier letters cannot be substituted by the encoded combining diacritics because (a) some authors contrast them, and (b) they themselves need to be able to take combining diacritics, including diacritics that go under the modifier letter, as in ⟨ᶟ̭̈⟩BA . (See next section and e.g. Figure 18. ) In addition, some linguists make a distinction between spacing superscript letters, used for phonetic detail as in the IPA tradition, and spacing subscript letters, used to denote phonological concepts such as archiphonemes. -
+1. Introduction 2. Cyrillic Letter Rumanian Yn
MAIN.HTM 10/13/2006 06:42 PM +1. INTRODUCTION These are comments to "Additional Cyrillic Characters In Unicode: A Preliminary Proposal". I'm examining each section of that document, as well as adding some extra notes (marked "+" in titles). Below I use standard Russian Cyrillic characters; please be sure that you have appropriate fonts installed. If everything is OK, the following two lines must look similarly (encoding CP-1251): (sample Cyrillic letters) АабВЕеЗКкМНОопРрСсТуХхЧЬ (Latin letters and digits) Aa6BEe3KkMHOonPpCcTyXx4b 2. CYRILLIC LETTER RUMANIAN YN In the late Cyrillic semi-uncial Rumanian/Moldavian editions, the shape of YN was very similar to inverted PSI, see the following sample from the Ноул Тестамент (New Testament) of 1818, Neamt/Нямец, folio 542 v.: file:///Users/everson/Documents/Eudora%20Folder/Attachments%20Folder/Addons/MAIN.HTM Page 1 of 28 MAIN.HTM 10/13/2006 06:42 PM Here you can see YN and PSI in both upper- and lowercase forms. Note that the upper part of YN is not a sharp arrowhead, but something horizontally cut even with kind of serif (in the uppercase form). Thus, the shape of the letter in modern-style fonts (like Times or Arial) may look somewhat similar to Cyrillic "Л"/"л" with the central vertical stem looking like in lowercase "ф" drawn from the middle of upper horizontal line downwards, with regular serif at the bottom (horizontal, not slanted): Compare also with the proposed shape of PSI (Section 36). 3. CYRILLIC LETTER IOTIFIED A file:///Users/everson/Documents/Eudora%20Folder/Attachments%20Folder/Addons/MAIN.HTM Page 2 of 28 MAIN.HTM 10/13/2006 06:42 PM I support the idea that "IA" must be separated from "Я". -
Old Cyrillic in Unicode*
Old Cyrillic in Unicode* Ivan A Derzhanski Institute for Mathematics and Computer Science, Bulgarian Academy of Sciences [email protected] The current version of the Unicode Standard acknowledges the existence of a pre- modern version of the Cyrillic script, but its support thereof is limited to assigning code points to several obsolete letters. Meanwhile mediæval Cyrillic manuscripts and some early printed books feature a plethora of letter shapes, ligatures, diacritic and punctuation marks that want proper representation. (In addition, contemporary editions of mediæval texts employ a variety of annotation signs.) As generally with scripts that predate printing, an obvious problem is the abundance of functional, chronological, regional and decorative variant shapes, the precise details of whose distribution are often unknown. The present contents of the block will need to be interpreted with Old Cyrillic in mind, and decisions to be made as to which remaining characters should be implemented via Unicode’s mechanism of variation selection, as ligatures in the typeface, or as code points in the Private space or the standard Cyrillic block. I discuss the initial stage of this work. The Unicode Standard (Unicode 4.0.1) makes a controversial statement: The historical form of the Cyrillic alphabet is treated as a font style variation of modern Cyrillic because the historical forms are relatively close to the modern appearance, and because some of them are still in modern use in languages other than Russian (for example, U+0406 “I” CYRILLIC CAPITAL LETTER I is used in modern Ukrainian and Byelorussian). Some of the letters in this range were used in modern typefaces in Russian and Bulgarian. -
Learning Cyrillic
LEARNING CYRILLIC Question: If there is no equivalent letter in the Cyrillic alphabet for the Roman "J" or "H" how do you transcribe good German names like Johannes, Heinrich, Wilhelm, etc. I heard one suggestion that Johann was written as Ivan and that the "h" was replaced with a "g". Can you give me a little insight into what you have found? In researching would I be looking for the name Ivan rather than Johann? One must always think phonetic, that is, think how a name is pronounced in German, and how does the Russian Cyrillic script produce that sound? JOHANNES. The Cyrillic spelling begins with the letter “I – eye”, but pronounced “eee”, so we have phonetically “eee-o-hann” which sounds like “Yo-hann”. You can see it better in typeface – Иоганн , which letter for letter reads as “I-o-h-a-n-n”. The modern Typeface script is radically different than the old hand-written Cyrillic script. Use the guide which I sent to you. Ivan is the Russian equivalent of Johann, and it pops up occasionally in Church records. JOSEPH / JOSEF. Listen to the way the name is pronounced in German – “yo-sef”, also “yo-sif”. That “yo” sound is produced by the Cyrillic script letters “I” and “o”. Again you can see it in the typeface. Иосеф and also Иосиф. And sometimes Joseph appears as , transliterated as O-s-i-p. Similar to all languages and scripts, Cyrillic spellings are not consistent. The “a” ending indicates a male name. JAKOB. There is no “Jay” sound in the German language. -
5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721
Internet Engineering Task Force (IETF) P. Faltstrom, Ed. Request for Comments: 5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721 The Unicode Code Points and Internationalized Domain Names for Applications (IDNA) Abstract This document specifies rules for deciding whether a code point, considered in isolation or in context, is a candidate for inclusion in an Internationalized Domain Name (IDN). It is part of the specification of Internationalizing Domain Names in Applications 2008 (IDNA2008). Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5892. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. -
Kyrillische Schrift Für Den Computer
Hanna-Chris Gast Kyrillische Schrift für den Computer Benennung der Buchstaben, Vergleich der Transkriptionen in Bibliotheken und Standesämtern, Auflistung der Unicodes sowie Tastaturbelegung für Windows XP Inhalt Seite Vorwort ................................................................................................................................................ 2 1 Kyrillische Schriftzeichen mit Benennung................................................................................... 3 1.1 Die Buchstaben im Russischen mit Schreibschrift und Aussprache.................................. 3 1.2 Kyrillische Schriftzeichen anderer slawischer Sprachen.................................................... 9 1.3 Veraltete kyrillische Schriftzeichen .................................................................................... 10 1.4 Die gebräuchlichen Sonderzeichen ..................................................................................... 11 2 Transliterationen und Transkriptionen (Umschriften) .......................................................... 13 2.1 Begriffe zum Thema Transkription/Transliteration/Umschrift ...................................... 13 2.2 Normen und Vorschriften für Bibliotheken und Standesämter....................................... 15 2.3 Tabellarische Übersicht der Umschriften aus dem Russischen ....................................... 21 2.4 Transliterationen veralteter kyrillischer Buchstaben ....................................................... 25 2.5 Transliterationen bei anderen slawischen -
Seize the Ъ: Linguistic and Social Change in Russian Orthographic Reform Eugenia Sokolskaya
Seize the Ъ: Linguistic and Social Change in Russian Orthographic Reform Eugenia Sokolskaya It would be convenient for linguists and language students if the written form of any language were a neutral and direct representation of the sounds emitted during speech. Instead, writing systems tend to lag behind linguistic change, retaining old spellings or morphological features instead of faithfully representing a language's phonetics. For better or worse, sometimes the writing can even cause a feedback loop, causing speakers to hypercorrect in imitation of an archaic spelling. The written and spoken forms of a language exist in a complex and fluid relationship, influenced to a large extent by history, accident, misconception, and even politics.1 While in many language communities these two forms – if writing exists – are allowed to develop and influence each other in relative freedom, with some informal commentary, in some cases a willful political leader or group may step in to attempt to intentionally reform the writing system. Such reforms are a risky venture, liable to anger proponents of historic accuracy and adherence to tradition, as well as to render most if not all of the population temporarily illiterate. With such high stakes, it is no wonder that orthographic reforms are not often attempted in the course of a language's history. The Russian language has undergone two sharply defined orthographic reforms, formulated as official government policy. The first, which we will from here on call the Petrine reform, was initiated in 1710 by Peter the Great. It defined a print alphabet for secular use, distancing the writing from the Church with its South Slavic lithurgical language. -
RUSSIAN CYRILLIC ALPHABET Sunday, 08 April 2012 20:15
RUSSIAN CYRILLIC ALPHABET Sunday, 08 April 2012 20:15 {joomplu:1215 left} The CYRILLIC ALPHABET is the official alphabet of the Russian Federation. When coming to Russia, one might be very intimidated by the strange letters and words that are written everywhere. There is no need to worry anymore. We would like to educate everyone on the Cyrillic alphabet, and how to read it, understand it, and love it. So that by the end of this article you will be able to read and understand the tricky Russian alphabet hands down. =) The Cyrillic alphabet was first developed in the 10th century AD in the area known as Bulgaria today. The alphabet has gone through a lot of changes over its history, and looks somewhat similar in some aspects today, as it did so many years ago. Cyrillic is the official alphabet of several Slavic countries such as: Russia, Ukraine, Belarus, and Serbia. But it is also used in several other post-Soviet countries. The alphabet is derived from the ancient Greek script. The usage of Cyrillic nowadays is that it serves as one of the three official alphabets in the European Union. In the Russian alphabet, there are 33 letters and 43 sounds (6 vowels and 37 consonants) as opposed to the 26 letters of the English alphabet (Latin alphabet) with the total of 52 sounds. Both alphabets contain similar letters such as A, E, K, M, O, and T, which are pronounced not exactly the same but quite similarly: The Russian T, K, and M are the equivalents of the respective sounds in English. -
1 Symbols (2286)
1 Symbols (2286) USV Symbol Macro(s) Description 0009 \textHT <control> 000A \textLF <control> 000D \textCR <control> 0022 ” \textquotedbl QUOTATION MARK 0023 # \texthash NUMBER SIGN \textnumbersign 0024 $ \textdollar DOLLAR SIGN 0025 % \textpercent PERCENT SIGN 0026 & \textampersand AMPERSAND 0027 ’ \textquotesingle APOSTROPHE 0028 ( \textparenleft LEFT PARENTHESIS 0029 ) \textparenright RIGHT PARENTHESIS 002A * \textasteriskcentered ASTERISK 002B + \textMVPlus PLUS SIGN 002C , \textMVComma COMMA 002D - \textMVMinus HYPHEN-MINUS 002E . \textMVPeriod FULL STOP 002F / \textMVDivision SOLIDUS 0030 0 \textMVZero DIGIT ZERO 0031 1 \textMVOne DIGIT ONE 0032 2 \textMVTwo DIGIT TWO 0033 3 \textMVThree DIGIT THREE 0034 4 \textMVFour DIGIT FOUR 0035 5 \textMVFive DIGIT FIVE 0036 6 \textMVSix DIGIT SIX 0037 7 \textMVSeven DIGIT SEVEN 0038 8 \textMVEight DIGIT EIGHT 0039 9 \textMVNine DIGIT NINE 003C < \textless LESS-THAN SIGN 003D = \textequals EQUALS SIGN 003E > \textgreater GREATER-THAN SIGN 0040 @ \textMVAt COMMERCIAL AT 005C \ \textbackslash REVERSE SOLIDUS 005E ^ \textasciicircum CIRCUMFLEX ACCENT 005F _ \textunderscore LOW LINE 0060 ‘ \textasciigrave GRAVE ACCENT 0067 g \textg LATIN SMALL LETTER G 007B { \textbraceleft LEFT CURLY BRACKET 007C | \textbar VERTICAL LINE 007D } \textbraceright RIGHT CURLY BRACKET 007E ~ \textasciitilde TILDE 00A0 \nobreakspace NO-BREAK SPACE 00A1 ¡ \textexclamdown INVERTED EXCLAMATION MARK 00A2 ¢ \textcent CENT SIGN 00A3 £ \textsterling POUND SIGN 00A4 ¤ \textcurrency CURRENCY SIGN 00A5 ¥ \textyen YEN SIGN 00A6 -
Psftx-Debian-10.5/Fullcyrasia-Terminusboldvga14.Psf Linux Console Font Codechart
psftx-debian-10.5/FullCyrAsia-TerminusBoldVGA14.psf Linux console font codechart Glyphs 0x000 to 0x0FF 0 1 2 3 4 5 6 7 8 9 A B C D E F 0x00_ 0x01_ 0x02_ 0x03_ 0x04_ 0x05_ 0x06_ 0x07_ 0x08_ 0x09_ 0x0A_ 0x0B_ 0x0C_ 0x0D_ 0x0E_ 0x0F_ Page 1 Glyphs 0x100 to 0x1FF 0 1 2 3 4 5 6 7 8 9 A B C D E F 0x10_ 0x11_ 0x12_ 0x13_ 0x14_ 0x15_ 0x16_ 0x17_ 0x18_ 0x19_ 0x1A_ 0x1B_ 0x1C_ 0x1D_ 0x1E_ 0x1F_ Page 2 Font information 0x015 U+00A7 SECTION SIGN Filename: psftx-debian-10.5/FullCyrAsia-Termi 0x016 U+0449 CYRILLIC SMALL LETTER nusBoldVGA14.psf SHCHA PSF version: 1 0x017 U+044B CYRILLIC SMALL LETTER Glyph size: 8 × 14 pixels YERU Glyph count: 512 0x018 U+2191 UPWARDS ARROW Unicode font: Yes (mapping table present) 0x019 U+2193 DOWNWARDS ARROW Unicode mappings 0x01A U+044E CYRILLIC SMALL LETTER 0x000 U+25C0 BLACK LEFT-POINTING YU TRIANGLE 0x01B U+0492 CYRILLIC CAPITAL LETTER 0x001 U+25B6 BLACK RIGHT-POINTING GHE WITH STROKE TRIANGLE 0x01C U+0493 CYRILLIC SMALL LETTER 0x002 U+00A9 COPYRIGHT SIGN GHE WITH STROKE 0x003 U+00AE REGISTERED SIGN 0x01D U+0496 CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER 0x004 U+2666 BLACK DIAMOND SUIT 0x01E U+0497 CYRILLIC SMALL LETTER 0x005 U+0414 CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER DE 0x01F U+049A CYRILLIC CAPITAL LETTER 0x006 U+0416 CYRILLIC CAPITAL LETTER KA WITH DESCENDER ZHE 0x020 U+0020 SPACE 0x007 U+041C CYRILLIC CAPITAL LETTER EM 0x021 U+0021 EXCLAMATION MARK 0x008 U+0422 CYRILLIC CAPITAL LETTER 0x022 U+0022 QUOTATION MARK TE 0x023 U+0023 NUMBER SIGN 0x009 U+0424 CYRILLIC CAPITAL LETTER EF 0x024 U+0024 DOLLAR SIGN -
Teaching and Training, Including Techniques and Dr Gary Massey Zurich University of Applied Sciences, Switzerland Technology, Testing and Assessment
ISSN 2520-2073 (print) ISSN 2521-442X (online) TRAINING, LANGUAGE AND CULTURE ‘Clear language engenders clear thought, and clear thought is the most important benefit of education’ ‒ Richard Mitchell Vol. 5 Issue 1 2021 Issue doi: 10.22363/2521-442X-2021-5-1 The quarterly journal published by Peoples’ Friendship University of Russia (RUDN University) ISSN 2520-2073 (print) AIMS AND SCOPE TRAINING, LANGUAGE AND CULTURE ISSN 2521-442X (online) Training, Language and Culture (TLC) is a peer-reviewed journal that aims to promote and disseminate research spanning the spectrum of language and linguistics, education and culture studies with a special focus on professional communication and professional discourse. Editorial Board of A quarterly journal published by RUDN University Training, Language and Culture invites research-based articles, reviews and editorials covering issues of relevance for the scientific and professional communities. EDITORIAL BOARD Dr Elena N. Malyuga Peoples’ Friendship University of Russia (RUDN University), Moscow, Russian Federation FOCUS AREAS Barry Tomalin Glasgow Caledonian University London, London, UK Training, Language and Culture covers the following areas of scholarly interest: theoretical and practical perspectives in language and linguistics; Dr Michael McCarthy University of Nottingham, Nottingham, UK culture studies; interpersonal and intercultural professional communication; language and culture teaching and training, including techniques and Dr Gary Massey Zurich University of Applied Sciences, Switzerland technology, testing and assessment. Dr Robert O’Dowd University of León, León, Spain Dr Elsa Huertas Barros University of Westminster, London, UK LICENSING Dr Olga V. Aleksandrova Lomonosov Moscow State University, Moscow, Russian Federation All articles and book reviews published in Training, Language and Culture are licensed under a Creative Commons Attribution 4.0 International Li- Dr Lilia K. -
ISO/IEC International Standard 10646-1
JTC1/SC2/WG2 N3381 ISO/IEC 10646:2003/Amd.4:2008 (E) Information technology — Universal Multiple-Octet Coded Character Set (UCS) — AMENDMENT 4: Cham, Game Tiles, and other characters such as ISO/IEC 8824 and ISO/IEC 8825, the concept of Page 1, Clause 1 Scope implementation level may still be referenced as „Implementa- tion level 3‟. See annex N. In the note, update the Unicode Standard version from 5.0 to 5.1. Page 12, Sub-clause 16.1 Purpose and con- text of identification Page 1, Sub-clause 2.2 Conformance of in- formation interchange In first paragraph, remove „, the implementation level,‟. In second paragraph, remove „, and to an identified In second paragraph, remove „with an implementation implementation level chosen from clause 14‟. level‟. In fifth paragraph, remove „, the adopted implementa- Page 12, Sub-clause 16.2 Identification of tion level‟. UCS coded representation form with imple- mentation level Page 1, Sub-clause 2.3 Conformance of de- vices Rename sub-clause „Identification of UCS coded repre- sentation form‟. In second paragraph (after the note), remove „the adopted implementation level,‟. In first paragraph, remove „and an implementation level (see clause 14)‟. In fourth and fifth paragraph (b and c statements), re- move „and implementation level‟. Replace the 6-item list by the following 2-item list and note: Page 2, Clause 3 Normative references ESC 02/05 02/15 04/05 Update the reference to the Unicode Bidirectional Algo- UCS-2 rithm and the Unicode Normalization Forms as follows: ESC 02/05 02/15 04/06 Unicode Standard Annex, UAX#9, The Unicode Bidi- rectional Algorithm, Version 5.1.0, March 2008.