Unicode Request for IPA Modifier-Letters (A), Pulmonic Background
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Origin of the Peculiarities of the Vietnamese Alphabet André-Georges Haudricourt
The origin of the peculiarities of the Vietnamese alphabet André-Georges Haudricourt To cite this version: André-Georges Haudricourt. The origin of the peculiarities of the Vietnamese alphabet. Mon-Khmer Studies, 2010, 39, pp.89-104. halshs-00918824v2 HAL Id: halshs-00918824 https://halshs.archives-ouvertes.fr/halshs-00918824v2 Submitted on 17 Dec 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Published in Mon-Khmer Studies 39. 89–104 (2010). The origin of the peculiarities of the Vietnamese alphabet by André-Georges Haudricourt Translated by Alexis Michaud, LACITO-CNRS, France Originally published as: L’origine des particularités de l’alphabet vietnamien, Dân Việt Nam 3:61-68, 1949. Translator’s foreword André-Georges Haudricourt’s contribution to Southeast Asian studies is internationally acknowledged, witness the Haudricourt Festschrift (Suriya, Thomas and Suwilai 1985). However, many of Haudricourt’s works are not yet available to the English-reading public. A volume of the most important papers by André-Georges Haudricourt, translated by an international team of specialists, is currently in preparation. Its aim is to share with the English- speaking academic community Haudricourt’s seminal publications, many of which address issues in Southeast Asian languages, linguistics and social anthropology. -
Unicode Request for Cyrillic Modifier Letters Superscript Modifiers
Unicode request for Cyrillic modifier letters L2/21-107 Kirk Miller, [email protected] 2021 June 07 This is a request for spacing superscript and subscript Cyrillic characters. It has been favorably reviewed by Sebastian Kempgen (University of Bamberg) and others at the Commission for Computer Supported Processing of Medieval Slavonic Manuscripts and Early Printed Books. Cyrillic-based phonetic transcription uses superscript modifier letters in a manner analogous to the IPA. This convention is widespread, found in both academic publication and standard dictionaries. Transcription of pronunciations into Cyrillic is the norm for monolingual dictionaries, and Cyrillic rather than IPA is often found in linguistic descriptions as well, as seen in the illustrations below for Slavic dialectology, Yugur (Yellow Uyghur) and Evenki. The Great Russian Encyclopedia states that Cyrillic notation is more common in Russian studies than is IPA (‘Transkripcija’, Bol’šaja rossijskaja ènciplopedija, Russian Ministry of Culture, 2005–2019). Unicode currently encodes only three modifier Cyrillic letters: U+A69C ⟨ꚜ⟩ and U+A69D ⟨ꚝ⟩, intended for descriptions of Baltic languages in Latin script but ubiquitous for Slavic languages in Cyrillic script, and U+1D78 ⟨ᵸ⟩, used for nasalized vowels, for example in descriptions of Chechen. The requested spacing modifier letters cannot be substituted by the encoded combining diacritics because (a) some authors contrast them, and (b) they themselves need to be able to take combining diacritics, including diacritics that go under the modifier letter, as in ⟨ᶟ̭̈⟩BA . (See next section and e.g. Figure 18. ) In addition, some linguists make a distinction between spacing superscript letters, used for phonetic detail as in the IPA tradition, and spacing subscript letters, used to denote phonological concepts such as archiphonemes. -
The Unicode Cookbook for Linguists: Managing Writing Systems Using Orthography Profiles
Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2017 The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles Moran, Steven ; Cysouw, Michael DOI: https://doi.org/10.5281/zenodo.290662 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-135400 Monograph The following work is licensed under a Creative Commons: Attribution 4.0 International (CC BY 4.0) License. Originally published at: Moran, Steven; Cysouw, Michael (2017). The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles. CERN Data Centre: Zenodo. DOI: https://doi.org/10.5281/zenodo.290662 The Unicode Cookbook for Linguists Managing writing systems using orthography profiles Steven Moran & Michael Cysouw Change dedication in localmetadata.tex Preface This text is meant as a practical guide for linguists, and programmers, whowork with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together. The intersection of the Unicode Standard and the International Phonetic Al- phabet is often not met without frustration by users. Nevertheless, thetwo standards have provided language researchers with a consistent computational architecture needed to process, publish and analyze data from many different languages. We bring to light common, but not always transparent, pitfalls that researchers face when working with Unicode and IPA. Our research uses quantitative methods to compare languages and uncover and clarify their phylogenetic relations. However, the majority of lexical data available from the world’s languages is in author- or document-specific orthogra- phies. -
Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only. -
Rank of Weighted Digraphs with Blocks
Rank of weighted digraphs with blocks∗ Ranveer Singh† Swarup Kumar Panda ‡ Naomi Shaked-Monderer§ Abraham Berman¶ October 10, 2018 Abstract Let G be a digraph and r(G) be its rank. Many interesting results on the rank of an undirected graph appear in the literature, but not much information about the rank of a digraph is available. In this article, we study the rank of a digraph using the ranks of its blocks. In particular, we define classes of digraphs, namely r2-digraph, and r0-digraph, for which the rank can be exactly determined in terms of the ranks of subdigraphs of the blocks. Furthermore, the rank of directed trees, simple biblock graphs, and some simple block graphs are studied. Keywords: r2-digraphs, r0-digraphs, tree digraphs, block graphs, biblock graphs AMS Subject Classifications. 15A15, 15A18, 05C50 1 Introduction A well-known problem, proposed in 1957 by Collatz and Sinogowitz, is to characterize graphs with positive nullity [19]. Nullity of graphs is applicable in various branches of science, in particular, quantum chemistry, H¨uckel molecular orbital theory [10], [13] and social network theory [14]. For more detail on the applications see [10]. Many significant results on the nullity of undirected graphs are available in the literature, see [4, 7, 11, 12, 15, 12, 2, 3, 20]. Recently in [9, 8, 6] nullity of undirected graphs was studied using cut-vertices and connected components. The results are used to calculate the nullity of line graphs of undirected graphs. Apart from the connected components, it turns out that the blocks are also interesting subgraphs of a graph, which can be utilized to know its determinant [17, 16]. -
31 Vowel Digraph Oo
Sort Vowel Digraph oo 31 Objectives • To identify spelling patterns of vowel digraph oo Words • To read, sort, and write words with vowel digraph oo o˘o oˉo = uˉ Oddball brook nook fool spool could Materials for Within Word Pattern crook soot groom spoon should Big Book of Rhymes, “The Puppet Show,” page 55 foot stood hoop stool would hood wood noon tool Whiteboard Activities DVD-ROM, Sort 31 hook wool root troop Teacher Resource CD-ROM, Sort 31 and Follow the Dragon Game Student Book, pages 121–124 Words Their Way Library, The House That Stood on Booker Hill Introduce/Model Small Groups • Read a Rhyme Read “The Puppet Show,” and Extend the Sort emphasize words with the vowel digraph oo. (wood, book, took; soon, noon) Ask students to locate these Alternative Sort: Brainstorming words, and help them write the words in two columns, Ask students to think of other words that contain according to vowel sound. Help students hear the oo. Write their responses on index cards. When different pronunciations of the vowel digraph oo. students have completed brainstorming, ask them to identify and sort all the words they named • Model Use the whiteboard DVD or the CD word according to the vowel sound of oo. cards. Define in context any that may be unfamiliar to students. Demonstrate how to sort the words ELL English Language Learners according to the sound of the digraph oo. Point out Explain that a nook is “a hidden place,” and that that would, could, and should do not contain the groom can have several meanings, including “a digraph oo, so they belong in the oddball category. -
5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721
Internet Engineering Task Force (IETF) P. Faltstrom, Ed. Request for Comments: 5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721 The Unicode Code Points and Internationalized Domain Names for Applications (IDNA) Abstract This document specifies rules for deciding whether a code point, considered in isolation or in context, is a candidate for inclusion in an Internationalized Domain Name (IDN). It is part of the specification of Internationalizing Domain Names in Applications 2008 (IDNA2008). Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5892. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. -
Unicode Alphabets for L ATEX
Unicode Alphabets for LATEX Specimen Mikkel Eide Eriksen March 11, 2020 2 Contents MUFI 5 SIL 21 TITUS 29 UNZ 117 3 4 CONTENTS MUFI Using the font PalemonasMUFI(0) from http://mufi.info/. Code MUFI Point Glyph Entity Name Unicode Name E262 � OEligogon LATIN CAPITAL LIGATURE OE WITH OGONEK E268 � Pdblac LATIN CAPITAL LETTER P WITH DOUBLE ACUTE E34E � Vvertline LATIN CAPITAL LETTER V WITH VERTICAL LINE ABOVE E662 � oeligogon LATIN SMALL LIGATURE OE WITH OGONEK E668 � pdblac LATIN SMALL LETTER P WITH DOUBLE ACUTE E74F � vvertline LATIN SMALL LETTER V WITH VERTICAL LINE ABOVE E8A1 � idblstrok LATIN SMALL LETTER I WITH TWO STROKES E8A2 � jdblstrok LATIN SMALL LETTER J WITH TWO STROKES E8A3 � autem LATIN ABBREVIATION SIGN AUTEM E8BB � vslashura LATIN SMALL LETTER V WITH SHORT SLASH ABOVE RIGHT E8BC � vslashuradbl LATIN SMALL LETTER V WITH TWO SHORT SLASHES ABOVE RIGHT E8C1 � thornrarmlig LATIN SMALL LETTER THORN LIGATED WITH ARM OF LATIN SMALL LETTER R E8C2 � Hrarmlig LATIN CAPITAL LETTER H LIGATED WITH ARM OF LATIN SMALL LETTER R E8C3 � hrarmlig LATIN SMALL LETTER H LIGATED WITH ARM OF LATIN SMALL LETTER R E8C5 � krarmlig LATIN SMALL LETTER K LIGATED WITH ARM OF LATIN SMALL LETTER R E8C6 UU UUlig LATIN CAPITAL LIGATURE UU E8C7 uu uulig LATIN SMALL LIGATURE UU E8C8 UE UElig LATIN CAPITAL LIGATURE UE E8C9 ue uelig LATIN SMALL LIGATURE UE E8CE � xslashlradbl LATIN SMALL LETTER X WITH TWO SHORT SLASHES BELOW RIGHT E8D1 æ̊ aeligring LATIN SMALL LETTER AE WITH RING ABOVE E8D3 ǽ̨ aeligogonacute LATIN SMALL LETTER AE WITH OGONEK AND ACUTE 5 6 CONTENTS -
Proposal for Generation Panel for Latin Script Label Generation Ruleset for the Root Zone
Generation Panel for Latin Script Label Generation Ruleset for the Root Zone Proposal for Generation Panel for Latin Script Label Generation Ruleset for the Root Zone Table of Contents 1. General Information 2 1.1 Use of Latin Script characters in domain names 3 1.2 Target Script for the Proposed Generation Panel 4 1.2.1 Diacritics 5 1.3 Countries with significant user communities using Latin script 6 2. Proposed Initial Composition of the Panel and Relationship with Past Work or Working Groups 7 3. Work Plan 13 3.1 Suggested Timeline with Significant Milestones 13 3.2 Sources for funding travel and logistics 16 3.3 Need for ICANN provided advisors 17 4. References 17 1 Generation Panel for Latin Script Label Generation Ruleset for the Root Zone 1. General Information The Latin script1 or Roman script is a major writing system of the world today, and the most widely used in terms of number of languages and number of speakers, with circa 70% of the world’s readers and writers making use of this script2 (Wikipedia). Historically, it is derived from the Greek alphabet, as is the Cyrillic script. The Greek alphabet is in turn derived from the Phoenician alphabet which dates to the mid-11th century BC and is itself based on older scripts. This explains why Latin, Cyrillic and Greek share some letters, which may become relevant to the ruleset in the form of cross-script variants. The Latin alphabet itself originated in Italy in the 7th Century BC. The original alphabet contained 21 upper case only letters: A, B, C, D, E, F, Z, H, I, K, L, M, N, O, P, Q, R, S, T, V and X. -
Locutour Guide to Letters, Sounds, and Symbols
LOCUTOUROUR® Guide to Letters, Sounds, and Symbols LOCUTOUR ABEL LASSIFICATION LACE OR RTICULATION PELLED AS XAMPLES S PELLINGP E L L I N G L C P A IPA S E Consonants p bilabial plosive voiceless lips /p/ p, pp pit, puppy b bilabial plosive voiced lips /b/ b, bb bat, ebb t lingua-alveolar plosive voiceless tongue tip + upper gum ridge /t/ t, ed, gth, th, tt, tw tan, tipped, tight, thyme, attic, two d lingua-alveolar plosive voiced tongue tip + upper gum ridge /d/ d, dd, ed dad, ladder, bagged k lingua-velar plosive voiceless back of tongue and soft palate /k/ c, cc, ch, ck cab, occur, school, duck que, k mosque, kit g lingua-velar plosive voiced back of tongue and soft palate /g/ g, gg, gh, gu, gue hug, bagged, Ghana, guy, morgue f labiodental fricative voiceless lower lip + upper teeth /f/ f, ff , gh, ph feet, fl u uff, enough, phone v labiodental fricative voiced lower lip + upper teeth /v/ v, vv, f, ph vet, savvy, of, Stephen th linguadental fricative voiceless tongue + teeth /T/ th thin th linguadental fricative voiced tongue + teeth /D/ th, the them, bathe s lingua-alveolar fricative voiceless tongue tip + upper gum ridge or /s/ s, ss, sc, ce, ci, sit, hiss, scenic, ace, city tongue tip + lower gum ridge cy, ps, z cycle, psychology, pizza z lingua-alveolar fricative voiced tongue tip + upper gum ridge or /z/ z, zz, s, ss, x, cz Zen, buzz, is, scissors, tongue tip + lower gum ridge Xerox, czar sh linguapalatal fricative voiceless tongue blade and hard palate /S/ sh, ce, ch, ci, sch, si, sheep, ocean, chef, glacier, kirsch ss, su, -
Proposal for Superscript Diacritics for Prenasalization, Preglottalization, and Preaspiration
1 Proposal for superscript diacritics for prenasalization, preglottalization, and preaspiration Patricia Keating Department of Linguistics, UCLA [email protected] Daniel Wymark Department of Linguistics, UCLA [email protected] Ryan Sharif Department of Linguistics, UCLA [email protected] ABSTRACT The IPA currently does not specify how to represent prenasalization, preglottalization, or preaspiration. We first review some current transcription practices, and phonetic and phonological literature bearing on the unitary status of prenasalized, preglottalized and preaspirated segments. We then propose that the IPA adopt superscript diacritics placed before a base symbol for these three phenomena. We also suggest how the current IPA Diacritics chart can be modified to allow these diacritics to be fit within the chart. 2 1 Introduction The IPA provides a variety of diacritics which can be added to base symbols in various positions: above ([a͂ ]), below ([n̥ ]), through ([ɫ]), superscript after ([tʰ]), or centered after ([a˞]). Currently, IPA diacritics which modify base symbols are never shown preceding them; the only diacritics which precede are the stress marks, i.e. primary ([ˈ]) and secondary ([ˌ]) stress. Yet, in practice, superscript diacritics are often used preceding base symbols; specifically, they are often used to notate prenasalization, preglottalization and preaspiration. These terms are very common in phonetics and phonology, each having thousands of Google hits. However, none of these phonetic phenomena is included on the IPA chart or mentioned in Part I of the Handbook of the International Phonetic Association (IPA 1999), and thus there is currently no guidance given to users about transcribing them. In this note we review these phenomena, and propose that the Association’s alphabet include superscript diacritics preceding the base symbol for prenasalization, preglottalization and preaspiration, in accord with one common way of transcribing them. -
How Do Speakers of a Language with a Transparent Orthographic System
languages Article How Do Speakers of a Language with a Transparent Orthographic System Perceive the L2 Vowels of a Language with an Opaque Orthographic System? An Analysis through a Battery of Behavioral Tests Georgios P. Georgiou Department of Languages and Literature, University of Nicosia, Nicosia 2417, Cyprus; [email protected] Abstract: Background: The present study aims to investigate the effect of the first language (L1) orthography on the perception of the second language (L2) vowel contrasts and whether orthographic effects occur at the sublexical level. Methods: Fourteen adult Greek learners of English participated in two AXB discrimination tests: one auditory and one orthography test. In the auditory test, participants listened to triads of auditory stimuli that targeted specific English vowel contrasts embedded in nonsense words and were asked to decide if the middle vowel was the same as the first or the third vowel by clicking on the corresponding labels. The orthography test followed the same procedure as the auditory test, but instead, the two labels contained grapheme representations of the target vowel contrasts. Results: All but one vowel contrast could be more accurately discriminated in the auditory than in the orthography test. The use of nonsense words in the elicitation task Citation: Georgiou, Georgios P. 2021. eradicated the possibility of a lexical effect of orthography on auditory processing, leaving space for How Do Speakers of a Language with the interpretation of this effect on a sublexical basis, primarily prelexical and secondarily postlexical. a Transparent Orthographic System Conclusions: L2 auditory processing is subject to L1 orthography influence. Speakers of languages Perceive the L2 Vowels of a Language with transparent orthographies such as Greek may rely on the grapheme–phoneme correspondence to with an Opaque Orthographic System? An Analysis through a decode orthographic representations of sounds coming from languages with an opaque orthographic Battery of Behavioral Tests.