Automatic Transliteration of Proper Names from Somali to English

Total Page:16

File Type:pdf, Size:1020Kb

Automatic Transliteration of Proper Names from Somali to English Research Article Automatic Transliteration of Proper Names from Somali to English Ahmed Muktar Omar*, Jian Qu and Sumeth Yuenyong School of Information Technology, Shinawatra University 99 Moo 10, Bangtoey, Samkhok, Pathum Thani 12160, Thailand Abstract Transliterating of proper names is the process of converting words from source natural language (such as Somali) to a target natural language (such as English) while maintaining language pronunciation. Proper names and technical words are challenging in bilingual translation systems and also in Cross-Language Information Retrieval (CLIR) applications, due to their absence from most dictionaries. In this paper, we study an automatic transliteration from Somali to English; which is an under-studied problem. Our Somali-English transliteration system uses transliteration rules based on the orthographic mapping of the source language characters to the characters of the target language. We also propose an alignment method that maps the Somali characters when there is no direct matching character to get accurate transliteration in English. Our novel approach particularly enhances Somali-English transliteration. Keywords: Somali-English; Somali Transliteration Table; Grapheme-Based 1. Introduction special characters, although the “ ‘ ” glottal Somali is a Cushitic language which stop stands for the (Arabic Hamza). belongs to the family of Afro-Asiatic Somali also has three digraph (( ظ or ط) dh ,( ﺵ) sh ,( ﺥ) languages (or Hamito-Semitic). The Somali consonants (kh language is similar to Semitic languages such which are based on similar Arabic sounds. as Arabic and Hebrew. It is a mother tongue Somali orthography corresponds mostly to for ethnic Somalis in Greater Somalia and is Roman alphabets except where some by far the most well-documented of all characters are modified for the usage in Cushitic languages [1]. Somali characters, where the letters c and x Somali is the official language of designed to accommodate the voiced and Somalia and Djibouti and a working voiceless pharyngeal fricatives, comparable Somali long .[1] ( ع = and (ʕ ( ح = Language in the Somali regions of Ethiopia to (h and Kenya. Somali uses different writing vowels usually are written by doubling of the systems, and the Latin alphabet has been the vowel itself. official writing system in the Federal The purpose of general transliteration Republic of Somalia and Djibouti since 1972 is not to introduce new sounds to the target [2]. It mostly uses the Roman alphabet language which the target language does not except for “p,v z” without diacritic signs or provide. However, its purpose is to substitute *Correspondence : [email protected] DOI 10.14456/tijsat.2016.26 Thammasat International Journal of Science and Technology Vol.21, No.4, October-December 2016 the original letter to the nearest letter in the transliteration approaches have been studied target language. The concept of long and in the literature, each of which brings out short vowels letters exists in Somali, and various processes in different languages. long vowels are usually indicated by These methods vary by the direction of repeating the vowel itself such as “aa”, “ee”, transliteration, writing systems of different “ii”, “oo”, “uu.” languages, or intended applications. For example, the Somali word Classification of these works is not Soomaaliya usually transliterated to English straightforward. as Somalia by omitting the long vowels. Earlier work has been done for Figure 1 demonstrated a basic word Machine Transliteration grapheme-based transliteration. approaches or phoneme-based approaches. Lee and Choi proposed a source channel model (SCM) a grapheme-based approach Somali S OO M AA L IY A for English-Korean transliteration [3]. They used a direct orthographical mapping from source graphemes to target graphemes. Knight and Graehl proposed Japanese to English back-transliteration using the English S O M A L I A similarity of SCM [4]. Wan and Verspoor modeled a technique to transliterate proper Figure 1. Basic Transliteration. names from English to Chinese using a phonetic procedure [5]. They proposed an In this paper we propose a Somali- algorithm for mapping from English English transliteration system that uses characters to Chinese characters based on transliteration rules based on the heuristics relationships between English orthographic mapping of the source language spelling and pronunciation, and stable characters to the characters of the target relationships between English phonemes and language. We also propose an alignment Chinese characters. method that maps the Somali characters Kang and Kim explored a forward- when there is no direct matching character to transliteration and back-transliteration for get accurate transliteration in English. English-Korean using a direct and pivot The structure of the paper is as method and then they used chunks of follows. In Section II, we describe the phonemes to perform the transliteration and previous study of machine transliteration. In back-transliteration [6]. Kang and Choi also Section III, we describe our character studied an English-Korean back- mapping method for Somali-English transliteration using a decision-tree learning transliteration; we also discuss our [7]. The English-Korean word alignment transliteration rules of Somali proper names procedure they used is similar to Lee and to English. In Section IV, we detail our Choi [3]. experiments, evaluation metrics and the Oh and Choi also studied a model for results we obtained; and Section V concludes English-Korean transliteration using the paper. pronunciation and contextual rules [8]. Their method was composed of two phases: 2. Related Work alignment and transliteration. In their first Transliteration refers to an phase, they aligned an English pronunciation orthographical transformation or phonetic unit (EPU) taken from a pronunciation change across two languages with different phrasebook and aligned it to Korean scripts. Many different generative phonemes to find the possible 18 Thammasat International Journal of Science and Technology Vol.21, No.4, October-December 2016 correspondence between the EPU and phonemes. Virga and Khudanpur presented Somali C EE L - M A C AA N English-Chinese transliteration using a phonetic representation of English names into Chinese to support Cross-Lingual Speech and Text Processing Applications [9]. English E L - M A ‘ A N AbdulJaleel and Larkey proposed a generative statistical transliteration model for English-Arabic transliteration using n- Figure 2. Forward Transliteration. gram methods [10]. The n-gram model Most of the transliteration methods generates strings of Arabic characters from a have been proposed between English and string of English characters. Malik proposed other common languages such as Arabic, a rule based Punjabi machine transliteration Chinese, or Japanese. Somali, not being a by transliterating a word between two scripts common language, is under-studied for both of Punjabi [11]. transliteration systems and cross-language Grapheme-based transliterations information retrieval applications. consider transliteration as an orthographic process rather than phonetic process and maps groups of graphemes/characters in the 3. Mapping and Transliteration rules source language word directly to groups of To align Somali/English characters, graphemes/characters in the target language we use a direct orthographic mapping word [12]. This approach also is known as between the Somali and English characters; a (spelling-based or direct methods) as it character alignment is given in Somali and its directly transforms the source language orthographic equivalent in English to find the graphemes into the target language most probable letters. graphemes without any phonetic knowledge We start by the alignment of the of the source and target languages. identical letters; in most cases, Somali words Instantaneously the phoneme-based methods are longer than their corresponding English require some steps in the transliteration transliterated words. The mapping type is process. However, most of the grapheme- either one-to-one letter or many-to-one letter based methods directly depend on the to avoid null mapping. information that is attainable from the For example, as shown in Figure 3, characters of the words. the Somali word Ceel-cadde is usually Forward transliteration is transliterated into English as El-Adde. transliterating a word as it is written in the source language such as Somali to a foreign Somali C E E L - C A DD E language such as English. For example, forward transliteration of a Somali name “Ceelmacaan” to English is “Elma’an”. Backward transliteration or back- transliteration is transliterating a word from English E L - A DD E its transliterated version back to the language of origin. For example, back-transliteration of “Elma’an” from English to Somali is Figure 3. Missing equivalent letters. “Ceelmacaan”. This example is shown in Figure 2. The drawback of the above direct mapping is the absence of some Somali 19 Thammasat International Journal of Science and Technology Vol.21, No.4, October-December 2016 letters and long vowels in English. This is the Table 1. Somali Transliteration Table. problem addressed by our method. consonant mapping vowel mapping 3.1 Somali Transliteration Table and its Problems Somali English Somali English As can been seen from Table 1, ‘ b [b] a a Somali and English both use Roman t [t] e e alphabets, though Somali has 24 letters, 19 j [j] i i consonant
Recommended publications
  • The Origin of the Peculiarities of the Vietnamese Alphabet André-Georges Haudricourt
    The origin of the peculiarities of the Vietnamese alphabet André-Georges Haudricourt To cite this version: André-Georges Haudricourt. The origin of the peculiarities of the Vietnamese alphabet. Mon-Khmer Studies, 2010, 39, pp.89-104. halshs-00918824v2 HAL Id: halshs-00918824 https://halshs.archives-ouvertes.fr/halshs-00918824v2 Submitted on 17 Dec 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Published in Mon-Khmer Studies 39. 89–104 (2010). The origin of the peculiarities of the Vietnamese alphabet by André-Georges Haudricourt Translated by Alexis Michaud, LACITO-CNRS, France Originally published as: L’origine des particularités de l’alphabet vietnamien, Dân Việt Nam 3:61-68, 1949. Translator’s foreword André-Georges Haudricourt’s contribution to Southeast Asian studies is internationally acknowledged, witness the Haudricourt Festschrift (Suriya, Thomas and Suwilai 1985). However, many of Haudricourt’s works are not yet available to the English-reading public. A volume of the most important papers by André-Georges Haudricourt, translated by an international team of specialists, is currently in preparation. Its aim is to share with the English- speaking academic community Haudricourt’s seminal publications, many of which address issues in Southeast Asian languages, linguistics and social anthropology.
    [Show full text]
  • Proposal for Generation Panel for Latin Script Label Generation Ruleset for the Root Zone
    Generation Panel for Latin Script Label Generation Ruleset for the Root Zone Proposal for Generation Panel for Latin Script Label Generation Ruleset for the Root Zone Table of Contents 1. General Information 2 1.1 Use of Latin Script characters in domain names 3 1.2 Target Script for the Proposed Generation Panel 4 1.2.1 Diacritics 5 1.3 Countries with significant user communities using Latin script 6 2. Proposed Initial Composition of the Panel and Relationship with Past Work or Working Groups 7 3. Work Plan 13 3.1 Suggested Timeline with Significant Milestones 13 3.2 Sources for funding travel and logistics 16 3.3 Need for ICANN provided advisors 17 4. References 17 1 Generation Panel for Latin Script Label Generation Ruleset for the Root Zone 1. General Information The Latin script1 or Roman script is a major writing system of the world today, and the most widely used in terms of number of languages and number of speakers, with circa 70% of the world’s readers and writers making use of this script2 (Wikipedia). Historically, it is derived from the Greek alphabet, as is the Cyrillic script. The Greek alphabet is in turn derived from the Phoenician alphabet which dates to the mid-11th century BC and is itself based on older scripts. This explains why Latin, Cyrillic and Greek share some letters, which may become relevant to the ruleset in the form of cross-script variants. The Latin alphabet itself originated in Italy in the 7th Century BC. The original alphabet contained 21 upper case only letters: A, B, C, D, E, F, Z, H, I, K, L, M, N, O, P, Q, R, S, T, V and X.
    [Show full text]
  • Proposal for Superscript Diacritics for Prenasalization, Preglottalization, and Preaspiration
    1 Proposal for superscript diacritics for prenasalization, preglottalization, and preaspiration Patricia Keating Department of Linguistics, UCLA [email protected] Daniel Wymark Department of Linguistics, UCLA [email protected] Ryan Sharif Department of Linguistics, UCLA [email protected] ABSTRACT The IPA currently does not specify how to represent prenasalization, preglottalization, or preaspiration. We first review some current transcription practices, and phonetic and phonological literature bearing on the unitary status of prenasalized, preglottalized and preaspirated segments. We then propose that the IPA adopt superscript diacritics placed before a base symbol for these three phenomena. We also suggest how the current IPA Diacritics chart can be modified to allow these diacritics to be fit within the chart. 2 1 Introduction The IPA provides a variety of diacritics which can be added to base symbols in various positions: above ([a͂ ]), below ([n̥ ]), through ([ɫ]), superscript after ([tʰ]), or centered after ([a˞]). Currently, IPA diacritics which modify base symbols are never shown preceding them; the only diacritics which precede are the stress marks, i.e. primary ([ˈ]) and secondary ([ˌ]) stress. Yet, in practice, superscript diacritics are often used preceding base symbols; specifically, they are often used to notate prenasalization, preglottalization and preaspiration. These terms are very common in phonetics and phonology, each having thousands of Google hits. However, none of these phonetic phenomena is included on the IPA chart or mentioned in Part I of the Handbook of the International Phonetic Association (IPA 1999), and thus there is currently no guidance given to users about transcribing them. In this note we review these phenomena, and propose that the Association’s alphabet include superscript diacritics preceding the base symbol for prenasalization, preglottalization and preaspiration, in accord with one common way of transcribing them.
    [Show full text]
  • Glottal and Epiglottal Stop in Wakashan, Salish, and Semitic
    Glottal and Epiglottal Stop in Wakashan, Salish, and Semitic John H. Esling Department of Linguistics, University of Victoria, Canada E-mail: [email protected] ABSTRACT 2. RESEARCH APPROACH Direct laryngoscopic articulatory evidence from four We have examined the laryngeal physiology involved in languages in three unrelated families demonstrates the the production of glottal, glottalized, and pharyngeal existence of epiglottal stop in the pharyngeal series. In consonants in Nuuchahnulth (Wakashan), Nlaka’pamux each language, Nuuchahnulth (Wakashan), Nlaka’pamux (Salish), Arabic (Semitic), and Tigrinya (Semitic) to (Salish), Arabic (Semitic), and Tigrinya (Semitic), glottal identify the role of the aryepiglottic sphincter mechanism. stop also exists in the glottal series as a complement to In general, we wish to discover how sounds originating in epiglottal stop, and in three of the languages, a voiceless the lower pharynx are produced and how they are related glottal fricative and a voiceless pharyngeal fricative are to each other articulatorily. Specifically, we wish to also found. In Nlaka’pamux, a pair of voiced pharyngeal demonstrate how stop articulations in the laryngeal and approximants (sometimes realized as pharyngealized pharyngeal regions are produced and to document the uvulars) is found instead of the voiceless pharyngeal production of epiglottal stop. The key element in this fricative. As the most extreme stricture in either the glottal research is to document linguistic examples from native or the pharyngeal series, epiglottal stop is a product of full speakers of the cardinal consonantal categories predicted constriction of the aryepiglottic laryngeal sphincter and in prior studies of laryngeal and pharyngeal articulatory functions as the physiological mechanism for optimally possibilities [15,16].
    [Show full text]
  • Title Glottal Stop in Cleft Palate Speech Author(S)
    CORE Metadata, citation and similar papers at core.ac.uk Provided by Kyoto University Research Information Repository Title Glottal Stop in Cleft Palate Speech Kido, Naohiro; Kawano, Michio; Tanokuchi, Fumiko; Author(s) Fujiwara, Yuri; Honjo, Iwao; Kojima, Hisayoshi Citation 音声科学研究 = Studia phonologica (1992), 26: 34-41 Issue Date 1992 URL http://hdl.handle.net/2433/52465 Right Type Departmental Bulletin Paper Textversion publisher Kyoto University STUDIA PHONOLOGICA XXVI (1992) Glottal Stop in Cleft Palate Speech Naohiro Kmo, Michio KAWANO, Fumiko TANOKUCHI, Yuri FUJIWARA, Iwao HONJO AND Hisayoshi KOJIMA INTRODUCTION There is a great deal of literature that deals with the glottal stop, one of the abnormal articulations found in cleft palate speech. Except for some earlier research by Kawano,l) very few descriptions of the articulatory movements involved in these glottal stops is available in the literature. The present study expands upon that ear­ lier research and examines two cases in order to illustrate the process by which glot­ tal stop production is corrected. METHOD The subjects were 26 cleft palate patients who were seen at our clinic during the 5 years from 1984 to 1988. Their productions ofJapanese voiceless stops were auditorilly judged to be glottal stops which were confirmed by fiberscopic assessment of their laryngeal behavior. Age at the time of fiberscopic evaluation ranged from 5 to 53 years, with the mean age being 23.6. Eighteen of the subjects were judged to exhibit significant velopharyngeal insufficiency while 8 demonstrated slight velo­ pharyngeal insufficiency. Individuals with mental retardation or bilateral hearing loss were excluded from the study (see Table 1).
    [Show full text]
  • Proposal to Encode Orthographic Glottal Stops in the UCS
    L2/04-065 Proposal to Encode Orthographic Glottal Stops in the UCS Date: 2004-02-01 Author: Peter Constable, Microsoft Address: One Microsoft Way Redmond, WA 98052 USA Tel: +1 425 722 1867 Email: [email protected] A. Administrative 1. Title Proposal to Encode Orthographic Glottal Stops in the UCS 2. Requester’s name SIL International (contact: Jonathan Kew), Peter Constable 3. Requester type Expert contribution 4. Submission date 2004-02-01 5. Requester’s reference 6a. Completion This is a complete proposal 6b. More information to Only as required for clarification. be provided? B. Technical—General 1a. New Script? Name? No 1b. Addition of characters to existing Yes — Latin Extended B is suggested block? Name? 2. Number of characters in proposal 2 3. Proposed category A 4. Proposed level of implementation and 1 (no combining marks) rationale 5a. Character names included in Yes proposal? 5b. Character names in accordance with Yes guidelines? 5c. Character shapes reviewable? Yes 6a. Who will provide computerized font? SIL International Proposal to Encode Orthographic Glottal Stops in the UCS Page 1 of 6 Peter G. Constable February 01, 2004 Rev: 7 6b. Font currently available? Yes 6c. Font format? TrueType 7a. Are references (to other character sets, Yes dictionaries, descriptive texts, etc.) provided? 7b. Are published examples (such as Yes samples from newspapers, magazines, or other sources) of use of proposed characters attached? 8. Does the proposal address other Yes, suggested character properties are included (see aspects of character data processing? section D). C. Technical—Justification 1. Has this proposal for addition of No character(s) been submitted before? 2a.
    [Show full text]
  • The Damascus Psalm Fragment Oi.Uchicago.Edu
    oi.uchicago.edu The Damascus Psalm Fragment oi.uchicago.edu ********** Late Antique and Medieval Islamic Near East (LAMINE) The new Oriental Institute series LAMINE aims to publish a variety of scholarly works, including monographs, edited volumes, critical text editions, translations, studies of corpora of documents—in short, any work that offers a significant contribution to understanding the Near East between roughly 200 and 1000 CE ********** oi.uchicago.edu The Damascus Psalm Fragment Middle Arabic and the Legacy of Old Ḥigāzī by Ahmad Al-Jallad with a contribution by Ronny Vollandt 2020 LAMINE 2 LATE ANTIQUE AND MEDIEVAL ISLAMIC NEAR EAST • NUMBER 2 THE ORIENTAL INSTITUTE OF THE UNIVERSITY OF CHICAGO CHICAGO, ILLINOIS oi.uchicago.edu Library of Congress Control Number: 2020937108 ISBN: 978-1-61491-052-7 © 2020 by the University of Chicago. All rights reserved. Published 2020. Printed in the United States of America. The Oriental Institute, Chicago THE UNIVERSITY OF CHICAGO LATE ANTIQUE AND MEDIEVAL ISLAMIC NEAR EAST • NUMBER 2 Series Editors Charissa Johnson and Steven Townshend with the assistance of Rebecca Cain Printed by M & G Graphics, Chicago, IL Cover design by Steven Townshend The paper used in this publication meets the minimum requirements of American National Standard for Information Services — Permanence of Paper for Printed Library Materials, ANSI Z39.48-1984. ∞ oi.uchicago.edu For Victor “Suggs” Jallad my happy thought oi.uchicago.edu oi.uchicago.edu Table of Contents Preface............................................................................... ix Abbreviations......................................................................... xi List of Tables and Figures ............................................................... xiii Bibliography.......................................................................... xv Contributions 1. The History of Arabic through Its Texts .......................................... 1 Ahmad Al-Jallad 2.
    [Show full text]
  • Kelsea Hosoda
    Kelsea Kanohokuahiwi Hosoda ICS 661 Spring 2017 Professor David Chin Identifying and Categorizing Word Ambiguities due to Diacritical Markers within a single Hawaiian Language Corpus Introduction Background The Hawaiian language has a rich history that includes a thriving language boasting the most literate nation in the late 1800s to less than one thousand Native speakers in the 1950s and is now a leading language in revitalization efforts (Warchauer, 1997). The Hawaiian language has an advantage in its revitalization because of the documentation, preservation, and digitization efforts starting from the 1840s including Hawaiian language newspapers, oral histories, and video recordings of elders (Berez, 2013; Nogelmeir, 2003; Papa Kilo, 2017). These documents are crucial to the revitalization efforts of researchers and students alike to developing the next generation of Hawaiian language speakers and understanding the Hawaiian culture (Grenoble, 2006). The Hawaiian people recognized the power of documenting the Hawaiian language in a printed form. For example, the very first Hawaiian newspaper article published in 1834 states the importance of documenting the language for future generations (Solomona, 1834). From the first Hawaiian newspaper, there were over 40 different Hawaiian language newspapers printed and dispersed between 1834 and the 1980s (Nogelmeir, 2003). The Hawaiian language newspapers have been archived and recently a large subset has been digitized via the ‘Ike Kūʻokoʻa project (ʻIke Kūʻokoʻa 2017). The content of the newspapers included Hawaiian stories as well as mainstream factual information. There is a robust amount of information and cultural knowledge within the printed Hawaiian language documents (Nogelmeir 2003). Hawaiian Orthography Hawaiian orthography was first developed by American protestant missionaries in the 1820s (Schultz, 1994).
    [Show full text]
  • The Three American T Sounds T Pronunciation #1: True T Or
    Free, printable resource from San Diego Voice and Accent [email protected] The Three American T Sounds The T sound has three pronunciation variations depending on the context. However, ​ ​ most IPA dictionaries don’t differentiate between the three T sounds! You will most likely always see the /t/ symbol used in the IPA transcription, but that’s not always correct. T Pronunciation #1: True T or Released T /t/ This T sound is probably the one that you learned to pronounce. It is made by touching the tongue tip to the roof of the mouth, just behind the front teeth (the bumpy ridge called the alveolar ridge). Air builds behind the tongue, and then a puff of air is released as the tongue comes down. You use this T sound at the beginning of a word or syllable or within a t-cluster (two or more consonants that are together), like in tie, atomic, and institution. ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ T Pronunciation #2: Flap /ɾ/ The flap is one of the sounds that makes American English different from other versions of English, and Americans use the flap frequently. It will improve your American accent greatly to learn about the flap! ​ The flap is similar to a light D sound. You make a flap by lightly “flapping” the tongue tip to the roof of the mouth, just behind the front teeth (the bumpy ridge called the alveolar ridge). The flap is very quick - if you put too much time into the flap, then you will make ​ ​ an actual D sound. One important note: The flap is the same as the R sound in other languages, like in Spanish, in which a speaker will use a flap in the word caro /kaɾo/ ​ ​ (meaning expensive).
    [Show full text]
  • Two Chamorro Orthographies and Their Differences
    Two Chamorro Orthographies and their differences Sandy Chung University of California, Santa Cruz [email protected] March 2019 1 An Ideal Orthography • Systematic • Accurate – true to the structure of the language • Easy to learn • Connected to earlier ways of spelling the language – some legacy features are preserved March 2019 2 What’s Special about Chamorro • Language loss – more recent in the CNMI than in Guam • Ethnically and linguistically diverse classrooms • The easier it is for children to learn Chamorro spelling, the better the chances that the language will survive March 2019 3 Two Principles of Orthography • Two principles involved in designing a good orthography – “One sound, one symbol” – “One word, one spelling” March 2019 4 “One sound, one symbol” • This principle means: spell a given sound the same way everywhere – The sound /k/ is spelled the same way wherever it occurs, not k sometimes and g other times kulu, dikiki’, måolik – Glottal stop is spelled the same way wherever it occurs li’i’, palåo’an, sa’ March 2019 5 Ex. of “One sound, one symbol” • Languages whose orthographies obey “one sound, one symbol” – Hawaiian – Spanish (mostly) – Turkish ... But not English March 2019 6 “One word, one spelling” • This principle means: spell a given word the same way in all its different forms – The word electric is spelled the same way in all of its forms and in all the words derived from it electric, electricity, electrician /k/ /s/ /š/ March 2019 7 Ex. of “One word, one spelling” • Languages whose orthographies obey “one word,
    [Show full text]
  • Tonogenesis Alexis Michaud, Bonny Sands
    Tonogenesis Alexis Michaud, Bonny Sands To cite this version: Alexis Michaud, Bonny Sands. Tonogenesis. Aronoff, Mark. Oxford Research Encyclopedia of Lin- guistics, Oxford University Press, 2020, 9780199384655. 10.1093/acrefore/9780199384655.013.748. halshs-02519305 HAL Id: halshs-02519305 https://halshs.archives-ouvertes.fr/halshs-02519305 Submitted on 25 Mar 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike| 4.0 International License Tonogenesis Alexis Michaud, Bonny Sands Table of Contents 1. Introduction ..................................................................................................................... 2 2. Tonogenesis in East Asian languages .......................................................................... 2 2.1. Tonogenesis by loss of voicing contrasts on initial consonants ........................... 3 2.1.1. The role of unvoiced continuants in consonant shifts ................................... 5 2.1.2. Voicing and phonation types
    [Show full text]
  • HIATUS RESOLUTION STRATEGIES in NON-RHOTIC ENGLISH: the CASE of /R/-LIAISON
    ICPhS XVII Regular Session Hong Kong, 17-21 August 2011 HIATUS RESOLUTION STRATEGIES IN NON-RHOTIC ENGLISH: THE CASE OF /r/-LIAISON Jose A. Mompeána & F. Alberto Gómezb aUniversity of Murcia, Spain; bUniversity College London, UK [email protected]; [email protected] ABSTRACT /aɪdɪər əv ɪt/–, a phenomenon known as ‘intrusive’ This paper looks at the different hiatus resolution /r/. Finally, in potential contexts in which the first vowel is [+high] and unrounded, a palatal glide is strategies in potential cases of /r/-liaison in non- j rhotic English. Potential contexts of /r/-liaison often inserted –e.g. flying [flaɪ ɪŋ]–while if the first vowel is [+high] and rounded –e.g. going were identified in a corpus of BBC newscasts and w analysed auditorily and acoustically for the [ˈɡəʊ ɪŋ]–, a labial-velar glide is inserted. occurrence of /r/-liaison, glottal stops, creaky voice 2. HIATUS-BREAKING IN POTENTIAL /r/- or hiatus. The results show that /r/-liaison is a LIAISON CONTEXTS: A STUDY common hiatus-breaking strategy although there are significant differences between linking /r/ and Although the hiatus-breaking strategies in varieties intrusive /r/. They also show that glottal hiatus- of English have long been observed and described, breaking strategies are common and that creaky little empirical research has so far been carried out voice is more frequent than true glottal stops while into different aspects of the use of those strategies hiatus is not very frequent. The discussion focuses and their variability. As a case in point, /r/-liaison on the reason why glottal strategies seem to be has long been discussed in the relevant literature gaining ground as opposed to the use of /r/.
    [Show full text]