A First Experience on Multilingual Acoustic Modeling of the Languages Spoken in Morocco

Total Page:16

File Type:pdf, Size:1020Kb

A First Experience on Multilingual Acoustic Modeling of the Languages Spoken in Morocco INTERSPEECH 2004 -- ICSLP 8th International Conference on Spoken ISCA Archive Language Processing http://www.isca-speech.org/archive ICC Jeju, Jeju Island, Korea October 4-8, 2004 A first experience on multilingual acoustic modeling of the languages spoken in Morocco José B. Mariño, A. Moreno, A. Nogueiras TALP Research Center on “Technologies and Applications of Language and Speech” Technical University of Catalonia, Spain {canton,asuncion,albino}@talp.upc.es inventory of allophones for both languages is designed and Abstract evaluated against the monolingual counterparts. The goal of this paper is to explore and describe the potential The paper is organized as follows. The three next sections of multilingual acoustic models for automatic speech describe the experimental framework including the available recognition of the languages spoken in Morocco. The basic speech databases used to train and test the system, the experimental framework comes from the OrienTel project, inventory of sounds and the main features of the recognition mainly the sound inventory of the Arabic languages and the system used for this experimental work. Section five provides speech databases. Monolingual and multilingual automatic the description of the experimental work carried out to speech recognition systems for Modern Colloquial and validate and evaluate the multilingual system. The paper ends Standard Arabic (MCA and MSA, respectively) and French with a discussion section. languages are developed and evaluated, in order to envisage the phonetic exchange and similarity among the three 2. Speech databases languages. As a main result, it can be stated that a combined In the OrienTel project three databases [2] have been modeling of MSA and MCA or, even a trilingual design, does produced in Morocco: for MCA, MSA and French. Calls not harm the performance of the recognition system. were recorded from fixed and mobile phones. The utterances were recorded through an ISDN access to the fixed public 1. Introduction telephone network, sampled at 8 kHz and quantified by the A- The aim of the IST project “Multilingual Access to Interactive law at 8 bits per sample. These databases have been used for Communication Services for the Mediterranean and the training the ASR system and testing. Middle East” (OrienTel) is to enable the project's participants to design and develop multilingual interactive communication 2.1. MCA database services for the Mediterranean and the Middle East, ranging The Modern Colloquial Arabic (MCA) database contains from Morocco in the West to the Gulf States in the East, utterances collected from 772 speakers: 600 of them supply including Turkey, Israel and Cyprus. To achieve this aim, the the training material and the remaining 172 speakers build up consortium has been compiling a set of 23 linguistic databases the testing set. and conduct research into ASR-related problems of the As training material the total number of utterances is 44 x 10.21437/Interspeech.2004-308 OrienTel region. 600= 26400 utterances (spellings and yes/no questions are not This paper is addressed to explore and describe the used) including more than 605850 phones. As test we chose potential of multilingual acoustic models and lexica of the three different tasks extensively described in [2]: languages spoken in Morocco. Morocco belongs to the • Digit strings: prompt sheet number, telephone number, Magreb area. Three languages are spoken in Morocco: spontaneous telephone number, credit car (14-16 digits), Modern Colloquial Arabic (MCA), Modern Standard Arabic PIN. (MSA) and French. As far as MSA and MCA are spoken • Applications words. across the country, Morocco is a fully bilingual country. • Dates: relative and general expressions. French is mainly used for commercial transactions. Both MSA and MCA languages have important 2.2. MSA database similarities while maintaining specific phonetic traits and lexica. For instance, even though they share the same The Modern Standard Arabic (MSA) database contains phonetic inventory, pronunciation issues differ slightly utterances collected from 530 speakers: 400 of them supply between both languages. the training material and the remaining 130 speakers build up On the other hand, French shows a complete different the testing set. phonetic inventory and come from a very different language As training material the total number of utterances is 46 x root, that is, Latin. In this work we shall try to take advantage 400= 18400 utterances (spelling and yes/no questions are not of the fact that, for Moroccan people, French is a very used) including more than 548395 phones. As test we chose commonly used third language, and their pronunciation is three different tasks: strongly influenced by Arab phonemes. • Digit strings: prompt sheet number, strings of 4 digits. Thus, an alternative to use a specific phonetic description • Applications words. for MSA, MCA and French can be devised. In this paper, and • Dates: relative and general expressions. following previous work (for instance, see [1]), a common 2.3. French database SAMPA Definition The French database is formed by utterances collected from Vowels 530 speakers: 400 of them supply the training material and a open front unrounded vowel the remaining 130 speakers build up the testing set. i close front unrounded vowel As training material the total number of utterances is 43 x 400= 17200 utterances (spellings and yes/no questions are not u close back rounded vowel used) including more than 344245 phones. As test we chose a: long open front unrounded vowel three different tasks: i: long close front unrounded vowel • Digit strings: prompt sheet number, telephone number, u: long close back rounded vowel spontaneous telephone number, credit car (14-16 digits), Semivowels PIN. • Applications words. j voiced palatal approximant • Dates: prompted date phrases, relative and general w voiced labial-velar approximant expressions Fricatives After discarding the utterances with mispronounced or incomplete words, the final number of utterances for every ?`(?\) voiced pharyngeal fricative language is described in Table 1. Furthermore, to be more D voiced dental fricative specific, Table 2 shows, for each test set, the size of the D` voiced dental emphatic fricative vocabulary, number of sentences and number of words for the f voiceless labiodental fricative final set. G voiced velar fricative h voiceless glottal fricative Training Test Language s voiceless alveolar fricative Utterances Digits A. words Dates S voiceless postalveolar fricative MCA 26328 356 816 156 s` voiceless alveolar emphatic fricative MSA 18322 911 698 114 T voiceless dental fricative French 17039 267 727 202 v voiced labiodental fricative (MCA, MSA rare) Table 1: Training and test material. Number of utterances. x voiceless velar fricative X\ voiceless pharyngeal fricative Digits A. words Dates Language z voiced alveolar fricative Size Words Size Words Size Words Z voiced postalveolar fricative MCA 10 2195 23 848 38 351 Lateral MSA 10 3999 25 786 32 247 l voiced dental/alveolar lateral approximant French 10 1728 33 727 151 894 l` voiced dental/alveolar lateral approximant emphatic (MCA, MSA rare) Table 2: Vocabulary size and number of words in the test set. Trill r voiced dental or alveolar trill 3. Inventory of sounds Nasals The standard SAMPA[3] phoneme set for French and Arabic m voiced bilabial nasal were used for French, and MCA and MSA as spoken in n voiced dental or alveolar nasal Morocco. Table 3 summarizes the MSA and MCA inventories of allophones as they were defined to design the OrienTel Plosives databases. The same table includes for every allophone the ? stød glottal stop attributes considered further in the clustering algorithms. It b voiced bilabial plosive can be observed that MSA and MCA share the same d voiced dental/alveolar plosive inventory, where rare sounds coming from foreign languages are included. d` voiced dento-alveolar emphatic plosive French only shares a small part of phonemes, which are g voiced velar plosive marked with a bold character in Table 3. French shows a k voiceless velar plosive greater variability on the vowels set and MCA and MSA p voiceless bilabial plosive (MCA, MSA rare) show a higher variability in the fricatives set. The specific q voiceless uvular plosive French sounds used in our experimentation can be found in Table 4, where “indeterminacy symbol” means that it replaces t voiceless dental/alveolar plosive the corresponding symbols in the list in case of indeterminacy t` voiceless dento-alveolar emphatic plosive between both symbols. Table 3: Inventory of sounds for MCA and MSA. SAMPA Definition Some standard pronunciation issues in Magreb dialects Vowels have not been taken into account when generating phonetic transcriptions because of their dependence on the speaker and 2 close-mid,front,rounded vowel their non-systematic nature: 9 open-mid front rounded vowel a) Substitution of /T/ by /t/; and of /q/ by /g/ or /?/. @ mid central unrounded vowel b) Assimilation of voiced dental fricatives and plosives (/D/, A open back unrounded vowel /D`/, /d/, /d`/), which usually merge into just /d/ and /d`/. c) Relaxation of shedda (gemination) and emphasation. e close-mid front unrounded vowel d) Deletion of hamza (/?/). E open-mid front unrounded vowel Furthermore, the distribution of these peculiarities is dialect o close-mid back rounded vowel dependent, being remarkably more important in MCA. O open-mid back rounded vowel The recognition search is sped up by using beam-search Y close front rounded vowel and phonetic look-ahead. &/ 2, 9 (indeterminacy symbol) 5. Evaluation A/ a, A (indeterminacy symbol) E/ e, E (indeterminacy symbol) The following recognition systems were trained and evaluated: a) Three monolingual systems, one for each language, with O/ o, O (indeterminacy symbol) 750 models each. U~/ e~, 9~ (indeterminacy symbol) b) Two bilingual systems for modeling MCA and MSA. 9~ open-mid front rounded nasal Both systems use 900 models. The presence or not of a~ open front unrounded nasal language dedicated models is the difference between e~ close-mid front unrounded nasal them.
Recommended publications
  • A Contrastive Study of the Ibibio and Igbo Sound Systems
    A CONTRASTIVE STUDY OF THE IBIBIO AND IGBO SOUND SYSTEMS GOD’SPOWER ETIM Department Of Languages And Communication Abia State Polytechnic P.M.B. 7166, Aba, Abia State, Nigeria. [email protected] ABSTRACT This research strives to contrast the consonant phonemes, vowel phonemes and tones of Ibibio and Igbo in order to describe their similarities and differences. The researcher adopted the descriptive method, and relevant data on the phonology of the two languages were gathered and analyzed within the framework of CA before making predictions and conclusions. Ibibio consists of ten vowels and fourteen consonant phonemes, while Igbo is made up of eight vowels and twenty-eight consonants. The results of contrastive analysis of the two languages showed that there are similarities as well as differences in the sound systems of the languages. There are some sounds in Ibibio which are not present in Igbo. Also many sounds are in Igbo which do not exist in Ibibio. Both languages share the phonemes /e, a, i, o, ɔ, u, p, b, t, d, k, kp, m, n, ɲ, j, ŋ, f, s, j, w/. All the phonemes in Ibibio are present in Igbo except /ɨ/, /ʉ/, and /ʌ/. Igbo has two vowel segments /ɪ/ and /ʊ/ and also fourteen consonant phonemes /g, gb, kw, gw, ŋw, v, z, ʃ, h, ɣ, ʧ, ʤ, l, r/ which Ibibio lacks. Both languages have high, low and downstepped tones but Ibibio further has contour or gliding tones which are not tone types in Igbo. Also, the downstepped tone in Ibibio is conventionally marked with exclamation point, while in Igbo, it is conventionally marked with a raised macron over the segments bearing it.
    [Show full text]
  • Vowel Quality and Phonological Projection
    i Vowel Quality and Phonological Pro jection Marc van Oostendorp PhD Thesis Tilburg University September Acknowledgements The following p eople have help ed me prepare and write this dissertation John Alderete Elena Anagnostop oulou Sjef Barbiers Outi BatEl Dorothee Beermann Clemens Bennink Adams Bo domo Geert Bo oij Hans Bro ekhuis Norb ert Corver Martine Dhondt Ruud and Henny Dhondt Jo e Emonds Dicky Gilb ers Janet Grijzenhout Carlos Gussenhoven Gert jan Hakkenb erg Marco Haverkort Lars Hellan Ben Hermans Bart Holle brandse Hannekevan Ho of Angeliek van Hout Ro eland van Hout Harry van der Hulst Riny Huybregts Rene Kager HansPeter Kolb Emiel Krah mer David Leblanc Winnie Lechner Klarien van der Linde John Mc Carthy Dominique Nouveau Rolf Noyer Jaap and Hannyvan Oosten dorp Paola Monachesi Krisztina Polgardi Alan Prince Curt Rice Henk van Riemsdijk Iggy Ro ca Sam Rosenthall Grazyna Rowicka Lisa Selkirk Chris Sijtsma Craig Thiersch MiekeTrommelen Rub en van der Vijver Janneke Visser Riet Vos Jero en van de Weijer Wim Zonneveld Iwant to thank them all They have made the past four years for what it was the most interesting and happiest p erio d in mylife until now ii Contents Intro duction The Headedness of Syllables The Headedness Hyp othesis HH Theoretical Background Syllable Structure Feature geometry Sp ecication and Undersp ecicati on Skeletal tier Mo del of the grammar Optimality Theory Data Organisation of the thesis Chapter Chapter
    [Show full text]
  • A Brief Description of Consonants in Modern Standard Arabic
    Linguistics and Literature Studies 2(7): 185-189, 2014 http://www.hrpub.org DOI: 10.13189/lls.2014.020702 A Brief Description of Consonants in Modern Standard Arabic Iram Sabir*, Nora Alsaeed Al-Jouf University, Sakaka, KSA *Corresponding Author: [email protected] Copyright © 2014 Horizon Research Publishing All rights reserved. Abstract The present study deals with “A brief Modern Standard Arabic. This study starts from an description of consonants in Modern Standard Arabic”. This elucidation of the phonetic bases of sounds classification. At study tries to give some information about the production of this point shows the first limit of the study that is basically Arabic sounds, the classification and description of phonetic rather than phonological description of sounds. consonants in Standard Arabic, then the definition of the This attempt of classification is followed by lists of the word consonant. In the present study we also investigate the consonant sounds in Standard Arabic with a key word for place of articulation in Arabic consonants we describe each consonant. The criteria of description are place and sounds according to: bilabial, labio-dental, alveolar, palatal, manner of articulation and voicing. The attempt of velar, uvular, and glottal. Then the manner of articulation, description has been made to lead to the drawing of some the characteristics such as phonation, nasal, curved, and trill. fundamental conclusion at the end of the paper. The aim of this study is to investigate consonant in MSA taking into consideration that all 28 consonants of Arabic alphabets. As a language Arabic is one of the most 2.
    [Show full text]
  • Lexical Tone Perception in Chinese Mandarin Anna Björklund
    Lexical Tone Perception in Chinese Mandarin Anna Björklund University of Florida 1 ABSTRACT Perceptual asymmetry, where Stimulus A is more often confused for B than B is for A or vice versa, has been observed in multiple lexical contexts, such as vowels (Polka & Bohn, 2003) and consonants (Dar, Mariam & Keren-Portnoy, 2018). Because historically perceptual space was assumed to be Euclidean, perception of stimuli was in turn assumed to be symmetrical, and observations of asymmetry were explained as simply response bias (Polka & Bohn, 2003). However, further examination of such biases has suggested that they are a much more fundamental occurrence. This study examines the effects of memory load and native language on perceptual bias in lexical tone perception of Mandarin Chinese for native speakers of Mandarin and English. Native speakers of both languages were given an AX categorical discrimination task with each combination of the four lexical Mandarin tones. To examine the effects of training on this bias, they were then trained on one of two tones (1 and 4) and given the same sequence of discrimination tasks again. Each pre- and post-training discrimination task featured both 250 ms and 1000 ms ISI intervals. 2 0. INTRODUCTION TO PERCEPTUAL ASYMMETRY It is often assumed that perception is symmetrical: that is, when presented with two stimuli, A and B, it is equally as easy to discriminate going from stimuli A to B as going from B to A. Although there had long been data that conflicted with such a model, the aberrant data was explained as response bias (Polka and Bohn, 2003), and subsequently ignored.
    [Show full text]
  • LT3212 Phonetics Assignment 4 Mavis, Wong Chak Yin
    LT3212 Phonetics Assignment 4 Mavis, Wong Chak Yin Essay Title: The sound system of Japanese This essay aims to introduce the sound system of Japanese, including the inventories of consonants, vowels, and diphthongs. The phonological variations of the sound segments in different phonetic environments are also included. For the illustration, word examples are given and they are presented in the following format: [IPA] (Romaji: “meaning”). Consonants In Japanese, there are 14 core consonants, and some of them have a lot of allophonic variations. The various types of consonants classified with respect to their manner of articulation are presented as follows. Stop Japanese has six oral stops or plosives, /p b t d k g/, which are classified into three place categories, bilabial, alveolar, and velar, as listed below. In each place category, there is a pair of plosives with the contrast in voicing. /p/ = a voiceless bilabial plosive [p]: [ippai] (ippai: “A cup of”) /b/ = a voiced bilabial plosive [b]: [baɴ] (ban: “Night”) /t/ = a voiceless alveolar plosive [t]: [oto̞ ːto̞ ] (ototo: “Brother”) /d/ = a voiced alveolar plosive [d]: [to̞ mo̞ datɕi] (tomodachi: “Friend”) /k/ = a voiceless velar plosive [k]: [kaiɰa] (kaiwa: “Conversation”) /g/ = a voiced velar plosive [g]: [ɡakɯβsai] (gakusai: “Student”) Phonetically, Japanese also has a glottal stop [ʔ] which is commonly produced to separate the neighboring vowels occurring in different syllables. This phonological phenomenon is known as ‘glottal stop insertion’. The glottal stop may be realized as a pause, which is used to indicate the beginning or the end of an utterance. For instance, the word “Japanese money” is actually pronounced as [ʔe̞ ɴ], instead of [je̞ ɴ], and the pronunciation of “¥15” is [dʑɯβːɡo̞ ʔe̞ ɴ].
    [Show full text]
  • UC Berkeley Dissertations, Department of Linguistics
    UC Berkeley Dissertations, Department of Linguistics Title The Aeroacoustics of Nasalized Fricatives Permalink https://escholarship.org/uc/item/00h9g9gg Author Shosted, Ryan K Publication Date 2006 eScholarship.org Powered by the California Digital Library University of California The Aeroacoustics of Nasalized Fricatives by Ryan Keith Shosted B.A. (Brigham Young University) 2000 M.A. (University of California, Berkeley) 2003 A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Linguistics in the GRADUATE DIVISION of the UNIVERSITY OF CALIFORNIA, BERKELEY Committee in charge: John J. Ohala, Chair Keith Johnson Milton M. Azevedo Fall 2006 The dissertation of Ryan Keith Shosted is approved: Chair Date Date Date University of California, Berkeley Fall 2006 The Aeroacoustics of Nasalized Fricatives Copyright 2006 by Ryan Keith Shosted 1 Abstract The Aeroacoustics of Nasalized Fricatives by Ryan Keith Shosted Doctor of Philosophy in Linguistics University of California, Berkeley John J. Ohala, Chair Understanding the relationship of aerodynamic laws to the unique geometry of the hu- man vocal tract allows us to make phonological and typological predictions about speech sounds typified by particular aerodynamic regimes. For example, some have argued that the realization of nasalized fricatives is improbable because fricatives and nasals have an- tagonistic aerodynamic specifications. Fricatives require high pressure behind the suprala- ryngeal constriction as a precondition for high particle velocity. Nasalization, on the other hand, vents back pressure by allowing air to escape through the velopharyngeal orifice. This implies that an open velopharyngeal port will reduce oral particle velocity, thereby potentially extinguishing frication. By using a mechanical model of the vocal tract and spoken fricatives that have undergone coarticulatory nasalization, it is shown that nasal- ization must alter the spectral characteristics of fricatives, e.g.
    [Show full text]
  • How to Edit IPA 1 How to Use SAMPA for Editing IPA 2 How to Use X
    version July 19 How to edit IPA When you want to enter the International Phonetic Association (IPA) character set with a computer keyboard, you need to know how to enter each IPA character with a sequence of keyboard strokes. This document describes a number of techniques. The complete SAMPA and RTR mapping can be found in the attached html documents. The main html document (ipa96.html) comes in a pdf-version (ipa96.pdf) too. 1 How to use SAMPA for editing IPA The Speech Assessment Method (SAM) Phonetic Alphabet has been developed by John Wells (http://www.phon.ucl.ac.uk/home/sampa). The goal was to map 176 IPA characters into the range of 7-bit ASCII, which is a set of 96 characters. The principle is to represent a single IPA character by a single ASCII character. This table is an example for five vowels: Description IPA SAMPA script a ɑ A ae ligature æ { turned a ɐ 6 epsilon ɛ E schwa ə @ A visual represenation of a keyboard shows the mapping on screen. The source for the SAMPA mapping used is "Handbook of multimodal an spoken dialogue systems", D Gibbon, Kluwer Academic Publishers 2000. 2 How to use X-SAMPA for editing IPA The multi-character extension to SAMPA has also been developed by John Wells (http://www.phon.ucl.ac.uk/home/sampa/x-sampa.htm). The basic principle used is to form chains of ASCII characters, that represent a single IPA character, e.g. This table lists some examples Description IPA X-SAMPA beta β B small capital B ʙ B\ lower-case B b b lower-case P p p Phi ɸ p\ The X-SAMPA mapping is in preparation and will be included in the next release.
    [Show full text]
  • Sample File “H” = a Voiceless Alveolar Affricate: “Ts.” the “Ts” of “Hats” Or “Pots.”
    Credits front layout and design Victor Raymond cover illustration Giovanna Fregni editorial help SampleChris Davis file ©M. A. R. Barker, 2002 Bednálljan THE SCRIPT OF THE FIRST IMPERIUM By M. A. R. BARKER “Bednálljan Salarvyáni” is a Khíshan language, related to Tsolyáni, Mu’ugalavyáni, and others of the family. It is a member in a long tradition, dating all the way back to Llyáni in far-off Livyánu; yet the stages of this process are anything but clear. One important fact is its close relationship to Irzákh, the tongue of the Dragon Warriors of N’lüss. The language can only tenuously be connected to Bednállja, the small princpality that once occupied the shores of Tamkáde BaySample in what is now file Western Salarvyá. The First Imperium, the empire founded by Queen Nayári of Jakálla, in southern Tsolyánu, was the primary cause for the prominence of both the language and the name: what “Bednálljan Salarvyáni.” Had she and her court not spoken of Bednállja as their original cultural and spiritual “homeland,” the language might well have been called something else entirely. “Bednálljan Salarvyáni” is not a single unified linguistic corpus. There were many dialectical changes during the First Imperium. Time and events have eroded the visibility of many of these: cognates, morphological and syntactic similarities, and sound shifts. What is left is a basic strong relationship, however, as can be seen from tomb inscriptions and historical texts, plus such non-linguistic cultural sequences as pottery, coins, and later records. Perhaps a dozen major dialects emerged from the chaos of the crumbling kingdoms of the Fisherman Kings.
    [Show full text]
  • Dominance in Coronal Nasal Place Assimilation: the Case of Classical Arabic
    http://elr.sciedupress.com English Linguistics Research Vol. 9, No. 3; 2020 Dominance in Coronal Nasal Place Assimilation: The Case of Classical Arabic Zainab Sa’aida Correspondence: Zainab Sa’aida, Department of English, Tafila Technical University, Tafila 66110, Jordan. ORCID: https://orcid.org/0000-0001-6645-6957, E-mail: [email protected] Received: August 16, 2020 Accepted: Sep. 15, 2020 Online Published: Sep. 21, 2020 doi:10.5430/elr.v9n3p25 URL: https://doi.org/10.5430/elr.v9n3p25 Abstract The aim of this study is to investigate place assimilation processes of coronal nasal in classical Arabic. I hypothesise that coronal nasal behaves differently in different assimilatory situations in classical Arabic. Data of the study were collected from the Holy Quran. It was referred to Quran.com for the pronunciations and translations of the data. Data of the study were analysed from the perspective of Mohanan’s dominance in assimilation model. Findings of the study have revealed that coronal nasal shows different assimilatory behaviours when it occurs in different syllable positions. Coronal nasal onset seems to fail to assimilate a whole or a portion of the matrix of a preceding obstruent or sonorant coda within a phonological word. However, coronal nasal in the coda position shows different phonological behaviours. Keywords: assimilation, dominance, coronal nasal, onset, coda, classical Arabic 1. Introduction An assimilatory situation in natural languages has two elements in which one element dominates the other. Nasal place assimilation occurs when a nasal phoneme takes on place features of an adjacent consonant. This study aims at investigating place assimilation processes of coronal nasal in classical Arabic (CA, henceforth).
    [Show full text]
  • Dental Fricatives in the Speech of Educated Singaporeans Author(S) Shanti Marion Moorthy and David Deterding Source A
    Title Three or tree? Dental fricatives in the speech of educated Singaporeans Author(s) Shanti Marion Moorthy and David Deterding Source A. Brown, D. Deterding & E. L. Low (Eds.), The English language in Singapore: Research on pronunciation (pp. 76-83) Published by Singapore Association for Applied Linguistics This document may be used for private study or research purpose only. This document or any part of it may not be duplicated and/or distributed without permission of the copyright owner. The Singapore Copyright Act applies to the use of this document. Citation: Deterding, D. (2000). Three or tree? Dental fricatives in the speech of educated Singaporeans. In A. Brown, D. Deterding & E. L. Low (Eds.), The English language in Singapore: Research on pronunciation (pp. 76-83). Singapore: Singapore Association for Applied Linguistics. Archived with permission from the copyright holder. Three or tree? Dental fricatives in the speech of educated Singaporeans Shanti Marion Moorthy and David Deterding Introduction It is commonly agreed by researchers that one of the most distinctive features of Singapore English (SgE) pronunciation is the avoidance of the dental fricatives /T/ and /D/. Sometimes these are replaced by the corresponding alveolar plosives /t/ and /d/ (Tongue 1979:27, Platt & Weber 1980:52, Deterding & Hvitfeldt 1994) or maybe by dental plosives (Brown 1991:121), and sometimes, in final position, by a labiodental fricative, /f/ or /v/ (Bao 1998:154, Deterding & Poedjosoedarmo 1998:157). All of these studies depend on the phonetic experience and impressionistic judgements of the researchers. One investigation that has attempted numerical analysis of the use of dental fricatives in SgE is that of Tan (1989), who found that Singaporeans replace voiceless dental fricatives with plosives in 35.19% of instances during formal speech and 80.00% of the time during informal speech.
    [Show full text]
  • Building a Universal Phonetic Model for Zero-Resource Languages
    Building a Universal Phonetic Model for Zero-Resource Languages Paul Moore MInf Project (Part 2) Interim Report Master of Informatics School of Informatics University of Edinburgh 2020 3 Abstract Being able to predict phones from speech is a challenge in and of itself, but what about unseen phones from different languages? In this project, work was done towards building precisely this kind of universal phonetic model. Using the GlobalPhone language corpus, phones’ articulatory features, a recurrent neu- ral network, open-source libraries, and an innovative prediction system, a model was created to predict phones based on their features alone. The results show promise, especially for using these models on languages within the same family. 4 Acknowledgements Once again, a huge thank you to Steve Renals, my supervisor, for all his assistance. I greatly appreciated his practical advice and reasoning when I got stuck, or things seemed overwhelming, and I’m very thankful that he endorsed this project. I’m immensely grateful for the support my family and friends have provided in the good times and bad throughout my studies at university. A big shout-out to my flatmates Hamish, Mark, Stephen and Iain for the fun and laugh- ter they contributed this year. I’m especially grateful to Hamish for being around dur- ing the isolation from Coronavirus and for helping me out in so many practical ways when I needed time to work on this project. Lastly, I wish to thank Jesus Christ, my Saviour and my Lord, who keeps all these things in their proper perspective, and gives me strength each day.
    [Show full text]
  • Grammar and Corpora 2016
    Published in: Fuß, Eric/Konopka, Marek/Trawiński, Beata/Waßner, Ulrich H. (eds.): Grammar and corpora 2016. - Heidelberg: Heidelberg University Publishing, 2018. Pp. 289-312 Renate Rafelsiefen, Anja Geumann Phonological Analysis at the Word Level: The Role of Corpora Abstract Notions such as “corpus-driven” versus “theory-driven” bring into focus the specific role of corpora in linguistic research. As for phonology with its intrinsic focus on abstract categorical representation, there is a question of how a strictly corpus-driven approach can yield insight into relevant struc - tures. Here we argue for a more theory-driven approach to phonology based on the concept of a phonological grammar in terms of interacting constraints. Empirical validation of such grammars comes from the potential convergence of the evidence from various sources including typological data, neutralization patterns, and in particular patterns observed in the creative use of language such as acronym formation, loanword adaptation, poetry, and speech errors. Further empirical validation concerns specific predictions regarding phonetic differences among opposition members, paradigm uniformity effects, and pho - netic implementation in given segmental and prosodic contexts. Corpora in the narrowest sense (i.e. “raw” data consisting of spontaneous speech produced in natural settings) are useful for testing these predictions, but even here, special purpose-built corpora are often necessary. Keywords Speech corpora, German vowels, phonological grammar, abstract - ness, Optimality Theory 1 Introduction Phonology is concerned with capturing the contrastive potential of a language, aiming at a comprehensive account of the ways in which differences in mean - ing can be conveyed through sound differences. Traditionally, a phonological description includes an inventory of phonemes, organized in terms of oppo - sitions or distinctive features, along with rules for the combination and pro - sodic organization of the phonemes.
    [Show full text]