Building a Universal Phonetic Model for Zero-Resource Languages

Total Page:16

File Type:pdf, Size:1020Kb

Building a Universal Phonetic Model for Zero-Resource Languages Building a Universal Phonetic Model for Zero-Resource Languages Paul Moore MInf Project (Part 2) Interim Report Master of Informatics School of Informatics University of Edinburgh 2020 3 Abstract Being able to predict phones from speech is a challenge in and of itself, but what about unseen phones from different languages? In this project, work was done towards building precisely this kind of universal phonetic model. Using the GlobalPhone language corpus, phones’ articulatory features, a recurrent neu- ral network, open-source libraries, and an innovative prediction system, a model was created to predict phones based on their features alone. The results show promise, especially for using these models on languages within the same family. 4 Acknowledgements Once again, a huge thank you to Steve Renals, my supervisor, for all his assistance. I greatly appreciated his practical advice and reasoning when I got stuck, or things seemed overwhelming, and I’m very thankful that he endorsed this project. I’m immensely grateful for the support my family and friends have provided in the good times and bad throughout my studies at university. A big shout-out to my flatmates Hamish, Mark, Stephen and Iain for the fun and laugh- ter they contributed this year. I’m especially grateful to Hamish for being around dur- ing the isolation from Coronavirus and for helping me out in so many practical ways when I needed time to work on this project. Lastly, I wish to thank Jesus Christ, my Saviour and my Lord, who keeps all these things in their proper perspective, and gives me strength each day. Table of Contents 1 Introduction 9 1.1 Motivation . 9 1.2 Project outline . 10 1.3 Previous project work . 10 2 Modelling phones 11 2.1 Phones vs. phonemes . 11 2.2 Standard phone modelling . 12 2.2.1 Feature extraction . 12 2.2.2 Monophone models . 12 2.2.3 Basic triphone models . 13 2.2.4 Advanced triphone models . 13 2.2.5 Limitations of standard models . 14 2.3 Deep learning . 14 2.3.1 Recurrent Neural Networks (RNNs) . 14 2.3.2 Long Short-Term Memory (LSTM) . 15 2.3.3 RMSProp optimisation . 16 2.3.4 Connectionist Temporal Classification (CTC) loss . 16 2.3.5 Miscellaneous techniques . 17 2.4 Universal phone models . 17 2.4.1 General concepts . 17 2.4.2 Modelling unseen phones . 18 2.4.3 Universal phone modelling with attributes . 19 3 General setup 21 3.1 The GlobalPhone Dataset . 21 3.1.1 Suitability analysis . 21 3.2 File preparation . 23 3.2.1 Kaldi . 23 3.2.2 Conversion and preliminary cleaning . 23 3.2.3 Splitting the data . 23 3.2.4 Standardising phones . 24 3.2.5 Generating transcriptions . 25 3.2.6 Generating input features . 26 3.3 Organising experiments . 26 3.3.1 Additional filtering . 26 5 6 TABLE OF CONTENTS 3.3.2 Converting phones to attributes . 26 3.3.3 Dealing with diphthongs . 27 3.4 Using PyTorch-Kaldi . 29 3.4.1 Adapting input alignments . 29 3.4.2 Cost function . 29 3.4.3 Model saving . 31 3.4.4 Chunk sizes . 31 3.4.5 Network structure . 31 3.4.6 Gradient issues . 33 3.5 Predicting phones from attributes . 33 3.5.1 Distance metrics . 33 3.5.2 Initial split . 33 3.5.3 Decision trees . 34 3.5.4 Universal scoring . 35 3.6 Evaluation . 37 3.6.1 Issues with decoding . 37 3.6.2 Alternative evaluation metrics . 38 4 Experiments 41 4.1 Experiment 1: Shallow models . 41 4.1.1 Research questions . 41 4.1.2 Setup . 41 4.1.3 Results . 42 4.2 Experiment 2: Baseline network . 43 4.2.1 Research questions . 43 4.2.2 Setup . 44 4.2.3 Results . 44 4.3 Experiment 3: Attribute network . 47 4.3.1 Research questions . 47 4.3.2 Setup . 47 4.3.3 Results . 49 4.4 Experiment 4: Cross-lingual investigations . 52 4.4.1 Research questions . 52 4.4.2 Setup . 53 4.4.3 Results . 54 5 Conclusions 59 5.1 Future work . 59 5.1.1 Fixing GlobalPhone . 59 5.1.2 Phonetic attribute improvements . 59 5.1.3 Replacing PyTorch-Kaldi . 60 5.1.4 Network structure improvements . 61 5.2 Results summary . 61 Bibliography 63 A Universal Phone Set 69 TABLE OF CONTENTS 7 A.1 Base Phones . 69 A.2 Extensions . 72 A.3 Phone maps . 72 B Dataset splits 77 B.1 Speaker lists . 77 B.2 Dataset statistics . 79 C Phone errors in baseline network 81 D Confusion matrices for attribute networks 87 E Phone distributions for attribute network 97 Chapter 1 Introduction ““Come, let us go down and confuse their language so they will not un- derstand each other”. That is why [the city] was called Babel—because there the Lord confused the language of the whole world.” ∗ 1.1 Motivation The above-quoted tale of the Tower of Babel, where humanity’s single language was split into different ones, has had a profound cultural impact that continues to this day. In Douglas Adam’s The Hitchhiker’s Guide to the Galaxy the so-called Babel fish is capable of translating any spoken language. While any organism or computer system with the ability to instantly reverse the “Babel effect” remains firmly in the area of science fiction for the present, there are related problems which may be more solvable. Worldwide there are nearly 3,000 unwritten languages [Eberhard et al., 2020]. Most of these are likely to have little to no audio data available either. According to Austin and Sallabank [2011], linguists believe that around 50-90% of the 7,000 languages world- wide will go extinct within this century, which doubtless includes the vast majority of unwritten ones. Some linguists have argued that this is a natural process, and we should do little to interfere with it ([Mufwene, 2004], [Ladefoged, 1992]). However, numerous other linguists believe that it is important to preserve them if possible, since these languages are an integral part of the society and culture they are in, and are a key component of human identity ([Austin and Sallabank, 2011], [Romaine, 2007]). When trying to save any endangered language, a key factor is to have a writing system for it. This empowers members of these people groups to read and write their own language, not just speak/hear it. Consequently, cultural stories or traditions can be written down in their original languages, and people will be able to communicate in written fashion in their native tongue, along with a whole host of other benefits. ∗Genesis 11:7,9 (NIV) 9 10 Chapter 1. Introduction In fact, such communication may be an important motivator for speakers of these lan- guages to preserve their language. Otherwise, a more common written language may be very attractive, particularly to younger members as they interact with the modern world. After all, they may reason, why continue using a language which is less con- venient for common activities such as text or email? Books or other reading materials are also a powerful impetus for perpetuating the use of such a language. However, in order to develop a writing system, an alphabet is required. Linguists need to work out the phonetic structure of a language and use this to decide on how to represent the sounds in writing. The task of discovering these phones is challenging, and often requires a great deal of time and effort. The International Phonetic Alphabet (IPA) [Smith, 1999] is frequently used to standardise the transcription of phones. Building a universal phonetic model, thus providing a way to model all the phones in the IPA, would make this undertaking considerably easier, with a phonetic transcrip- tion based on nothing other than the audio. Even if an accurate transcription proved difficult, recurring phonetic features could be highlighted, which would be beneficial. 1.2 Project outline The existing methods for modelling phones, particularly in a universal model will be discussed first. Then, the general experimental setup used across most of the experi- ments will be given. The experiments themselves aim to answer the following questions: • What is a reasonable baseline, using non-universal phones? • Which feature types are better for training? • Does training on languages within the same family improve performance for unseen languages within the same family, or is it better to have as many different languages as possible? Finally, directions for potential future work will be outlined, and overall findings sum- marised. 1.3 Previous project work Certain aspects of work from the project from last year [Moore, 2019] were reused. While previously the focus was on language identification, the goal for this year, as stated in this introduction, was quite different. Some of the scripts for working with Kaldi [Povey et al., 2011] and the GlobalPhone dataset [Schultz, 2002] were reused and/or improved. Furthermore, a common focus in both projects has been working on models which could be applicable in areas of the world with little to no transcribed language resources. Chapter 2 Modelling phones In this chapter, the basic principles of building models for representing phones will be covered. Based on these simpler models, ways to apply these principles in a multilin- gual or universal sense will be explored. There will also be a brief section on relevant deep learning techniques which were used in the course of this project. 2.1 Phones vs. phonemes To begin, one important distinction to make is the difference between phones and phonemes, as these will be referred to throughout the rest of this report.
Recommended publications
  • LT3212 Phonetics Assignment 4 Mavis, Wong Chak Yin
    LT3212 Phonetics Assignment 4 Mavis, Wong Chak Yin Essay Title: The sound system of Japanese This essay aims to introduce the sound system of Japanese, including the inventories of consonants, vowels, and diphthongs. The phonological variations of the sound segments in different phonetic environments are also included. For the illustration, word examples are given and they are presented in the following format: [IPA] (Romaji: “meaning”). Consonants In Japanese, there are 14 core consonants, and some of them have a lot of allophonic variations. The various types of consonants classified with respect to their manner of articulation are presented as follows. Stop Japanese has six oral stops or plosives, /p b t d k g/, which are classified into three place categories, bilabial, alveolar, and velar, as listed below. In each place category, there is a pair of plosives with the contrast in voicing. /p/ = a voiceless bilabial plosive [p]: [ippai] (ippai: “A cup of”) /b/ = a voiced bilabial plosive [b]: [baɴ] (ban: “Night”) /t/ = a voiceless alveolar plosive [t]: [oto̞ ːto̞ ] (ototo: “Brother”) /d/ = a voiced alveolar plosive [d]: [to̞ mo̞ datɕi] (tomodachi: “Friend”) /k/ = a voiceless velar plosive [k]: [kaiɰa] (kaiwa: “Conversation”) /g/ = a voiced velar plosive [g]: [ɡakɯβsai] (gakusai: “Student”) Phonetically, Japanese also has a glottal stop [ʔ] which is commonly produced to separate the neighboring vowels occurring in different syllables. This phonological phenomenon is known as ‘glottal stop insertion’. The glottal stop may be realized as a pause, which is used to indicate the beginning or the end of an utterance. For instance, the word “Japanese money” is actually pronounced as [ʔe̞ ɴ], instead of [je̞ ɴ], and the pronunciation of “¥15” is [dʑɯβːɡo̞ ʔe̞ ɴ].
    [Show full text]
  • How to Edit IPA 1 How to Use SAMPA for Editing IPA 2 How to Use X
    version July 19 How to edit IPA When you want to enter the International Phonetic Association (IPA) character set with a computer keyboard, you need to know how to enter each IPA character with a sequence of keyboard strokes. This document describes a number of techniques. The complete SAMPA and RTR mapping can be found in the attached html documents. The main html document (ipa96.html) comes in a pdf-version (ipa96.pdf) too. 1 How to use SAMPA for editing IPA The Speech Assessment Method (SAM) Phonetic Alphabet has been developed by John Wells (http://www.phon.ucl.ac.uk/home/sampa). The goal was to map 176 IPA characters into the range of 7-bit ASCII, which is a set of 96 characters. The principle is to represent a single IPA character by a single ASCII character. This table is an example for five vowels: Description IPA SAMPA script a ɑ A ae ligature æ { turned a ɐ 6 epsilon ɛ E schwa ə @ A visual represenation of a keyboard shows the mapping on screen. The source for the SAMPA mapping used is "Handbook of multimodal an spoken dialogue systems", D Gibbon, Kluwer Academic Publishers 2000. 2 How to use X-SAMPA for editing IPA The multi-character extension to SAMPA has also been developed by John Wells (http://www.phon.ucl.ac.uk/home/sampa/x-sampa.htm). The basic principle used is to form chains of ASCII characters, that represent a single IPA character, e.g. This table lists some examples Description IPA X-SAMPA beta β B small capital B ʙ B\ lower-case B b b lower-case P p p Phi ɸ p\ The X-SAMPA mapping is in preparation and will be included in the next release.
    [Show full text]
  • A Phonetic, Phonological, and Morphosyntactic Analysis of the Mara Language
    San Jose State University SJSU ScholarWorks Master's Theses Master's Theses and Graduate Research Spring 2010 A Phonetic, Phonological, and Morphosyntactic Analysis of the Mara Language Michelle Arden San Jose State University Follow this and additional works at: https://scholarworks.sjsu.edu/etd_theses Recommended Citation Arden, Michelle, "A Phonetic, Phonological, and Morphosyntactic Analysis of the Mara Language" (2010). Master's Theses. 3744. DOI: https://doi.org/10.31979/etd.v36r-dk3u https://scholarworks.sjsu.edu/etd_theses/3744 This Thesis is brought to you for free and open access by the Master's Theses and Graduate Research at SJSU ScholarWorks. It has been accepted for inclusion in Master's Theses by an authorized administrator of SJSU ScholarWorks. For more information, please contact [email protected]. A PHONETIC, PHONOLOGICAL, AND MORPHOSYNTACTIC ANALYSIS OF THE MARA LANGUAGE A Thesis Presented to The Faculty of the Department of Linguistics and Language Development San Jose State University In Partial Fulfillment of the Requirements for the Degree Master of Arts by Michelle J. Arden May 2010 © 2010 Michelle J. Arden ALL RIGHTS RESERVED The Designated Thesis Committee Approves the Thesis Titled A PHONETIC, PHONOLOGICAL, AND MORPHOSYNTACTIC ANALYSIS OF THE MARA LANGUAGE by Michelle J. Arden APPROVED FOR THE DEPARTMENT OF LINGUISTICS AND LANGUAGE DEVELOPMENT SAN JOSE STATE UNIVERSITY May 2010 Dr. Daniel Silverman Department of Linguistics and Language Development Dr. Soteria Svorou Department of Linguistics and Language Development Dr. Kenneth VanBik Department of Linguistics and Language Development ABSTRACT A PHONETIC, PHONOLOGICAL, AND MORPHOSYNTACTIC ANALYSIS OF THE MARA LANGUAGE by Michelle J. Arden This thesis presents a linguistic analysis of the Mara language, a Tibeto-Burman language spoken in northwest Myanmar and in neighboring districts of India.
    [Show full text]
  • The South African Directory Enquiries (SADE) Name Corpus
    (C) Springer Nature B.V. 2019. In Language Resources & Evaluation (2019). The final publication is available at https://link.springer.com/article/10.1007/s10579-019-09448-6 The South African Directory Enquiries (SADE) Name Corpus Jan W.F. Thirion · Charl van Heerden · Oluwapelumi Giwa · Marelie H. Davel Accepted: 25 January 2019 Abstract We present the design and development of a South African directory enquiries (DE) corpus. It contains audio and orthographic transcriptions of a wide range of South African names produced by first- language speakers of four languages, namely Afrikaans, English, isiZulu and Sesotho. Useful as a resource to understand the effect of name language and speaker language on pronunciation, this is the first corpus to also aim to identify the “intended language”: an implicit assumption with regard to word origin made by the speaker of the name. We describe the design, collection, annotation, and verification of the corpus. This includes an analysis of the algorithms used to tag the corpus with meta information that may be beneficial to pronunciation modelling tasks. Keywords Speech corpus collection · Pronunciation modelling · Speech recognition · Proper names 1 Introduction Multilingual environments, such as in South Africa, present unique and interesting challenges to systems dealing with pronunciation variability. Spoken dialogue systems need to adequately deal with various factors that affect speech production, such as a speaker’s socio-economic background, mother tongue language, age, and gender [37, 2]. Differences among these factors result in speakers producing words with varying pronunciation, leading to deteriorated recognition performance [1]. Hand-crafting rules to deal with pronunciation variation is both time-consuming and impractical.
    [Show full text]
  • Coproduction and Coarticulation in Isizulu Clicks
    Coproduction and Coarticulation in IsiZulu Clicks Coproduction and Coarticulation in IsiZulu Clicks by Kimberly Diane Thomas-Vilakati University of California Press Berkeley Los Angeles London UNIVERSITY OF CALIFORNIA PRES S, one of the most distinguished university presses in the United States, enriches lives around the world by advancing scholarship in the humanities, social sciences, and natural sciences. Its activities are supported by the UC Press Foundation and philanthropic contributions from individuals and institutions. For more information, visit www.ucpress.edu University of California Press Berkeley and Los Angeles, California University of California Press, Ltd. London, England UNIVERSITY OF CALIFORNIA PUBLICATIONS IN LINGUISTICS Editorial Board: Judith Aissen, Andrew Garrett, Larry M. Hyman, Marianne Mithun, Pamela Munro, Maria Polinsky Volume 144 Coproduction and Coarticulation in IsiZulu Clicks by Kimberly Diane Thomas-Vilakati © 2010 by The Regents of the University of California All rights reserved. Published 2010 20 19 18 17 16 15 14 13 12 11 10 1 2 3 4 5 ISBN 978-0-520-09876-3 (pbk. : alk. paper) Library of Congress Control Number: 2010922226 The paper used in this publication meets the minimum requirements of ANSI/NISO Z39.48-1992 (R 1997) (Permanence of Paper). Dedication This study is dedicated to the following individuals: To my loving father, who sacrificed his life to work hard in order to educate me and who, through his loyalty and devotion, made this all possible. To my loving mother, who gave selflessly to seven children and many grandchildren, whose confidence in me never waivered and who gave me the fortitude to compete in the international arena.
    [Show full text]
  • Glossopoeia a Contrastive Phonological Study Of
    DEPARTAMENT DE FILOLOGIA ANGLESA I DE GERMANÍSTICA Glossopoeia A Contrastive Phonological Study of Sindarin and Klingon Treball de Fi de Grau Author: Mónica Malvárez Ocaña Supervisor: Hortènsia Curell Gotor Grau d’Estudis Anglesos June 2020 jyE qhE5 `B 7r$`B6E tiT16E lE5 Law pain i reviar mistar aen. Not all those who wander are lost. ACKNOWLEDGEMENTS I would like to express my appreciation to Dr. Hortènsia Curell, not only for her help and support during these difficult months that I have been abroad, but also for giving me the opportunity and the freedom to explore other fascinating linguistic areas, such as glossopoeia. I would also like to thank my friends and family for always pushing me to go one step further and to think outside the box. I discovered the universe of Middle-Earth during my childhood, and for that reason, it will always have a special place in my heart. Before going to bed, my father used to read The Hobbit to me. I remember being mesmerized by the story and the characters, and even now, as an adult, I am still mesmerized by what J.R.R. Tolkien created. TABLE OF CONTENTS 1. Introduction ................................................................................................................. 2 2. Constructed Languages ............................................................................................... 3 2.1. Classification of Conlangs ................................................................................ 3 2.1.1. Historical Classification ....................................................................
    [Show full text]
  • Prestopped Bilabial Trills in Sangtam*
    PRESTOPPED BILABIAL TRILLS IN SANGTAM* Alexander R. Coupe Nanyang Technological University, Singapore [email protected] ABSTRACT manner of articulation in an ethnographic description published in 1939: This paper discusses the phonetic and phonological p͜͜ w = der für Nord-Sangtam typische Konsonant, sehr features of a typologically rare prestopped bilabial schwierig auszusprechen; tönt etwa wie pw oder pr. trill and some associated evolving sound changes in Wird jedoch von den Lippen gebildet, durch die man the phonology of Sangtam, a Tibeto-Burman die Luft so preßt, daß die Untelippe einmal (oder language of central Nagaland, north-east India. zweimal) vibriert. (Möglicherweise gibt es den gleichen Konsonanten etwas weicher und wird dann Prestopped bilabial trills were encountered in two mit b͜ w bezeichnet). dozen words of a 500-item corpus and found to be in Translation: pw = the typical consonant for the North phonemic contrast with all other members of the Sangtam language, very difficult to pronounce; sounds plosive series. Evidence from static palatograms and like pw or pr. It is however produced by the lips, linguagrams demonstrates that Sangtam speakers through which one presses the air in a way that the articulate this sound by first making an apical- or lower lip vibrates once (or twice). (Possibly, the same laminal-dental oral occlusion, which is then consonant exists in a slightly softer form and is then 1 explosively released into a bilabial trill involving up termed bw). to three oscillations of the lips. In 2012 a similar sound was encountered in two The paper concludes with a discussion of the dozen words of a 500-word corpus of Northern possible historical sources of prestopped bilabial Sangtam, the main difference from Kauffman’s trills in this language, taking into account description being that the lip vibration is preceded phonological reconstructions and cross-linguistic by an apical- or laminal-dental occlusion.
    [Show full text]
  • Improving Machine Translation of Null Subjects in Italian and Spanish
    Improving Machine Translation of Null Subjects in Italian and Spanish Lorenza Russo, Sharid Loaiciga,´ Asheesh Gulati Language Technology Laboratory (LATL) Department of Linguistics – University of Geneva 2, rue de Candolle – CH-1211 Geneva 4 – Switzerland {lorenza.russo, sharid.loaiciga, asheesh.gulati}@unige.ch Abstract (2000) has shown that 46% of verbs in their test corpus had their subjects omitted. Continuation Null subjects are non overtly expressed of this work by Rello and Ilisei (2009) has found subject pronouns found in pro-drop lan- that in a corpus of 2,606 sentences, there were guages such as Italian and Spanish. In this study we quantify and compare the oc- 1,042 sentences without overtly expressed pro- currence of this phenomenon in these two nouns, which represents an average of 0.54 null languages. Next, we evaluate null sub- subjects per sentence. As for Italian, many anal- jects’ translation into French, a “non pro- yses are available from a descriptive and theoret- drop” language. We use the Europarl cor- ical perspective (Rizzi, 1986; Cardinaletti, 1994, pus to evaluate two MT systems on their among others), but to the best of our knowledge, performance regarding null subject trans- there are no corpus studies about the extent this lation: Its-2, a rule-based system devel- 2 oped at LATL, and a statistical system phenomenon has. built using the Moses toolkit. Then we Moreover, althought null elements have been add a rule-based preprocessor and a sta- largely treated within the context of Anaphora tistical post-editor to the Its-2 translation Resolution (AR) (Mitkov, 2002; Le Nagard and pipeline.
    [Show full text]
  • Saudi Speakers' Perception of the English Bilabial
    Sino-US English Teaching, June 2015, Vol. 12, No. 6, 435-447 doi:10.17265/1539-8072/2015.06.005 D DAVID PUBLISHING Saudi Speakers’ Perception of the English Bilabial Stops /b/ and /p/ Mohammad Al Zahrani Taif University, Taif, Saudi Arabia Languages differ in their phoneme inventories. Some phonemes exist in more than one language but others exist in relatively few languages. More specifically, English Language has some sounds that Arabic does not have and vice versa. This paper focuses on the perception of the English bilabial stops /b/ and /p/ in contrast to the perception of the English alveolar stops /t/ and /d/ by some Saudi linguists who have been speaking English for more than six years and who are currently in an English speaking country, Australia. This phenomenon of perception of the English bilabial stops /b/ and /p/ will be tested mainly by virtue of minimal pairs and other words that may better help to investigate this perception. The paper uses some minimal pairs in which the bilabial and alveolar stops occur initially and finally. Also, it uses some verbs that end with the suffix /-ed/, but this /-ed/ suffix is pronounced [t] or [d] when preceded by /p/ or /b/ respectively. Notice that [t] and [d] are allophones of the English past tense morpheme /-ed/ (for example, Fromkin, Rodman, & Hyams, 2007). The pronunciation of the suffix as [t] and [d] works as a clue for the subjects to know the preceding bilabial sound. Keywords: perception, Arabic, stops, English, phonology Introduction Languages differ in their phoneme inventories.
    [Show full text]
  • Phonetics and Phonology Seminar Introduction to Linguistics, Andrew
    Phonetics and Phonology Phonetics and Phonology Voicing: In voiced sounds, the vocal cords (=vocal folds, Stimmbände) are pulled together Seminar Introduction to Linguistics, Andrew McIntyre and vibrate, unlike in voiceless sounds. Compare zoo/sue, ban/pan. Tests for voicing: 1 Phonetics vs. phonology Put hand on larynx. You feel more vibrations with voiced consonants. Phonetics deals with three main areas: Say [fvfvfv] continuously with ears blocked. [v] echoes inside your head, unlike [f]. Articulatory phonetics: speech organs & how they move to produce particular sounds. Acoustic phonetics: what happens in the air between speaker & hearer; measurable 4.2 Description of English consonants (organised by manners of articulation) using devices such as a sonograph, which analyses frequencies. The accompanying handout gives indications of the positions of the speech organs Auditory phonetics: how sounds are perceived by the ear, how the brain interprets the referred to below, and the IPA description of all sounds in English and other languages. information coming from the ear. Phonology: study of how particular sounds are used (in particular languages, in languages 4.2.1 Plosives generally) to distinguish between words. Study of how sounds form systems in (particular) Plosive (Verschlusslaut): complete closure somewhere in vocal tract, then air released. languages. Examples of phonological observations: (2) Bilabial (both lips are the active articulators): [p,b] in pie, bye The underlined sound sequence in German Strumpf can occur in the middle of words (3) Alveolar (passive articulator is the alveolar ridge (=gum ridge)): [t,d] in to, do in English (ashtray) but not at the beginning or end. (4) Velar (back of tongue approaches soft palate (velum)): [k,g] in cat, go In pan and span the p-sound is pronounced slightly differently.
    [Show full text]
  • Acoustic Characteristics of Aymara Ejectives: a Pilot Study
    ACOUSTIC CHARACTERISTICS OF AYMARA EJECTIVES: A PILOT STUDY Hansang Park & Hyoju Kim Hongik University, Seoul National University [email protected], [email protected] ABSTRACT Comparison of velar ejectives in Hausa [18, 19, 22] and Navajo [36] showed significant cross- This study investigates acoustic characteristics of linguistic variation and some notable inter-speaker Aymara ejectives. Acoustic measurements of the differences [27]. It was found that the two languages Aymara ejectives were conducted in terms of the differ in the relative durations of the different parts durations of the release burst, the vowel, and the of the ejectives, such that Navajo stops are greater in intervening gap (VOT), the intensity and spectral the duration of the glottal closure than Hausa ones. centroid of the release burst, and H1-H2 of the initial In Hausa, the glottal closure is probably released part of the vowel. Results showed that ejectives vary very soon after the oral closure and it is followed by with place of articulation in the duration, intensity, a period of voiceless airflow. In Navajo, it is and centroid of the release burst but commonly have released into a creaky voice which continues from a lower H1-H2 irrespective of place of articulation. several periods into the beginning of the vowel. It was also found that the long glottal closure in Keywords: Aymara, ejective, VOT, release burst, Navajo could not be attributed to the overall speech H1-H2. rate, which was similar in both cases [27]. 1. INTRODUCTION 1.2. Aymara ejectives 1.1. Ejectives Ejectives occur in Aymara, which is one of the Ande an languages spoken by the Aymara people who live Ejectives are sounds which are produced with a around the Lake Titicaca region of southern Peru an glottalic egressive airstream mechanism [26].
    [Show full text]
  • Consonant Co-Occurrence Classes and the Feature-Economy Principle* Dmitry Nikolaev Stockholm University Eitan Grossman Hebrew University of Jerusalem
    Phonology 37 (2020) 419–451. © The Author(s), 2020. Published by Cambridge University Press. This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited. doi:10.1017/S0952675720000226 Consonant co-occurrence classes and the feature-economy principle* Dmitry Nikolaev Stockholm University Eitan Grossman Hebrew University of Jerusalem The feature-economy principle is one of the key theoretical notions which have been postulated to account for the structure of phoneme inventories in the world’s languages. In this paper, we test the explanatory power of this principle by conducting a study of the co-occurrence of consonant segments in phonological inventories, based on a sample of 2761 languages. We show that the feature- economy principle is able to account for many important patterns in the structure of the world’s phonological inventories; however, there are particular classes of sounds, such as what we term the ‘basic consonant inventory’ (the core cluster of segments found in the majority of the world’s languages), as well as several more peripheral clusters whose organisation follows different principles. 1 Introduction A central question in phonological typology (and in phonology more gener- ally) is whether there are principles that govern the size, structure and con- stituent parts of phonological inventories, and if so, what they are. Research in recent decades has proposed numerous factors, often extralinguistic, that predict the composition of phonological inventories. Such proposed factors include demography (Pericliev 2004, Hay & Bauer 2007, Donohue & Nichols 2011,Moranet al.
    [Show full text]