The Effects of Speech Rate on VOT for Initial Plosives and Click Accompaniments in Zulu
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Unicode Cookbook for Linguists: Managing Writing Systems Using Orthography Profiles
Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2017 The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles Moran, Steven ; Cysouw, Michael DOI: https://doi.org/10.5281/zenodo.290662 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-135400 Monograph The following work is licensed under a Creative Commons: Attribution 4.0 International (CC BY 4.0) License. Originally published at: Moran, Steven; Cysouw, Michael (2017). The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles. CERN Data Centre: Zenodo. DOI: https://doi.org/10.5281/zenodo.290662 The Unicode Cookbook for Linguists Managing writing systems using orthography profiles Steven Moran & Michael Cysouw Change dedication in localmetadata.tex Preface This text is meant as a practical guide for linguists, and programmers, whowork with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together. The intersection of the Unicode Standard and the International Phonetic Al- phabet is often not met without frustration by users. Nevertheless, thetwo standards have provided language researchers with a consistent computational architecture needed to process, publish and analyze data from many different languages. We bring to light common, but not always transparent, pitfalls that researchers face when working with Unicode and IPA. Our research uses quantitative methods to compare languages and uncover and clarify their phylogenetic relations. However, the majority of lexical data available from the world’s languages is in author- or document-specific orthogra- phies. -
Philological Sciences. Linguistics” / Journal of Language Relationship Issue 3 (2010)
Российский государственный гуманитарный университет Russian State University for the Humanities RGGU BULLETIN № 5/10 Scientific Journal Series “Philological Sciences. Linguistics” / Journal of Language Relationship Issue 3 (2010) Moscow 2010 ВЕСТНИК РГГУ № 5/10 Научный журнал Серия «Филологические науки. Языкознание» / «Вопросы языкового родства» Выпуск 3 (2010) Москва 2010 УДК 81(05) ББК 81я5 Главный редактор Е.И. Пивовар Заместитель главного редактора Д.П. Бак Ответственный секретарь Б.Г. Власов Главный художник В.В. Сурков Редакционный совет серии «Филологические науки. Языкознание» / «Вопросы языкового родства» Председатель Вяч. Вс. Иванов (Москва – Лос-Анджелес) М. Е. Алексеев (Москва) В. Блажек (Брно) У. Бэкстер (Анн Арбор) В. Ф. Выдрин (Санкт-Петербург) М. Гелл-Манн (Санта Фе) А. Б. Долгопольский (Хайфа) Ф. Кортландт (Лейден) А. Лубоцкий (Лейден) Редакционная коллегия серии: В. А. Дыбо (главный редактор) Г. С. Старостин (заместитель главного редактора) Т. А. Михайлова (ответственный секретарь) К. В. Бабаев С. Г. Болотов А. В. Дыбо О. А. Мудрак В. Е. Чернов ISSN 1998-6769 © Российский государственный гуманитарный университет, 2010 УДК 81(05) ББК 81я5 Вопросы языкового родства: Международный научный журнал / Рос. гос. гуманитар. ун-т; Рос. Акад. наук. Ин-т языкознания; под ред. В. А. Дыбо. ― М., 2010. ― № 3. ― X + 176 с. ― (Вестник РГГУ: Научный журнал; Серия «Филологические науки. Языко- знание»; № 05/10). Journal of Language Relationship: International Scientific Periodical / Russian State Uni- versity for the Humanities; Russian Academy of Sciences. Institute of Linguistics; Ed. by V. A. Dybo. ― Moscow, 2010. ― Nº 3. ― X + 176 p. ― (RSUH Bulletin: Scientific Periodical; Linguistics Series; Nº 05/10). ISSN 1998-6769 http ://journal.nostratic.ru [email protected] Дополнительные знаки: С. -
The South African Directory Enquiries (SADE) Name Corpus
(C) Springer Nature B.V. 2019. In Language Resources & Evaluation (2019). The final publication is available at https://link.springer.com/article/10.1007/s10579-019-09448-6 The South African Directory Enquiries (SADE) Name Corpus Jan W.F. Thirion · Charl van Heerden · Oluwapelumi Giwa · Marelie H. Davel Accepted: 25 January 2019 Abstract We present the design and development of a South African directory enquiries (DE) corpus. It contains audio and orthographic transcriptions of a wide range of South African names produced by first- language speakers of four languages, namely Afrikaans, English, isiZulu and Sesotho. Useful as a resource to understand the effect of name language and speaker language on pronunciation, this is the first corpus to also aim to identify the “intended language”: an implicit assumption with regard to word origin made by the speaker of the name. We describe the design, collection, annotation, and verification of the corpus. This includes an analysis of the algorithms used to tag the corpus with meta information that may be beneficial to pronunciation modelling tasks. Keywords Speech corpus collection · Pronunciation modelling · Speech recognition · Proper names 1 Introduction Multilingual environments, such as in South Africa, present unique and interesting challenges to systems dealing with pronunciation variability. Spoken dialogue systems need to adequately deal with various factors that affect speech production, such as a speaker’s socio-economic background, mother tongue language, age, and gender [37, 2]. Differences among these factors result in speakers producing words with varying pronunciation, leading to deteriorated recognition performance [1]. Hand-crafting rules to deal with pronunciation variation is both time-consuming and impractical. -
Building a Universal Phonetic Model for Zero-Resource Languages
Building a Universal Phonetic Model for Zero-Resource Languages Paul Moore MInf Project (Part 2) Interim Report Master of Informatics School of Informatics University of Edinburgh 2020 3 Abstract Being able to predict phones from speech is a challenge in and of itself, but what about unseen phones from different languages? In this project, work was done towards building precisely this kind of universal phonetic model. Using the GlobalPhone language corpus, phones’ articulatory features, a recurrent neu- ral network, open-source libraries, and an innovative prediction system, a model was created to predict phones based on their features alone. The results show promise, especially for using these models on languages within the same family. 4 Acknowledgements Once again, a huge thank you to Steve Renals, my supervisor, for all his assistance. I greatly appreciated his practical advice and reasoning when I got stuck, or things seemed overwhelming, and I’m very thankful that he endorsed this project. I’m immensely grateful for the support my family and friends have provided in the good times and bad throughout my studies at university. A big shout-out to my flatmates Hamish, Mark, Stephen and Iain for the fun and laugh- ter they contributed this year. I’m especially grateful to Hamish for being around dur- ing the isolation from Coronavirus and for helping me out in so many practical ways when I needed time to work on this project. Lastly, I wish to thank Jesus Christ, my Saviour and my Lord, who keeps all these things in their proper perspective, and gives me strength each day. -
Coproduction and Coarticulation in Isizulu Clicks
Coproduction and Coarticulation in IsiZulu Clicks Coproduction and Coarticulation in IsiZulu Clicks by Kimberly Diane Thomas-Vilakati University of California Press Berkeley Los Angeles London UNIVERSITY OF CALIFORNIA PRES S, one of the most distinguished university presses in the United States, enriches lives around the world by advancing scholarship in the humanities, social sciences, and natural sciences. Its activities are supported by the UC Press Foundation and philanthropic contributions from individuals and institutions. For more information, visit www.ucpress.edu University of California Press Berkeley and Los Angeles, California University of California Press, Ltd. London, England UNIVERSITY OF CALIFORNIA PUBLICATIONS IN LINGUISTICS Editorial Board: Judith Aissen, Andrew Garrett, Larry M. Hyman, Marianne Mithun, Pamela Munro, Maria Polinsky Volume 144 Coproduction and Coarticulation in IsiZulu Clicks by Kimberly Diane Thomas-Vilakati © 2010 by The Regents of the University of California All rights reserved. Published 2010 20 19 18 17 16 15 14 13 12 11 10 1 2 3 4 5 ISBN 978-0-520-09876-3 (pbk. : alk. paper) Library of Congress Control Number: 2010922226 The paper used in this publication meets the minimum requirements of ANSI/NISO Z39.48-1992 (R 1997) (Permanence of Paper). Dedication This study is dedicated to the following individuals: To my loving father, who sacrificed his life to work hard in order to educate me and who, through his loyalty and devotion, made this all possible. To my loving mother, who gave selflessly to seven children and many grandchildren, whose confidence in me never waivered and who gave me the fortitude to compete in the international arena. -
American Indian Languages (Abbreviated A), and the Alphabet of the Dialect Atlas of New England (Abbreviated D)
CONCORDANCE OF PHONETIC ALPHABETS Robert C. Hollow, Jr. University of North Carolina, Chapel Hill A concordance of 3 major phonetic alphabets used in North America is presented and discussed. Those alphabets consid- ered are one used by the International Phonetic Association, one used for American dialectology and one used for American Indian languages. Comparisons are made in terms of vowel symbols, -consonant symbols, secondary segmental- symbols, and diacritic marks. Typewriter equivalents of standard symbols are also given. [phonetics, linguistics, North A.merica, American Indians, phonetic symbols] This paper is a brief concordance of the major phonetic alphabets currently in use by linguists and anthropologists in North America. The alphabets included are the International Phonetic Alphabet (abbreviated I in this paper), the Americanist alphabet used in the transcription of American Indian Languages (abbreviated A), and the alphabet of the Dialect Atlas of New England (abbreviated D). For convenience I have divided the concordance into five sections: 1) Primary Vowel Symbols, 2) Primary Consonant Symbols, 3) Secondary Segmental Symbols, 4) Diacritic Marks, and 5) Typewriter Symbols. The form of I used in this paper is the 1951 revision as fully presented in The Principles of the International Phonetic Association (International Phonetic Association 1957). D is presented and discussed in the Handbook of the Linguistic Geography of New England (Kurath, Bloch and Hansen 1939). This alphabet is based onI, but includes certain modifications made to facilitate the transcription of American English dialect material. There is no single phonetic alphabet currently in use by students of American Indian Languages, for this reason I have consulted several alternate formu- lations of phonetic alphabets given by scholars in the field, most notably 42 Bloch and Trager (1942), Pike (1947), Trager (1958), and Shipley (1965). -
Uu Clicks Correlate with Phonotactic Patterns Amanda L. Miller
Tongue body and tongue root shape differences in N|uu clicks correlate with phonotactic patterns Amanda L. Miller 1. Introduction There is a phonological constraint, known as the Back Vowel Constraint (BVC), found in most Khoesan languages, which provides information as to the phonological patterning of clicks. BVC patterns found in N|uu, the last remaining member of the !Ui branch of the Tuu family spoken in South Africa, have never been described, as the language only had very preliminary documentation undertaken by Doke (1936) and Westphal (1953–1957). In this paper, I provide a description of the BVC in N|uu, based on lexico-statistical patterns found in a database that I developed. I also provide results of an ultrasound study designed to investigate posterior place of articulation differences among clicks. Click consonants have two constrictions, one anterior, and one posterior. Thus, they have two places of articulation. Phoneticians since Doke (1923) and Beach (1938) have described the posterior place of articulation of plain clicks as velar, and the airstream involved in their production as velaric. Thus, the anterior place of articulation was thought to be the only phonetic property that differed among the various clicks. The ultrasound results reported here and in Miller, Brugman et al. (2009) show that there are differences in the posterior constrictions as well. Namely, tongue body and tongue root shape differences are found among clicks. I propose that differences in tongue body and tongue root shape may be the phonetic bases of the BVC. The airstream involved in click production is described as velaric airstream by earlier researchers. -
LIN 3201 Sounds of Human Language Manual by Ratree
LIN 3201 Sounds of Human Language Manual By Ratree Wayland Program in Linguistics University of Florida Gainesville, FL 2 Introduction There are approximately 7,000 languages in the world, and the sounds employed by these languages show both similarities and differences. Thus, an interesting question that one might ask is, “What factors affect the sounds a language can or cannot use?” First of all, we are constrained by what we can do with our tongue, our lips and other organs involved in the production of speech sounds. This factor may be referred to as the ‘articulatory ease’ factor. Secondly, we are constrained by what we can hear or what we can perceptually distinguish. This is the ‘auditory distinctiveness’ factor. Thus, no language in the world has sounds that are too difficult for native speakers to produce or to perceptually differentiate. To nonnative speakers, however, certain sounds may prove challenging to both produce and perceive. One of the goals of this course is to familiarize students with the various sounds employed in the world’s languages. Students will learn how to describe, produce and perceptually distinguish these sounds. Describing Speech Sounds Phonetics is concerned with describing speech sounds that occur in the world’s languages. Speech sounds can be described in at least two different ways. First we can describe them in terms of how they are made in the vocal tract (articulatory phonetics). As speech sounds leave the mouth, they cause disturbances in the surrounding air (sound waves). Thus, another way that we can describe speech sound is to analyze its acoustic sound wave (acoustic 3 phonetics). -
Click Variation and Reacquisition in Two South African Ndebele Varieties
CLICK VARIATION AND REACQUISITION IN TWO SOUTH AFRICAN NDEBELE VARIETIES Stephan Schulz, Antti Olavi Laine, Lotta Aunio & Nailya Philippova This article deals with click consonants in two Nguni varieties of South Africa, namely isiNdebele, or Southern Ndebele, as it is better known outside of South Africa; and Sindebele, or Northern Transvaal Ndebele. We review previous research on the topic, in which isiNdebele been described as having a some- what reduced click inventory compared to better described Nguni languages, and Sindebele has been claimed to have lost clicks completely. We also review previous research on click loss, variation, and acquisition. We then describe the current situation of both language varieties regarding clicks. For Sindebele, we observe that while clicks indeed seem to have been almost completely replaced by other consonants, some speakers still do produce clicks in isolated words, possibly as a result of recent contact with isiZulu. In isiNdebele, we find that the lateral click has been lost almost completely, while the distinction between dental and postalveolar has been lost for some speakers (with most of them preferring the dental click), whereas some speakers still maintain the distinc- tion. We propose a tentative correlation between increasing formal education in the isiNdebele language and the tendency to maintain the two clicks as distinct, but generally find the functional load of the distinction to be very low, at least on a lexical level. 1. INTRODUCTION The language varieties known as Ndebele belong to the Nguni branch of the Southern Bantu languages. All Nguni languages are spoken in South Africa, with the exception of Northern Ndebele (or Zimbabwean Ndebele), which is spoken in Zimbabwe. -
Data-Driven Concatenative Sound Synthesis
Acad´emie de Paris Universit´e Paris 6 { Pierre et Marie Curie Ecole´ Doctorale d'Informatique THESE` DE DOCTORAT sp´ecialit´e ACOUSTIQUE, TRAITEMENT DE SIGNAL ET INFORMATIQUE APPLIQUES´ A` LA MUSIQUE pr´esent´ee par Diemo Schwarz pour obtenir le grade de DOCTEUR de l'UNIVERSITE´ PARIS 6 { PIERRE ET MARIE CURIE DATA-DRIVEN CONCATENATIVE SOUND SYNTHESIS soutenue le 23. 1. 2004 devant le jury compos´e de XAVIER RODET Directeur de Th`ese UDO ZOLZER¨ Rapporteur FRANC¸ OIS PACHET Rapporteur THIERRY DUTOIT Examinateur GERARD´ CHOLLET Examinateur JEAN-FRANC¸ OIS PERROT Examinateur ii PhD Thesis in Acoustics, Computer Science, Signal Processing Applied to Music Data-Driven Concatenative Sound Synthesis Diemo Schwarz [email protected] http://www.ircam.fr/anasyn/schwarz Version 1.01 19th January 2004 Ircam { Centre Pompidou 1, place Igor{Stravinsky 75004 Paris, France http://www.ircam.fr University of Paris 6 { Pierre et Marie Curie iv v . a` Marie-Eve` Chapter Overview Abstract ix R´esum´e xi Acknowledgments xiii Contents xv 1 Introduction 1 I Previous and Related Work 7 2 Overview 9 3 Speech Synthesis 15 4 Musical Sound Synthesis 23 II Automatic Alignment 31 5 Introduction 33 6 Dynamic Time Warping 49 7 Hidden Markov Models 67 8 Discussion 79 III The Data-Driven Sound Synthesis System Caterpillar 81 9 System Overview 83 10 Sound Descriptors 85 11 Characteristic Values 101 12 Database 109 vi CHAPTER OVERVIEW vii 13 Database Interface 115 14 The Caterpillar Database Schema 121 15 Corpus Examples and Statistics 133 16 Synthesis 149 17 Applications -
Drytok: Tlok1
tr'.z*w. Dritok An Introduction to the Language of the Drushek Donald Boozer The Languages of Kryslan Volume 1 W5&^Y2-lt'.q'.=x:.=D5^Q5=D2; D2=!D3-z*.z*.n.=s'.s'.=hr:.zp.th.=D5^Q5=B5; There are many paths ahead of you. You choose and travel one of them. Literally, "MANY paths exist YONDER WITH-RESPECT-TO YOU. YOU choose and walk WITH-RESPECT-TO ONE." ~ A Drushek proverb. Verbalizations: lt'.q'. > thed.gi ("path, way, route to a goal") x:. > khaa (A generic word for "existence", can be used for one sense of "to be") z*.z*.n. > tik.tik.ni ("be confronted with several alternatives") s'.s'. > chi.chi (an all-purpose conjunction, in this case joining two verbs) hr:.zp.th. > herr.zhep.ta ("walk, travel, move along a path (by foot)") "Many paths exist for you. You choose one and walk it." What we cannot speak about we must pass over in silence ~ Ludwig Wittegenstein, Tractatus Logico-Philosophicus © December 2008, Donald Boozer Revised, April 2009 Revised, February 2010 Table of Contents Chapter 1 - Historical Background Chapter 2 - The "Sounds" of Dritok II.A. Audible Phonology II.B. Inaudible Phonology/Morphology Chapter 3 - Linguistic Characteristics III.A. Dritok: Ambiguous Syntax III.B. "Parts of Speech" III.C. Word Order Cultural Intermission: Shekstan Chapter 4 - Nominals and Nominal-Phrase Operations IV.A. Compounding IV.B. Number IV.C. Case IV.D. Articles and Demonstratives IV.E. Possession Chapter 5 - Predicate Nominals and Related Constructions V.A. Proper Inclusion and Equation V.B. -
Unicode Request for Additional Phonetic Click Letters Characters Suggested Annotations
Unicode request for additional phonetic click letters Kirk Miller, [email protected] Bonny Sands, [email protected] 2020 July 10 This request is for phonetic symbols used for click consonants, primarily for Khoisan and Bantu languages. For the two symbols ⟨⟩ and ⟨ ⟩, which fit the pre- and post-Kiel IPA and see relatively frequent use, code points in the Basic Plane are requested. Per the recommendation of the Script Ad Hoc Committee, two other symbols, ⟨‼⟩ and ⟨⦀⟩, may be best handled by adding annotations to existing punctuation and mathematical characters. The remainder of the symbols being proposed are less common and may be best placed in the supplemental plane. Three precomposed characters, old IPA ʇ ʖ ʗ with a swash (⟨⟩, ⟨⟩, ⟨⟩), are covered in a separate proposal. Characters U+A7F0 LATIN SMALL LETTER ESH WITH DOUBLE BAR. Figures 1–9. U+A7F1 LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK. Figures 16–19. U+10780 LATIN SMALL LETTER TURNED T WITH CURL. Figures 8–13. U+10781 LATIN LETTER INVERTED GLOTTAL STOP WITH CURL. Figures 8, 9, 11, 12, 15. U+10782 LATIN SMALL LETTER ESH WITH DOUBLE BAR AND CURL. Figures 8-10. U+10783 LATIN LETTER STRETCHED C WITH CURL. Figures 8, 9, 12, 14. U+10784 LATIN LETTER SMALL CAPITAL TURNED K. Figure 20. Suggested annotations U+A7F0 LATIN SMALL LETTER ESH WITH DOUBLE BAR ⟨⟩ resembles U+2A0E INTEGRAL WITH DOUBLE STROKE ⟨⨎⟩, though the glyph design is different. (The Script Ad Hoc Committee recommends not unifying them.) U+203C DOUBLE EXCLAMATION MARK ⟨‼⟩ may be used as a phonetic symbol, equivalent to a doubled U+1C3 LATIN LETTER RETROFLEX CLICK ⟨ǃ⟩.