Problems and Solutions in African Tone Language Text–To–Speech

Total Page:16

File Type:pdf, Size:1020Kb

Problems and Solutions in African Tone Language Text–To–Speech ITRW on Multilingual Speech and ISCA Archive Language Processing http://www.isca-speech.org/archive Stellenbosch, South Africa April 9-11, 2006 Problems and solutions in African tone language Text–To–Speech Dafydd Gibbon, Eno–Abasi Urua, Moses Ekpenyong Universitat¨ Bielefeld, Germany; University of Uyo [email protected]; [email protected]; ekpenyong [email protected] Abstract The present contribution outlines the initial developmeng One of the most useful HLT systems for use with non-literate issues involved in preparing an experimental TTS prototype for communities is Text–to–Speech, with typical applications in Ibibio, and presents detailed computational solutions for three health, market and education information dissemination. How- areas: a tone model; a model for non-local morphological de- ever, current TTS development environments are far from be- pendencies; corpus preparation and processing for a tone lan- ing language independent: adaptation of an existing voice in guage with a small and inconsistent text base. one language to a voice in another is a common procedure. During development of a TTS prototype for Ibibio, a Nigerian 2. Overview of issues Niger–Congo, Lower Cross language, in the Local Language Speech Technology Initiative, several levels of infrastructural The development of an experimental TTS prototype for a new and linguistic problems were identified. This contribution out- language, from project design through resource acquisition to lines these problems, concentrating mainly on requirements for system design, implementation and evaluation is a long haul, African tone language TTS, and formulates solution strategies even with older technologies using diphone TTS shells such as for both rule–driven and data–driven TTS development. MBROLA [3] or more modern technologies with unit selection shells such as Festival [4] or BOSS [5]. 1. Introduction 2.1. Linguistic issues: language typology One of the most useful HLT systems for use with non- On the one hand, much depends on the typology of the language literate communities is Text–to–Speech, with typical applica- concerned, both from the perspective of the creation of a unit tions in health, market and educational information dissemi- database resource for the TTS system, and from the perspective nation. However, current TTS development environments are of the processing of text input to the TTS system. Key issues far from being language independent: adaptation of an exist- include: ing voice in one language to a voice in another is a common procedure. 1. Syllable phonotactics. The complexity of syllable In the LLSTI project [1], the adaptation procedure was ap- phonotactics is a major determinant of the size of the plied to a Nigerian tone language, Ibibio (ISO 693-2: nic; Eth- unit database resource: whether a language is fundamen- nologue: IBB), the official language of Akwa Ibom State in the tally CCCVVCC, like many Indo-European languages, Nigerian Federation. Adaptation is plausible when languages or CV, CCV, CCVC, like many West African languages, are prosodically and phonemically similar but severe problems leads to a significant difference in combinatoric op- arise when languages are very dissimilar (e.g. ‘intonation lan- tions, and therefor in resource size. Syllable phonotac- guages’, for which TTS systems are typically developed, vs. tics includes prosody: syllables may be long or short, ‘tone languages’, e.g. Ibibio). open or closed, strong or weak, stressed/accented or un- These problems can be generalised to other African tone stressed/unaccented, tonal or non–tonal, depending on languages, which have lexical (phonemic) tone but, unlike the language. Asian tone languages, also morphemic tone: e.g. in Ibibio near/far tense is marked by LH/HL tones on tense morphemes. 2. Inflectional morphotactics. Inflectional morphotactics Tone-morpheme combinations could be used in unit selection, determines, for example, whether table lookup or rule– but the problem is compounded by the lack of orthographic tone based techniques are most appropriate for handling the marking, making morphological tone assignment effectively an vocabulary and its adaptation to grammatical context: ‘AI complete’ problem. English tends towards the isolating type, with little in- Additionally, ’terraced tone’ patterning generated by auto- flection; other Indo-European languages are fusional, matic and non-automatic downstep requires a clean interface for tendentially with single inflectional suffixes, each with signal processing of pitch: pitch has a ’hard’ semantic function complex grammatical meanings; other languages, such rather than a ’soft’ pragmatic function in intonation languages as Finno-Ugric languages and many African languages, [2]. are agglutinating, with chains of suffixes, each of which On the language side, agglutinative inflectional morphol- has a simple grammatical meaning. Inflectional morpho- ogy generally makes it infeasible to list all inflected word forms tactics also includes prosody: inflection may affect stress except for small vocabularies and complex subject-verb-object patterns, as in the German singular noun DOKtor, plural person concord makes morphological tone assignment difficult. DokTORen, or tonal patterns, as in Ibibio verbal mor- Finally, specific problems of sparseness and text inconsis- phology where, simplifying slightlym, rising and falling tency in written but non-standardised languages arise. tones on a future or past prefix particle represent distal future or past, and proximal future or past, respectively • For historical reasons there are competing or- (ya`a,´ ya´a;` ma`a,´ ma´a).` thographies: they differ in details. 3. Morphotactics of word formation. Derivation with • Existing documents created by untrained person- one root and affixes, compounding with more than nel tend to be highly inconsistent in orthography, one root, determine the composition of the vocabu- punctuation and format (though this is hardly dif- lary. Prosody also plays a central role: in languages ferent from documents in many other languages such as English, stress patterns (IMport/imPORT; TELE- the world over). phone/teLEphony/telePHONic; BLACKboard) are in- • The body of available texts is small, for IT pur- volved. In West African language, low or high tonal poses the available data are therefore very sparse. interfixes may be involved, as in Ibibio: en` o` (gift) +´ (tonal interfix) + ab` as` `ı (God) = en` o`ab´ as` `ı (personal proper • Available lexicographic material is also very re- name). stricted, and in a legacy paper format. • 4. Sentence structure. The combinatorics of words into Available information on grammar is neither pre- larger units takes place in conjunction with inflectional cise enough nor complete enough to be of much morphology: different orders of subject, object and verb use for automatic processing (e.g. parser develop- in simple sentences; different concord conditions in sim- ment), though the morphology is more complete, ple sentences (e.g. subject–verb, object–verb, ergative); and the phonology is in fact complete. serial verb constructions; different conventions for em- 3. Infrastructural resources. The ubiquity of electricity and bedded sentences. of internet connectivity which academic researchers ex- pect in, for example, European contexts, is either not 2.2. Resource issues given, or very expensive. Power cuts can be frequent, and are sometimes predictable due to known times when Many aspects of TTS development depend on the development demand is likely to outstrip supply. For most hands-on environment: from data resources (corpora, lexica, grammars), computational work either a desktop with backup gen- through hardware and software, to the instutional environment, erator or a laptop with extra batteries is required. The with its personnel structures and economic basis, and in this re- usual solution is a laptop. Unsealed magnetic storage spect conditions vary from region to region, and from continent media (floppy disks) have a very short life, because of to continent. dust, heat and humidity. In relation to Europe, resource conditions differ signifi- cantly in all of these respects in West Africa, and, of course, The last of these points is crucial for local language IT de- the issue of speech synthesis for local African languages is it- velopment: creaming off development work into well–equipped self very heterogeneous. labs in more northerly (or southerly) latitudes is likely to fan the The experimental prototype development project for Ibibio flames of reputation for the labs concerned, but is not likely to was conducted at the University of Uyo, located in Uyo, the cap- lead to significant value for or acceptance by users of the local ital of the Akwa Ibom State in the Nigerian Federation, in the language unless local developers are responsible for develop- South East of Nigeria, between the Cross and Calabar rivers. ment on an equal footing; an optimal solution would be part- Universitat¨ Bielefeld was coordinating cooperation partner in nerships between well–equipped labs and regional, less well- the project. The official language of Akwa Ibom State is Ibibio. equipped labs. The 2 million speakers of Ibibio constitute a rather large ap- plication pool for speech synthesis applications in education, 3. A note on tonal typology health–care, agricultural and geographical information systems. The Department of Linguistics and Nigerian Languages at Uyo In this and the following sections, attention is focussed on the University
Recommended publications
  • Prosody and Intonation in Non-Bantu Niger-Congo Languages: an Annotated Bibliography
    Electronic Journal of Africana Bibliography Volume 11 Prosody and Intonation in Non- Bantu Niger-Congo Languages: An Annotated Article 1 Bibliography 2009 Prosody and Intonation in Non-Bantu Niger-Congo Languages: An Annotated Bibliography Christopher R. Green Indiana University Follow this and additional works at: https://ir.uiowa.edu/ejab Part of the African History Commons, and the African Languages and Societies Commons Recommended Citation Green, Christopher R. (2009) "Prosody and Intonation in Non-Bantu Niger-Congo Languages: An Annotated Bibliography," Electronic Journal of Africana Bibliography: Vol. 11 , Article 1. https://doi.org/10.17077/1092-9576.1010 This Article is brought to you for free and open access by Iowa Research Online. It has been accepted for inclusion in Electronic Journal of Africana Bibliography by an authorized administrator of Iowa Research Online. For more information, please contact [email protected]. Volume 11 (2009) Prosody and Intonation in Non-Bantu Niger-Congo Languages: An Annotated Bibliography Christopher R. Green, Indiana University Table of Contents Table of Contents 1 Introduction 2 Atlantic – Ijoid 4 Volta – Congo North 6 Kwa 15 Kru 19 Dogon 20 Benue – Congo Cross River 21 Defoid 23 Edoid 25 Igboid 27 Jukunoid 28 Mande 28 Reference Materials 33 Author Index 40 Prosody and Intonation in Non-Bantu Niger-Congo Languages Introduction Most linguists are well aware of the fact that data pertaining to languages spoken in Africa are often less readily available than information on languages spoken in Europe and some parts of Asia. This simple fact is one of the first and largest challenges facing Africanist linguists in their pursuit of preliminary data and references on which to base their research.
    [Show full text]
  • Segmental and Non-Segmental Phonology of Kūnámá ̄
    Segmental and Non-Segmental Phonology of ūáá̄ Anteneh Getachew Damtew A Dissertation Submitted to the Department of Linguistics and Philology Presented in Fulfillment of the Requirements for the Degree Doctor of Philosophy in Documentary Linguistics and Culture Addis Ababa University Addis Ababa, Ethiopia June 2018 ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES This is to certify that the dissertation prepared by Anteneh Getachew Damtew, entitled: Segmental and Non-Segmental Phonology of ūáá ̄ and submitted in fulfillment of the requirements for the Degree of Doctor of Philosophy (in Documentary Linguistics and Culture) complies with the regulations of the University and meets the accepted standards with respect to originality and quality. Signed by the Examining Committeeː Examinerː____________________Signature____________Date_________ (External) Examinerː____________________Signature____________Date_________ (Internal) Advisor:_____________________Signature____________Date_________ Chair of Department or Graduate Program Coordinator ABSTRACT This dissertation presents the descriptions of the segmental and non-segmental phonology of Kunama, a Nilo-Saharan dialect cluster spoken in Western Eritrea and Northern Ethiopia. It also provides an annotated multimedia corpus of the names and description of the Kunama cultural artifacts. The study uses primary data recorded from speakers of the Kunama Shukre dialect, spoken by an isolated minority group living in Tahtay Addi Yabo Woreda of Northwestern Zone of the Tigray Regional State, Ethiopia.
    [Show full text]
  • Cover Title: Dictionary of Phonetics and Phonology Author
    cover file:///D:/Documents%20and%20Settings/superstar/Mes%20document... Cover title: Dictionary of Phonetics and Phonology author: Trask, R. L. publisher: Taylor & Francis Routledge isbn10 | asin: print isbn13: 9780203696026 ebook isbn13: 9780203695111 language: English subject Phonetics--Dictionaries, Grammar, Comparative and general--Dictionaries.--Phonology , Gramática comparada y general--Fonología--Diccionarios, Phonetics, Phonology. publication date: 1996 lcc: P216.T73 1996eb ddc: 414/.03 subject: Phonetics--Dictionaries, Grammar, Comparative and general--Dictionaries.--Phonology , Gramática comparada y general--Fonología--Diccionarios, Phonetics, Phonology. cover Page i A Dictionary of Phonetics and Phonology Written for students of linguistics, applied linguistics and speech therapy, this dictionary covers over 2,000 terms in phonetics and phonology. In addition to providing a comprehensive yet concise guide to an enormous number of individual terms, it also includes an explanation of the most important theoretical approaches to phonology. Its usefulness as a reference tool is further enhanced by the inclusion of pronunciations, notational devices and symbols, earliest sources of terms, suggestions for further reading, and advice with regard to usage. R.L.Trask is Lecturer in Linguistics in the School of Cognitive and Computing Sciences at the University of Sussex. His previous publications include A Dictionary of Grammatical Terms in Linguistics (1993), Language Change (1994) and Language: The Basics (1995). page_i Page ii This page intentionally left blank. page_ii Page iii A Dictionary of Phonetics and Phonology R.L.Trask London and New York 1 of 246 10/04/2010 11:56 cover file:///D:/Documents%20and%20Settings/superstar/Mes%20document... page_iii Page iv First published 1996 by Routledge 11 New Fetter Lane, London EC4P 4EE This edition published in the Taylor & Francis e-Library, 2005.
    [Show full text]
  • A Grammar of Agolle Kusaal Revised Version
    A Grammar of Agolle Kusaal Revised Version David Eddyshaw i Contents Preface...................................................................................................................... ix Preface to the Revised Version..................................................................................xi Introduction to the Grammar...................................................................................xii Other Studies of Kusaal...........................................................................................xiv Abbreviations.......................................................................................................... xvi Interlinear Glossing................................................................................................xvii Transcription Conventions......................................................................................xix Sources..................................................................................................................... xx References/Bibliography.........................................................................................xxi 1 Introduction to Kusaal and the Kusaasi.....................................................................1 1.1 The Kusaasi People.............................................................................................2 1.2 The Kusaal Language..........................................................................................4 1.2.1 Language Status..........................................................................................4
    [Show full text]
  • A Grammar of Agolle Kusaal Revised Version
    A Grammar of Agolle Kusaal Revised Version David Eddyshaw i Contents Preface....................................................................................................................... x Preface to the Revised Version.................................................................................xii Introduction to the Grammar..................................................................................xiii Other Studies of Kusaal............................................................................................xv Abbreviations......................................................................................................... xvii Interlinear Glossing...............................................................................................xviii Transcription Conventions.......................................................................................xx Sources................................................................................................................... xxii References/Bibliography.......................................................................................xxiii 1 Introduction to Kusaal and the Kusaasi.....................................................................1 1.1 The Kusaasi People.............................................................................................2 1.2 The Kusaal Language..........................................................................................4 1.2.1 Language Status..........................................................................................4
    [Show full text]
  • Open JAS Final.Pdf
    THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF SPANISH, ITALIAN, AND PORTUGUESE AND INTERCOLLEGE PROGRAM THE PHONETIC IMPLEMENTATION OF TONE IN COATZOSPAN MIXTEC JANALYN A. SHEETZ Spring 2010 A thesis submitted in partial fulfillment of the requirements for baccalaureate degrees in Bachelor of Philosophy, Spanish, and International Studies with interdisciplinary honors in Endangered Languages and Spanish. Reviewed and approved* by the following: Chip Gerfen Associate Professor and Head of Spanish, Italian and Portuguese Department Thesis Supervisor Eric White Executive Director for the Division of Undergraduate Studies Honors Adviser John M. Lipski Edwin Erle Sparks Professor of Spanish and Linguistics Honors Adviser * Signatures are on file in the Schreyer Honors College. i ABSTRACT With fewer than 5000 native speakers remaining, the Mexican indigenous language of Coatzospan Mixtec (CM) faces potential extinction in coming generations. Spoken only in the remote northern mountains of the Mexican state of Oaxaca in San Juan Coatzospan, the language has received relatively little linguistic documentation, though some fundamental work has been conducted by Pike and Small (1974) and Gerfen (1999). The primary goal of this project is to continue the process of describing this endangered language in the hope of gaining a more complete knowledge of the linguistic systems of tone used around the world. Specifically, this thesis is concerned with the description of how lexical tone is implemented phonetically in CM both in words produced in isolation as well as in combination with other words. Using voice recordings of a selection of single words and two word phrases, the fundamental frequency of the voice was measured and averaged across repetitions to show how the phonological process of terracing downstep influences the fundamental frequencies of an isolated word.
    [Show full text]
  • Finite State Processing of Tone Systems
    FINITE STATE PROCESSING OF TONE SYSTEMS Dafydd Gibbon (U Bielefeld) ABSTRACT as floating tones and compound tones. Floating tones are not associated with syllables, but are po- It is suggested in this paper that two-level stulated to explain appparent irregularities in pho- morphology theory (Kay, Koskenniemi) can be ex- netic patterning in terms of regular tone sandhi tended to include morphological tone. This exten- properties. sion treats phonological features as I/O tapes for Finite State Transducers in a parallel sequential The tone sandhi rules define how tones affect incrementation (PSI) architecture; phonological their neighbours. The example to be treated here is processes (e.g. assimilation) are seen as variants of a kind of tonal assimilation known as tonal sprea- an elementary unification operation over feature ding in which low tones are phonetically raised tapes (linear unification phonology, LUP). The following a high tone or, more frequently, high phenomena analysed are tone terracing with tones are lowered after a low tone, either to the tone-spreading (horizontal assimilation), down- level of the low tone (total downstep) or to a mid step, upstep, downdrift, upsweep in two West Afri- level (partial downstep). The newly defined tone is can languages, Tem (Togo) and Baule (C6te then the reference point for following tones. d'Ivoire). It is shown that an FST acccount leads to more insightful definitions of the basic pheno- The latter kind of assimilation produces a cha- mena than other approaches (e.g. phonological racteristic perceptual, and experimentally measu- rules or metrical systems). rable, effect known as tone terracing. Tone se- quences are realized at a fairly high level at the beginning of a sequence, and at certain well- 1.
    [Show full text]
  • Appeared in the First Glot International State-Of-The-Article Book: the Latest in Linguistics, Edited by Lisa Lai-Shen Cheng and Rint Sybesma
    1 Appeared in The First Glot International State-of-the-Article Book: The Latest in Linguistics, edited by Lisa Lai-Shen Cheng and Rint Sybesma. Studies in Generative Grammar 48. Berlin: Mouton de Gruyter, pp. 251-286, 2000. Tone: An Overview San Duanmu University of Michigan May 1998 1. Introduction This article surveys developments and achievements in research on tone in the past thirty years, primarily in the generative framework, and outstanding issues that pertain to current research. I begin with a discussion of what tone languages are and point out some similarities between tone and intonation. 1.1. Tone languages and non-tone languages It is not hard to identify some languages that are clearly tone languages. A good example is Standard Chinese (also called Mandarin), which has four tones on full syllables, shown in (1). (1) Contrastive tones in Standard Chinese (Mandarin) [ma1] [ma2] [ma3] [ma4] 'mother' 'hemp' 'horse' 'to scold' The numbers 1-4 refer to the four tones: tone 1 is a high, tone 2 is a rise, tone 3 is a low (or low-rise in final position), and tone 4 is a fall. The four words in (1) are phonologically identical in all respects except tone (I forgo the issue of duration for the moment). In other words, tone (or pitch contour) is lexically contrastive in Standard Chinese (for other functions of tone, see section 3.14). Most tone languages are found in Asia and Africa. In addition, tone had been reported in some European and Native American languages. It is not hard, either, to identify some languages that are non-tone languages.
    [Show full text]
  • On the Representation of Tone in Peñoles Mixtec
    UC Berkeley Phonology Lab Annual Report (2005) Submitted to IJAL On the Representation of Tone in Peñoles Mixtec John P. Daly Summer Institute of Linguistics and Larry M. Hyman University of California, Berkeley ABSTRACT This paper presents a systematic account of the tone system of Peñoles Mixtec (PM). While /H/ and /L/ tones are unambiguously needed in underlying representations, it is argued that the third tone is not /M/, but must rather be underspecified as /Ø/. Perhaps the most interesting of the several arguments presented is that strings of /Ø/ tone-bearing units are invisible to a process which deletes the second /L/ of a /L-Ø*-L/ sequence. We propose that all /L/ tones are underlying floating and that /L/ rather than /H/ is the marked tone in this three-value system. The surface mid and low-falling pitches in outputs are shown to derive by a small number of realizational rules, which also are responsible for producing successively upstepped H tones. The PM tone system is unusually interesting both from a general tonological perspective as well as for its relation to Dürr’s (1987) Proto-Mixtec tones which have the inverted values in PM. KEYWORDS: underspecification, markedness, obligatory contour principle, upstepping, floating tones 1. Introduction The complexity of Mixtec tone systems has been recognized for some time. Following Kenneth Pike’s (1944, 1948) pioneering work on San Miguel El Grande dialect, there has been a steady succession of descriptive studies on other Mixtec languages and dialects. These studies have revealed a wide range of tonal variation which has great significance not only for the understanding of Mixtec, but of tone systems in general.
    [Show full text]
  • Toward an Ecological Phonology
    Poznań Studies in Contemporary Linguistics 45(1), 2009, pp. 73–102 © School of English, Adam Mickiewicz University, Poznań, Poland doi:10.2478/v10010-009-0012-8 FORMAL IS NATURAL: TOWARD AN ECOLOGICAL PHONOLOGY DAFYDD GIBBON Universität Bielefeld [email protected] ABSTRACT Naturalism Phonology (NP) has a history of opposition to abstractness, to generative linguistics, to formalist approaches, and differs from these in its strong focus on external rather than distribu- tional, structural evidential domains. But evidence domains are orthogonal to empirical and for- mal methods, and, like formalist theories such as Optimality Theory (OT), the pedigree of NP in- cludes structuralist and generative phonology. In an analysis which is sympathetic to both NP and OT, this contribution examines the relation between NP and OT, analyses a classic OT case study of syllabification in Tashlhiyt Berber, and presents computational linguistic analyses of this case, as well as of English syllable phonotactics and of tone language tonotactics. The contribution ad- vocates an opening towards these methods, and the adoption of explicit, consistent, precise, complete and sound formal criteria for theories, which enable an exact interpretation in terms of operational models and computational implementations, and practical applications. The general frame of reference is a the Ecological Cycle in theory formation, from clarification of the domain through theory construction, interpretation with a model, evaluation and application in the origi- nal evidential domain, with payback to the language community from which the evidence was gained. KEYWORDS : Natural Phonology; Optimality Theory; Finite State Phonology; syllable; tone. 1. Background and objectives: the natural and the formal 1 1.1.
    [Show full text]
  • A Grammar of Agolle Kusaal Revised Version
    A Grammar of Agolle Kusaal Revised Version David Eddyshaw i Contents Preface...................................................................................................................... ix Preface to the Revised Version..................................................................................xi Abbreviations........................................................................................................... xii Interlinear glossing.................................................................................................xiii Transcription conventions........................................................................................xv Sources.................................................................................................................... xvi Other studies of Kusaal.........................................................................................xviii References/Bibliography.........................................................................................xix 1 Introduction to Kusaal and the Kusaasi.....................................................................1 1.1 The Kusaasi people.............................................................................................2 1.2 The Kusaal language...........................................................................................4 1.2.1 Language status...........................................................................................4 1.2.2 Dialects........................................................................................................4
    [Show full text]
  • Formal Models of Oscillation in Rhythm, Melody and Harmony
    Formal models of oscillation in rhythm, melody and harmony Formalne modele oscylacji w rytmie, melodii i harmonii Dafydd Gibbon Universität Bielefeld, Germany [email protected] ABSTRACT The aims of this contribution are to present (a) an intuitive explicandum of rhythm as iterated alternation, going beyond measurements of regularity; (b) a novel formal explication of rhythm in terms of Jassem’s rhythm model seen from a computational point of view as a Rhythm Oscillator Model, (c) a computational model for two-tone Niger-Congo languages as a Tone Oscillator Model and (d) a pointer to the similarities of structure between these two and to a formal syllable model, suggesting new avenues of research and application. STRESZCZENIE Ce lem ni niej szej pra cy jest przed sta wie nie (a) in tu icyj nej eks pli ka cji ryt mu ja ko po- wta rza ją cych się zmian, wy cho dząc po za po miar re gu lar noś ci; (b) no we go for mal - ne go wy ja śnie nia ryt mu w ka te go riach mo de lu ryt mu Jas se ma, po strze ga ne go z ob - li cze nio we go punk tu wi dze nia ja ko Oscy la cyj ny Mo del Ryt mu, (c) mo de lu ob li cze - nio we go dla dwu to nal nych ję zy ków ni ge ro -kon gij skich (Oscy la cyj ny Mo del To nu), oraz (d) wska za nie po do bieństw struk tu ral nych po mię dzy ty mi dwo ma mo de la mi a for mal nym mo de lem sy la by.
    [Show full text]