Constructing a Multilingual Phoneme List for Polyglot Speech Synthesiser

Total Page:16

File Type:pdf, Size:1020Kb

Constructing a Multilingual Phoneme List for Polyglot Speech Synthesiser Constructing a Multilingual Phoneme List for Polyglot Speech Synthesiser Nur-Hana Samsudin, Mark Lee School of Computer Science University of Birmingham United Kingdom {n.h.samsudin, m.g.lee}@cs.bham.ac.uk Abstract We describe our approach to construct a phoneme set for polyglot speech synthesis. In polyglot speech synthesis, resources are shared across languages. The goal of this research is to develop global phoneme set using existing resources. Therefore, MBROLA has been selected. In MBROLA, there are 72 diphone databases of different languages. For each database, there is a set of phonemes used. We have selected 31 language databases out of the 72 diphone databases in MBROLA. By reusing existing resources, we would be able to gather global phoneme set in faster and wider language coverage. Therefore it would be able to be used for language that has limited linguistic expertise or limited linguistics resources. Our approach includes the process of extracting the phonemes of these languages, clustering, eliminating and substituting inaccurate phonemes and finally evaluating the list of phonemes obtained. At the end of this study, we are able to come out with one complete list of a global phoneme set. This list can be use as a substitution for unavailable phonemes in future polyglot TTS systems. It is also suitable to be used as a default phoneme set for new languages if the new languages’ phoneme set is not yet defined. Keywords: polyglot TTS, multilingual, speech synthesis, global phoneme set, resource poor languages 1. Introduction For the scope of our research, we use SAMPA as a phonetic representation given that is what MBROLA uses The International Phonetic Alphabet (IPA) is a standard (MBROLA, 2005). MBROLA is only used as the representation of phonetics for all languages. It provides resource. SAMPA is one of the complete phonetic symbols to represent sounds for phonemes which are representation which follows IPA transcription closely. already listed in a language or that are possible to be The reason why we choose MBROLA is that MBROLA produced by human articulatory system. Therefore, the already has a rich collection of phonemes sets for 31 question needs to be addressed: why there is a need for languages (MBROLA, 2005) - with a few variations for constructing a multilingual phoneme list for polyglot some languages. Making use of available resources not speech synthesis? only makes the work possible without linguistic expertise IPA provides a generic concept and instances of speech but also makes the standardising and identifying process sound. While in speech, not all sounds listed in IPA are faster. used regularly. Based on this account, it is possible to Standard phonetic notation distinguishes the different have a speech synthesis system for resource poor phonemes used in pronunciation. In polyglot speech languages in which the phonemes set is obtained based on synthesis, the resources must be put together other languages. And since polyglot TTS facilitate (Romsdorfer, 2007) and selection done based on sharable data including phonetic resources, the concept of phoneme labelling. By using a standard notation, one multilingual phonemes set will complement the phoneme will also not be mistaken as another. By having implementation of the polyglot TTS. a standard representation, it is also possible to reuse the In one previous study on multilingual phonemes by phonemes of selected languages in other speech Altosaar et al. (1996), an intermediary representation was applications or phonetics related work. introduced. The representation able to convert from IPA, This paper is organised as follows. In Section 2, we will SAMPA, X-SAMPA, TIMIT and other notations into discuss the process of creating a multilingual phoneme their notation, called Wordbet. The Worldbet list not only list. We will describe the extraction and analysis process. covers IPA transcription (Hieronymus,1993), but also We will also explain some of the issues we encounter suggests symbols based upon five years study conducted during the analysis phase. We will then provide the on a speech database. Therefore, knowledge on the sound outcome of our research in the multilingual phoneme list of different language needs to be obtained before subsection. In Section 3, we will compare the outcome of algorithm for symbol mapping from one notation to our research with the phonemes set used by linguists in another can be constructed. This is different to our the following languages: English (RP), Latin, Italian, approach where the global phoneme set is constructed German and French. This will be followed with with limited resources. discussion and conclusion. The aim of constructing a multilingual phoneme list is not to substitute IPA, SAMPA or X-SAMPA. What we propose is a default or initial phoneme set which can be 2. Constructing a Multilingual Phoneme List used in polyglot TTS architecture or TTS for resource There are three processes in phoneme list construction; poor languages. The global phoneme set obtained at the Extraction, Analysis and Evaluation. In Extraction end can be use independently. process, all phonemes from the MBROLA database are retrieved. For the scope of this research, we collected all phonemes in each language that are available in the is vowels, diphtongs and consonants. For each cluster, MBROLA database. In Analysis, there are two parts: the there are also derived phonemes. Some variations which clustering process and the elimination/substitution occur most frequently are the lengthening of vowels, process. In Evaluation, the result of this study is germination of consonants and aspirated of consonants as compared with validated phoneme set of the stated well as palatalised phonemes. We also have clusters of languages. similar sounds. For example, [r], [R] and [4] or in IPA There are 31 languages listed in MBROLA: Afrikaans, are: [r], [ ʁ ] and [ ɾ ] correspondingly are clustered Dutch, English, German, Icelandic, Swedish, French, together. This second level of clustering is based on our Italian, Latin, Romanian, Spanish, Croatian, Czech, own judgement that these phonemes are coming from Lithuanian, Polish, Breton, Farsi, Greek, Hindi, Estonian, similar sound group with each other; but not in term of Hungarian, Arabic, Hebrew, Japanese, Korean, Turkish, the manner or the place or articulation. By having the Indonesian, Malay, Maori and Telegu. All the phonemes second level cluster we can determine the possible use the SAMPA notation. substitution phoneme from similar cluster during synthesising process. 2.1. Extraction Based on the phonemes of the 31 languages which have 2.3. Elimination and Substitution been extracted, initially there are 357 unique SAMPA From the clustered phonemes, we are capable of symbols. In the list, we notice that the phonemes can be determining the SAMPA phonemes which are correspond classified into two: the basic phonemes, which the to the standard IPA notation. At this phase, there could be phoneme is a direct mapping from consonants and vowel two reasons a phoneme need to be substituted or of the IPA; and the derived phonemes, where the eliminated. It could either be the phoneme is not a phoneme is an entity constructed based on the standard SAMPA symbol or the SAMPA symbol given is combination of basic phoneme and diacritics or/and not corresponding to the sound produced by the phoneme suprasegmentals symbols. in MBROLA. When the phoneme is not written in Before we go into greater detail, it is necessary to standard SAMPA notation, it could be either one of these describe the different types of classification in IPA. two reasons: the symbol was represented to fit in all Symbols in IPA are classified into consonants, vowels, phonemes of the target language and somehow the diacritics, suprasegmentals, and tones and word accents. created symbol is clashing with another SAMPA symbol However, in SAMPA, as described by Wells (2003), the or the symbol simply does not exist in SAMPA. When tones and word accents need to be labelled in different the SAMPA sound does not correspond to the sound tier (isolated from phoneme tier). This issue is beyond the played by MBROLA synthesiser, it means that error may scope of this paper. occur during the matching process between orthographic Therefore, according to the IPA chart, the phoneme can and phonetic or the developer is using a different version either belong to one of the following categories: vowels, of SAMPA standard. consonant pulmonic, consonant non-pulmonic and other Based on these criteria, the elimination or substitution symbols. The phoneme could also has diacritics and/or process will be carried out. Elimination is the process suprasegmentals symbol. In the list, diphtongs are not required when one of the following conditions occurs: listed. This is understandable because diphthongs consist • the phoneme does not match to any IPA transcription of two consecutive vowels that glide or assimilate with • the symbol does not exist in SAMPA notation one another in the production to become a phoneme. • the sound which is labelled in the MBROLA Contrary to the IPA chart, we classify our phonemes database is not possible to be match with any other quite differently in our analysis. We treat all vowel and unused symbol for that particular language. consonant (both pulmonic and non-pulmonic) as an entity Substitution on the other hand is the process of of our phonemes. However, we also treat diphthongs as changing the symbol declared in MBROLA into one that an entity. We also have instances of derived phonemes matches with IPA and SAMPA. We will provide which are a combination of a consonant or a vowel with examples in later section when we discuss specific diacritics or/and suprasegmentals values in which we also languages’ issues. treated the combined attributes as a phoneme entity. It is It is also important to highlight that we remove semi- important to retain the derived phonemes because the diphtongs (or also known as mixed-diphtongs) in which phone produced has a unique sound as compared to the vowels /a/, /e/ /i/ and /u/ are followed by /r/, /l/, / ļ/ or /m/.
Recommended publications
  • Ling 230/503: Articulatory Phonetics and Transcription English Vowels
    Ling 230/503: Articulatory Phonetics and Transcription Broad vs. narrow transcription. A narrow transcription is one in which the transcriber records much phonetic detail without attention to the way in which the sounds of the language form a system. A broad transcription omits those details of a narrow transcription which the transcriber feels are not worth recording. Normally these details will be aspects of the speech event which are: (1) predictable or (2) would not differentiate two token utterances of the same type in the judgment of speakers or (3) are presumed not to figure in the systematic phonology of the language. IPA vs. American transcription There are two commonly used systems of phonetic transcription, the International Phonetics Association or IPA system and the American system. In many cases these systems overlap, but in certain cases there are important distinctions. Students need to learn both systems and have to be flexible about the use of symbols. English Vowels Short vowels /ɪ ɛ æ ʊ ʌ ɝ/ ‘pit’ pɪt ‘put’ pʊt ‘pet’ pɛt ‘putt’ pʌt ‘pat’ pæt ‘pert’ pɝt (or pr̩t) Long vowels /i(ː), u(ː), ɑ(ː), ɔ(ː)/ ‘beat’ biːt (or bit) ‘boot’ buːt (or but) ‘(ro)bot’ bɑːt (or bɑt) ‘bought’ bɔːt (or bɔt) Diphthongs /eɪ, aɪ, aʊ, oʊ, ɔɪ, ju(ː)/ ‘bait’ beɪt ‘boat’ boʊt ‘bite’ bɑɪt (or baɪt) ‘bout’ bɑʊt (or baʊt) ‘Boyd’ bɔɪd (or boɪd) ‘cute’ kjuːt (or kjut) The property of length, denoted by [ː], can be predicted based on the quality of the vowel. For this reason it is quite common to omit the length mark [ː].
    [Show full text]
  • Part 1: Introduction to The
    PREVIEW OF THE IPA HANDBOOK Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet PARTI Introduction to the IPA 1. What is the International Phonetic Alphabet? The aim of the International Phonetic Association is to promote the scientific study of phonetics and the various practical applications of that science. For both these it is necessary to have a consistent way of representing the sounds of language in written form. From its foundation in 1886 the Association has been concerned to develop a system of notation which would be convenient to use, but comprehensive enough to cope with the wide variety of sounds found in the languages of the world; and to encourage the use of thjs notation as widely as possible among those concerned with language. The system is generally known as the International Phonetic Alphabet. Both the Association and its Alphabet are widely referred to by the abbreviation IPA, but here 'IPA' will be used only for the Alphabet. The IPA is based on the Roman alphabet, which has the advantage of being widely familiar, but also includes letters and additional symbols from a variety of other sources. These additions are necessary because the variety of sounds in languages is much greater than the number of letters in the Roman alphabet. The use of sequences of phonetic symbols to represent speech is known as transcription. The IPA can be used for many different purposes. For instance, it can be used as a way to show pronunciation in a dictionary, to record a language in linguistic fieldwork, to form the basis of a writing system for a language, or to annotate acoustic and other displays in the analysis of speech.
    [Show full text]
  • How to Edit IPA 1 How to Use SAMPA for Editing IPA 2 How to Use X
    version July 19 How to edit IPA When you want to enter the International Phonetic Association (IPA) character set with a computer keyboard, you need to know how to enter each IPA character with a sequence of keyboard strokes. This document describes a number of techniques. The complete SAMPA and RTR mapping can be found in the attached html documents. The main html document (ipa96.html) comes in a pdf-version (ipa96.pdf) too. 1 How to use SAMPA for editing IPA The Speech Assessment Method (SAM) Phonetic Alphabet has been developed by John Wells (http://www.phon.ucl.ac.uk/home/sampa). The goal was to map 176 IPA characters into the range of 7-bit ASCII, which is a set of 96 characters. The principle is to represent a single IPA character by a single ASCII character. This table is an example for five vowels: Description IPA SAMPA script a ɑ A ae ligature æ { turned a ɐ 6 epsilon ɛ E schwa ə @ A visual represenation of a keyboard shows the mapping on screen. The source for the SAMPA mapping used is "Handbook of multimodal an spoken dialogue systems", D Gibbon, Kluwer Academic Publishers 2000. 2 How to use X-SAMPA for editing IPA The multi-character extension to SAMPA has also been developed by John Wells (http://www.phon.ucl.ac.uk/home/sampa/x-sampa.htm). The basic principle used is to form chains of ASCII characters, that represent a single IPA character, e.g. This table lists some examples Description IPA X-SAMPA beta β B small capital B ʙ B\ lower-case B b b lower-case P p p Phi ɸ p\ The X-SAMPA mapping is in preparation and will be included in the next release.
    [Show full text]
  • Building a Universal Phonetic Model for Zero-Resource Languages
    Building a Universal Phonetic Model for Zero-Resource Languages Paul Moore MInf Project (Part 2) Interim Report Master of Informatics School of Informatics University of Edinburgh 2020 3 Abstract Being able to predict phones from speech is a challenge in and of itself, but what about unseen phones from different languages? In this project, work was done towards building precisely this kind of universal phonetic model. Using the GlobalPhone language corpus, phones’ articulatory features, a recurrent neu- ral network, open-source libraries, and an innovative prediction system, a model was created to predict phones based on their features alone. The results show promise, especially for using these models on languages within the same family. 4 Acknowledgements Once again, a huge thank you to Steve Renals, my supervisor, for all his assistance. I greatly appreciated his practical advice and reasoning when I got stuck, or things seemed overwhelming, and I’m very thankful that he endorsed this project. I’m immensely grateful for the support my family and friends have provided in the good times and bad throughout my studies at university. A big shout-out to my flatmates Hamish, Mark, Stephen and Iain for the fun and laugh- ter they contributed this year. I’m especially grateful to Hamish for being around dur- ing the isolation from Coronavirus and for helping me out in so many practical ways when I needed time to work on this project. Lastly, I wish to thank Jesus Christ, my Saviour and my Lord, who keeps all these things in their proper perspective, and gives me strength each day.
    [Show full text]
  • Equivalences Between Different Phonetic Alphabets
    Equivalences between different phonetic alphabets by Carlos Daniel Hern´andezMena Description IPA Mexbet X-SAMPA IPA Symbol in LATEX Voiceless bilabial plosive p p p p Voiceless dental plosive” t t t d ntextsubbridgeftg Voiceless velar plosive k k k k Voiceless palatalized plosive kj k j k j kntextsuperscriptfjg Voiced bilabial plosive b b b b Voiced bilabial approximant B VB o ntextloweringfntextbetag fl Voiced dental plosive d” d d d ntextsubbridgefdg Voiced dental fricative flD DD o ntextloweringfntextipafn;Dgg Voiced velar plosive g g g g Voiced velar fricative Èfl GG o ntextloweringfntextbabygammag Voiceless palato-alveolar affricate t“S tS tS ntextroundcapftntexteshg Voiceless labiodental fricative f f f f Voiceless alveolar fricative s s s s Voiced alveolar fricative z z z z Voiceless dental fricative” s s [ s d ntextsubbridgefsg Voiced dental fricative” z z [ z d ntextsubbridgefzg Voiceless postalveolar fricative S SS ntextesh Voiceless velar fricative x x x x Voiced palatal fricative J Z jn ntextctj Voiced postalveolar affricate d“Z dZ dZ ntextroundcapfdntextyoghg Voiced bilabial nasal m m m m Voiced alveolar nasal n n n n Voiced labiodental nasal M MF ntextltailm Voiced dental nasal n” n [ n d ntextsubbridgefng Voiced palatalized nasal nj n j n j nntextsuperscriptfjg Voiced velarized nasal nÈ N n G nntextsuperscript fntextbabygammag Voiced palatal nasal ñ n∼ J ntextltailn Voiced alveolar lateral approximant l l l l Voiced dental lateral” l l [ l d ntextsubbridgeflg Voiced palatalized lateral lj l j l j lntextsuperscriptfjg Lowered
    [Show full text]
  • Using Phonetic Transcription in Class
    /ˈjuːˌzɪŋfəˈnɛˌɾɪkˌtɹənˈskɹɪp̚.ʃn̩ɪnklæˑs/* * Using Phonetic Transcription in Class Phonetic transcription can be a useful tool for teaching or correcting pronunciation in the ESL/EFL classroom. Anthony Atkielski Introduction This paper discusses the use of phonetic transcription in the teaching of English as a second or foreign language (ESL/EFL), using the International Phonetic Alphabet (IPA). As it happens, English is the most widely taught foreign language in the world, and the IPA is the most widely used alphabet for phonetic transcription. However, most of the concepts and techniques described in this paper apply equally to the teaching of other languages and the use of other systems of phonetic transcription. Phonetic transcription is nothing more than a written record of the sounds of a spoken language. The relationship between phonetic transcription and spoken language is very similar to that between a printed musical score and a musical performance. Transcription separates pronunciation from actual audio recording and, while this might at first seem to be counterproductive, in reality it has many advantages for teaching spoken language and pronunciation. One might well ask what purpose phonetic transcription serves in English when the written form of English already represents the way the language is spoken (more or less). The advantages of phonetic transcription include: • As any student of English can attest, written English is only an approximate representation of the spoken lan- guage. Phonetic transcription, in contrast, is an exact representation, without any ambiguity, redundancy, or omission. In a phonetic transcription, every symbol stands for one sound, and one sound only. There are no “silent letters,” nor are there any spoken sounds that are not represented in the transcription.
    [Show full text]
  • Arabic and English Consonants: a Phonetic and Phonological Investigation
    Advances in Language and Literary Studies ISSN: 2203-4714 Vol. 6 No. 6; December 2015 Flourishing Creativity & Literacy Australian International Academic Centre, Australia Arabic and English Consonants: A Phonetic and Phonological Investigation Mohammed Shariq College of Science and Arts, Methnab, Qassim University, Saudi Arabia E-mail: [email protected] Doi:10.7575/aiac.alls.v.6n.6p.146 Received: 18/07/2015 URL: http://dx.doi.org/10.7575/aiac.alls.v.6n.6p.146 Accepted: 15/09/2015 Abstract This paper is an attempt to investigate the actual pronunciation of the consonants of Arabic and English with the help of phonetic and phonological tools like manner of the articulation, point of articulation, and their distribution at different positions in Arabic and English words. A phonetic and phonological analysis of the consonants of Arabic and English can be useful in overcoming the hindrances that confront the Arab EFL learners. The larger aim is to bring about pedagogical changes that can go a long way in improving pronunciation and ensuring the occurrence of desirable learning outcomes. Keywords: Phonetics, Phonology, Pronunciation, Arabic Consonants, English Consonants, Manner of articulation, Point of articulation 1. Introduction Cannorn (1967) and Ekundare (1993) define phonetics as sounds which is the basis of human speech as an acoustic phenomenon. It has a source of vibration somewhere in the vocal apparatus. According to Varshney (1995), Phonetics is the scientific study of the production, transmission and reception of speech sounds. It studies the medium of spoken language. On the other hand, Phonology concerns itself with the evolution, analysis, arrangement and description of the phonemes or meaningful sounds of a language (Ramamurthi, 2004).
    [Show full text]
  • Icelandic Phonetic Transcription
    A Short Overview of the Icelandic Sound System Pronunciation Variants and Phonetic Transcription IPA Version Eiríkur Rögnvaldsson SÍM 2020 This document was written in December 2019 and January 2020 for the SÍM consortium as a part of the Icelandic National Language Technology Program. The document is made in two versions – one using the IPA transcription system and the other using the X-SAMPA transcription system. This is the IPA version. Both versions begin with a table showing the mappings between the two systems. The document is distributed under the CC BY 4.0 license. 2 1 An Overview of the Icelandic Sound System Icelandic speech sounds can be divided into two main groups; consonants and vowels. Icelandic consonants can be further divided into four classes: plosives (stops), fricatives (and approximants), nasals, and liquids (laterals and trills/taps). Within the vowel group, a further distinction can be made between monophthongs and diphthongs. The following table gives an overview of the phonemes of the Icelandic IPA and X- SAMPA symbol set, grouped by the phoneme classes to which they belong (according to the manner of their articulation). Consonants IPA SAMPA Orthography IPA SAMPA Gloss Plosives p p bera [pɛːra] /pE:ra/ ‘carry’ pʰ p_h pera [pʰɛːra] /p_hE:ra/ ‘pear’ t t dalur [taːlʏr] /ta:lYr/ ‘valley’ tʰ t_h tala [tʰaːla] /t_ha:la/ ‘talk’ c c gera [cɛːra] /cE:ra/ ‘do’ cʰ c_h kæla [cʰaiːla] /c_hai:la/ ‘cool off’ k k galdur [kaltʏr] /kaltYr/ ‘magic’ kʰ k_h kaldur [kʰaltʏr] /k_haltYr/ ‘cold’ Fricatives v v vera [vɛːra] /vE:ra/ ‘be’
    [Show full text]
  • Using Phonetic Transcription in Class.Qxd
    /juziŋ fənε tik tɹ nskɹip.ʃn in kl s/* Phonetic transcription can be a useful tool for teaching or correcting pronunciation in the ESL/EFL classroom. Introduction This paper discusses the use of phonetic transcription in the teaching of English as a second or foreign language (ESL/EFL), using the International Phonetic Alphabet (IPA). English is the most widely taught language in the world, and the IPA is the most widely used alphabet for phonetic transcription. However, most of the concepts and techniques described in this paper apply equal- ly to the teaching of other languages and the use of other systems of phonetic transcription. Phonetic transcription is nothing more than a written record of the sounds of a spoken language. The relationship between phonetic transcription and spoken language is very similar to that between a printed musical score and a musical performance. Transcription separates pronuncia- tion from actual audio recording, and while this might at first seem to be counterproductive, in reality it has many advantages for teaching spoken language and pronunciation. One might well ask what purpose phonetic transcription serves in English when the written form of English already represents the way the language is spoken (more or less). The advantages of phonetic transcription are several: • As any student of English can attest, written English is only an approximate represen- tation of the spoken language. Phonetic transcription, in contrast, is an exact represen- tation, without any ambiguity, redundancy, or omission. In a phonetic transcription, every symbol stands for one sound, and one sound only. There are no “silent letters,” nor are there any spoken sounds that are not represented in the transcription.
    [Show full text]
  • List of Symbols Because the International Phonetic Association
    List of Symbols Because the International Phonetic Association (IPA) and its symbols and conventions are the most linguistically acceptable tool of phonetic transcription, they have been adopted in this book to transcribe both English and Spanish as well as other languages when necessary. Slight modifications in both letter symbols and diacritics are occasionally used. Below is a list of the symbols and conventions used: Vowels Phonetic Description i Close front with spread lips Close front (somewhat centralized) to close-mid with spread lips گ e Close-mid front with unrounded lips ϯ Open-mid front with unrounded lips ҷ Open-mid central with unrounded lips a Open front with unrounded lips э Near-open central vowel æ Near-open front with unrounded lips Ϫ Open back with unrounded lips Ҳ Open back with rounded lips o Close-mid back with rounded lips ѐ Open-mid back with rounded lips u Close back with rounded lips Ѩ Near-close near-back with rounded lips ѩ Open-mid back with unrounded lips ђ Mid central (neutral) vowel (schwa) ɚ R-colored (rhotacized) mid central (schwar) ɝ R-colored (rhotacized) open-mid central Diphthongs au as in <how, now> ai as in <high, tie> oi as in <boy, noise> ou; o as in <go, know> ei; e as in <bait, gate> i; iɚ as in <here, dear> e; eɚ as in <there, bear> u; uɚ as in <poor, tour> x Consonants Phonetic Description b Voiced bilabial plosive p Voiceless unaspirated bilabial plosive p Voiceless aspirated bilabial plosive d Voiced alveolar plosive t Voiceless unaspirated alveolar plosive t Voiceless aspirated alveolar
    [Show full text]
  • Introductory Materials Akan Twi Asante 1
    Humboldt-Universität zu Berlin Sprach- und literaturwissenschaftliche Fakultät Institut für deutsche Sprache und Linguistik MA Historische Linguistik Modul 9: Methoden Linguistischer Datenerhebung Urbane Feldforschung Sommersemester 2018 Dozent: PD Dr. habil. Frank Seifart Verfasserin: Polina Zabolotskikh Matrikelnummer: 595976 Introductory Materials Akan Twi Asante 1. Bibliographical survey This paper is dedicated to the dialect Asante of Akan Twi, one of languages of West Africa. The topic of this small research is usage of phrasal verbs (emu) ye den, (emu) ye duru and (emu) ye hare, which is described in the grammar outlook, written by J. G. Christaller in 1875 and called Grammar of the Asante and Fante Language called Tschi [Chwee, Twi]. This is a brief grammar outlook, describing all main phonetic, morphologic and syntactic aspects of Akan Twi. There is a lack of fundamental linguistic studies of Akan Twi, the majority of the works are either general essays or textbooks for school students, what means there are mostly works one can consult only on basic patterns of verbal conjunction and noun inflection in Akan and its dialects. The ones, used used while writing this paper are: Asante-Twi learners' reference grammar written by David Adu-Amankwah in 2003 and A Comprehensive Course in Twi (Asante) by Florence A. Dolphyne, published in 1996 as well as the Twi, basic course written by James E Redden, and Nana Owusu, issued in 1963. Unfortunately there are only a few references to the dialectal differences between Akan dialects in the mentioned general works, but there is a lack of a thorough research on this topic.
    [Show full text]
  • Analysis of Phonetic Transcriptions for Danish Automatic Speech Recognition
    Analysis of phonetic transcriptions for Danish automatic speech recognition Andreas Søeborg Kirkedal Department of International Business Communication, CBS Dalgas Have 15, DK-2000 Frederiksberg Denmark [email protected] ABSTRACT Automatic speech recognition (ASR) relies on three resources: audio, orthographic transcrip- tions and a pronunciation dictionary. The dictionary or lexicon maps orthographic words to sequences of phones or phonemes that represent the pronunciation of the corresponding word. The quality of a speech recognition system depends heavily on the dictionary and the transcrip- tions therein. This paper presents an analysis of phonetic/phonemic features that are salient for current Danish ASR systems. This preliminary study consists of a series of experiments using an ASR system trained on the DK-PAROLE corpus. The analysis indicates that transcribing e.g. stress or vowel duration has a negative impact on performance. The best performance is obtained with coarse phonetic annotation and improves performance 1% word error rate and 3.8% sentence error rate. KEYWORDS: Automatic speech recognition, phonetics, phonology, speech, phonetic transcrip- tion. Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); Linköping Electronic Conference Proceedings #85 [page 321 of 474] 1 Introduction Automatic speech recognition systems are seeing wider commercial use now more than ever before. No longer are ASR systems restricted to rerouting scenarios in call centres with a small and domain-specific vocabulary. The largest commercial experiment in Europe to date has taken place in the Danish public sector in Odense municipality and entailed more than 500 case workers dictating reports rather than typing them. To be a practical alternative to manual transcriptions or typing in general, the recognition rate must be high.
    [Show full text]