<<

Constructing a Multilingual List for Polyglot Speech Synthesiser

Nur-Hana Samsudin, Mark Lee

School of Computer Science University of Birmingham United Kingdom {..samsudin, ..lee}@.bham.ac.uk

Abstract We describe our approach to construct a phoneme set for polyglot . In polyglot speech synthesis, resources are shared across languages. The goal of this research is to develop global phoneme set using existing resources. Therefore, MBROLA has been selected. In MBROLA, there are 72 diphone databases of different languages. For each database, there is a set of used. We have selected 31 language databases out of the 72 diphone databases in MBROLA. By reusing existing resources, we would be able to gather global phoneme set in faster and wider language coverage. Therefore it would be able to be used for language that has limited linguistic expertise or limited linguistics resources. Our approach includes the process of extracting the phonemes of these languages, clustering, eliminating and substituting inaccurate phonemes and finally evaluating the list of phonemes obtained. At the end of this study, we are able to come out with one complete list of a global phoneme set. This list can be use as a substitution for unavailable phonemes in future polyglot TTS systems. It is also suitable to be used as a default phoneme set for new languages if the new languages’ phoneme set is not yet defined. Keywords: polyglot TTS, multilingual, speech synthesis, global phoneme set, resource poor languages

1. Introduction For the scope of our research, we use SAMPA as a phonetic representation given that is what MBROLA uses The International Phonetic Alphabet (IPA) is a standard (MBROLA, 2005). MBROLA is only used as the representation of for all languages. It provides resource. SAMPA is one of the complete phonetic symbols to represent sounds for phonemes which are representation which follows IPA transcription closely. already listed in a language or that are possible to be The reason why we choose MBROLA is that MBROLA produced by human articulatory system. Therefore, the already has a rich collection of phonemes sets for 31 question needs to be addressed: why there is a need for languages (MBROLA, 2005) - with a few variations for constructing a multilingual phoneme list for polyglot some languages. Making use of available resources not speech synthesis? only makes the work possible without linguistic expertise IPA provides a generic concept and instances of speech but also makes the standardising and identifying process sound. While in speech, not all sounds listed in IPA are faster. used regularly. Based on this account, it is possible to Standard phonetic notation distinguishes the different have a speech synthesis system for resource poor phonemes used in pronunciation. In polyglot speech languages in which the phonemes set is obtained based on synthesis, the resources must be put together other languages. And since polyglot TTS facilitate (Romsdorfer, 2007) and selection done based on sharable data including phonetic resources, the concept of phoneme labelling. By using a standard notation, one multilingual phonemes set will complement the phoneme will also not be mistaken as another. By having implementation of the polyglot TTS. a standard representation, it is also possible to reuse the In one previous study on multilingual phonemes by phonemes of selected languages in other speech Altosaar et al. (1996), an intermediary representation was applications or phonetics related work. introduced. The representation able to convert from IPA, This paper is organised as follows. In Section 2, we will SAMPA, -SAMPA, TIMIT and other notations into discuss the process of creating a multilingual phoneme their notation, called Wordbet. The Worldbet list not only list. We will describe the extraction and analysis process. covers IPA transcription (Hieronymus,1993), but also We will also explain some of the issues we encounter suggests symbols based upon five years study conducted during the analysis phase. We will then provide the on a speech database. Therefore, knowledge on the sound outcome of our research in the multilingual phoneme list of different language needs to be obtained before subsection. In Section 3, we will compare the outcome of algorithm for symbol mapping from one notation to our research with the phonemes set used by linguists in another can be constructed. This is different to our the following languages: English (RP), , Italian, approach where the global phoneme set is constructed German and French. This will be followed with with limited resources. discussion and conclusion. The aim of constructing a multilingual phoneme list is not to substitute IPA, SAMPA or X-SAMPA. What we propose is a default or initial phoneme set which can be 2. Constructing a Multilingual Phoneme List used in polyglot TTS architecture or TTS for resource There are three processes in phoneme list construction; poor languages. The global phoneme set obtained at the Extraction, Analysis and Evaluation. In Extraction end can be use independently. process, all phonemes from the MBROLA database are retrieved. For the scope of this research, we collected all phonemes in each language that are available in the is , diphtongs and . For each cluster, MBROLA database. In Analysis, there are two parts: the there are also derived phonemes. Some variations which clustering process and the elimination/substitution occur most frequently are the lengthening of vowels, process. In Evaluation, the result of this study is germination of consonants and aspirated of consonants as compared with validated phoneme set of the stated well as palatalised phonemes. We also have clusters of languages. similar sounds. For example, [], [R] and [4] or in IPA There are 31 languages listed in MBROLA: Afrikaans, are: [r], [ ʁ ] and [ ɾ ] correspondingly are clustered Dutch, English, German, Icelandic, Swedish, French, together. This second level of clustering is based on our Italian, Latin, Romanian, Spanish, Croatian, Czech, own judgement that these phonemes are coming from Lithuanian, Polish, Breton, Farsi, Greek, Hindi, Estonian, similar sound group with each other; but not in term of Hungarian, , Hebrew, Japanese, Korean, Turkish, the manner or the place or articulation. By having the Indonesian, Malay, Maori and Telegu. All the phonemes second level cluster we can determine the possible use the SAMPA notation. substitution phoneme from similar cluster during synthesising process. 2.1. Extraction Based on the phonemes of the 31 languages which have 2.3. Elimination and Substitution been extracted, initially there are 357 unique SAMPA From the clustered phonemes, we are capable of symbols. In the list, we notice that the phonemes can be determining the SAMPA phonemes which are correspond classified into two: the basic phonemes, which the to the standard IPA notation. At this phase, there could be phoneme is a direct mapping from consonants and two reasons a phoneme need to be substituted or of the IPA; and the derived phonemes, where the eliminated. It could either be the phoneme is not a phoneme is an entity constructed based on the standard SAMPA symbol or the SAMPA symbol given is combination of basic phoneme and or/and not corresponding to the sound produced by the phoneme suprasegmentals symbols. in MBROLA. When the phoneme is not written in Before we go into greater detail, it is necessary to standard SAMPA notation, it could be either one of these describe the different types of classification in IPA. two reasons: the symbol was represented to fit in all Symbols in IPA are classified into consonants, vowels, phonemes of the target language and somehow the diacritics, suprasegmentals, and tones and word accents. created symbol is clashing with another SAMPA symbol However, in SAMPA, as described by Wells (2003), the or the symbol simply does not exist in SAMPA. When tones and word accents need to be labelled in different the SAMPA sound does not correspond to the sound tier (isolated from phoneme tier). This issue is beyond the played by MBROLA synthesiser, it means that error may scope of this paper. occur during the matching process between orthographic Therefore, according to the IPA chart, the phoneme can and phonetic or the developer is using a different version either belong to one of the following categories: vowels, of SAMPA standard. pulmonic, consonant non-pulmonic and other Based on these criteria, the elimination or substitution symbols. The phoneme could also has diacritics and/or process will be carried out. Elimination is the process suprasegmentals symbol. In the list, diphtongs are not required when one of the following conditions occurs: listed. This is understandable because consist • the phoneme does not match to any IPA transcription of two consecutive vowels that glide or assimilate with • the symbol does not exist in SAMPA notation one another in the production to become a phoneme. • the sound which is labelled in the MBROLA Contrary to the IPA chart, we classify our phonemes database is not possible to be match with any other quite differently in our analysis. We treat all vowel and unused symbol for that particular language. consonant (both pulmonic and non-pulmonic) as an entity Substitution on the other hand is the process of of our phonemes. However, we also treat diphthongs as changing the symbol declared in MBROLA into one that an entity. We also have instances of derived phonemes matches with IPA and SAMPA. We will provide which are a combination of a consonant or a vowel with examples in later section when we discuss specific diacritics or/and suprasegmentals values in which we also languages’ issues. treated the combined attributes as a phoneme entity. It is It is also important to highlight that we remove semi- important to retain the derived phonemes because the diphtongs (or also known as mixed-diphtongs) in which produced has a unique sound as compared to the vowels /a/, // /i/ and /u/ are followed by /r/, //, / ļ/ or /m/. basic phoneme. We also remove an instance of SAMPA which consist of During extraction, phonemes are retrieved and two consecutive phonemes from the original SAMPA list organised into a table of languages and phonemes list. extracted at the initial stage. Both decisions are made We then sorted the phonemes to see if there was a pattern because the sounds do not glide with each other and the of distribution. From preliminary observation, we find sound can be produced by putting the two phonemes in that phonemes are not influenced by their language sequence. family. We also found that, for standard German and By having this step by step process, we would be able Bravarian German, the phonemes instances are not to standardise the SAMPA notation which will then not mirroring each other. only make the process of mapping of IPA to computer readable alphabet easier but will also make the process of 2.2. Clustering -to-phoneme conversion more accurate. After we obtained the raw phonemes list for the corresponding languages, we clustered the phonemes into groups of phoneme classes. The first level of the clusters 2.4. Issues during Extraction, Clustering, 2.4.3. Case Study on Lithuanian Elimination and Substitution Lithuanian originally has 90 phonemes in the list. In the process of obtaining the phoneme set, some issues However Lithuanian phonemes in MBROLA do not regarding MBROLA phonemes has been encountered. follow the standard SAMPA notation. For a vowel which We would like to discuss the issues based on a few has different stressed level, the Lithuanian database individual languages. presents their phonemes by using capitalised alphabet for Each database in MBROLA is required to follow stressed and small alphabet for unstressed phonemes. SAMPA notation. However due to the MBROLA While in SAMPA standard, different capital letter will synthesiser architecture and the different nature of the refer to different phonemes in IPA. Lithuanian also languages, some developers may find the need to modify represents contour in phonemes. For instance, for a or simplify their phonetic representation. The MBROLA long vowel /a/, the Lithuanian database has a phoneme of speech engine is based on concatenation of diphones. /aA/ for rising tone and /Aa/ for falling tone. In SAMPA MBROLA has a prosody modification mechanism that however, only lengthening of vowel need to be allows pitch varying throughout the time frame of each represented. The changes of tones will need to be labelled phoneme. at a different tier. For this we change the short and long For example in the tested sample for Malay, vowel to the corresponding symbol in SAMPA. ;ujian (translated as: test) However, because we are not sure of which phoneme U 120 0 208 90 192 actually been used in Lithuanian, some phonemes are left dZ 120 0 226 as capitalised and small capital. We also need to check I 80 90 218 for correct phonemes since the labelling of phonemes 80 0 198 already inconsistent by SAMPA standard. After analysis, n 120 0 165 90 167 Lithuanian only has 57 phonemes left. Each phoneme represents the duration value and the varying change of the pitch is shown in percentage for the 2.4.4. Case Study for Estonian value afterwards. For example, for the phoneme /U/, the Originally there are 77 phonemes for the Estonian. values are 120 0 208 90 192. It means the duration of /U/ However, similar to Lithuanian, Estonian database’ sound is 120 ms and throughout that duration, the developer find it necessary to represent the SAMPA fundamental frequency will be varied like this: at 0% of notation independent of standard notation. This is due to the total time (initial state) the pitch value is 208Hz and it the fact that Estonian has three types of long phoneme is interpolated to 192Hz at 90% of the 120ms time. This duration which they call phonological distinctive foot example is presented to show the reason behind some of patterns (MBROLA, 2005). Estonian also represents issues raised in the following case studies. double phonemes for long vowel and germinates We would like to highlight four languages for analysis: consonants which are not a standard SAMPA notation. English, Italian, Lithuanian and Estonian. Each will Additionally for those long vowel and repetitive present different issues for the above mentioned process. consonants, there are variations in term of lengthening of the phonemes which make it even harder for us to 2.4.1. Case Study on English provide phoneme substitution where applicable. For Since MBROLA allows manipulation of duration, example, there are /:f/, /ff/, /h:/, /hh/, /jj/, /kk/ and /:k/ English uses a very limited set of phonemes. There are 41 which all mean germinated consonants. But because of phonemes listed in the English phonemes set in the variation in Estonian speech pattern, they find a need MBROLA. For English vowels, the phonemes selected to describe them in the database. However, after are phonemes either with or without lengthening. There standardising the symbols according to SAMPA, we have are no basic and derived phonemes that are put together to replace repetitive phonemes with long symbol /:/ and in the list. The same case goes with diphtongs. In the for other cases, we have to eliminate them without standard English dictionary, the lengthening is written in substitution. This is because the phoneme can be the of each word to clearly constructed by putting two phonemes sequentially. At the distinguish the and accent which need to be put in. end only 31 phonemes are listed in the Estonian database. However in MBROLA, these instances is ignored As we going to describe in Discussion & Conclusion because the ability of the MBROLA synthesiser to stretch section, we changed this notation by adding colon [:] to or shorten the duration. individual basic phonemes. Based on the four database During analysis of the phonemes, none of English analysis description, we have showed that different phonemes set has been removed. It shows that the database require different analysis. phoneme set follows the standard notation of SAMPA. As for lengthening variation, we have to add those 2.5. Multilingual Phoneme List for Polyglot unlisted phonemes into our set of multilingual phonemes. Speech Synthesis This list is written in SAMPA. We also provide the 2.4.2. Case Study on Italian corresponding IPA symbol to each phoneme. However There are 40 phonemes listed in the Italian database. we find that some phonemes cannot be mapped to IPA Contrary to English, Italian has a lot of lateral releases as because they are declared as a phoneme in the particular derived phonemes. But similar to English, no phoneme languages but cannot be removed based on the rules listed has been removed during our analysis. in analysis section. The mapping is mainly based on Wells (2003). SAMPA notation is represented in square ([]) and IPA is represented in (//). The representation does not differentiate slash (//) as phonemes and square bracket ([]) as phones. Differences [N_k] /ŋH/, [n~] /ñ/, [N~] /ŋ /, [ny] /ny/ , [nY] /;/, are introduced to improve readability. For non available corresponding phonemes, we let the IPA space blanked. [p] /p/ , [P] /K/, [p'] /p/ , [p:] /p/ , [p_}] /p >/, [p_>] /p’/ , [p_h] /p,/ , [p_h:] /p,/ , [] /q/ , [r] /r/ , [R] 2.5.1. Vowels /N/, [r'] /r/ , [R'] /N/, [r:] /r/ , [r_0] /r /̥ , [s] /s/ , [S] [@] /ə/, [@'] /ə/, [{] /æ/ , [{:] /æ/ , [}:] /ʉ/, [1] /ʃ/, [s'] /s/ , [S'] /ʃ/, [s.] // , [s:] /s/ , [S:] /ʃ/ , [s\] //, [2] /ø/ , [2:] /ø/ , [2~] /ø /, [2~:] /ø / , [3:] / / , /ɕ/, [s_>] /s’/ , [] /t/ , [T] /θ/ , [t'] /t/, [t.] // , [t:] [6] / /, [6~] / /, [7] / /, [7:] / / , [9] /œ/ , [9:] /t/ , [t_}] /t>/, [t_>] /t’/ , [t_h] /t,/ , [t_h:] /t,/ , /œ/ , [9~] /œ /, [9~:] /œ / , [a] /a/ , [A] /ɑ/ , [a.] //, [T_h] /θ,/ , [ts] /ts/ , [tS] /ʧ/, [Ts] /θs/ , [ts'] /ts/ , [a:] /a/ , [A:] /ɑ/ , [A^] / ɑ/, [a_1] /a/ , [a~] /ã/, [tS'] /tʃ/, [ts\] /tɕ/, [ts\_>] /ts’/ , [ts\_h] /ts’/, [v] [A~] /ɑ /, [A~:] /ɑ /, [e] /e/ , [E] //, [e:] /e/ , [E:] /v/ , [v'] /v/ , [v:] /v/ , [] /w/ , [W] /V/, [X] /x/ , // , [E:~] / /, [e_1] /e/ , [E_1] // , [e~] /e /, [E~] [X] /χ/ , [x'] /x/ , [] /z/ , [Z] //, [z'] /z/ , [Z'] /Y/, / /, [e~:] /e / , [i] /i/ ,[I] //, [i.] //, [i:] /i/ , [I:] [z.] // and [z:] /z/ . // , [i_1] /i/ , [i~] /ĩ/, [i~:] / i/ , [] /o/ , [O] //, [o:] /o/ , [O:] // , [O_1] // , [o~] /õ/, [o~:] /õ/ , 2.6. Evaluation [Q] //, [u] /u/ , [U] /ʊ/, [u.] //, [u:] /u/ , [u_0] /u /̥ , We obtained lists top five languages of phonemes from a few resources of linguistics on the net which we refer as [u_1] /u/ , [u~] /ũ/, [V] /!/, [y] /y/ , [Y] /#/, [y:] control list. We then compared the list from control list /y/ , [Y:] /#/ , [y~] /y /and [y~:] /y /. with our list of global phoneme and looked for phoneme which may not include in the list. The phoneme set 2.5.2. Diphtongs obtained as a control set are Latin, German, French, Italian and English (RP). [@I] /ə/, [@U] /əʊ/, [2~] /ø / , [2j] /ø/ , [6~j~] In English (RP), there are 45 phonemes, 49 for /  /, [9~j] /œ / , [9j] /œ/ , [9u] /œu/ , [9y] /œy/ , German, 33 for French, 30 for Latin and 36 for Italian. [A~w] /ɑ w/ , [aE] /a/, [ai] /ai/ , [aI] /a/, [Ai] /ɑi/ , The list of IPA of the languages will be given in section are as follows: [ai:] /ai/ , [Aj] /ɑj/ , [au] /au/ , [aU] /aʊ/, [Au] /ɑu/ , [au:] /au/ , [Aw] /ɑw/ , [ay] /ay/ , [e@] /eə/ , English RP /p/, //, /m/, /f/, /v/, /V/, /w/, /θ/, /ð/, /t/, [e~w] /e w/ , [ea] /ea/ , [eA] /eɑ/ , [Ea] /a/ , [ei] //, /n/, /s/, /z/, /Z/, /r/, /l/, /ʃ/, /Y/, /j/, /ei/ , [eI] /e/, [Ei] /i/ , [Ej] /j/ , [el] /e/ , [ew] /k/, /g/, /ŋ/, /h/, /æ/, /ɑ//, //, /e/, /ə/, /ew/ , [Ew] /w/ , [ey] /ey/ , [I@] /ə/, [ie] /ie/ , [iE] / /, /i/, //, /u:/, /ʊ/, /!/, /e/, /a/, //, /əʊ/, /aʊ/, /ə/, /eə/ and /ʊə/ /i/, [Ie] /e/ , [iw] /iw/ , [o_X] /ŏ/, [OE] //, [oi] /oi/ , [Oi] /i/ , [OI] //, [oj] /oj/ , [ou] /ou/ , [ou:] German /a/, / /, /a/, /ae/, /ao/, /b/, /ç/, //, /#/, , [ow] , [oy] , [OY] , [U@] , /ou/ /ow/ /oy/ /#/ /ʊə/ /d/, /e/, //, /ʃ/, /ə/, /e/, //, /e /, / /, [ui] /ui/ , [uo] /uo/ , [uO] /u/, [Uo] /ʊo/ and [Uy] /f/, /\/, /h/, /i/, //, /i/, /j/, /k/, /l/, /m/, /ʊy/. /n/, /ŋ/, /o/, /ø/, /o/, /ø/, /œ/, /p/, /r/, /ɾ/, /s/, /t/, /ts/, /u/, /ʊ/, /u/, /x/, /#/, /y/, and 2.5.3. Consonants /z/ /Y/

[?] /ʔ/, [4] /ɾ/, [b] /b/ , [B] /+/, [b'] /b/ , [b:] /b/ , French [b_h] /b,/ , [] /c/ , [C] /ç/ , [c:] /c/ , [c_h] /c,/ , [d] /i/, /e/, //, /a/, /ɑ/, //, /o/, /u/, /ø/, /œ/, / /, /ɑ /, /õ/, /œ /, /m/, /n/, /j/, /w/, /9/, /d/ , [D] /ð/ , [d'] /ð/ , [d.] // , [d:] /d/ , [d_h] /d,/ , /p/, /t/, /k/, /b/, /d/, /g/, /f/, /s/, /ʃ/, /v/, [D_h] /ð,/ , [Dz] /ðz/ , [dZ] /2/, [dz'] /dz,/ , [dZ'] /z/, /Y/, /l/ and /ɾ/ /2/, [dz\] /dʑ/, [f] /f/ , [F] /ɱ/, [f'] /f/ , [f:] /f/ , Latin [g] /g/ , [G] /7/, [g'] /g/ , [g:] /g/ , [h] /h/ , [H] /9/, /a/, /ɑi/, /ɑu/, /b/, //, /d/, /dY/, //, /ʃ/, [h'] /h/ , [h:] /h/ , [h\] /ɧ/, [j] /j/ , [J] /;/, [j'] /j/ , /u/, /f/, /\/, /i/, /j/, /k/, /l/, /m/, /n/, /;/, /ŋ/, /p/, /r/, /ɾ/, /s/, /t/, /tʃ/, /ts/, /u/, /w/ [J-] /;/, [k_>] /k’/ , [k_h] /k,/ , [k_h:] /k,/ , [l] /l/ , [L] /A/, [l'] /l/ , [L'] /A/, [ l:] Italian /a/, /ai/, /au/, /b/, //, /d/, /dz/, /dY/, /e/, /l/ , [l_}] /l >/, [l_0] /l/̥ , [m] /m/ , [M] /D/, [m'] //, /ʃ/, /i/, /u/, /f/, /\/, /i/, /j/, /k/, /l/, /m/ , [M'] /D/, [m:] /m/ , [M\] /E/, [m_}] /m >/, /m/, /n/, /;/, /ŋ/, /o/, /p/, /r/, /ɾ/, /s/, /t/, /tʃ/, /ts/, /u/, /v/, /w/, /A/ and /z/ [m_0] /m /̥ , [n] /n/ , [N] /ŋ/ , [n'] /n/ , [N'] /ŋ/ , [n:] /n/ , [n':] /n/ , [n_}] /n >/, [n_0] /n /̥ , [N_0] /ŋ/̥ , For each control phoneme, the instance is compared Proceedings of the 4th International Conference on with the whole list of phoneme that we constructed. Spoken Language Processing (ICSLP1996). Therefore we are not comparing the availability of Philadelphia, USA. phoneme restricted to the language only. This procedure Hieronymus, J. (1993). ASCII Phonetic Symbols for the is conducted to ensure that the multilingual phonemes World's Languages: Worldbet. In: Journal of AT&T constructed would be able to represent most frequently Bell Laboratories. Technical Memo, Vol. 23. use sound in the language sample. MBROLA Group (2005). The MBROLA PROJECTS – Towards a Freely Available Multilingual Speech 3. Discussion & Conclusion Synthesizer. Théorie des Circuits et Traitement du Signal (TCTS). Mons Belgium. Retrieved from: There are five languages which have been compared with http://tcts.fpms.ac.be/synthesis/mbrola.html. Accessed our list of multilingual phonemes. These are English date: August 17, 2009. (RP), Latin, Italian, German and French which we refer Romsdorfer, H. and Pfister, B. (2007). Text Analysis and as control phonemes list. The set of phonemes used are Language Identification for Polyglot Text-to-Speech obtained from The International Phonetic Alphabet for Synthesis. In Speech Communication , 49(2), pp. 697- the RP Accent (Tovey, 2009) and IPA Source (Suverkrop, 724. 2009). During evaluation, we compared each control Suverkrop, B. (2009). IPA Source, IPA Transcription and phonemes for each language with our multilingual Literal Translation of Songs and Arias. Retrieved phonemes list. However, the result is not a 100% match. from: http://www.ipasource.com/diction_help. For English (RP), the phonemes which are not match Accessed date: August 17, 2009. with our list are only two phonemes: / ɹ / and / ɜ /. For Tovey, B. (2009). The IPA for RP Accent, English other languages however, there are some phoneme Language and Literature. Retrieved from: instance which we had decided to class as ‘noise’ http://www.bethtovey.com/language/ipa.html. phoneme during our phoneme set construction. We Accessed date: August 17, 2009. justify this as follows: with the MBROLA list, we found Wells, J.C. (2003). Computer-coding the IPA: a proposed that there are phonemes which are a combination of two extension of SAMPA. University College London. consonants that is presented as a single phoneme unit. We Retrieved from: decided to remove them since it is possible to construct http://www.phon.ucl.ac.uk/home/sampa/ipasam-x.pdf. the transcription using two different phonemes which Accessed date: August 17, 2009. already in the list. Also, in MBROLA, the lengthening of vowel and germination of consonants are represented using the symbol [ ]. For some languages which use repetition to represent germination, we have substituted them back by using colon [:]. Therefore we also applied the same concept during the phoneme set construction. We also ignore the repetitive phonemes which are declared in MBROLA with break between them with same reason like the first justification. If we included the stated justification and adapted the phonemes in the control list with the one that we have already constructed, we find six non-matching phonemes for French, another six for Italian, three for Latin and ten for German. The very high number of non-match phonemes for German is because German uses a lot of palatalised and nasalized sounds with a lot of variation within their diphtongs in which the MBROLA database simply needs to modify the value during synthesis. We find that the consonants and the vowel of German matched well with our list. We have described a method of extracting phoneme sets from existing speech resource. The evaluation shows that we have a reasonable list of multilingual phonemes constructed but it will still require some refinement if the resources are available. This is possible for most Indo- European languages where a lot of research has been previously conducted on the corresponding phonetics. But for resource poor languages, having more than half of the phonemes available is better than loaning the whole phoneme set from another language. References Altosaar, T., Karjalainen, M. and Vainio, M. (1996). A Multilingual Phonetic Representation and Analysis System For Different Speech Databases. In: