<<

Development of Natural Language Processing Tools for Maori¯

Rolando Coto-Solano1, Sally Akevai Nicholas2, and Samantha Wray3

1School of and Applied Language Studies Victoria University of Wellington Te Whare Wananga¯ o te Upoko¯ o te Ika a Maui¯ [email protected] 2School of Language and Culture Auckland University of Technology [email protected] 3Neuroscience of Language Lab New York University Abu Dhabi, United Arab Emirates [email protected]

Abstract 1.1 Minority Languages and NLP

Lack of resources makes it difficult to train data- This paper presents three ongoing projects driven NLP tools for smaller languages. This is for NLP in Cook Islands Maori:¯ Un- compounded by the difficulty in generating input trained Forced Alignment (approx. 9% er- for Indigenous and endangered languages, where ror when detecting the center of words), dwindling numbers of speakers, non-standardized automatic speech recognition (37% WER writing systems and lack of resources to train spe- in the best trained models) and automatic cialist transcribers and analysts create a vicious part-of-speech tagging (92% accuracy for cycle that makes it even more difficult to take ad- the best performing model). These new re- vantage of NLP solutions. Amongst the hundreds sources fill existing gaps in NLP for the of languages of the Americas, for example, very language, including gold standard POS- few have large spoken and written corpora (e.g. tagged written corpora, transcribed speech Zapotec from Mexico, Guaran´ı from Paraguay and corpora, and time-aligned corpora down Quechua from Bolivia and Peru),´ some have spo- to the level. These are part of ken and written corpora, and only a handful pos- efforts to accelerate the documentation of sess more sophisticated tools like spell-checkers Cook Islands Maori¯ and to increase its vi- and machine translation (Mager et al., 2018). tality amongst its users. Overcoming these limitations is an important part of accelerating language documentation. Cre- 1 Introduction ating NLP resources also enhances the profile of endangered languages, creating a symbolic impact Cook Islands Maori¯ has been moderately well to generate positive associations towards the lan- documented with two dictionaries (Buse et al., guage and attract new learners, particularly young 1996; Savage, 1962), a comprehensive description members of the community who might not other- (Nicholas, 2017), and a corpus of audiovisual ma- wise see their language in a digital environment terials (Nicholas, 2012). However these materials (Aguilar Gil, 2014; Kornai, 2013). are not yet sufficiently organized or annotated so As for , te reo Maori,¯ as to be machine readable and thus maximally use- the Indigenous language spoken in Aotearoa New ful for both scholarship and revitalization projects. Zealand, is the one that has received the most at- The human resources needed to achieve the de- tention from the NLP community. It has multiple sired level of annotation are not available, which corpora and Google provides machine-translations has encouraged us to take advantage of NLP meth- for it as part of Google Translate, and tools such ods to accelerate documentation and research. as speech-to-text, text-to-speech and parsing are

Rolando Coto Solano, Sally Akevai Nicholas and Samantha Wray. 2018. Development of Natural Language Processing Tools for Cook Islands Maori.¯ In Proceedings of Australasian Language Technology Association Workshop, pages 26−33. under development (Bagnall et al., 2017). Other (Nicholas, 2018, 46). Overall its vitality is be- languages in the family have also received some tween a 7 (shifting) and an 8a (moribund) on attention. For example, Johnson et al., (2018) have the Expanded Graded Intergenerational Disrup- worked on forced alignment for Tongan. tion Scale (Lewis and Simons, 2010, 2). Among the diaspora population and that on the island 1.2 Cook Islands Maori¯ of Rarotonga there has been a shift to English Southern Cook Islands Maori¯ 1 (ISO 639-3 rar, and there is very little intergenerational transmis- or glottology raro1241) is an endangered East sion of Cook Islands Maori.¯ The only contexts Polynesian language indigenous to the realm of where the vitality is strong is within the small . It originates from the southern populations of the remaining islands of the south- Cook Islands (see figure1). Today however, ern Cook Islands, where Cook Islands Maori¯ still most of its speakers reside in diaspora popu- serves as the lingua franca (see Nicholas (2018) lations in Aotearoa New Zealand and Australia for a full discussion). (Nicholas, 2018). The languages most closely related to Cook Islands Maori¯ are the languages 1.3 Grammatical Structure of / (rkh, raka1237) and Penrhyn (pnh, penr1237) which originate from The following is a selection of grammatical fea- the northern Cook Islands. Te reo Maori¯ (mri, tures of Cook Islands Maori¯ as described in maor1246) and Tahitian (tah, tahi1242), Nicholas (2017). Cook Islands Maori¯ is of the both East Polynesian languages, are also closely isolating type with very few productive morpho- related. There is some degree of mutual intelligi- logical processes. There is widespread polysemy, bility between these languages but they are gen- particularly within the grammatical particles. For erally considered to be separate languages by lin- example, in the following sentence, glossed using guists and community members alike. the Leipzig Glossing Rules (Bickel et al., 2008), there are four homophones of the particle i: the past tense marker, the cause preposition, the loca- tive preposition and the temporal locative preposi- tion.

(1) I mate a Mere i te mango¯ PST be-dead DET Mere CAUSE the shark i roto i te moana i LOC inside LOC the ocean LOC.TIME te Tapati. the Sunday ‘Mere was killed by the shark in the ocean on Sunday.’

Furthermore, nearly every lexical item that can occur in a verb phrase can also occur in a noun Figure 1: Cook Islands (CartoGIS Services et al., phrase without any overt derivation, making tasks 2017) like POS tagging more difficult (see section 2.3). The unmarked constituent order is predicate ini- Cook Islands Maori¯ is severely endangered tial. There are verbal and non-verbal predicate 1Southern Cook Islands Maori¯ (henceforth Cook Islands types. In sentences with verbal predicates the un- Maori)¯ has historically been called Rarotongan by non- marked order is VSO. The phoneme paradigm is Indigenous scholars. However, this name is disliked by the small, as is typical for Polynesian languages, with speech community and should not be used to refer to South- ern Cook Islands Maori¯ but rather to the specific variety orig- 9 and 5 which have a phonemic inating from the island of Rarotonga (Nicholas, 2018:36). length distinction.

27 2 CIM NLP Projects Under Development CIM Arpabet kite K IY1 T EH1 There is a need to accelerate the documentation nga‘i¯ NG AE1 T IY1 of the endangered Cook Islands Maori¯ language, so that it can be revitalized in Rarotonga and Table 1: Arpabet conversion of CIM data Aotearoa New Zealand and its usage domains can be expanded in the islands where it is still the lingua franca. We have begun working on to align previously transcribed recordings of Cook this through three projects: (i) We have used un- Islands Maori¯ speech. Figure2 shows the output trained forced speech alignment to generate corre- of this process, a time-aligned transcription of the spondences between transcriptions and their audio audio in the Praat (Boersma et al., 2002) TextGrid files, with the purpose of improving phonetic and format. phonological documentation. (ii) We are training speech-to-text models to automatize the transcrip- tion of both legacy material and recordings made during linguistic fieldwork. (iii) We are develop- ing an interface-accompanied part-of-speech tag- ger as a first step towards full automatic parsing of the language. The following subsections provide details about the current state of each project.

2.1 Untrained Forced Alignment Figure 2: Praat TextGrid for CIM forced aligned text Forced alignment is a technique that matches the sound wave with its transcription, down to the After the automatic TextGrids were generated, word and phoneme levels (Wightman and Talkin, 1045 phonemic tokens (628 vowels, 298 conso- 1997). This makes the work of generating align- nants, 119 glottal stops) and the words containing ment grids up to 30 times faster than manual pro- them were hand-corrected to verify the accuracy cessing (Labov et al., 2013). In theory, forced of the automatic system. Table2 shows a summary alignment needs a specific language model to of the results. The alignment showed error rates of function (e.g. an model aligning 9% for the center of words and 20% for the center English text). However, untrained forced align- of vowels2. This error is higher than that observed ment, where for example Cook Islands Maori¯ au- for other instances of untrained forced alignment dio recordings and transcriptions are processed us- (2%, 7% and 3% than that observed for the Central ing an English language model, has been proven American languages Bribri, Cabecar´ and Malecu to be useful for the alignment of text and audio respectively (Coto-Solano and Solorzano, 2016; in Indigenous and under-resourced languages (Di- Coto-Solano and Solorzano´ , 2017)), but it pro- Canio et al., 2013). vides significant improvements in speed over hand In order to use this technique, a dictionary of alignment. Cook Islands Maori¯ to English Arpabet was built, Error (relative to the so that the Cook Islands Maori¯ words could be in- Type of interval troduced as new English words into the the align- duration of the interval) ment tool. Some examples are shown in table1. Words 9% ± 12% The phones of Cook Islands Maori¯ words were Vowels 20% ± 25% matched with the closest English phone in the Arpabet system (e.g. the T in kite ‘to know’). Table 2: Errors for Untrained Forced Alignment Some Cook Islands Maori¯ phones were not avail- able in English; long Cook Islands Maori¯ vowels Utilization of forced alignment as a method for were replaced by the equivalent accented in documentation and phonetic research has already English, and the was replaced by the 2We are currently documenting phonetic variation in vow- Arpabet phone T, as in nga‘i¯ ‘place’. els using data that was first force aligned and then manu- ally corrected. Because of this, we don’t know at this point The English language acoustic model from whether the 20% error is due to phonetic differences between FAVE-align (Rosenfelder et al., 2014) was used English and CIM vowels, or if it’s due to the data itself.

28 been demonstrated for . vitalization, and it is in those environments where In Nicholas and Coto-Solano(2018), Cook Is- we can see our worst performance. More work on lands Maori¯ phonemic tokens including vowels, this area is needed (e.g. crowdsourcing the record- consonants and glottal stops were forced-aligned ing of fixed phrases from numerous speakers for and manually corrected to study the glottal stop more reliable training). phoneme. This work was able to show that in is- lands such as ‘Atiu, whose dialect is reported as 2.3 Part-of-Speech Tagging having lost the glottal stop, the phoneme survives We have begun developing automatic part-of- as shorter stop or as creaky voice. The corrected speech (POS) tagging to aid in linguistic research Praat TextGrids are publicly available at Paradisec of the syntax of Cook Islands Maori,¯ and to build (Thieberger and Barwick, 2012), in the collection towards a full parser of the language. To begin of Nicholas(2012). our work we hand-tagged 418 sentences (2916 words) using the part of speech tags from Nicholas 2.2 Automatic Speech Recognition (2017). This is the only POS annotated corpus of We have begun the training of an Automatic Cook Islands Maori¯ existing to date. The corpus Speech Recognition (ASR) system using the Kaldi is currently being prepared for public release. system (Povey et al., 2011), both independently The corpus is currently annotated using two lev- and through the ELPIS pipeline developed by Co- els of tagging: a more shallow/broad level with 23 EDL (Foley et al., 2018). While this work is still in tags, and a second, narrower level with 70 tags. progress, our preliminary results point to the need For example, the shallow level contains tags like of custom models for each speaker. As can be seen v for verbs, n for nouns and tam for tense-aspect- in table3, our recordings produced models with mood particles. The narrower level provides fur- very different per-speaker word-error rates. (The ther detail for each tag. For example, it separates “all speakers” model has cross-speaker test set). the verbs into v:vt for transitive verbs, v:vi This, in addition to the paucity of data (approx. 80 for intransitives, v:vstat for stative verbs, and minutes of speech for all speakers) makes the task v:vpass for passives. extremely difficult. Classification experiments were carried out in the WEKA environment for machine learning Speaker WER (Hall et al., 2009). We tested algorithms that we Female, middle-aged, believed would cope with the sparseness of the 37% controlled environment data given the size of the corpus. These algo- Female, middle-aged, rithms were: D. Tree (J48): an open-source Java- 55% open environment based extension of the C4.5 decision tree algo- Female, elderly, rithm (Quinlan, 2014) and R. Forest: merger of 62% open environment random decision trees (with 100 iterations). In- Male, elderly, cluded as a reference baseline is a zeroR classifi- 68% open environment cation algorithm predicting the majority POS class All speakers 64% for all words. All algorithms were evaluated by splitting the entire corpus of 2916 words into a Table 3: Errors for Untrained Forced Alignment 90% set of sentences for training and 10% set for (per-speaker) testing. The model used position-dependent word con- The recording with the best performance was text features for classification of each word’s POS. recorded in a very controlled environment (a silent These included: room with a TASCAM DR-1 recorder). The worst • w recordings were those of elderly speakers who the word ( ) were speaking in their living rooms with open win- • the previous word (w-1) dows. The main issue here is that it is precisely these kinds of recordings (open environments with • the word before the previous word (w-2) elderly practitioners of traditional knowledge or • the word two before the previous word (w-3) tellers of traditional stories) that are of most in- terest to linguists and practitioners of language re- • the following word (w+1)

29 Model Accuracy Error type % of Shallow/broad POS tags assigned tag ⇒ correct tag errors D Tree (j48) 87.59% n (NOUN) ⇒ v (VERB) 23% R Forest (I = 100) 92.41% tam (tense aspect mood) ⇒ prep 9% zeroR (baseline) 21.03% prep ⇒ tam (tense aspect mood) 5% Narrow POS tags v (VERB) ⇒ n (NOUN) 5% D Tree (j48) 80.00% R Forest (I = 100) 82.41% Table 5: Most common POS tagger errors zeroR (baseline) 15.52% for shallow/broad tags for top-performing tagger (Random Forest) Table 4: Accuracy of POS tagging models. Per- formance is reported in accuracy (per-token) for public launch.

A comparison of the classification algorithms using the features above is shown in Table4. As seen here, the current top performing classifier that we have identified is a Random Forest classifier. This algorithm performs best when the POS tags are less informative; that is, it performs best on the shallow/broad tags with an accuracy of 92.41%. Despite the fact that the narrow tags do not col- lapse across types and therefore are more diffi- cult to classify, the best performer for the narrow tags is also the Random Forest classifier. This performance is comparable to other POS tagging Figure 3: Interface for the POS tagger tasks for under-resourced languages for which a new minimal dataset was manually tagged as the The assignation of parts of speech is a very dif- sole input for training, such as 71-78% for Kin- ficult task in Cook Islands Maori¯ not only because yarwanda and Malagasy using a Hidden Markov of its data-driven nature, but because the orthog- Model (Garrette and Baldridge, 2013). It is also raphy of Cook Islands Maori¯ has not been fixed comparable to POS tagging for related languages until recently. As detailed in Nicholas(2013), his- with relatively larger corpora, such as Indonesian torically and even today there are numerous vari- (94% accuracy, with 355,000 annotated tokens) ations in spelling. The glottal stop is frequently (Fu et al., 2018). omitted in spelling, as are the macrons for vo- To obtain an assessment of directions for future calic length. As a result, words like ‘e ‘nominal work aimed at improving the model, we also deter- predicate marker’ can be written as ‘e¯ or e, in- mined the most commonly confused tags by con- creasing the homography of the language. Fig- sulting a confusion matrix. The most common er- ure4 shows the JSP interface attempting to tag rors for the top performer (Shallow/broad tags as two variants of the greeting ‘Aere mai ‘Welcome’. classified by the Random Forest) are seen in table The first is spelled according to the current ortho- 5. Recall from section 1.3 that grammatically, lex- graphic guidelines, with a glottal stop character at ical items which occur in a noun phrase can also the beginning, and the first word gets tagged cor- occur in a verb phrase with no overt derivational rectly as an intransitive verb. The second word, marking. This explains the fact that the majority however, is spelled without the glottal stop, which of errors occurred as the result of confusion be- leads the model to misidentify it as a noun. Future tween v and n. work includes experiments on how to best tackle After training the model, we built a JavaServer this problem by utilizing naturally occurring va- Pages (JSP) interface to demo the model and ob- riety written as spontaneous orthographies, look- tain POS tagged versions of new, raw untagged ing at solutions already found for languages with sentences. This is illustrated in figure3 below. non-standardized writing systems such as collo- The interface is in the process of being prepared quial Arabic (Wray et al., 2015).

30 4 Conclusions

This paper summarizes ongoing work in the ap- plication of NLP techniques to Cook Islands Maori,¯ including untrained forced alignment for producing time-aligned transcriptions down to the phoneme, automatic speech recognition for the production of transcriptions from recordings of speech, and part-of-speech tagging for producing tagged text. We have given an overview of the Figure 4: Error when tagging words without a state of these projects, and presented ideas for fu- glottal stop ture work in this area. We believe that this inter- disciplinary work can accelerate and enhance not 2.4 Language Revitalization Applications only the documentation of the language, but can ultimately bring more of the Cook Islands com- In addition to the scholarly advances these projects munity in contact with its language and help in its will support, they also have utility for the revital- revitalization. ization of Cook Islands Maori.¯ The forced align- ment work will support the teaching of the phonet- Acknowledgments ics, phonology, and ‘pronunciation’, as well as im- prove community understanding of variation. The We wish to thank Dr. Ben Foley and Dr. Miriam consumption of audio visual material with closed Meyerhoff of CoEDL for their support in numer- captions in the target language is known to be ben- ous aspects of this project, as well as three anony- eficial for language learning (Vanderplank, 2016), mous reviewers for their helpful comments. We and automatic speech recognition will greatly in- also wish to thank Jean Mason, Director of the crease the quantity of such material available to Rarotonga Library, Teata Ateriano, principal of learners. The increased size of the searchable cor- the School of Ma’uke, and Dr. Tyler Peterson from pus of Cook Islands Maori¯ will facilitate the pro- Arizona State University for their continued sup- duction of corpus-based and multimedia pedagog- port in the documentation of Cook Islands Maori.¯ ical materials. Similarly, the POS tagged data will provide further benefits for the accurate design of pedagogical materials and the linguistic training References of teachers who speak Cook Islands Maori.¯ Ad- Yasnaya¯ Elena Aguilar Gil. 2014. para qu pub- ditional tools that can be developed using this well licar libros en lenguas indgenas si nadie los lee? annotated corpus include text-to-speech technol- e’px. http://archivo.estepais.com/site/2014/para- ogy and chatbot applications. quepublicar-libros-en-lenguas-indigenas-si-nadie- los-lee/.

3 Future work Douglas Bagnall, Keoni Mahelona, and Peter-Lucas Jones. 2017. Korero¯ Maori:¯ a serious attempt at There is much work to be done to move forward speech recognition for te reo Maori.¯ Paper presented in these three projects. The next step for the POS at the Conference 2017 NZ LingSoc, Auckland. tagging is to add resilience to cope with the non- Balthasar Bickel, Bernard Comrie, and Martin Haspel- stardardized writing it will most commonly find. math. 2008. The leipzig glossing rules. conventions As for the speech recognition, crowdsourcing sim- for interlinear morpheme by morpheme glosses. Re- ilar to that carried out by Bagnall et al (2017) vised version of February . might help increase the amount of controlled, pre- Paul Boersma et al. 2002. Praat, a system for doing transcribed audio available for speech recognition phonetics by computer. Glot international 5. training. Furthermore, we are investigating the in- corporation of algorithms for treatment of audio Jasper E Buse, Raututi Taringa, Bruce Biggs, and data with low signal-to-noise ratio to improve au- Rangi Moeka‘a. 1996. Cook Islands Maori¯ dic- tionary with English-Cook Island Maori¯ finderlist. dio quality which theoretically should lower the Pacific Linguistics, Research School of Pacific and high word error rate for recordings conducted in Asian Studies, Australian National University, Can- an open, noisy environment. berra, ACT.

31 CartoGIS Services, College of Asia and Manuel Mager, Ximena Gutierrez-Vasques, Gerardo the Pacific, and The Australian Na- Sierra, and Ivan Meza. 2018. Challenges of lan- tional University. 2017. Cook islands. guage technologies for the indigenous languages of http://asiapacific.anu.edu.au/mapsonline/base- the americas. Proceedings of the 27th International maps/cook-islands [Accessed 2017-10-06]. Conference on Computational Linguistics (COLING 2018) . Rolando Coto-Solano and Sof´ıa Flores Solorzano.´ 2017. Comparison of two forced alignment sys- Sally Nicholas. 2017. Ko te Karama¯ o te Reo Maori¯ o tems for aligning bribri speech. CLEI Electron. J. te Pae o Te Kuki Airani: A Grammar of South- 20(1):2–1. ern Cook Islands Maori¯ . Ph.D. thesis, University of Rolando Coto-Solano and Sofia Flores Solorzano. Auckland. 2016. Alineacion´ forzada sin entrenamiento para la anotacion´ automatica´ de corpus orales de las lenguas Sally Akevai Nicholas. 2012. Te Vairanga Tuatua ind´ıgenas de costa rica. Ka´nina˜ 40(4):175–199. o te Te Reo Maori¯ o te Pae Tonga: Cook Is- lands Maori¯ (Southern dialects) (sn1). Digital Christian DiCanio, Hosung Nam, Douglas H Whalen, collection managed by PARADISEC. [Open H Timothy Bunnell, Jonathan D Amith, and Access] DOI: 10.4225/72/56E9793466307 Rey Castillo Garc´ıa. 2013. Using automatic align- http://catalog.paradisec.org.au/collections/SN1. ment to analyze endangered language data: Testing the viability of untrained alignment. The Journal Sally Akevai Nicholas. 2018. Language contexts: Te of the Acoustical Society of America 134(3):2235– Reo Maori¯ o te Pae Tonga o te Kuki Airani also 2246. known as Southern Cook Islands Maori.¯ Language Documentation and Description 15:36–64. Ben Foley, Josh Arnold, Rolando Coto-Solano, Gau- tier Durantin, T. Mark Ellison, Daan van Esch, Scott Sally Akevai Nicholas and Rolando Coto-Solano. Heath, Frantisek Kratochvil, Zara Maxwell-Smith, 2018. Using untrained forced alignment to study David Nash, Ola Olsson, Mark Richards, Nay San, variation of glottalization in cook islands mori. In Hywel Stoakes, Nick Thieberger, and Janet Wiles. NWAV-AP5. University of Brisbane. 2018. Building speech recognition systems for lan- guage documentation: The coedl endangered lan- guage pipeline and inference system (elpis). In 6th Sally Akevai Te Namu Nicholas. 2013. Orthographic Intl. Workshop on Spoken Language Technologies reform in cook islands maori:¯ Human considera- for Under-Resourced Languages. tions and language revitalisation implications. 3rd International Conference on Language Documenta- Sihui Fu, Nankai Lin, Gangqin Zhu, and Shengyi Jiang. tion and Conservation (ICLDC) . 2018. Towards indonesian part-of-speech tagging: Corpus and models. In LREC. Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Dan Garrette and Jason Baldridge. 2013. Learning Hannemann, Petr Motlicek, Yanmin Qian, Petr a part-of-speech tagger from two hours of annota- Schwarz, et al. 2011. The kaldi speech recogni- tion. In Proceedings of NAACL 2013. Atlanta, USA, tion toolkit. In IEEE 2011 workshop on automatic pages 138–147. speech recognition and understanding. IEEE Signal Processing Society, EPFL-CONF-192584. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H Witten. 2009. The weka data mining software: an update. J Ross Quinlan. 2014. C4. 5: programs for machine ACM SIGKDD explorations newsletter 11(1):10–18. learning. Elsevier.

Lisa M Johnson, Marianna Di Paolo, and Adrian Bell. Ingrid Rosenfelder, Josef Fruehwald, Keelan Evanini, 2018. Forced alignment for understudied language Scott Seyfarth, Kyle Gorman, Hilary Prichard, and varieties: Testing prosodylab-aligner with tongan Jiahong Yuan. 2014. FAVE (Forced Alignment and data. Language Documentation & Conservation . Vowel Extraction) Program Suite.

Andras´ Kornai. 2013. Digital language death. PloS Stephen Savage. 1962. A dictionary of the Maori lan- one 8(10):e77056. guage of Rarotonga / manuscript by Stephen Sav- William Labov, Ingrid Rosenfelder, and Josef Frue- age. New Zealand Department of Island Territories, hwald. 2013. One hundred years of Wellington. in philadelphia: Linear incrementation, reversal, and reanalysis. Language 89(1):30–65. Nicholas Thieberger and Linda Barwick. 2012. Keep- ing records of language diversity in melanesia, the M. Paul Lewis and Gary F. Simons. 2010. Mak- pacific and regional archive for digital sources in ing EGIDS assessment for the ethnologue. endangered cultures (paradisec). Melanesian lan- www.icc.org.kh/download/Making EGIDS- guages on the edge of Asia: Challenges for the 21st Assessments English.pdf [Accessed 2017-07-20]. Century pages 239–53.

32 Robert Vanderplank. 2016. Effects of and effects with captions: How exactly does watching a tv pro- gramme with same-language subtitles make a dif- ference to language learners?. Language Teaching 49(2):235. Colin W Wightman and David T Talkin. 1997. The aligner: Text-to-speech alignment using markov models. In Progress in speech synthesis, Springer, pages 313–323. Samantha Wray, Hamdy Mubarak, and Ahmed Ali. 2015. Best Practices for Crowdsourcing Dialec- tal Arabic Speech Transcription. In Proceedings of Workshop on Arabic Natural Language Processing.

33