The Development of an Automatic Pronunciation Assistant

THE DEVELOPMENT OF AN AUTOMATIC PRONUNCIATION ASSISTANT by TSHEPHISHO JOSEPH SEFARA DISSERTATION Submitted in fulfilment of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE in the FACULTY OF SCIENCE AND AGRICULTURE (School of Mathematical and Computer Sciences) at the UNIVERSITY OF LIMPOPO SUPERVISOR: Mr MJD Manamela CO-SUPERVISOR: Dr TI Modipa 2019 DEDICATION This dissertation is dedicated to my family who sacrificed the time we should have spent together. ii DECLARATION OF AUTHORSHIP I, Tshephisho Joseph Sefara, declare that the dissertation entitled “THE DEVELOPMENT OF AN AUTOMATIC PRONUNCIATION ASSISTANT”, is my own work and has been generated by me as the result of my own original research proposal. I confirm that where collaboration with other people has taken place, or material from other researchers is included, the parties or material are appropriately indicated in the acknowledgements or references. I further confirm that this work has not been submitted to any other university for any other degree or examination. ____________ Sefara, T.J. 05/04/2019 Date iii ACKNOWLEDGEMENTS As I finish this academic work, I would like to thank almighty God for giving the strength, wisdom and guidance throughout the course of the research study. It is a great pleasure to express my thanks to the following people for contributing towards the success of this research study: Mr MJD Manamela and Dr TI Modipa, my supervisors, for their supervision and guidance without which this work would not have been a reality; Credit has to be given to ARMSCOR and CSIR for the financial support they provided for this research study; Special thanks must go to all the hard-working people that met with me every day and gave their one hundred and ten percent effort during evaluation of the system; Special thanks must go to my friends and colleagues, for their moral support. Finally, I would like to thank my amazing family for their love and support. iv ABSTRACT The pronunciation of words and phrases in any language involves careful manipulation of linguistic features. Factors such as age, motivation, accent, phonetics, stress and intonation sometimes cause a problem of inappropriate or incorrect pronunciation of words from non-native languages. Pronunciation of words using different phonological rules has a tendency of changing the meaning of those words. This study presents the development of an automatic pronunciation assistant system for under-resourced languages of Limpopo Province, namely, Sepedi, Xitsonga, Tshivenda and isiNdebele. The aim of the proposed system is to help non-native speakers to learn appropriate and correct pronunciation of words/phrases in these under-resourced languages. The system is composed of a language identification module on the front-end side and a speech synthesis module on the back-end side. A support vector machine was compared to the baseline multinomial naive Bayes to build the language identification module. The language identification phase performs supervised multiclass text classification to predict a person’s first language based on input text before the speech synthesis phase continues with pronunciation issues using the identified language. The speech synthesis on the back-end phase is composed of four baseline text-to-speech synthesis systems in selected target languages. These text-to-speech synthesis systems were based on the hidden Markov model method of development. Subjective listening tests were conducted to evaluate the performance of the quality of the synthesised speech using a mean opinion score test. The mean opinion score test obtained good performance results on all targeted languages for naturalness, pronunciation, pleasantness, understandability, intelligibility, overall quality of the system and user acceptance. The developed system has been implemented on a “real-live” production web-server for performance evaluation and stability testing using live data. v TABLE OF CONTENTS DEDICATION ...................................................................................................... ii DECLARATION OF AUTHORSHIP ................................................................... iii ACKNOWLEDGEMENTS................................................................................... iv ABSTRACT ........................................................................................................ v TABLE OF CONTENTS ..................................................................................... vi LIST OF TABLES ............................................................................................. xiv LIST OF FIGURES ............................................................................................ xv LIST OF LISTINGS .......................................................................................... xix LIST OF ABBREVIATIONS ............................................................................... xx 1 CHAPTER 1: INTRODUCTION ............................................................. 1 1.1 Preamble ............................................................................................... 1 1.2 Motivation .............................................................................................. 3 1.3 Problem Statement................................................................................ 4 1.3.1 Aim ........................................................................................................ 5 1.3.2 Objectives ............................................................................................. 5 1.3.3 Research Questions .............................................................................. 6 1.4 Research Methods ................................................................................ 6 1.5 Scientific Contribution ............................................................................ 7 vi 1.6 Ethical Considerations ........................................................................... 9 1.6.1 Informed Consent .................................................................................. 9 1.6.2 Voluntary participation ........................................................................... 9 1.6.3 Privacy and Confidentiality .................................................................. 10 1.6.4 Physical or Psychological Harm .......................................................... 10 1.7 Structure of Dissertation ...................................................................... 10 2 CHAPTER 2: BACKGROUND ............................................................ 11 2.1 Introduction ......................................................................................... 11 2.2 Proper Names ..................................................................................... 12 2.3 Pronunciation ...................................................................................... 13 2.4 Supervised Learning Techniques ........................................................ 14 2.4.1 Multinomial Naive Bayes Classifier ..................................................... 15 2.4.2 Support Vector Machines .................................................................... 17 2.5 Language Identification ....................................................................... 18 2.5.1 Language Identification for South African Languages ......................... 20 2.6 Toolkits for Classifier Implementation.................................................. 22 2.6.1 WEKA .................................................................................................. 22 2.6.2 Scikit-learn .......................................................................................... 22 2.6.3 NLTK ................................................................................................... 22 vii 2.7 Components of a Typical TTS Synthesis System ............................... 23 2.7.1 Natural Language Processing ............................................................. 23 2.7.2 Digital Signal Processing ..................................................................... 25 2.8 Evaluation of TTS Synthesis System .................................................. 33 2.9 TTS Synthesis Application Areas ........................................................ 34 2.10 Speech Synthesis Systems Toolkits.................................................... 36 2.10.1 Festival TTS ........................................................................................ 36 2.10.2 Speect TTS ......................................................................................... 36 2.10.3 IBM Watson TTS ................................................................................. 36 2.10.4 Merlin .................................................................................................. 37 2.10.5 MARY TTS .......................................................................................... 37 2.11 Summary ............................................................................................. 38 3 CHAPTER 3: DESIGN AND IMPLEMENTATION ............................... 40 3.1 Introduction ......................................................................................... 40 3.2 Front-end Phase: Language Identification Module .............................. 42 3.2.1 Data Acquisition Pre-processing ......................................................... 42 3.2.2 N-gram Feature set ............................................................................

Load more