Machine-Assisted Phonemic Analysis

Machine-Assisted Phonemic Analysis Timothy Kempton Department of Computer Science February 2012 Dissertation submitted to the University of Sheffield for the degree of Doctor of Philosophy Supervisor: Professor Roger K. Moore Acknowledgements Firstly I would like to thank my supervisor Roger Moore. From the start, Roger was flexible and enthusiastic about my original proposal. I have really appreciated his patience as I slowly developed in the role of a researcher. I’ve particularly benefitted from Roger’s wealth of knowledge of previous speech recognition work, and his depth of insight on the subject. I am also very grateful for being funded by the UK Engineering and Physical Sciences Re- search Council (EPSRC grant number EP/P502748/1) through the University of Sheffield. I have had a lot of assistance from fieldworkers in Southeast Asia. Andy Castro has been particularly helpful in providing the Kua-nsi data, and talking through the phonology of the language. Brian Crook was also on hand later to provide transcriptions and recordings on re- quest. I have also benefited from conversations with Cathryn Yang about fieldwork andfrom receiving her survey data on the Nisu language. Mary Pearce has been tremendously helpful in conversations about field linguistics and phonemic analysis, and I have appreciated her dedication in reading through an earlier draft of the thesis. Cathy Bartram gave me a helpful steer at the start on phonemic analysis heuristics. Juha Yliniemi put me in touch with the above linguists including David Morgan who helped me access the resources at the SIL UK library. I have hugely benefited from the expertise of those on the SIL linguistics discussion list. I have appreciated extended discussions with Stephen Marlett, Robert Hedinger, and Steve Parker and I am grateful for the data they sent me. I’m thankful for the Ethnologue editorial team who ran some queries on their database for me. Brian Migliazza helped with initial queries and I value the further discussions with Paul Lewis regarding language vitality factors. It has been a privilege to talk through areas of research with linguists in academia. Near the beginning of the PhD, I met John Goldsmith at a conference in UCL who encouraged me to pursue the original idea. Sharon Peperkamp has been very helpful in answering all my questions on her previous research. Robert Kirchner’s sabbatical at Sheffield was perfect timing forme to learn phonology, and I’m grateful for the discussions we had in our shared office. Sandra Whiteside and Frank Herrmann taught me practical phonetics at Sheffield, and I appreciate their openness for further questions and discussions. I also benefited from a conversation with Ranjan Sen on phonological rules. I am also grateful to Tony Simons for sharing his past knowledge 3 4 particularly the spectrogram reading material. Assistance for the allophones example in Chapter 1 came from Sesotho speakers Lehlohonolo Mohasi and Peter Lebiletsa. Ben Murphy recorded the northeast English at the Stadium of Light. John-Paul Hosom, Andreas Stolcke, Petr Schwarz and Larry Hayashi have all kindly given me assistance in using their software. Advice on evaluation metrics came from Paul Clough and Colin Champion. Colin originally introduced me to both the ROC-AUC measure and the value of statistical significance. The SpandH Research Group has been great fun to be part of (thanks especially to James Carmichael) and a good forum for sharing ideas. I have enjoyed working with Emina Kurtic & Ahmet Aker on forced alignment, Emina has been helpful for linguistic advice and proofreading, and Ahmet providing support for text processing. Matt Gibson has provided a lot of help with anything mathematical particularly acoustic modelling. This is also true for Ning Ma who also provided insightful comments on this when proofreading. Vincent Wan gave me head start with language modelling, both in understanding it and using the correct tools. Sarah Creer was able to point me in the direction of some particularly relevant phonological research and lend me the appropriate books. I have benefited from discussions with Odette Scharenborg, particularly about the importance of comparing phoneme inventories. I also have appreciated Jan Gorisch’s proofreading as well as providing an extra ear for checking my transcriptions. I am grateful to other members of this research group for productive discussions including Thomas Hain, Sue Harding, Guy Aimetti, Herman Kamper, Phil Green, Guy Brown, Robin Hofe and Jon Barker. I am completely indebted to my family, who show me much love and patience. My brother Matthew was able to use his PhD experience to coach me, and also taught me more about statistical significance. And I’m very grateful to my mother, who did lots of typing. It’sbeen great having regular encouragement from my father, Jessica and Joe. It’s been so good to have a supportive bunch of housemates. Tim Brown has been checking my progress and doing some of the proofreading. Philip Wilson was able to check my use of statistics and helped in finding the equation for the Hockett heuristic. I’ve also appreciated the moral support of Richard Wilson and Fabian Avila. It’s been great to have a house linked with the church, and I’m so thankful for everyone in Broomhall gospel community past and present. I’m aware that in doing a PhD I run the risk of becoming more individualistic and self-indulgent; I am very grateful for my brothers and sisters in Christ reminded me of the gospel which keeps me sane e.g. the Richardsons had been helpful in that way over the full time period. In the wider church I’ve particularly appreciated the support and friendship of Lucy Mitchell, Piers & Shirley Miller, and all the Elders. Fred Hughes has been a great support long term. As I reflect on the many people that have helped me, and the friendship of manyofthem,I am full of thanks to God for providing these people and giving them their gifts. This thesis is dedicated to everyone at The Crowded House, Sheffield. Image component credits: Zscout370 (Lesotho flag 13p. ), Hugh Guiney (Brain p.72) CC BY-SA Abstract There is a consensus between many linguists that half of all languages risk disappearing by the end of the century. Documentation is agreed to be a priority. This includes the process of phonemic analysis to discover the contrastive sounds of a language with the resulting benefits of further linguistic analysis, literacy, and access to speech technology. A machine-assisted approach to phonemic analysis has the potential to greatly speed up the process and make the analysis more objective. Good computer tools are already available to help in a phonemic analysis, but these primar- ily provide search and sort database functionality, rather than automated analysis. In computational phonology there have been very few studies on the automated discovery of phonological patterns from surface level data such as narrow phonetic transcriptions or acoustics. This thesis addresses the lack of research in this area. The key scientific question underpin- ning the work in this thesis is “To what extent can a machine algorithm contribute to the procedures needed for a phonemic analysis?”. A secondary question is “What insights does such a quantitative evaluation give about the contribution of each of these procedures to a phonemic analysis?” It is demonstrated that a machine-assisted approach can make a measurable contribution to a phonemic analysis for all the procedures investigated; phonetic similarity, phone recognition & alignment, complementary distribution, and minimal pairs. The evaluation measures introduced in this thesis allows a comprehensive quantitative comparison between these phonemic analysis procedures. Given the best available data and the machine-assisted procedures described, there is a strong indication that phonetic similarity is the most important piece of evidence in a phonemic analysis. The tools and techniques developed in this thesis have resulted in tangible benefits to the analysis of two under-resourced languages and it is expected that many more languages will follow. 5 Contents 1 Introduction 11 1.1 Motivation ....................................... 11 1.1.1 Importance of phonemic analysis ....................... 11 1.1.2 Importance of machine-assisted phonemic analysis ............. 15 1.2 What is involved in a phonemic analysis? ...................... 16 1.3 Can a machine-assisted approach help? ....................... 19 1.4 Scope of thesis ..................................... 20 1.5 Definitions of key phonological terms ........................ 21 1.6 Chapter summary .................................... 22 2 Related work 23 2.1 Literacy and endangered languages .......................... 23 2.2 Phonemic analysis ................................... 26 2.3 Directly relevant work ................................. 27 2.3.1 Software to help with phonemic analysis .................. 27 2.3.2 Computational phonological analysis ..................... 28 2.4 Speech recognition technology ............................ 29 2.4.1 Multilingual acoustic modelling for ASR ................... 29 2.4.2 Language identification ............................ 33 2.4.3 Which speech technology to use? ....................... 37 2.5 Selected areas in speech development ........................ 38 2.5.1 Perception of speech categories ........................ 38 2.5.2 Learning sound categories ........................... 39 2.6 Chapter summary ...................................

Load more