The Induction of Phonotactics for Speech Segmentation

The Induction of Phonotactics for Speech Segmentation Converging evidence from computational and human learners Published by LOT phone: +31 30 253 6006 Trans 10 fax: +31 30 253 6406 3512 JK Utrecht e-mail: [email protected] The Netherlands http://www.lotschool.nl Cover illustration: Elbertus Majoor, Landschap-fantasie II (detail), 1958, gouache. ISBN: 978-94-6093-049-2 NUR 616 Copyright c 2011: Frans Adriaans. All rights reserved. The Induction of Phonotactics for Speech Segmentation Converging evidence from computational and human learners Het Induceren van Fonotactiek voor Spraaksegmentatie Convergerende evidentie uit computationele en menselijke leerders (met een samenvatting in het Nederlands) Proefschrift ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van de rector magnificus, prof.dr. J.C. Stoof, ingevolge het besluit van het college voor promoties in het openbaar te verdedigen op vrijdag 25 februari 2011 des middags te 2.30 uur door Frans Willem Adriaans geboren op 14 maart 1981 te Ooststellingwerf Promotor: Prof. dr. R.W.J. Kager To my parents, Ank & Piet CONTENTS acknowledgements xi 1 introduction 1 1.1 The speech segmentation problem ................. 1 1.2 Segmentation cues and their acquisition .............. 3 1.2.1 Theroleofsegmentationcuesinspokenwordrecognition 4 1.2.2 The acquisition of segmentation cues by infants ..... 7 1.3 The induction of phonotactics for speech segmentation ..... 11 1.3.1 Bottom-up versus top-down ................ 11 1.3.2 Computational modeling of phonotactic learning .... 12 1.3.3 Two hypotheses regarding the acquisition of phonotactics by infants ......................... 16 1.4 Learning mechanisms in early language acquisition ....... 17 1.4.1 Statistical learning ...................... 18 1.4.2 Generalization ........................ 19 1.4.3 Towards a unified account of infant learning mechanisms 22 1.5 Research questions and scientific contribution .......... 23 1.6 Overview of the dissertation ..................... 26 2 a computational model of phonotactic learning and segmentation 31 2.1 Introduction .............................. 31 2.1.1 Models of speech segmentation using phonotactics . 32 2.1.2 Models of constraint induction ............... 33 2.2 The OT segmentation model .................... 35 2.3 The learning model: StaGe ..................... 38 2.3.1 Statistical learning ...................... 39 2.3.2 Frequency-Driven Constraint Induction .......... 40 2.3.3 Single-Feature Abstraction ................. 42 2.4 An example: The segmentation of plosive-liquid sequences . 45 2.5 General discussion .......................... 49 3 simulations of segmentation using phonotactics 53 3.1 Introduction .............................. 53 3.2 Experiment 1: Biphones in isolation ................ 54 3.2.1 Method ............................. 55 3.2.2 Results and discussion .................... 58 3.3 Experiment 2: Thresholds ...................... 61 3.3.1 Method ............................. 61 3.3.2 Results and discussion .................... 62 3.4 Experiment 3: Biphones in context ................. 65 vii CONTENTS 3.4.1 Method ............................. 65 3.4.2 Results and discussion .................... 66 3.5 Experiment 4: Input quantity .................... 69 3.5.1 Modeling the development of phonotactic learning . 69 3.5.2 Method ............................. 70 3.5.3 Results and discussion .................... 70 3.6 General discussion .......................... 72 4 modeling ocp-place and its effect on segmentation 75 4.1 Introduction .............................. 76 4.1.1 OCP effects in speech segmentation ............ 79 4.1.2 Modeling OCP-Place with StaGe ............. 81 4.2 Methodology ............................. 86 4.2.1 Materials ............................ 86 4.2.2 Procedure ........................... 87 4.2.3 Evaluation ........................... 88 4.3 The induction of OCP-Place .................... 90 4.3.1 Predictions of the constraint set with respect to word boundaries ........................... 94 4.4 Experiment 1: Model comparisons ................. 99 4.4.1 Segmentation models .................... 99 4.4.2 Linear regression analyses ................. 100 4.4.3 Discussion ........................... 102 4.5 Experiment 2: Input comparisons .................. 103 4.5.1 Segmentation models .................... 103 4.5.2 Linear regression analyses ................. 105 4.5.3 Discussion ........................... 112 4.6 General discussion .......................... 114 5 the induction of novel phonotactics by human learners 121 5.1 Introduction .............................. 121 5.2 Experiment 1 ............................. 128 5.2.1 Method ............................. 130 5.2.2 Results and discussion .................... 133 5.3 Experiment 2 ............................. 137 5.3.1 Method ............................. 138 5.3.2 Results and discussion .................... 141 5.4 Experiment 3 ............................. 144 5.4.1 Method ............................. 144 5.4.2 Results and discussion .................... 146 5.5 General discussion .......................... 148 viii CONTENTS 6 summary, discussion, and conclusions 153 6.1 Summary of the dissertation ..................... 154 6.1.1 A combined perspective ................... 154 6.1.2 The learning path ....................... 156 6.1.3 The proposal and its evaluation .............. 157 6.2 Discussion and future work ..................... 159 6.2.1 Comparison to other constraint induction models .... 159 6.2.2 What is the input for phonotactic learning? ....... 162 6.2.3 Which mechanisms are involved in phonotactic learning and segmentation? ...................... 164 6.2.4 The formation of a proto-lexicon .............. 168 6.2.5 Language-specific phonotactic learning .......... 170 6.2.6 A methodological note on the computational modeling of language acquisition ................... 171 6.3 Conclusions .............................. 172 a the stage software package 173 b feature specifications 181 c examples of segmentation output 183 d experimental items 187 References 191 samenvatting in het nederlands 205 curriculum vitae 209 ix ACKNOWLEDGEMENTS And so this journey ends... Many people have supported me in doing my research over the past few years, and in writing this book. Some gave critical feedback, or helped solving various problems. Others were simply a pleasant source of distraction. All of these people have provided essential contributions to this dissertation, and ensured that I had a very enjoyable time while doing my PhD research in Utrecht. This book could not have materialized without the inspiration and feedback from my supervisor, Rene´ Kager. When I started my project (which was the ‘computational’ sub-project within Rene’s´ NWO Vici project), I did not know that much about linguistics, let alone psycholinguistics. Reneprovidedme´ with an environment in which I could freely explore new ideas (whether they were computational, linguistic, psychological, or any mixture of the three), while providing me with the essential guidance for things to make sense in the end. I want to thank Rene´ for his enthusiasm, and for always challenging me with the right questions. In addition to many hours of scientific discussion, we also got along on a more personal level. Not many PhD students have the opportunity to hang out with their supervisors in record stores in Toulouse, while discussing jazz legends such as Mingus and Dolphy. I consider myself lucky. I was also fortunate to be part of a research group of very smart people, who became close friends. I am especially grateful to my fellow ‘groupies’ Natalie Boll-Avetisyan and Tom Lentz. The research presented in this book has clearly benefited from the many discussions I had with them throughout the years. These discussions sometimes took place in exotic locations: Natalie and I discussed some of the basic ideas of our projects by the seaside in Croatia; Tom and I came up with new research plans at the San Francisco Brewing Company. Aside from obvious fun, these trips and discussions helped me to form my ideas, and to look at them in a critical way. I would like to thank Diana Apoussidou, who joined the group later, for helpful advice on various issues, and for discussions about computational modeling. Thanks also to the other members of the group meetings (and eatings): Chen Ao, Frits van Brenk, Marieke Kolkman, Liquan Liu, Bart Penning de Vries, Johannes Schliesser, Tim Schoof, Keren Shatzman, Marko Simonovic,´ Jorinde Timmer. I also want to thank two students that I have supervised, Joost Bastings and Andrea´ Davis, for taking an interest in my work, and for providing me with a fresh perspective on my research. For this book to take on its current shape, I am very much indebted to Tikitu de Jager for proofreading, and for helping me with various LATEXlayout issues. Also thanks to Gianluca Giorgolo for help with technical issues, but mainly for the excellent suggestion of watching Star Trek while writing a xi acknowledgements dissertation. (‘These people get things done!’) Thanks to Donald Roos for editing the cover illustration. I also want to thank Ingrid, Angelo, and Arifin at Brandmeester’s for a continuous supply of delicious caffeine throughout the summer. Several people at the Utrecht Institute of Linguistics OTS have helped me with various aspects of my research. Elise de Bree and Annemarie Kerkfhoff gave me very useful advice on designing experiments. Jorinde Timmer assisted me in running the experiments.

The Induction of Phonotactics for Speech Segmentation

An Analysis of Sentence Segmentation Features For

A Bayesian Framework for Word Segmentation: Exploring the Effects of Context

Automatic Speech Segmentation

A Semi-Markov Model for Speech Segmentation with an Utterance-Break Prior

Seminar, Tue 4:15Pm

The Polysemy of the Words That Children Learn Over Time Arxiv

AUTOMATIC LINGUISTIC SEGMENTATION of CONVERSATIONAL SPEECH Andreas Stolcke Elizabeth Shriberg

Segmentation and Morphology

Unsupervised Segmentation of Audio Speech Using the Voting Experts Algorithm Matthew Im Ller Adam Miller Iowa State University

The Universal Dependencies Treebank of Spoken Slovenian

Sentence Boundary Detection Based on Parallel Lexical and Acoustic Models

Speech Segmentation and Its Impact on Spoken Document Processing