Catching Words in a Stream of Speech

Catching Words in a Stream of Speech

<p>Catching Words in a Stream of Speech: </p><p>Computational Simulations of <br>Segmenting Transcribed Child-Directed Speech </p><p></p><ul style="display: flex;"><li style="flex:1">˘</li><li style="flex:1">Çagrı Çöltekin </li></ul><p></p><p>CLCG </p><p>The work presented here was carried out under the auspices of the School of Behavioural and </p><p>Cognitive Neuroscience and the Center for Language and Cognition Groningen of the Faculty of </p><p>Arts of the University of Groningen. </p><p>Groningen Dissertations in Linguistics 97 ISSN 0928-0030 </p><p>2011, Çag˘rı Çöltekin </p><p>This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.&nbsp;To view a </p><p>copy of this license, visit <a href="/goto?url=http://creativecommons.org/licenses/by-nc-sa/3.0/" target="_blank">http://creativecommons.org/licenses/by-nc-sa/3.0/ </a></p><p>or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. </p><p>Cover art, titled auto, by Franek Timur Çöltekin </p><p>A</p><p>Document prepared with LT X 2&nbsp;and typeset by pdfT X </p><p>ε</p><p></p><ul style="display: flex;"><li style="flex:1">E</li><li style="flex:1">E</li></ul><p>Printed by Wöhrmann Print Service, Zutphen <br>RIJKSUNIVERSITEIT GRONINGEN </p><p>Catching Words in a Stream of Speech </p><p>Computational Simulations of Segmenting Transcribed Child-Directed Speech </p><p>Proefschrift ter verkrijging van het doctoraat in de <br>Letteren aan de Rijksuniversiteit Groningen op gezag van de <br>Rector Magnificus, dr. E. Sterken, in het openbaar te verdedigen op donderdag 8 december 2011 om 14.30 uur </p><p>door </p><p>Çag˘rı Çöltekin </p><p>geboren op 28 februari 1972 <br>Çıldır, Turkije </p><ul style="display: flex;"><li style="flex:1">Promotor: </li><li style="flex:1">Prof. dr. ir. J. Nerbonne </li></ul><p>Beoordelingscommissie: Prof.&nbsp;dr. A. van den Bosch <br>Prof. dr. P. Hendriks Prof. dr. P. Monaghan </p><p>ISBN (electronic version):&nbsp;978-90-367-5259-6 </p><ul style="display: flex;"><li style="flex:1">ISBN (print version): </li><li style="flex:1">978-90-367-5232-9 </li></ul><p></p><p>Preface </p><p>I started my PhD project with a more ambitious goal than what might have been achieved in this dissertation. I wanted to touch most issues of language acquisition, developing computational models for a wider range of phenomena.&nbsp;In particular, I </p><p>wanted to focus on models of learning linguistic ‘structure’, as it is typically observed </p><p>in morphology and syntax. As a result, segmentation was one of the annoying tasks that </p><p>I could not easily step over because I was also interested in morphology. So, I decided </p><p>to write a chapter on segmentation. Despite the fact that segmentation is considered </p><p>relatively easy (in comparison to learning syntax, for example) by many people, and </p><p>it is studied relatively well, every step I took for modeling this task revealed another </p><p>interesting problem I could not just gloss over. At the end, the initial ‘chapter’ became </p><p>the dissertation you have in front of you. I believe I have a far better understanding of </p><p>the problem now, but I also have many more questions than what I started with. </p><p>The structure of the project, and my wanderings in the landscape of language acquisition did not allow me to work with many other people.&nbsp;As a result, this dissertation has been completed in a more independent setting than most other PhD dissertations. Nevertheless, this thesis benefited from my interactions with others. I </p><p>will try to acknowledge the direct or indirect help I had during this work, but it is likely </p><p>that I will fail to mention all. I apologize in advance to anyone whom I might have </p><p>unintentionally left out. </p><p>First of all, my sincere thanks goes to my supervisor John Nerbonne.&nbsp;Here, I do not use the word sincere for stylistic reasons. All PhD students acknowledge the </p><p>supervisor(s), but if you think they all mean it, you probably have not talked to many </p><p>of them. Besides the valuable comments on the content of my work, I got the attention </p><p>and encouragement I needed, when I needed it. He patiently read all my early drafts on </p><p>short notice, even correcting my never-ending English mistakes. </p><p>I would also like to thank Antal van den Bosch, Petra Hendriks and Padraic </p><p>Monaghan for agreeing to read and evaluate my thesis. Their comments and criticisms </p><p>improved the final draft, and made me look at the issues discussed in the thesis from </p><p>different perspectives. In earlier stages of my PhD project, I also received valuable comments and criticisms from Kees de Bot and Tamás Biró. Although focus of the project changed substantially, the benefit of their comments remain.&nbsp;Later, regular </p><p>discussions with fellow PhD students Barbara Plank, Dörte Hessler and Peter Nabende </p><p>vvi </p><p>kept me on track, and I particularly got valuable comments from Barbara on some of </p><p>the content presented here. Barbara and Dörte also get additional thanks for agreeing </p><p>to be my paranimphs during the defense. </p><p>A substantial part of completing a PhD requires writing. Writing at a reasonable </p><p>academic level is a difficult task, writing in a foreign language is even more difficult and </p><p>writing in a language in which you barely understand the basics is almost impossible. </p><p>First, thanks to Daniël de Kok for helping me with the impossible, and translating the </p><p>summary to Dutch.&nbsp;Second, I shamelessly used some people close to me for proof reading, on short notice, without much display of appreciation. My many mistakes </p><p>in earlier drafts of this dissertation were eliminated by the help of Joanna Krawczyk- </p><p>Çöltekin, Arzu Çöltekin, Barbara Plank and Asena and Giray Devlet. I am grateful for </p><p>their help, as well as their friendship. </p><p>My interest in language and language acquisition that lead to this rather late PhD </p><p>goes back to my undergraduate years in Middle East Technical University. I am likely </p><p>to omit many people here because of many years past since. However, the help I got </p><p>and things that I learned from Cem Bozs¸ahin and Deniz Zeyrek are difficult to forget. I </p><p>am particularly grateful to Cem Bozs¸ahin for his encouragement and his patience in </p><p>supervising my many MSc thesis attempts. </p><p>This thesis also owes a lot to people that I cannot all name here. I would like to </p><p>thank to those who share their data, their code, and their wisdom. This thesis would not be possible without many freely available sources of information, tools and data that we </p><p>A</p><p>take for granted nowadays—just to name a few: GNU utilities, R, LT X, CHILDES. <br>E</p><p>There is more to life than research, and you realize it more if you move to a new </p><p>city. Many&nbsp;people made the life in Groningen more pleasant during my PhD time here. First, I feel fortunate to be in Alfa-informatica. Besides my colleagues in the </p><p>department, people of foreign guest club and international service desk of the university made my life more pleasant and less painful. I am reluctant to name people individually because of certainty of omissions. Nevertheless here is an incomplete list of people that </p><p>I feel lucky to have met during this time, sorted randomly: Ellen and John Nerbonne, </p><p>Kostadin Cholakov, Laura Fahnenbruck, Dörte Hessler, Ildikó Berzlánovich, Gisi </p><p>Cannizzaro, Tim Van de Cruys, Daniël de Kok, Martijn Wieling, Jelena Prokic´, Aysa </p><p>Arylova, Barbara Plank, Martin Meraner, Gideon Kotzé, Zhenya Markovskaya, Radek </p><p>Šimík, Jörg Tiedemann, Tal Caspi, Jelle Wouda. </p><p>My parents, Hos¸naz and Selçuk Çöltekin, have always been supportive, but also encouraged my curiosity from the very start.&nbsp;My interest in linguistics likely goes back to a description of Turkish morphology among my father’s notes. I still pursue </p><p>the same fascination I felt when I realized there was a neat explanation to something </p><p>I knew intuitively.&nbsp;My sister, Arzu, has always been there for me, not only as a supportive family member, but I also benefited a lot from our discussions about my work, sometimes making me feel that she should have been doing what I do. Lastly, many thanks to two people who suffered most from my work on the thesis by being </p><p>closest to me, Aska and Franek, for their help, support and patience. </p><p>Contents </p><p></p><ul style="display: flex;"><li style="flex:1">List of Abbreviations </li><li style="flex:1">xi </li></ul><p></p><ul style="display: flex;"><li style="flex:1">1</li><li style="flex:1">1</li></ul><p>2<br>Introduction </p><ul style="display: flex;"><li style="flex:1">The Problem of Language Acquisition </li><li style="flex:1">5</li></ul><p></p><p>2.1 The&nbsp;nature–nurture debate&nbsp;. . . . . . . . . . . . . . . . . . . . . . . <br>2.1.1 Difficulty&nbsp;of learning languages&nbsp;. . . . . . . . . . . . . . . . <br>68<br>2.1.2 How&nbsp;quick is quick enough?&nbsp;. . . . . . . . . . . . . . . . . .&nbsp;10 2.1.3 Critical&nbsp;periods . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;11 2.1.4 Summary&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;12 <br>2.2 Theories&nbsp;of language acquisition&nbsp;. . . . . . . . . . . . . . . . . . . .&nbsp;14 <br>2.2.1 Parametric&nbsp;theories . . . . . . . . . . . . . . . . . . . . . . .&nbsp;14 2.2.2 Connectionist&nbsp;models .&nbsp;. . . . . . . . . . . . . . . . . . . . .&nbsp;17 2.2.3 Usage&nbsp;based theories&nbsp;. . . . . . . . . . . . . . . . . . . . . .&nbsp;18 2.2.4 Statistical&nbsp;models . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;19 <br>2.3 Summary&nbsp;and discussion&nbsp;. . . . . . . . . . . . . . . . . . . . . . . .&nbsp;20 </p><p></p><ul style="display: flex;"><li style="flex:1">3</li><li style="flex:1">Formal Models </li><li style="flex:1">23 </li></ul><p></p><p>3.1 Formal&nbsp;models in science&nbsp;. . . . . . . . . . . . . . . . . . . . . . . .&nbsp;24 3.2 Computational&nbsp;learning theory&nbsp;. . . . . . . . . . . . . . . . . . . . .&nbsp;25 <br>3.2.1 Chomsky&nbsp;hierarchy of languages .&nbsp;. . . . . . . . . . . . . . .&nbsp;26 3.2.2 Identification&nbsp;in the limit&nbsp;. . . . . . . . . . . . . . . . . . . .&nbsp;27 3.2.3 Probably&nbsp;approximately correct learning .&nbsp;. . . . . . . . . . .&nbsp;29 <br>3.3 Learning&nbsp;theory and language acquisition&nbsp;. . . . . . . . . . . . . . .&nbsp;31 <br>3.3.1 The&nbsp;class of natural languages and learnability&nbsp;. . . . . . . .&nbsp;33 3.3.2 The&nbsp;input to the language learner .&nbsp;. . . . . . . . . . . . . . .&nbsp;35 3.3.3 Are&nbsp;natural languages provably (un)learnable?&nbsp;. . . . . . . .&nbsp;37 <br>3.4 What&nbsp;do computational models explain?&nbsp;. . . . . . . . . . . . . . . .&nbsp;38 3.5 Computational&nbsp;simulations . . . . . . . . . . . . . . . . . . . . . . .&nbsp;39 3.6 Summary&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;42 </p><p>vii viii </p><p>Contents </p><p></p><ul style="display: flex;"><li style="flex:1">4</li><li style="flex:1">Lexical Segmentation: An Introduction </li><li style="flex:1">45 </li></ul><p></p><p>4.1 Word&nbsp;recognition .&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;47 4.2 Lexical&nbsp;segmentation . . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;49 <br>4.2.1 Predictability&nbsp;and distributional regularities&nbsp;. . . . . . . . . .&nbsp;50 4.2.2 Prosodic&nbsp;cues . . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;52 4.2.3 Phonotactics&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;54 4.2.4 Pauses&nbsp;and utterance boundaries&nbsp;. . . . . . . . . . . . . . . .&nbsp;55 4.2.5 Words&nbsp;that occur in isolation and short utterances&nbsp;. . . . . . .&nbsp;55 4.2.6 Other&nbsp;cues for segmentation&nbsp;. . . . . . . . . . . . . . . . . .&nbsp;56 <br>4.3 Segmentation&nbsp;cues: combination and comparison&nbsp;. . . . . . . . . . .&nbsp;57 </p><p></p><ul style="display: flex;"><li style="flex:1">5</li><li style="flex:1">Computational Models of Segmentation </li><li style="flex:1">59 </li></ul><p></p><p>5.1 The&nbsp;input .&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;61 5.2 Processing&nbsp;strategy .&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;63 <br>5.2.1 Guessing&nbsp;boundaries . . . . . . . . . . . . . . . . . . . . . .&nbsp;63 5.2.2 Recognition&nbsp;strategy .&nbsp;. . . . . . . . . . . . . . . . . . . . .&nbsp;65 <br>5.3 The&nbsp;search space&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;67 <br>5.3.1 Exact&nbsp;search using dynamic programming .&nbsp;. . . . . . . . . .&nbsp;68 5.3.2 Approximate&nbsp;search procedures&nbsp;. . . . . . . . . . . . . . . .&nbsp;69 5.3.3 Search&nbsp;for boundaries&nbsp;. . . . . . . . . . . . . . . . . . . . .&nbsp;71 <br>5.4 Performance&nbsp;and evaluation .&nbsp;. . . . . . . . . . . . . . . . . . . . . .&nbsp;71 5.5 Two&nbsp;reference models .&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;76 <br>5.5.1 A&nbsp;random segmentation model&nbsp;. . . . . . . . . . . . . . . . .&nbsp;76 5.5.2 A&nbsp;reference model using language-modeling strategy&nbsp;. . . . .&nbsp;76 <br>5.6 Summary&nbsp;and discussion&nbsp;. . . . . . . . . . . . . . . . . . . . . . . .&nbsp;82 </p><p></p><ul style="display: flex;"><li style="flex:1">6</li><li style="flex:1">Segmentation using Predictability Statistics </li><li style="flex:1">85 </li></ul><p></p><p>6.1 Measures&nbsp;of predictability for segmentation&nbsp;. . . . . . . . . . . . . .&nbsp;86 <br>6.1.1 Boundary&nbsp;probability . . . . . . . . . . . . . . . . . . . . . .&nbsp;86 6.1.2 Transitional&nbsp;probability .&nbsp;. . . . . . . . . . . . . . . . . . . .&nbsp;88 6.1.3 Pointwise&nbsp;mutual information&nbsp;. . . . . . . . . . . . . . . . .&nbsp;90 6.1.4 Successor&nbsp;variety . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;91 6.1.5 Boundary&nbsp;entropy .&nbsp;. . . . . . . . . . . . . . . . . . . . . . .&nbsp;93 6.1.6 Effects&nbsp;of phoneme context .&nbsp;. . . . . . . . . . . . . . . . . .&nbsp;94 6.1.7 Predicting&nbsp;the past&nbsp;. . . . . . . . . . . . . . . . . . . . . . .&nbsp;98 6.1.8 Predictability&nbsp;measures: summary and discussion&nbsp;. . . . . . . 100 <br>6.2 A&nbsp;predictability based segmentation model .&nbsp;. . . . . . . . . . . . . . 102 <br>6.2.1 Peaks&nbsp;in unpredictability&nbsp;. . . . . . . . . . . . . . . . . . . . 102 6.2.2 Combining&nbsp;multiple measures and varying phoneme context&nbsp;. 106 6.2.3 Weighing&nbsp;the competence of the voters&nbsp;. . . . . . . . . . . . 109 6.2.4 Two&nbsp;sides of a peak .&nbsp;. . . . . . . . . . . . . . . . . . . . . .&nbsp;111 6.2.5 Reducing&nbsp;redundancy .&nbsp;. . . . . . . . . . . . . . . . . . . . . 112 </p><p>Contents </p><p>ix </p><p>6.3 Summary&nbsp;and discussion&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . 113 </p><p>Learning from Utterance Boundaries&nbsp;115 </p><p>7.1 Related&nbsp;work .&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.2 Do&nbsp;utterance boundaries provide cues for word boundaries? .&nbsp;. . . . .&nbsp;117 7.3 An&nbsp;unsupervised learner using utterance boundaries&nbsp;. . . . . . . . . . 119 7.4 Combining&nbsp;measures and cues&nbsp;. . . . . . . . . . . . . . . . . . . . . 122 7.5 Summary&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 </p><p>7</p><ul style="display: flex;"><li style="flex:1">8</li><li style="flex:1">Learning From Past Decisions </li><li style="flex:1">127 </li></ul><p></p><p>8.1 Using&nbsp;previously discovered words .&nbsp;. . . . . . . . . . . . . . . . . . 128 <br>8.1.1 What&nbsp;makes a good word?&nbsp;. . . . . . . . . . . . . . . . . . . 130 8.1.2 Related&nbsp;work .&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;131 <br>8.2 Using&nbsp;known words for segmentation: a computational model&nbsp;. . . . 132 <br>8.2.1 The&nbsp;computational model&nbsp;. . . . . . . . . . . . . . . . . . . 132 8.2.2 Performance&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . 134 <br>8.3 Making&nbsp;use of lexical stress .&nbsp;. . . . . . . . . . . . . . . . . . . . . . 136 <br>8.3.1 Related&nbsp;work .&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;137 8.3.2 An&nbsp;analysis of stress patterns in the BR corpus&nbsp;. . . . . . . . 139 8.3.3 Segmentation&nbsp;using lexical stress&nbsp;. . . . . . . . . . . . . . .&nbsp;141 <br>8.4 Summary&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 </p><p></p><ul style="display: flex;"><li style="flex:1">9</li><li style="flex:1">An Overview of the Segmentation Strategies </li><li style="flex:1">147 </li></ul><p></p><p>9.1 The&nbsp;measures and the models .&nbsp;. . . . . . . . . . . . . . . . . . . . .&nbsp;147 9.2 Performance&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 9.3 Making&nbsp;use of prior information&nbsp;. . . . . . . . . . . . . . . . . . . . 152 9.4 Variation&nbsp;in the input&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . 155 9.5 Qualitative&nbsp;evaluation . . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;157 9.6 The&nbsp;learning method&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . .&nbsp;161 9.7 Summary&nbsp;. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 </p><p></p><ul style="display: flex;"><li style="flex:1">10 Conclusions </li><li style="flex:1">165 </li></ul><p>169 187 191 193 197 <br>Bibliography Short Summary Samenvatting in het Nederlands A The&nbsp;Corpora B Detailed&nbsp;Performance Results </p><p>List of Abbreviations </p><p>List of Abbreviations <br>ANN APS <br>Artificial Neural Network Argument from poverty of stimulus <br>17 35 </p><p>BF BP BR BTP <br>Boundary F-score Boundary precision Boundary recall <br>73 73 73 </p><ul style="display: flex;"><li style="flex:1">99 </li><li style="flex:1">Backwards or reverse TP </li></ul><p>CDS CHILDES <br>Child Directed Speech Child language data exchange system <br>61 46 </p><p></p><ul style="display: flex;"><li style="flex:1">H</li><li style="flex:1">Boundary entropy </li><li style="flex:1">94 </li></ul><p>IIL </p><p>Identification in the limit also used as ‘Identifiable&nbsp;27 </p><p>in the limit’ <br>LAD LF LP <br>Language Acquisition Device Lexical (or word-type) f-score Lexical (or word-type) precision Lexical (or word-type) recall <br>673 73 </p><ul style="display: flex;"><li style="flex:1">73 </li><li style="flex:1">LR </li></ul><p>MDL MI <br>Minimum description length (Pointwise) Mutual Information <br>65 90 </p><p>PAC PM PUM <br>Probably approximately correct (learning) Predictability-based Segmentation model </p><p>Segmentation model that uses predictability and ut-&nbsp;123 </p><p>terance boundaries <br>29 113 </p><p>PUWM </p><p>Segmentation model that uses predictabilty, utter-&nbsp;135 </p><p>ance boundaries and know words </p><p>xi xii </p><p>List of Abbreviations </p><p>PUWSM PV </p><p>Segmentation model that uses all cues: predictabilty,&nbsp;143 </p><p>utterance boundaries, known words and stress </p><ul style="display: flex;"><li style="flex:1">Predecessor Value </li><li style="flex:1">99 </li></ul><p></p><ul style="display: flex;"><li style="flex:1">76 </li><li style="flex:1">RM </li><li style="flex:1">Random Model </li></ul><p>SM SRN SV <br>Stress-based Segmentation model Simple recurrent network Successor Variety <br>142 18, 63 91 </p><p></p><ul style="display: flex;"><li style="flex:1">TP </li><li style="flex:1">Transitional probability </li><li style="flex:1">88 </li></ul><p>UG UM <br>Universal Grammar Utterance boundary-based Segmentation model <br>6123 </p><p></p><ul style="display: flex;"><li style="flex:1">VC dimension&nbsp;Vapnik-Chervonenkis dimension </li><li style="flex:1">30 </li></ul><p>WF WM WP WR <br>Word F-score Lexicon-based Segmentation model Word precision <br>73 132 73 </p><ul style="display: flex;"><li style="flex:1">Word recall </li><li style="flex:1">73 </li></ul><p></p><p>1 Introduction </p><p>A witty saying proves nothing. <br>Voltaire </p><p>The aim of language acquisition research is to understand how children learn languages spoken in their environment.&nbsp;This study contributes to this purpose by </p><p>investigating one of the first steps of the language acquisition process, the discovery of </p><p>words in the speech stream directed to children, by means of computational simulations. </p><p>We take words for granted, we identify them effortlessly when listening to others </p><p>speaking a language we understand, and we use them to construct utterances possibly </p><p>never uttered before.&nbsp;We learn the sound forms of the words, associate them with </p><p>meanings, discover how to use them appropriately in company of other words, and in </p><p>presence of different people. Despite apparent ease with which we process and learn </p><p>words, learning a proper set of words, a lexicon, to effectively communicate with our </p><p>environment is a challenging task. The challenge starts with identifying these words in </p><p>a continuous speech stream. Unlike written text where we typically put white spaces </p><p>between the words, the speech signal does not contain analogous reliable markers for </p><p>word boundaries. </p><p>A competent language user is aided, to some extent, by his/her knowledge of </p><p>words to extract them from a continuous stream: itisannoyingbutyouprobablycanfigure- </p><p>outthewordsinthissequence. However, at the beginning of their journey to becoming </p><p>competent speakers, children do not know the words in the language they are acquir- </p><p>ing. As&nbsp;a result they cannot make use of words.&nbsp;If this is not convincing, try to </p><p>locate the word boundaries in this sentence: eg˘ertürkçebilmiyorsanızbudizidekisözcük- </p><p>leribulmanızçokzor. This is approximately what happens when you hear an unfamiliar </p><p>language (in this case, Turkish). Without knowing the words of the input language, </p><p>discovering words in a continuous speech stream does not seem possible, which leads </p><p>to a chicken-and-egg problem. In spoken language, we are not as helpless as in the </p><p>written stream of letters. There are several acoustic cues that indicate word boundaries. </p><p>However, although these cues correlate with the boundaries, they are known to be </p><p>insufficient, noisy and sometimes in conflict with each other. Furthermore, the cues are </p><p>language dependent, that is, one needs to know the boundaries to learn when and how </p><p>1<br>2</p><p>Introduction </p><p>these cues correlate with the boundaries. We are back to the chicken-and-egg problem </p><p>again. </p><p>Fortunately, there are also some simple and general segmentation strategies that seem to work universally for all languages. A byproduct of the fact that the natural speech stream is formed by concatenating words, the flow of basic units (such as </p><p>syllables or phonemes) in an utterance follows certain statistical regularities. Particu- </p><p>larly, the basic units within words predict one another in sequence, while units across </p><p>boundaries do not. It is even more encouraging that children seem to be sensitive to </p><p>these statistics at a very young age. Another source of information for word boundaries </p><p>that does not require knowledge of words in advance comes from utterance bound- </p><p>aries. Utterance boundaries are also word boundaries, and words are formed by certain </p><p>regularities, for example they share common beginnings and endings. This provides </p><p>another source for discovering words before knowing them. Once we start discovering </p><p>words using these general strategies, we can also learn to use the language-specific </p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    215 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us