University of California, San Diego
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSITY OF CALIFORNIA, SAN DIEGO A Connectionist Model of the Effect of Pro-drop on SVO Languages A Dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Linguistics and Cognitive Science by Ezra Laurens Van Everbroeck Committee in charge: Professor Maria Polinsky, Chair Professor Garrison Cottrell, Co-Chair Professor Jeffrey Elman Professor Andrew Kehler Professor Ronald Langacker 2007 Signature Page The Dissertation of Ezra Laurens Van Everbroeck is approved, and it is acceptable in quality and form for publication on microfilm: Co-Chair Chair University of California, San Diego 2007 iii Dedication This dissertation is dedicated to Gary and Masha for simply refusing to let me quit. iv Epigraph A man said to the universe: "Sir I exist!" "However," replied the universe, "The fact has not created in me A sense of obligation." Stephen Crane v Table of Contents TABLE OF CONTENTS Signature Page.........................................................iii Dedication............................................................iv Epigraph ............................................................. v Table of Contents......................................................vi List of Figures.........................................................ix List of Tables.......................................................... x Acknowledgments..................................................... xii Vita................................................................. xv Abstract............................................................ xvii Chapter 1. Introduction ................................................. 1 Chapter 2. Linguistic parameters . ................... 10 2.1 Word order................................................. 10 2.2 Morphological marking . ................... 12 2.2.1 Nominal case markers ................................ 15 2.2.2 Verb markers: agreement and T/A/M ................... 17 2.3 Pro-drop................................................... 20 2.4 Summary................................................... 27 Chapter 3. Methods.................................................... 29 3.1 The artificial languages........................................ 29 3.2 Network architecture . ................... 32 vi 3.3 Training and testing .......................................... 36 3.4 Summary................................................... 37 Chapter 4. Experiment 1................................................ 38 4.1 Network results . ................... 40 4.2 Linguistic discussion . ................... 48 4.2.1 Creoles............................................. 51 4.2.2 Mandarin Chinese.................................... 60 4.3 Summary................................................... 69 Chapter 5. Experiment 2................................................ 71 5.1 Network results . ................... 73 5.2 Linguistic discussion . ................... 78 5.2.1 Early grammar acquisition............................. 79 5.2.2 Acquiring Mandarin Chinese........................... 88 5.3 Summary................................................... 95 Chapter 6. Experiment 3................................................ 98 6.1 Network results . .................. 100 6.1.1 The ‘easy’ language types............................. 101 6.1.2 The ‘difficult’ language type........................... 104 6.2 Linguistic discussion . .................. 112 6.2.1 Nouns first or verbs first in pro-drop languages........... 112 6.2.2 Generalization in language development................. 117 6.3 Summary.................................................. 126 vii Chapter 7. Experiment 4............................................... 128 7.1 Network results . .................. 130 7.2 Linguistic discussion: homonymy in natural language . ........... 137 7.3 Summary.................................................. 142 Chapter 8. General Discussion . .................. 144 8.1 Word order vs. morphological marking vs. lexical identity . ...... 144 8.2 The Competition Model...................................... 151 8.3 Probabilistic linguistics . .................. 155 8.4 Optimality Theory .......................................... 161 Chapter 9. Conclusion . ....................... 166 9.1 Summary of the experiments.................................. 166 9.1.1 Experiment 1 ...................................... 166 9.1.2 Experiment 2 ...................................... 167 9.1.3 Experiment 3 ...................................... 168 9.1.4 Experiment 4 ...................................... 169 9.2 The big picture............................................. 170 9.3 Future work................................................ 176 References.......................................................... 181 viii List of Figures LIST OF FIGURES Figure 1. Architecture of the neural network. ................... 33 Figure 2. Word-by-word analysis of the performance on the four sentence types of the networks learning the difficult language type after 10 training cycles ...... 43 Figure 3. Word-by-word analysis of the performance on the four sentence types of the networks learning the difficult language type after 200 training cycles . 76 Figure 4. Word-by-word analysis of how the networks that were trained on the difficult language type for 30 epochs parsed test sentences with novel verbs ............................................................. 107 Figure 5. The interaction between amounts of training and homonymy . ...... 134 ix List of Tables LIST OF TABLES Table 1. Estimated frequencies of basic word orders in the languages of the world .............................................................. 11 Table 2. The space of possible language types defined by the three linguistic parameters in our simulations . ................... 27 Table 3. The actual realization of subject noun phrases in the artificial languages . 31 Table 4. Possible sentence structures in simple SVO languages with and without pro- drop.......................................................... 39 Table 5. Results of Experiment 1......................................... 41 Table 6. Natural language counterparts to the language types defined by the three linguistic parameters in our simulations . ................... 50 Table 7. Results of Experiment 2......................................... 74 Table 8. Average percentage of test sentences analyzed correctly for an increasing number of training epochs........................................ 75 Table 9. Percentages of reversible transitive sentences processed correctly by children learning four different languages................................... 85 Table 10. Results from Experiment 3: novel nouns (30 epochs)............... 102 Table 11. Results from Experiment 3: novel verbs (30 epochs)................ 102 Table 12. Results from Experiment 3: novel nouns + verbs (30 epochs) ........ 102 Table 13. The performance of the networks learning the difficult language type on four different test corpora after 10 and 30 epochs of training........... 111 Table 14. The impact of novel verbs on all the language types ................ 115 x Table 15. The impact of novel nouns on all the language types................ 115 Table 16. Results from Experiment 4 .................................... 132 Table 17. Frequency of noun/verb homonymy in English and Mandarin ....... 139 Table 18. Frequency of nouns, verbs, and noun/verb homonyms in child-directed and adult English.................................................. 140 xi Acknowledgments ACKNOWLEDGMENTS There is something to be said for taking eleven years to complete a Ph.D. program. It may not have been particularly efficient, but doing it the leisurely way has given me much more time to enjoy the company, teachings and advice of a large number of people in San Diego. I have no doubt that the memories of them will stay with me far longer than the intricacies of many theories. First and foremost I want to thank my co-chairs, Gary Cottrell and Masha Polinsky, for having supported me as a student and as a person for more than a decade. Masha exposed me to a world of typological data in my first quarter at UCSD and the need to account for the diversity of the world’s languages became a cornerstone for this dissertation. Gary let me join his unbelievable research unit in the Summer of 1998 – I am positive no linguist knows as much about the leech’s central nervous system as I do – and proceeded to teach me more about machine learning than I was ever able to retain. In more recent years, they exhibited endless patience as they encouraged, coaxed, and occasionally threatened me into finishing this dissertation. I will never understand how they put up with my lack of progress for so long, but I do know that I am extremely lucky to have had them as advisers. Next, I want to express my gratitude to the other members of the committee, and not only for being willing to read this dissertation critically. Ron Langacker’s work on cognitive linguistics is the main reason I came to UCSD and to this day I believe it is the only linguistic framework that is fundamentally sound. Along similar lines, Jeff Elman’s articles from the early 90s describing how simple recurrent networks can learn xii linguistic structures are why I took up connectionist modeling. Finally, Andy Kehler’s academic influence can be found in the section on probabilistic linguistics in the general discussion. I also owe him for many years of excellent