Quickassist Extensive Reading for Learners of German Using CALL Technologies
Total Page:16
File Type:pdf, Size:1020Kb
QuickAssist Extensive Reading for Learners of German Using CALL Technologies by Peter Wood A thesis presented to the University of Waterloo in fulfilment of the thesis requirement for the degree of Doctor of Philosophy in German Waterloo, Ontario, Canada, 2010 c Peter Wood 2010 Author’s Declaration I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions , as accepted by the examiners. I understand that my thesis may be made electronically available to the public. ii Abstract The focus of this dissertation is the development and testing of a CALL tool which assists learners of German with the extensive reading of German texts of their choice. The appli- cation provides functionality that enables learners to acquire new vocabulary, analyse the meaning of complex word forms and to study a word’s semantic and syntactic features with the help of corpora and online resources. It is also designed to enable instructors to create meaningful exercises to be used in classroom activities focusing on vocabulary acquisition and word formation rules. The detailed description of the software development and implementation is preceded by a review of the relevant literature in the areas of German morphology and word forma- tion, second language acquisition and vocabulary acquisition in particular, studies on the benefits of extensive reading, the role of motivation in second language learning, CALL, and natural language processing technologies. The user study presented at the end of this dissertation shows how a first test group of learners was able to use the application for individual reading projects and presents the results of an evaluation of the sortware conducted by three German instructors assessing the affordances of the applications for students and potential applications for language instructors. iii Acknowledgements For all his help and support, I would like to thank my thesis supervisor Dr. Mathias Schulze. During my years at the University of Waterloo, and later from afar, he did not only help me with any problems with respect to my dissertation, he also taught me what it takes to be a diligent researcher and instructor. I am grateful that I was able to benefit from his deep knowledge and understanding of the subject matter. Thank you for always treating my like a colleague and friend. I would also like to thank everybody in the Germanic and Slavic Studies Department for the teaching, training, and administrative support they provided me and for giving my the chance of pursuing my graduate studies in Canada. For their support and understanding in the months it took to complete the work, I would also like to thank my colleagues here at the University of Saskatchewan. A special thank you goes out to our chair, Richard Julien, who not only took the time proofread the entire manuscript, but also supplied me with abundant amounts of coffee and took me on the odd motorbike tours in order to keep me sane over the last year. The software developers at the Technical Group at the Max Planck Institute for Psy- cholinguistics in Nijmegen, the Netherlands taught me what it takes to turn a n00b into a geek for which I will always be grateful. For putting up with my quirks through all these years and lending me her love and moral support, I would like to thank my wife Christine. There were times when I thought that I would never finish this dissertation. Without her, I would never have. iv Dedication To Christine v Table of Contents List of Figures xii List of Tables xiv Nomenclature xv 1 Introduction 1 2 Theoretical background 11 2.1 Overview . 11 2.2 Morphology . 12 2.2.1 What is morphology? . 12 2.2.2 The need of formal accounts of morphology . 14 2.2.3 Generative grammar - formal accounts of morphology . 15 2.2.4 Available accounts of German morphology . 20 2.3 German word formation . 25 2.3.1 German word formation rules . 25 2.3.2 Units of word formation . 28 vi 2.3.3 Compounding . 33 2.3.3.1 Endocentric Compounds . 35 2.3.3.2 Exocentric compounds . 39 2.3.3.3 Appositional compounds . 41 2.3.3.4 Contaminations . 42 2.3.3.5 Reduplications . 43 2.3.4 Explicit derivations . 44 2.3.5 Conversion . 64 2.3.6 Implicit derivation . 65 2.3.7 Reductions . 65 2.3.8 Remotivations and play-on-words . 66 2.3.9 Summary . 68 2.4 Vocabulary acquisition in a foreign language . 69 2.4.1 What is SLA? . 69 2.4.2 Vocabulary acquisition . 82 2.4.3 How many words does a particular language have? . 84 2.4.4 How many words does the average native speaker know? . 87 2.4.5 How many words are necessary to communicate? . 89 2.4.6 How many words are necessary to comprehend a text? . 90 2.4.7 What does it mean to know a word? . 100 2.4.8 Are all words equally hard or easy to learn? . 104 2.4.9 Effective ways to extend the vocabulary range . 106 vii 2.4.10 Extensive reading . 108 2.4.11 Motivation . 112 2.5 Theory and practice . 114 2.5.1 Vocabulary acquisition: theory and practice . 115 2.5.2 Extensive Reading . 118 2.5.3 Conclusion . 118 3 Computer Assisted Language Learning 122 3.1 Theory and practice in CALL . 123 3.2 The role of computers in CALL . 128 3.2.1 Learner Independence . 129 3.3 ICALL . 132 3.3.1 Tokenizers . 134 3.3.2 Lemmatizers . 135 3.3.3 Morphological analysers . 136 3.3.4 Part of speech (POS) taggers . 138 3.3.5 Parsers . 139 3.3.6 Natural language corpora . 141 3.3.6.1 What are corpora . 141 3.3.6.2 Corpora and CALL . 147 3.3.7 Lexical tools . 149 viii 4 Development 150 4.1 The Design of QuickAssist . 150 4.2 Design principles . 151 4.2.1 Open source software and reusable software components . 151 4.3 Similar software . 154 4.4 Finding a suitable programming language . 165 4.5 Finding suitable components . 168 4.5.1 Java Components . 168 4.5.1.1 The Standard Widget Toolkit (SWT) . 168 4.5.1.2 The Derby Database . 170 4.5.2 NLP Components . 171 4.5.2.1 Description of the Corpus . 171 4.5.2.2 Description of the Wordform list . 176 4.5.2.3 Other NLP components . 178 4.6 Architecture . 179 5 Implementation 182 6 User Study 191 6.1 Student study . 193 6.1.1 Student walkthrough . 193 6.1.1.1 User One . 196 6.1.1.2 User Two . 198 ix 6.1.1.3 User Three . 199 6.1.1.4 User Four . 200 6.1.1.5 Findings . 202 6.2 Instructor study . 204 6.3 Results . 206 7 Conclusions 209 7.1 Question 1 . 209 7.2 Question 2 . 211 7.3 Question 3 . 212 7.4 Reflections on the development . 212 7.5 Reflections on the study . 214 7.6 Future plans . 215 Appendices 218 Appendix 1: Letter to Instructors . 219 Appendix 2: Recruitment Script . 223 Appendix 3: Letter to Students . 225 Appendix 4: Instructor Questionnaire . 229 Appendix 5: Feedback Letter . 231 Appendix 6: Instructor Study . 233 Appendix 6.1: Answers Provided by Instructor One . 233 Appendix 6.2: Answers Provided by Instructor Two . 235 Appendix 6.3: Answers Provided by Instructor Three . 237 x References 239 xi List of Figures 1.1 Startup Screen . .3 2.1 Analysis of ’Apfelkuchenguss’ - hierarchical structure . 36 2.2 Analysis of ’Apfelkuchenguss’ - flat structure . 37 2.3 German word frequency coverage - using the 100 most frequent words . 93 2.4 German word frequency coverage - using the 500 most frequent words . 94 2.5 German word frequency coverage - using the 1000 most frequent words . 95 2.6 German word frequency coverage - using the 5000 most frequent words . 96 2.7 German word frequency coverage - using the 10000 most frequent words 97 2.8 German word frequency coverage - original text . 98 4.1 Glosser Start page . 154 4.2 Glosser User Interface . 155 4.3 Cyberbuch . 157 4.4 Alpheios . 157 4.5 Word Manager: homepage . 159 4.6 Word Manager: display of related words . 160 xii 4.7 Word Manager: morphological analysis of words that are not listed in the dictionary . 160 4.8 Wortschatz: homepage . 162 4.9 Wortschatz: information on a word . 163 4.10 Wortschatz: corpus look-up and information on co-occurrences of a word 163 4.11 Wortschatz also provides some information on words with unconven- tional morphology . 164 4.12 Architecture of QuickAssist . ..