Czech and Higher Order Grammar
Total Page:16
File Type:pdf, Size:1020Kb
CZECH CLITICS IN HIGHER ORDER GRAMMAR DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Jiri Hana, Mgr., RNDr. ***** The Ohio State University 2007 Dissertation Committee: Approved by Professor Carl Pollard, Adviser Professor Brian Joseph Adviser Professor Detmar Meurers Graduate program in Linguistics Professor Michael White c Copyright by Jiri Hana 2007 ABSTRACT This thesis has three interrelated goals: The main goal is an analysis of Czech clitics, units of grammar on the borderline between morphology and syntax with rather peculiar ordering properties both relative to the whole clause and to each other. We examine the actual set of clitics, their rather rigid ordering properties, and finally the properties of so-called clitic climbing. The analysis evaluates previous research, but it also provides new insights, especially in the position of the clitic cluster and in the constraints on clitic climbing. We show that many of the constraints regarding position of the clitic cluster suggested in previous research do not hold. We also argue that cases when clitics do not follow the first constituent are in fact not exceptions in clitic placement but instead unusual frontings. The second goal is the development of a framework within Higher Order Grammar (HOG) supporting a transparent and modular treatment of word order. Unlike previous versions of HOG, we work with signs (containing phonological, syntactic and potentially other information) as actual objects of the grammar. Apart from that, we build on the simplicity and elegance of the pre-formal part of the linearization framework within Head-driven Phrase Structure Grammar. Finally, the third objective is to test the result of the second goal by applying it on the results of the first goal. ii To my grandfather Frantiˇsek Andrl´ık iii ACKNOWLEDGMENTS The thesis did not emerge in vacuum and I would like to thank: My advisor, Carl Pollard, for endless hours of discussions, for his willingness to explore my ideas even when he disagreed with the premisses, for those simple questions which showed in an instant that I did not have a clue what I was talking about. Brian Joseph, Detmar Meurers and Mike White for serving on my committee, for their insights, com- ments, flexibility and suggestions. To Daniel Collins for serving as the Grad School representative, for reading the thesis in a very short time given to him by the Grad School and for his comments and suggestions. Petr Sgall, for introducing me to linguistics, for feedback on early versions of the clitic analysis and for all the discussions about Czech syntax. Alexander Rosen, for extensive feedback on the analysis Czech clitic, for all his examples, whether they supported or undermined my conclusions. Andrea Sims and Richard Janda for comments on previous versions of the analysis of clitics. Many other people who helped me with various linguistic and technical problems, most notably Mary Beckman, Mike Daniels, Markus Dickinson, Jakub Dotlaˇcil, Eva Hajiˇcov´a, Pavel Kosek, Denisa Lenertov´a, Marcela Mich´alkov´a, Jarmila Panevov´a, Petr Savick´y, Shari Speer, Jana Sindlerov´a,ˇ Ludmila Uhl´ıˇrov´a, Radoslav Veˇcerka, L´ıda Veselovsk´a, S´arkaˇ Zik´anov´a. Peter Culicover, who showed me that it is really possible to work effectively. I should have recalled more often his advice that it is easier to get lost in a long sentence than in a short one. Chris Brew, for his why questions. Anna for feedback, for being a friend and a colleague. iv I would not have survived sane if it weren’t for our pub evenings with George, Detmar, Luiz and Martin. I will really miss the heated discussions about free trade, wars, airlines or just about anything else. I would also like to thank Saˇsa for similar support in Prague. Finally, I would like to thank to Hanka, Filip and Mat´ysek for too many things to list them here, but especially for their patience . v VITA 1998 ............................................... .Mgr, Computer Science, Charles University, Prague 2001 ............................................... .RNDr, Computer Science, Charles University, Prague 1998–2001 .......................................... Researcher, Charles University, Prague 2001–2006 .......................................... Fulbright Fellow 2001–2002 .......................................... University Fellow, The Ohio State University 2002–2006 .......................................... Graduate Research and Teaching Associate, The Ohio State University 2006–2007 .......................................... Presidential Fellow, The Ohio State University vi PUBLICATIONS Research Publications Hana, J., A. Feldman, L. Amaral, and C. Brew (2006). Tagging Portuguese with a Spanish Tagger Using Cognates. In Proceedings of the Workshop on Cross-language Knowledge Induction, 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2006), Trento, Italy. Feldman, A., J. Hana, and C. Brew (2006). A cross-language approach to rapid creation of new morpho-syntactically annotated resources. In Proceedings of the fifth international conference on Language Resources and Evaluation (LREC 2006). Genoa, Italy. Feldman, A., J. Hana, and C. Brew (2006). Experiments in Morphological Annotation Trans- fer. In A. Gelbukh (Ed.), Proceedings of Computational Linguistics and Intelligent Text Processing (CICLing), Lecture Notes in Computer Science. Springer-Verlag. Hana, J. (2004). Czech clitics in Higher Order Grammar. In A. D. Sims and M. Whiting (Eds.), Proceedings of the First Graduate Colloquium on Slavic Linguistics, November 8, 2003, at the Ohio State University; Working Papers in Slavic Studies, Volume 3. Columbus, Ohio: Department of Slavic and East European Languages and Literatures. Significantly revised in 2005; backdated to 2004. Hana, J. and A. Feldman (2004). Portable Language Technology: The case of Czech and Russian. In Proceedings of the Midwest Computational Linguistics Colloquium, June 25-26, 2004, Bloomington, Indiana. vii Hana, J., A. Feldman, and C. Brew (2004). A Resource-light Approach to Russian Morphology: Tagging Russian using Czech resources. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 2004, Barcelona, Spain. Pollard, C. and J. Hana (2003). Ambiguity, neutrality, and coordination in Higher Order Grammar. In G. Jaeger, P. Monachesi, G. Penn, and S. Wintner (Eds.), Proceedings of Formal Grammar, Wien, pp. 125–136. Kruijff, G.-J., E. Teich, J. Bateman, I. Kruijff-Korbayov´a, H. Skoumalov´a, S. Sharoff, L. Sokolova, T. Hartley, K. Staykova, and J. Hana (2000). A multilingual system for text generation in three Slavic languages. In Proceedings of the 18th Conference on Computational Linguistics (COLING), July 31 - August 4 2000, pp. 474–480. Universit¨at des Saarlandes, Saarbr¨ucken, Germany. FIELDS OF STUDY Major Field: Linguistics viii TABLE OF CONTENTS Abstract ................................................ ii Dedication .............................................. iii Acknowledgments .......................................... iv Vita .................................................. vi List of Tables ............................................. xii List of Figures ............................................ xiii Chapters: 1. Introduction ........................................... 1 1.1 Word Order, Clitics ................................... 1 1.2 Higher Order Grammar ................................. 3 1.3 Roadmap ......................................... 3 2. Higher Order Grammar .................................... 5 2.1 Basic properties ..................................... 6 2.2 Formalism of HOG informally ............................. 7 2.3 Signs ........................................... 10 2.4 Tectogrammar ...................................... 13 2.5 Phenogrammar ...................................... 17 2.6 Semantics ......................................... 20 ix 2.7 Lexicon .......................................... 23 2.8 Combining signs ..................................... 24 2.9 A complete toy grammar of English .......................... 28 2.10 Comparison with other approaches ........................... 30 3. Basics of Czech word order .................................. 37 3.1 Free word order ..................................... 38 3.2 Elements with restricted word order .......................... 40 3.3 Information structure and Information Packaging ................... 40 3.4 Fronting .......................................... 49 4. Czech Special Clitics ...................................... 62 4.1 Clitics in General .................................... 64 4.2 Basic Characteristics of Czech special clitics ..................... 66 4.3 The set of Czech clitics ................................. 74 4.4 Position of the main clitic cluster ............................ 95 4.5 Morpholexical ordering ................................. 114 4.6 Clitic Climbing ...................................... 122 5. Czech in HOG ......................................... 133 5.1 Simple Czech Tectogrammar .............................. 134 5.2 Combining signs II. ................................... 163 5.3 Inspiration from HPSG ................................. 170 5.4 Linearization in HOG .................................. 174 5.5 Czech word order in HOG ................................ 183 5.6 Clitics ........................................... 189 6. Conclusion