Inquiries Into Words, Constraints and Contexts
Total Page:16
File Type:pdf, Size:1020Kb
Inquiries into Words, Constraints and Contexts Inquiries into Words, Constraints and Contexts Festschrift for Kimmo Koskenniemi on his 60th Birthday Edited by Antti Arppe Lauri Carlson Krister Lindén Jussi Piitulainen Mickael Suominen Martti Vainio Hanna Westerlund Anssi Yli-Jyrä Technical Editing by Juho Tupakka Inquiries into Words, Constraints and Contexts. Festschrift for Kimmo Koskenniemi on his 60th Birthday. Individual articles Copyright c 2005, by their authors. Photographs Copyright c Koskenniemi Family. Editorial Board: . Antti Arppe, Secretary-Treasurer . Lauri Carlson, Chairman . Krister Lindén . Jussi Piitulainen . Mickael Suominen, Editorial Secretary . Martti Vainio . Hanna Westerlund . Anssi Yli-Jyrä Technical Editor: Juho Tupakka Image scanning and enhancing: Markus Koljonen CSLI Studies in Computational Linguistics ONLINE Series Editor: Ann Copestake ISSN 1557-5772 Available on-line at: http://cslipublications.stanford.edu/site/SCLO.html CSLI Publications Ventura Hall Stanford University Stanford, CA 94305 Contents Acknowledgements and Notices viii Tabula Gratulatoria ix Kimmo Koskenniemi’s first 60 years xiv FRED KARLSSON Publications by Kimmo Koskenniemi xviii I Morphology, Syntax and Automata 1 1 The Very Long Way from Basic Linguistic Research to Commercially Successful Language Business: the Case of Two-Level Morphology 2 ANTTI ARPPE 2 Inducing a Morphological Transducer from Inflectional Paradigms 18 LAURI CARLSON 3 The DDI Approach to Morphology 25 KENNETH W. CHURCH 4 Finite-StateParsingofGerman 35 ERHARD W. HINRICHS v vi/ INQUIRIES INTO WORDS,CONSTRAINTS AND CONTEXTS 5 Solutions for Handling Non-concatenative Processes in Bantu Languages 45 ARVI HURSKAINEN 6 A Method for Tokenizing Text 55 RONALD M. KAPLAN 7 Phonotactic Complexity of Finnish Nouns 65 FRED KARLSSON 8 Twenty-FiveYearsofFinite-StateMorphology 71 LAURI KARTTUNEN AND KENNETH R. BEESLEY 9 LocalGrammarAlgorithms 84 MEHRYAR MOHRI 10 TwolatWork 94 SJUR MOSHAGEN, PEKKA SAMMALLAHTI AND TROND TROSTERUD 11 Two Notions of Parsing 106 JOAKIM NIVRE 12 Computational Morphologies for Small Uralic Languages 116 GÁBOR PRÓSZÉKY AND ATTILA NOVÁK 13 MorphologyandtheHarappanGods 126 RICHARD SPROAT AND STEVE FARMER 14 Consonant Gradation in Estonian and Sámi: Two-Level Solution 136 TROND TROSTERUD AND HELI UIBO 15 RUSTWOL: A Tool for Automatic Russian Word Form Recognition 151 LIISA VILKKI 16 Combining Regular Expressions with Near-Optimal Automata in the FIRE Station Environment 163 BRUCE W. WATSON,MICHIEL FRISHERT AND LOEK CLEOPHAS 17 Linguistic Grammars with Very Low Complexity 172 ANSSI YLI-JYRÄ CONTENTS / vii II Speech and Meaning 184 18 Speech is Special: What’s Special about It? 185 OLLI AALTONEN AND ESA UUSIPAIKKA 19 Data Mining Meets Collocations Discovery 194 HELENA AHONEN-MYKA AND ANTOINE DOUCET 20 Semantic Morphology 204 BJÖRN GAMBÄCK 21 Morphological Processing in Mono- and Cross-Lingual Information Retrieval 214 KALERVO JÄRVELIN AND ARI PIRKOLA 22 A Grammar for Finnish Discourse Patterns 227 KRISTIINA JOKINEN 23 Meaningful Models for Information Access Systems 241 JUSSI KARLGREN 24 WordSenses 249 KRISTER LINDÉN 25 Exploring Morphologically Analysed Text Material 259 MIKKO LOUNELA 26 Information Structure and Minimal Recursion Semantics 268 GRAHAM WILCOCK 27 Developing a Dialogue System that Interacts with a User in Estonian 278 HALDUR ÕIM AND MARE KOIT 28 The Story of Supposed Hebrew-Finnish Affinity - a Chapter in the History of Comparative Linguistics 289 TAPANI HARVIAINEN Acknowledgements and Notices A paper pre-print of this electronic publication was presented to Pro- fessor Kimmo Koskenniemi at a special Colloquium on Friday, Septem- ber 2nd, 2005, at the Auditorium of the Arppeanum building of the University of Helsinki, arranged in association with the Workshop on Finite State Methods for Natural Language Processoring (FSMNLP) <http://www.ling.helsinki.fi/events/FSMNLP2005/>. This final electronic version of the publication contains a few minor revisions, as compared to the pre-printed paper version. The chapters affected by these revisions are the following: 1, 17, 20, 22, 25, and 26. The Editorial Board wants to express its appreciation for CSC - Scientific Computing Finland, Lingsoft, Inc., Connexor Oy, and Oy International Busi- ness Machines Ab for their generous financial support which greatly con- tributed to the successful realisation of this project. Moreover, the Editorial Board expresses its gratitude to Ann Copestake, Dikran Karagueuzian and CSLI Publications for graciously agreeing to pub- lish this Festschrift in their online series. Furthermore, the Editorial Board wishes to thank Leena Barros and the Office of the Faculty of Arts at the University of Helsinki for administrative support in setting up and running this project. Finally, the Editorial Board would like to thank the Koskenniemi family for providing the photographs from Kimmo’s past and present that are included this Festschrift. Helsinki, November 11th, 2005, AnttiArppe LauriCarlson KristerLindén Jussi Piitulainen Mickael Suominen Juho Tupakka Martti Vainio Hanna Westerlund Anssi Yli-Jyrä viii Tabula Gratulatoria Prof. Olli Aaltonen, University of Turku, Finland Anders Ahlqvist Prof. Helena Ahonen-Myka, University of Helsinki, Finland Anu Airola and Kari Lehtonen Antti Arppe Lili Aunimo and Reeta Kuuskoski, University of Helsinki, Finland Academy Prof. Ralph-Johan Back, Academy of Finland / Department of Computer Science, Åbo Akademi University Lars Backström Dr. Kenneth R. Beesley, Xerox Research Centre Europe, France Dr. phil. Eckhard Bick, Institute of Language and Communication, University of Southern Denmark, Odense Prof. Lars Borin, Göteborg University, Sweden Lauri Carlson Andrew Chesterman, University of Helsinki Kenneth Ward Church, Microsoft Research, Redmond, USA Loek Cleophas, Technische Universiteit Eindhoven, The Netherlands Robin Cooper, Department of Linguistics, Göteborg University, Sweden Mathias Creutz, Helsinki University of Technology, Finland Antoine Doucet, University of Helsinki, Finland Ph.D Steve Farmer Michiel Frishert Björn Gambäck, Stockholm, Sweden Prof. Tapani Harviainen, University of Helsinki, Finland Orvokki Heinämäki Prof. Dr. Irmeli Helin, University of Helsinki, Finland ix x/INQUIRIES INTO WORDS,CONSTRAINTS AND CONTEXTS Prof. Dr. Erhard W. Hinrichs, Eberhard-Karls-Universität Tübingen, Germany Henrik Holmboe, NorFA/NordForsk, Denmark Prof. Dr. Timo Honkela, Helsinki University of Technology, Finland Prof. Dr. Arvi Hurskainen, Institute for Asian and African Studies, University of Helsinki, Finland Sari Hyvärinen, University of Helsinki, Finland Matti Ihamuotila Academy Prof. Kalervo Järvelin, University of Tampere, Finland Prof. Janne Bondi Johannessen, University of Oslo, Norway Prof. Kristiina Jokinen, University of Helsinki, Finland Dr. Ronald M. Kaplan, Palo Alto Research Center, USA Jussi Karlgren, Stockholm, Sweden Fred Karlsson and Sylvi Soramäki-Karlsson Prof. Dr. Lauri Karttunen, Stanford University, Palo Alto, USA Dr Matti Keijola, Helsinki University of Technology, Finland Prof. Pekka Kilpeläinen, University of Kuopio, Finland Erkki I. Kolehmainen, Helsinki, Finland Mare Koit, University of Tartu, Estonia Jouko and Eeva Koskenniemi Osmo and Liisa Koskenniemi Seppo Koskenniemi and Hely von Konov-Koskenniemi Pirkko Kukkonen PhD Timo Lahtinen Ritva Laury Greger Lindén, University of Helsinki, Finland Krister Lindén Prof. Jouko Lindstedt, University of Helsinki, Finland Mikko Lounela Urho Määttä Matti Miestamo Manne Miettinen, CSC – Scientific Computing Ltd., Finland Prof. Mehryar Mohri, Courant Institute of Mathematical Sciences, USA Cand.philol. Sjur Nørstebø Moshagen, Helsinki, Finland / The Sámi Parliament, Norway Jussi Niemi Prof. Joakim Nivre, Växjö, Sweden Attila Novák, MorphoLogic, Hungary TABULA GRATULATORIA / xi Adjunct Prof. Martti Nyman, Turku, Finland Prof. Haldur Õim, University of Tartu, Estonia Marjatta and Asko Parpola, Helsinki Simo and Sisko Parpola, Helsinki Jussi Piitulainen Dr. Ari Pirkola Helena Pirttisaari, University of Helsinki, Finland Kari K. Pitkänen PhD Gábor Prószéky, MorphoLogic, Hungary CEO Juhani Reiman, Lingsoft, Inc. Mikael Reuter, Helsingfors, Finland Prof. Eiríkur Rögnvaldsson, University of Iceland, Reykjavík Otto-Ville Ronkainen Klaas Ruppel, Research Institute for the Languages of Finland, Helsinki Pekka Sammallahti, Ohcejohka, Finland Kaius Sinnemäki Koenraad de Smedt and Victoria Rosén, Bergen, Norway Richard Sproat, University of Illinois at Urbana-Champaign, USA Dr. Doc. Pirkko Suihkonen, University of Helsinki, Finland Mickael Suominen Markku Syrjänen, Helsinki University of Technology, Finland PhD Pasi Tapanainen, Connexor Oy, Helsinki, Finland Lauri Tarkkonen Bertil Tikkanen, University of Helsinki, Finland Dr. Trond Trosterud, University of Tromsø, Norway Juho Tupakka Heli Uibo, University of Tartu, Estonia Kira and Esko Ukkonen Prof. Esa Uusipaikka, University of Turku, Finland Martti Vainio CDO, Phil. Lic. Simo Vihjanen, Lingsoft, Inc. Liisa Vilkki, University of Helsinki, Finland Prof. Dr. Martin Volk, Stockholm University, Sweden PhD Atro Voutilainen, Connexor Oy, Helsinki, Finland Prof. Dr. Bruce W. Watson, University of Pretoria, South Africa Jürgen Wedekind, University of Copenhagen, Denmark Stefan Werner, University of Joensuu, Finland Hanna Westerlund Dr. Graham Wilcock, University of Helsinki, Finland Anssi Yli-Jyrä xii / INQUIRIES INTO WORDS,CONSTRAINTS AND CONTEXTS Institutions Aspekti ry., Helsinki, Finland Chair of Language Technology, University of Tartu,