Proceedings of the Ninth International Workshop on Parsing Technologies
Total Page:16
File Type:pdf, Size:1020Kb
IWPT-05 Proceedings of the Ninth International Workshop on Parsing Technologies 9–10 October 2005 Vancouver, British Columbia, Canada Production and Manufacturing by Omnipress Inc. Post Office Box 7214 Madison, WI 53707-7214 c 2005 The Association for Computational Linguistics Order copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 75 Paterson Street, Suite 9 New Brunswick, NJ 08901 USA Tel: +1-732-342-9100 Fax: +1-732-342-9339 [email protected] ii Preface The 9th International Workshop on Parsing Technologies continues the tradition that was started with the first workshop with that name, organized by Masaru Tomita in 1989 in Pittsburgh. IWPT’89 was followed by seven workshops in a biennial rhythm, only slightly disturbed by fear of millenium problems in 1999: • IWPT’91 in Cancun, Mexico • IWPT’93 in Tilburg, The Netherlands • IWPT’95 in Prague, Czech Republic • IWPT’97 in Cambridge (MA), USA • IWPT 2000 in Trento, Italy • IWPT 2001 in Beijing, China • IWPT 2003 in Nancy, France Over the years, the IWPT workshops have become the major forum for researchers in natural language parsing to meet and discuss advances in the field. Based on these workshops, four books on parsing technologies have been published by Kluwer Academic Publishers. The most recent one, based on IWPT 2000 and IWPT’01 was published last year (New Developments in Parsing Technology, Harry Bunt, John Carroll and Giorgio Satta, editors). In 1994 the Special Interest Group on Parsing (SIGPARSE) was set up within ACL with the primary aim to give continuity to the IWPT series, and ever since the IWPT workshops have taken place under the auspices of SIGPARSE. IWPT 2005 also marks the return of IWPT to North America; the combined conferences on Human Language Technology and Empirical Methods in Natural Language Processing providing an excellent opportunity for this. I would like to thank Alon Lavie and the ACL organisation, in particular Priscilla Rasmussen, for their wonderful and professional help in organizing IWPT 2005 as a satellite event of the joint HLT/EMNLP conference. Thanks are also due to the members of the Program Committee, and in particular to chairman Rob Malouf, for the careful work in reviewing submitted papers and designing the workshop program. Of a total of 42 submitted papers, 17 were accepted for presentation as a full paper, which gives an acceptance rate of 40%. Harry Bunt SIGPARSE Officer and IWPT 2005 General Chair iii General Chair: Harry Bunt (Tilburg University, Netherlands) Program Chair: Robert Malouf (San Diego State University, USA) Logistic Arrangements Chair: Alon Lavie (Carnegie-Mellon University, Pittsburgh, USA) Program Committee: Harry Bunt (Tilburg University, Netherlands) Bob Carpenter (Alias-i, Inc., Brooklyn, NY, USA) John Carroll (University of Sussex, Brighton, UK) Eugene Charniak (Brown University, Providence, RI, USA) Aravind Joshi (University of Pennsylvania, Philadelphia, USA) Ronald Kaplan (Xerox Palo Alto Research Center, USA) Martin Kay (Xerox Palo Alto Research Center, USA) Dan Klein (UC Berkeley, USA) Sadao Kurohashi (University of Tokyo, Japan) Alon Lavie (Carnegie-Mellon University, Pittsburgh, USA) Robert Malouf (San Diego State University, USA) Yuji Matsumoto (Nara Institute of Science and Technology, Japan) Paola Merlo (University of Geneva, Switzerland) Bob Moore (Microsoft, Redmond, USA) Anton Nijholt (University of Twente, Netherlands) Gertjan van Noord (University of Groningen, Netherlands) Stephan Oepen (University of Oslo, Norway) Stefan Riezler (Xerox Palo Alto Research Center, USA) Giorgio Satta (University of Padua, Italy) Khalil Sima’an (University of Amsterdam, Netherlands) Mark Steedman (University of Edinburgh, UK) Hozumi Tanaka (Chukyo University, Japan) Hans Uszkoreit (DFKI and Univeritat des Saarlandes, Germany) Eric Villemonte de la Clergerie (INRIA, Rocquencourt, France) K. Vijay-Shanker (University of Delaware, Newark, USA) Dekai Wu (Hong Kong University of Science and Technology, China) Additional Reviewers: Rieks op de Akker (Twente University, Netherlands) v Daisuke Kawahara (University of Tokyo, Japan) Manabu Okumura (Tokyo Institute of Technology, Japan) Kenji Sagae (Carnegie-Mellon University, USA) Libin Shen (University of Pennsylvania, USA) Kiyoaki Shirai (Hokuriku Advanced Institute of Science and Technology, Japan) Invited Speakers: Jill Burstein (Educational Testing Service, USA) Mari Ostendorf (University of Washington, USA) vi Table of Contents Efficient and Robust LFG Parsing: SxLFG Pierre Boullier and BenoˆıtSagot......................................................... 1 Parsing Linear Context-Free Rewriting Systems Hakan˚ Burden and Peter Ljunglof.......................................................¨ 11 Switch Graphs for Parsing Type Logical Grammars Bob Carpenter and Glyn Morrill . 18 Parsing with Soft and Hard Constraints on Dependency Length Jason Eisner and Noah A. Smith . 30 Corrective Modeling for Non-Projective Dependency Parsing Keith Hall and Vaclav´ Novak...........................................................´ 42 Better k-best Parsing Liang Huang and David Chiang . 53 Machine Translation as Lexicalized Parsing with Hooks Liang Huang, Hao Zhang and Daniel Gildea. .65 Treebank transfer Martin Jansche . 74 Lexical and Structural Biases for Function Parsing Gabriele Musillo and Paola Merlo . 83 Probabilistic Models for Disambiguation of an HPSG-Based Chart Generator Hiroko Nakanishi, Yusuke Miyao and Jun’ichi Tsujii . 93 Efficacy of Beam Thresholding, Unification Filtering and Hybrid Parsing in Probabilistic HPSG Parsing Takashi Ninomiya, Yoshimasa Tsuruoka, Yusuke Miyao and Jun’ichi Tsujii . 103 Head-Driven PCFGs with Latent-Head Statistics Detlef Prescher. .115 A Classifier-Based Parser with Linear Run-Time Complexity Kenji Sagae and Alon Lavie . 125 Chunk Parsing Revisited Yoshimasa Tsuruoka and Jun’ichi Tsujii . 133 Constituent Parsing by Classification Joseph Turian and I. Dan Melamed . 141 vii Strictly Lexical Dependency Parsing Qin Iris Wang, Dale Schuurmans and Dekang Lin . 152 Efficient Extraction of Grammatical Relations Rebecca Watson, John Carroll and Ted Briscoe . 160 Improving Parsing Accuracy by Combining Diverse Dependency Parsers Daniel Zeman and Zdenekˇ Zabokrtskˇ y..................................................´ 171 Exploring Features for Identifying Edited Regions in Disfluent Sentences Qi Zhang and Fuliang Weng . 179 Short papers Statistical Shallow Semantic Parsing despite Little Training Data Rahul Bhagat, Anton Leuski and Eduard Hovy . 186 The Quick Check Pre-unification Filter for Typed Grammars: Extensions Liviu Ciortuz . 188 From metagrammars to factorized TAG/TIG parsers Eric´ Villemonte de la Clergerie . 190 Parsing Generalized ID/LP Grammars Michael W. Daniels . 192 TFLEX: Speeding Up Deep Parsing with Strategic Pruning Myroslava O. Dzikovska and Carolyn P. Rose . 194 Generic Parsing for Multi-Domain Semantic Interpretation Myroslava Dzikovska, Mary Swift, James Allen and William de Beaumont. .196 Online Statistics for a Unification-Based Dialogue Parser Micha Elsner, Mary Swift, James Allen and Daniel Gildea. .198 SUPPLE: A Practical Parser for Natural Language Engineering Applications Robert Gaizauskas, Mark Hepple, Horacio Saggion, Mark A. Greenwood and Kevin Humphreys . 200 k-NN for Local Probability Estimation in Generative Parsing Models Deirdre Hogan . 202 Robust Extraction of Subcategorization Data from Spoken Language Jianguo Li, Chris Brew and Eric Fosler-Lussier . 204 viii Efficient and robust LFG parsing: SXLFG Pierre Boullier and Benoît Sagot INRIA-Rocquencourt, Projet Atoll, Domaine de Voluceau, Rocquencourt B.P. 105 78 153 Le Chesnay Cedex, France {pierre.boullier, benoit.sagot}@inria.fr Abstract Developing a new parser for LFG (Lexical- Functional Grammars, see, e.g., (Kaplan, 1989)) is In this paper, we introduce a new parser, not in itself very original. Several LFG parsers al- called SXLFG, based on the Lexical- ready exist, including those of (Andrews, 1990) or Functional Grammars formalism (LFG). (Briffault et al., 1997). However, the most famous We describe the underlying context-free LFG system is undoubtedly the Xerox Linguistics parser and how functional structures are Environment (XLE) project which is the successor efficiently computed on top of the CFG of the Grammars Writer’s Workbench (Kaplan and shared forest thanks to computation shar- Maxwell, 1994; Riezler et al., 2002; Kaplan et al., ing, lazy evaluation, and compact data 2004). XLE is a large project which concentrates a representation. We then present vari- lot of linguistic and computational technology, relies ous error recovery techniques we imple- on a similar point of view on the balance between mented in order to build a robust parser. shallow and deep parsing, and has been successfully Finally, we offer concrete results when used to parse large unrestricted corpora. SXLFG is used with an existing gram- Nevertheless, these parsers do not always use in mar for French. We show that our parser the most extensive way all existing algorithmic tech- is both efficient and robust, although the niques of computation sharing and compact infor- grammar is very ambiguous. mation representation that make it possible to write an efficient LFG parser, despite the fact that the LFG formalism, as many other formalisms relying on uni- 1 Introduction fication, is NP-hard. Of course our purpose is not to In order to tackle the algorithmic difficulties of make a new XLE system but to study how robust- parsers when applied to