Proceedings of NAACL-HLT 2016

Total Page:16

File Type:pdf, Size:1020Kb

Proceedings of NAACL-HLT 2016 NAACL HLT 2016 Workshop on Discontinuous Structures in Natural Language Processing (DiscoNLP) Proceedings of the Workshop June 17, 2016 San Diego, California, USA c 2016 The Association for Computational Linguistics Order copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ISBN 978-1-941643-85-3 ii Introduction This volume presents the papers presented at the Workshop on Discontinuous Structures in Natural Language Processing, held in San Diego, California on June 17, 2016 during the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. The modeling of certain structures in natural language requires a mechanism for discontinuity, in the sense that we must account for two or more parts of the structure that are not adjacent. This is true across many languages and on different description levels. For instance, on the lexical level, this concerns discontinuous morphological phenomena such as transfixation (templatic morphology), as well as phrasal verbs, and non-contiguous multiword expressions. On the syntactic level, discontinuity is caused by phenomena such as extraposition and topicalization, or argument scrambling. Morphologically rich languages (MRLs) are particularly likely to exhibit such phenomena. Other examples include disfluency and anaphora/coreference resolution with discontinuous antecedents; modeling in both of the latter areas requires an extended domain of locality. On a higher level, discontinuity is a relevant factor in machine translation, as well as in complex question answering and in topic structure modeling. Discontinuity has been studied intensively in a range of different areas, including but not limited to grammar development, syntactic and semantic parsing, morphological analysis, machine translation, anaphora resolution, discourse modeling, automatic summarization and complex question answering. Nevertheless, the treatment of discontinuous structures remains a challenge, because on the one hand, recovering of non-local information is generally associated with a high computational cost, and on the other hand, discontinuities are inherently a low-frequency phenomenon, which means that statistical approaches have a tendency to analyze them incorrectly as more frequent local phenomena. Additionally, it is not always clear if and how NLP tasks can benefit from knowing about discontinuity, that is, why one should care, particularly considering the given computational cost. The goal of this workshop is to bring together researchers from the different areas to give them a forum to exchange ideas and problem solutions, to create synergy effects, and to enable more powerful solutions. This encompasses not only linguistic analyses and work on analyzing or recovering the corresponding structures, such as, e.g., in non-projective dependency parsing, but also studies on "use cases", which show how information about discontinuity can be used to enhance NLP tasks. We think that given the broad program we have put together, this goal has been more than fulfilled. Thanks to all authors who have contributed their work! Out of ten submissions, seven were selected for presentation. We would also like to extend our gratitude the program committee, who have dedicated their time and effort in order to make this workshop a high-quality event. See you in San Diego! Wolfgang Maier, Sandra Kübler, and Constantin Orasan˘ iii Organizers: Wolfgang Maier, University of Düsseldorf (Germany) Sandra Kübler, Indiana University (USA) Constantin Orasan,˘ University of Wolverhampton (UK) Program Committee: Anne Abeillé, University Paris 7 (France) Krasimir Angelov, University of Gothenburg (Sweden) Marianna Apidianaki, LIMSI (France) Eric de la Clergerie, INRIA (France) Andreas van Cranenburgh, Royal Netherlands Academy for Arts and Sciences (The Netherlands) Joachim Daiber, University of Amsterdam (The Netherlands) Carlos Gómez Rodríguez, University of A Coruña (Spain) Eva Hasler, University of Cambridge (UK) Mijail Kabadjov, University of Essex (UK) Sylvain Kahane, University Paris 10 (France) Laura Kallmeyer, University of Düsseldorf (Germany) Philipp Koehn, University of Edinburgh (UK) Johannes Leveling, Elsevier (The Netherlands) Timm Lichte, University of Düsseldorf (Germany) Peter Ljunglöf, University of Gothenburg (Sweden) Georgiana Marsic, University of Wolverhampton (UK) Detmar Meurers, University of Tübingen (Germany) Jean-Luc Minel, Université Paris Ouest Nanterre La Défense (France) Sara Moze, University of Wolverhampton (UK) Philippe Muller, University of Toulouse/IRIT (France) Preslav Nakov, Qatar Computing Research Institute (Qatar) Mark-Jan Nederhof, University of St. Andrews (UK) Yannick Parmentier, University of Orléans (France) Ted Pedersen, University of Minnesota (USA) Irene Renau, Pontificia Universidad Católica de Valparaíso (Chile) Lonneke van der Plas, University of Malta (Malta) Natalie Schluter, University of Copenhagen (Denmark) Djamé Seddah, University Paris 4 (France) Khalil Sima’an, University of Amsterdam (The Netherlands) Yannick Versley, University of Heidelberg (Germany) Suzan Veberne, University of Nijmegen (The Netherlands) Andy Way, Dublin City University (Ireland) Invited Speaker: David Chiang, University of Notre Dame (USA) v Table of Contents An LFG Account of Discontinuous Nominal Expressions Liselotte Snijders . .1 Non-projectivity and valency Zdenka Uresova, Eva Fucikova and Jan Hajic. .12 Machine Translation of Non-Contiguous Multiword Units Anabela Barreiro and Fernando Batista . 22 Discontinuous VP in Bulgarian Elisaveta Balabanova . 31 Discontinuous Genitives in Hindi/Urdu Sebastian Sulger. .37 Discontinuous parsing with continuous trees Wolfgang Maier and Timm Lichte . 47 Discontinuity Reˆ2-visited: A Minimalist Approach to Pseudoprojective Constituent Parsing Yannick Versley . 58 vii Workshop Program Friday, June 17, 2016 9:30–10:00 An LFG Account of Discontinuous Nominal Expressions Liselotte Snijders 10:00–10:30 Non-projectivity and valency Zdenka Uresova, Eva Fucikova and Jan Hajic 10:30–11:00 Coffee break 11:00–12:15 Invited Talk: Finite automata for free word order languages David Chiang 12:15–12:45 Machine Translation of Non-Contiguous Multiword Units Anabela Barreiro and Fernando Batista 12:45–14:30 Lunch break 2:30–3:00 Discontinuous VP in Bulgarian Elisaveta Balabanova 3:00–3:30 Discontinuous Genitives in Hindi/Urdu Sebastian Sulger 4:00–4:30 Discontinuous parsing with continuous trees Wolfgang Maier and Timm Lichte 4:30–5:00 Discontinuity Reˆ2-visited: A Minimalist Approach to Pseudoprojective Constituent Parsing Yannick Versley 5:00–5:45 Panel discussion ix An LFG Account of Discontinuous Nominal Expressions Liselotte Snijders Waseda University Tokyo, Japan [email protected] Abstract (1) Kurdu-ngku ka wajilipi-nyi child-ERG PRES chase-NONPAST This paper presents an overview of an LFG wita-ngku treatment of discontinuous nominal expres- sions involving modification, making the small-ERG claim that cross-linguistically different types ‘The small child is chasing it.’ of discontinuity (i.e. in Warlpiri and English) should be captured by the same overall analy- In (1) a head noun is separated from a modifier, sis, despite being licensed in different ways. but both parts map to the same grammatical function LFG’s separation of grammatical functions (subject). The two parts of the discontinuous expres- from phrase structural positions intuitively ac- sion share the same case-marking. A similar type counts for discontinuous expressions, and its of discontinuity involving modification is attested in use of glue semantics ensures that discontin- English, in the cases of relative clause extraposition uous and contiguous expressions receive the same semantic analysis. in (2a) and NP-PP split in (2b) (Kirkwood, 1977, p. 55):2 1 Introduction (2) a. The man entered who I met yesterday. Discontinuity of nominal expressions, a phe- b. A number of stories soon appeared nomenon in which two or more parts of a seman- about Watergate. tic nominal unit are non-adjacent in phrase struc- A similar type of discontinuity is in fact also at- ture, is prevalent in languages traditionally classified tested in Warlpiri (Hale, 1976, p. 78):3 as “non-configurational” (Hale, 1983), e.g. the Aus- tralian languages Warlpiri, Wambaya, Jaminjung (3) Ngajulu-rlu rna yankirri pantu-rnu (Simpson, 1991; Nordlinger, 1998; Schultze-Berndt I-ERG AUX emu.ABS spear-PAST and Simard, 2012), Latin (Devine and Stephens, kuja-lpa ngapa nga-rnu. 2000; Spevak, 2010), Ancient Greek (Devine and COMP-AUX water.ABS drink-PAST Stephens, 2006), and are also attested in a number ‘I speared the emu that was drinking of Slavic languages, e.g. Russian (Sekerina, 1997; water.’ Sekerina, 1999) and Polish (Siewierska, 1984). An 2Another type of discontinuity in English involving modifi- example of nominal discontinuity from Warlpiri is cation is partial fronting, e.g. About Japan, the woman wrote shown in (1) (Simpson, 1991, p. 282):1 many books; additional examples are discussed in Section 6. 3 1 Hale (1976) refers to this type of example as ‘adjoined rel- This type of Warlpiri example has another interpretation, ative clause’: it can also precede the sentence as a whole (some- which can be translated as ‘The childi is chasing it and iti is what like a hanging topic). It can also have a temporal reading: small’
Recommended publications
  • The Morphosyntax of Clitic Doubling
    UC Santa Cruz UC Santa Cruz Electronic Theses and Dissertations Title On the Mapping from Syntax to Morphophonology Permalink https://escholarship.org/uc/item/82m2c0fq Author Harizanov, Boris Publication Date 2014 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA SANTA CRUZ ON THE MAPPING FROM SYNTAX TO MORPHOPHONOLOGY A dissertation submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in LINGUISTICS by Boris Harizanov June The Dissertation of Boris Harizanov is approved: Professor Sandra Chung, Chair Professor Jorge Hankamer Professor James McCloskey Dean Tyrus Miller Vice Provost and Dean of Graduate Studies Copyright © by Boris Harizanov Contents Abbreviations vi Abstract vii Dedication ix Acknowledgments x Introduction . Overview ................................. . Theoretical background ........................ . Language background ......................... . Outline .................................. Clitic doubling . Types of clitic constructions ...................... .. Clitic doubling ......................... .. CLLD and CLRD ........................ . The status of the clitic doubled associate ............... .. Islandhood ........................... .. Case assignment ........................ .. Word order ........................... .. Summary ............................ . The status of the clitic–associate relation ............... .. Binding ............................
    [Show full text]
  • Bulgarian Reference Grammar
    1 A Concise Bulgarian Grammar John Leafgren 2 Table of Contents Introduction 4 Chapter 1. Bulgarian Sounds and Orthography 6 Alphabet 6 Vowels7 Consonants 8 Palatalization 9 Affricates 9 Voicing 9 Chapter 2. Major Morphophonemic Alternations 11 The Е ~’А Alternation 11 The Vowel ~ Zero Alternation 12 The К ~!Ч, Г ~!Ж and Х ~!Ш Alternations 13 The К ~!Ц, Г ~!З and Х ~!С Alternations 15 The ЪР ~!РЪ and ЪЛ ~!ЛЪ Alternations 17 Stress Alternation 18 Chapter 3. Nouns 21 Gender 21 Humans 21 Animals 22 Other Nouns 22 Singular Formation 23 Masculine Nouns 23 Feminine Nouns 24 Neuter Nouns 25 Plural Formation 26 Masculine Nouns 26 Feminine Nouns 28 Neuter Nouns 29 Numerical Form 31 Pluralia Tantum and Singularia Tantum 31 Nouns and the Definite Article 32 Masculine Singular Nouns and the Definite Article 32 Feminine Singular Nouns and the Definite Article 34 Neuter Singular Nouns and the Definite Article 34 Plural Nouns and the Definite Article 35 Use of Definite Forms 36 The Vocative 36 Feminine Nouns 36 3 Masculine Nouns 37 Remnants of other Cases 38 Chapter 4. Adjectives and Adverbs 39 Adjectives 39 Masculine Singular Adjectives 39 Feminine Singular Adjectives 39 Neuter Singular Adjectives 39 Plural Adjectives 40 Soft Adjective Forms 40 Summary of Adjective Declension 40 Comparative and Superlative Forms 41 Adverbs 42 Comparative and Superlative Forms 43 Chapter 5. Numbers 44 Cardinal Numbers 44 Fractions 46 Ordinal Numbers 47 Chapter 6. Pronouns 49 Personal Pronouns 49 Nominative Personal Pronouns 49 Accusative Personal Pronouns 50 Dative Personal Pronouns 52 Possessive Pronouns 53 Demonstrative Pronouns 54 Interrogative Pronouns 55 Negative Pronouns 57 Indefinite Pronouns 57 Relative Pronouns 59 Reciprocal Pronouns 60 Generalizing Pronouns 61 Chapter 7.
    [Show full text]
  • Morphology Interface: Denominal Adjectives in Bulgarian Boris Harizanov
    Word Formation at the Syntax- Morphology Interface: Denominal Adjectives in Bulgarian Boris Harizanov A major goal in the study of the interface between syntax and morphol- ogy (understood as part of the PF component) is to understand mis- matches between syntactic representations and the corresponding mor- phological representations. Denominal adjectives in Bulgarian provide one such mismatch. In morphology, they are composed of a nominal component D adjoined to an adjectivizing head F. In syntax, however, the nominal component D behaves like a nominal phrase occupying the specifier of F. Denominal adjectives in Bulgarian thus present both a structural mismatch whereby a syntactic specifier-head relation is mapped to head adjunction at PF and a mismatch between the syntactic and morphological category of denominal adjectives. I analyze these mismatches as the result of a morphological (postsyntactic) operation, which converts nominal phrases into denominal adjectives postsyntac- tically, as part of the word formation process that combines the nominal phrases with adjectivizing morphology. The proposal is an extension of the theory of the syntax-morphology mapping developed within Distributed Morphology (Embick and Noyer 2001, et seq.) on the basis of Marantz’s (1984) Morphological Merger and relies on the implementation of Morphological Merger developed by Harizanov (2014a) in the context of cliticization, itself an elaboration of Matu- shansky’s (2006) and Nevins’s (2011) proposals. Keywords: syntax-morphology interface, denominal adjectives, Mor- phological Merger, Bulgarian 1 Introduction At every level of linguistic description and analysis, various criteria are used to infer the structural relations between the objects that are relevant at that level.
    [Show full text]
  • Special Issue a Cognitive Linguistic View of South Slavic Prepositions
    Special issue A Cognitive Linguistic View of South Slavic Prepositions and Prefixes Guest editor: Ljiljana Šari 13.1 (2012): 5-17 LANCI – ARTICLES – ARTIKEL Ljiljana Šari University of Oslo Introduction: A cognitive linguistic view of South Slavic prepositions and prefixes1 Background and motivation for this special issue Cognitive linguists dealing with the Slavic languages in Slavic countries and worldwide have been engaged in broad research activities.2 However, much of this research has remained overlooked, especially works published in Slavic languages in Slavic countries. Individual Slavic languages are unevenly repre- 1 This special issue has been made possible thanks to grants from the Department of Litera- ture, Area Studies, and European Languages at the Faculty of Humanities, University of Oslo, which funded the workshop for the project group Space in South Slavic in March 2010 and some other activities related to this publication. The authors and project-group members have commented on each others’ earlier versions of their texts, and their constructive feedback has improved the articles. Additional helpful feed- back on individual texts was provided by Franiška Lipovšek, Darko Matovac, Kjetil Rå Hauge, and Barbara Schmiedtová. Dawn and Donald Reindl were responsible for final copyediting, and Ana Bratuli for checking the final layout of the articles. I am indebted to all of them. Many thanks also go to Jezikoslovlje’s editor-in chief Mario Brdar for supporting the idea of this thematic issue. 2 See the Slavic Cognitive Linguistic Bibliography by Laura Janda and Ljiljana Šari at http:// www.hum.uit.no/lajanda/SlavCognBibliography_Sept2009.doc, which provides an overview of part of this research up to 2009.
    [Show full text]