Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), Pages 1–7 Brussels, Belgium, November 1, 2018
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Arxiv:1908.07448V1
Evaluating Contextualized Embeddings on 54 Languages in POS Tagging, Lemmatization and Dependency Parsing Milan Straka and Jana Strakova´ and Jan Hajicˇ Charles University Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics {strakova,straka,hajic}@ufal.mff.cuni.cz Abstract Shared Task (Zeman et al., 2018). • We report our best results on UD 2.3. The We present an extensive evaluation of three addition of contextualized embeddings im- recently proposed methods for contextualized 25% embeddings on 89 corpora in 54 languages provements range from relative error re- of the Universal Dependencies 2.3 in three duction for English treebanks, through 20% tasks: POS tagging, lemmatization, and de- relative error reduction for high resource lan- pendency parsing. Employing the BERT, guages, to 10% relative error reduction for all Flair and ELMo as pretrained embedding in- UD 2.3 languages which have a training set. puts in a strong baseline of UDPipe 2.0, one of the best-performing systems of the 2 Related Work CoNLL 2018 Shared Task and an overall win- ner of the EPE 2018, we present a one-to- A new type of deep contextualized word repre- one comparison of the three contextualized sentation was introduced by Peters et al. (2018). word embedding methods, as well as a com- The proposed embeddings, called ELMo, were ob- parison with word2vec-like pretrained em- tained from internal states of deep bidirectional beddings and with end-to-end character-level word embeddings. We report state-of-the-art language model, pretrained on a large text corpus. results in all three tasks as compared to results Akbik et al. -
Extended and Enhanced Polish Dependency Bank in Universal Dependencies Format
Extended and Enhanced Polish Dependency Bank in Universal Dependencies Format Alina Wróblewska Institute of Computer Science Polish Academy of Sciences ul. Jana Kazimierza 5 01-248 Warsaw, Poland [email protected] Abstract even for languages with rich morphology and rel- atively free word order, such as Polish. The paper presents the largest Polish Depen- The supervised learning methods require gold- dency Bank in Universal Dependencies for- mat – PDBUD – with 22K trees and 352K standard training data, whose creation is a time- tokens. PDBUD builds on its previous ver- consuming and expensive process. Nevertheless, sion, i.e. the Polish UD treebank (PL-SZ), dependency treebanks have been created for many and contains all 8K PL-SZ trees. The PL- languages, in particular within the Universal De- SZ trees are checked and possibly corrected pendencies initiative (UD, Nivre et al., 2016). in the current edition of PDBUD. Further The UD leaders aim at developing a cross- 14K trees are automatically converted from linguistically consistent tree annotation schema a new version of Polish Dependency Bank. and at building a large multilingual collection of The PDBUD trees are expanded with the en- hanced edges encoding the shared dependents dependency treebanks annotated according to this and the shared governors of the coordinated schema. conjuncts and with the semantic roles of some Polish is also represented in the Universal dependents. The conducted evaluation exper- Dependencies collection. There are two Polish iments show that PDBUD is large enough treebanks in UD: the Polish UD treebank (PL- for training a high-quality graph-based depen- SZ) converted from Składnica zalezno˙ sciowa´ 1 and dency parser for Polish. -
Universal Dependencies According to BERT: Both More Specific and More General
Universal Dependencies According to BERT: Both More Specific and More General Tomasz Limisiewicz and David Marecekˇ and Rudolf Rosa Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics Charles University, Prague, Czech Republic {limisiewicz,rosa,marecek}@ufal.mff.cuni.cz Abstract former based systems particular heads tend to cap- ture specific dependency relation types (e.g. in one This work focuses on analyzing the form and extent of syntactic abstraction captured by head the attention at the predicate is usually focused BERT by extracting labeled dependency trees on the nominal subject). from self-attentions. We extend understanding of syntax in BERT by Previous work showed that individual BERT examining the ways in which it systematically di- heads tend to encode particular dependency verges from standard annotation (UD). We attempt relation types. We extend these findings by to bridge the gap between them in three ways: explicitly comparing BERT relations to Uni- • We modify the UD annotation of three lin- versal Dependencies (UD) annotations, show- ing that they often do not match one-to-one. guistic phenomena to better match the BERT We suggest a method for relation identification syntax (x3) and syntactic tree construction. Our approach • We introduce a head ensemble method, com- produces significantly more consistent depen- dency trees than previous work, showing that bining multiple heads which capture the same it better explains the syntactic abstractions in dependency relation label (x4) BERT. • We observe and analyze multipurpose heads, At the same time, it can be successfully ap- containing multiple syntactic functions (x7) plied with only a minimal amount of supervi- sion and generalizes well across languages. -
Making Logical Form Type-Logical Glue Semantics for Minimalist Syntax
Making Logical Form type-logical Glue Semantics for Minimalist syntax Matthew Gotham University of Oslo UiO Forum for Theoretical Linguistics 12 October 2016 Slides available at <http://folk.uio.no/matthegg/research#talks> Matthew Gotham (Oslo) Making LF type-logical FTL, 12.10.2016 1 / 63 What this talk is about Slides available at <http://folk.uio.no/matthegg/research#talks> An implementation of Glue Semantics —an approach that treats the syntax-semantics interface as deduction in a type logic— for Minimalist syntax, i.e. syntactic theories in the ST!EST!REST!GB!...‘Chomskyan’ tradition. Q How Minimalist, as opposed to (say) GB-ish? A Not particularly, but the factoring together of subcategorization and structure building (in the mechanism of feature-checking) is, if not crucial to this analysis, then certainly useful. and a comparison of this approach with more mainstream approaches to the syntax-semantics interface. Matthew Gotham (Oslo) Making LF type-logical FTL, 12.10.2016 2 / 63 Outline 1 The mainstream approach 2 A fast introduction to Glue Semantics 3 Implementation in Minimalism The form of syntactic theory assumed The connection to Glue 4 Comparison with the mainstream approach Interpreting (overt) movement Problems with the mainstream approach Glue analysis Nested DPs Scope islands Matthew Gotham (Oslo) Making LF type-logical FTL, 12.10.2016 3 / 63 The mainstream approach How semantics tends to be done for broadly GB/P&P/Minimalist syntax Aer Heim & Kratzer (1998) Syntax produces structures that are interpreted recursively -
Proceedings of the LFG 00 Conference University of California, Berkeley Editors: Miriam Butt and Tracy Holloway King 2000 CSLI P
Proceedings of the LFG 00 Conference University of California, Berkeley Editors: Miriam Butt and Tracy Holloway King 2000 CSLI Publications ISSN 1098-6782 Editors' Note This year's conference was held as part of a larger Berkeley Formal Grammar Conference, which included a day of workshops and the annual HPSG conference. The program committee for LFG'00 were Rachel Nordlinger and Chris Manning. We would like to thank them for putting together the program that gave rise to this collection of papers. We would also like to thank all those who contributed to this conference, especially the local organizing committee, namely Andreas Kathol, and our reviewers, without whom the conference would not have been possible. Miriam Butt and Tracy Holloway King Functional Identity and Resource-Sensitivity in Control Ash Asudeh Stanford University and Xerox PARC Proceedings of the LFG00 Conference University of California, Berkeley Miriam Butt and Tracy Holloway King (Editors) 2000 CSLI Publications http://csli-publications.stanford.edu/ 2 1 Introduction1 Glue semantics provides a semantics for Lexical Functional Grammar (LFG) that is expressed using linear logic (Girard, 1987; Dalrymple, 1999) and provides an interpretation for the f(unctional)-structure level of syntactic representation, connecting it to the level of s(emantic)-structure in LFG’s parallel projection archi- tecture (Kaplan, 1987, 1995). Due to its use of linear logic for meaning assembly, Glue is resource-sensitive: semantic resources contributed by lexical entries and resulting f-structures must each be used in a successful proof exactly once. In this paper, I will examine the tension between a resource-sensitive semantics which interprets f-structures and structure-sharing in f-structures as expressed by functional control resulting from lexical functional identity equations. -
RELATIONAL NOUNS, PRONOUNS, and Resumptionw Relational Nouns, Such As Neighbour, Mother, and Rumour, Present a Challenge to Synt
Linguistics and Philosophy (2005) 28:375–446 Ó Springer 2005 DOI 10.1007/s10988-005-2656-7 ASH ASUDEH RELATIONAL NOUNS, PRONOUNS, AND RESUMPTIONw ABSTRACT. This paper presents a variable-free analysis of relational nouns in Glue Semantics, within a Lexical Functional Grammar (LFG) architecture. Rela- tional nouns and resumptive pronouns are bound using the usual binding mecha- nisms of LFG. Special attention is paid to the bound readings of relational nouns, how these interact with genitives and obliques, and their behaviour with respect to scope, crossover and reconstruction. I consider a puzzle that arises regarding rela- tional nouns and resumptive pronouns, given that relational nouns can have bound readings and resumptive pronouns are just a specific instance of bound pronouns. The puzzle is: why is it impossible for bound implicit arguments of relational nouns to be resumptive? The puzzle is highlighted by a well-known variety of variable-free semantics, where pronouns and relational noun phrases are identical both in category and (base) type. I show that the puzzle also arises for an established variable-based theory. I present an analysis of resumptive pronouns that crucially treats resumptives in terms of the resource logic linear logic that underlies Glue Semantics: a resumptive pronoun is a perfectly ordinary pronoun that constitutes a surplus resource; this surplus resource requires the presence of a resumptive-licensing resource consumer, a manager resource. Manager resources properly distinguish between resumptive pronouns and bound relational nouns based on differences between them at the level of semantic structure. The resumptive puzzle is thus solved. The paper closes by considering the solution in light of the hypothesis of direct compositionality. -
Chapter 22 Semantics Jean-Pierre Koenig University at Buffalo Frank Richter Goethe Universität Frankfurt
Chapter 22 Semantics Jean-Pierre Koenig University at Buffalo Frank Richter Goethe Universität Frankfurt This chapter discusses the integration of theories of semantic representations into HPSG. It focuses on those aspects that are specific to HPSG and, in particular, re- cent approaches that make use of underspecified semantic representations, as they are quite unique to HPSG. 1 Introduction A semantic level of description is more integrated into the architecture of HPSG than in many frameworks (although, in the last couple of decades, the integra- tion of syntax and semantics has become tighter overall; see Heim & Kratzer 1998 for Mainstream Generative Grammar1, for example). Every node in a syntactic tree includes all appropriate levels of structure, phonology, syntax, semantics, and pragmatics so that local interaction between all these levels is in principle possible within the HPSG architecture. The architecture of HPSG thus follows the spirit of the rule-to-rule approach advocated in Bach (1976) and more specifi- cally Klein & Sag (1985) to have every syntactic operation matched by a semantic operation (the latter, of course, follows the Categorial Grammar lead, broadly speaking; Ajdukiewicz 1935; Pollard 1984; Steedman 2000). But, as we shall see, only the spirit of the rule-to-rule approach is adhered to, as there can be more 1We follow Culicover & Jackendoff (2005: 3) in using the term Mainstream Generative Grammar (MGG) to refer to work in Government & Binding or Minimalism. Jean-Pierre Koenig & Frank Richter. 2021. Semantics. In Stefan Müller, Anne Abeillé, Robert D. Borsley & Jean- Pierre Koenig (eds.), Head-Driven Phrase Structure Grammar: The handbook. Prepublished version. -
Chapter 30 HPSG and Lexical Functional Grammar Stephen Wechsler the University of Texas Ash Asudeh University of Rochester & Carleton University
Chapter 30 HPSG and Lexical Functional Grammar Stephen Wechsler The University of Texas Ash Asudeh University of Rochester & Carleton University This chapter compares two closely related grammatical frameworks, Head-Driven Phrase Structure Grammar (HPSG) and Lexical Functional Grammar (LFG). Among the similarities: both frameworks draw a lexicalist distinction between morphology and syntax, both associate certain words with lexical argument structures, both employ semantic theories based on underspecification, and both are fully explicit and computationally implemented. The two frameworks make available many of the same representational resources. Typical differences between the analyses proffered under the two frameworks can often be traced to concomitant differ- ences of emphasis in the design orientations of their founding formulations: while HPSG’s origins emphasized the formal representation of syntactic locality condi- tions, those of LFG emphasized the formal representation of functional equivalence classes across grammatical structures. Our comparison of the two theories includes a point by point syntactic comparison, after which we turn to an exposition ofGlue Semantics, a theory of semantic composition closely associated with LFG. 1 Introduction Head-Driven Phrase Structure Grammar is similar in many respects to its sister framework, Lexical Functional Grammar or LFG (Bresnan et al. 2016; Dalrymple et al. 2019). Both HPSG and LFG are lexicalist frameworks in the sense that they distinguish between the morphological system that creates words and the syn- tax proper that combines those fully inflected words into phrases and sentences. Stephen Wechsler & Ash Asudeh. 2021. HPSG and Lexical Functional Gram- mar. In Stefan Müller, Anne Abeillé, Robert D. Borsley & Jean- Pierre Koenig (eds.), Head-Driven Phrase Structure Grammar: The handbook. -
Towards Glue Semantics for Minimalist Syntax∗
cambridge occasional papers in linguistics COP i L Volume 8, Article 4: 56–83, 2015 j ISSN 2050-5949 TOWARDS GLUE SEMANTICS FOR MINIMALIST SYNTAX∗ Matthew Gotham University College London Abstract Glue Semantics is a theory of the syntax/semantics interface according to which syntax generates premises in a fragment of linear logic, and semantic interpretation proceeds by deduction from those premises. Glue was originally developed within Lexical-Functional Grammar and is now the mainstream view of semantic composition within LFG, but it is in principle compatible with any syntactic framework. In this paper I present an implementation of Glue for Minimalism, and show how it can bring certain advantages in comparison with approaches to the syntax/semantics interface more conventionally adopted. 1 Introduction Take a simple example of quantifier scope ambiguity like (1). (1) A boy trains every dog. There are various theoretical options available when deciding where the ambi- guity of (1) should be located. One option would be to say that that the lexical entry of one (or more) of the words in the sentence is polymorphic (Hendriks 1987). Another would be to say that there is more than one syntactic analysis of (1): either in terms of constituent structure (May 1977) or derivational history (Montague 1973). There is a third option, however, which is to say that the ambiguity resides in the nature of the connection between syntax and semantic composition. This is the approach taken in Glue Semantics.1 What the lexical polymorphism and syntactic ambiguity accounts have in common is the view that, given a syntactic analysis of a sentence and a choice ∗ This paper is based on material that was presented at the event Interactions between syntax and semantics across frameworks at Cambridge University on 8 January 2015, and to the Oxford Glue Group on 19 February and 5 March 2015. -
Universal Dependencies for Japanese
Universal Dependencies for Japanese Takaaki Tanaka∗, Yusuke Miyaoy, Masayuki Asahara}, Sumire Uematsuy, Hiroshi Kanayama♠, Shinsuke Mori|, Yuji Matsumotoz ∗NTT Communication Science Labolatories, yNational Institute of Informatics, }National Institute for Japanese Language and Linguistics, ♠IBM Research - Tokyo, |Kyoto University, zNara Institute of Science and Technology [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Abstract We present an attempt to port the international syntactic annotation scheme, Universal Dependencies, to the Japanese language in this paper. Since the Japanese syntactic structure is usually annotated on the basis of unique chunk-based dependencies, we first introduce word-based dependencies by using a word unit called the Short Unit Word, which usually corresponds to an entry in the lexicon UniDic. Porting is done by mapping the part-of-speech tagset in UniDic to the universal part-of-speech tagset, and converting a constituent-based treebank to a typed dependency tree. The conversion is not straightforward, and we discuss the problems that arose in the conversion and the current solutions. A treebank consisting of 10,000 sentences was built by converting the existent resources and currently released to the public. Keywords: typed dependencies, Short Unit Word, multiword expression, UniDic 1. Introduction 2. Word unit The definition of a word unit is indispensable in UD an- notation, which is not a trivial question for Japanese, since The Universal Dependencies (UD) project has been de- a sentence is not segmented into words or morphemes by veloping cross-linguistically consistent treebank annota- white space in its orthography. -
MISSING RESOURCES in a RESOURCE-SENSITIVE SEMANTICS Gianluca Giorgolo and Ash Asudeh King's College, London & Carleton
MISSING RESOURCES IN A RESOURCE-SENSITIVE SEMANTICS Gianluca Giorgolo and Ash Asudeh King’s College, London & Carleton University & University of Oxford University of Oxford Proceedings of the LFG12 Conference Miriam Butt and Tracy Holloway King (Editors) 2012 CSLI Publications http://csli-publications.stanford.edu/ Abstract In this paper, we present an investigation of the argument/adjunct distinction in the context of LFG. We focus on those cases where certain grammatical functions that qualify as arguments according to all standard tests (Needham and Toivonen, 2011) are only optionally realized. We argue for an analy- sis first proposed by Blom et al. (2012), and we show how we can make it work within the machinery of LFG. Our second contribution regards how we propose to interpret a specific case of optional arguments, optional objects. In this case we propose to generalize the distinction between transitive and intransitive verbs to a continuum. Purely transitive and intransitive verbs rep- resent the extremes of the continuum. Other verbs, while leaning towards one or the other end of this spectrum, show an alternating behavior between the two extremes. We show how our first contribution is capable of accounting for these cases in terms of exceptional behavior. The key insight we present is that the verbs that exhibit the alternating behavior can best be understood as being capable of dealing with an exceptional context. In other words they display some sort of control on the way they compose with their context. This will prompt us also to rethink the place of the notion of subcategorization in the LFG architecture 1 Introduction The distinction between arguments and adjuncts is central for the LFG architec- ture as it influences the way in which representations of linguistic expressions are generated both at the functional and the semantic level. -
Universal Dependencies for Persian
Universal Dependencies for Persian *Mojgan Seraji, **Filip Ginter, *Joakim Nivre *Uppsala University, Department of Linguistics and Philology, Sweden **University of Turku, Department of Information Technology, Finland *firstname.lastname@lingfil.uu.se **figint@utu.fi Abstract The Persian Universal Dependency Treebank (Persian UD) is a recent effort of treebanking Persian with Universal Dependencies (UD), an ongoing project that designs unified and cross-linguistically valid grammatical representations including part-of-speech tags, morphological features, and dependency relations. The Persian UD is the converted version of the Uppsala Persian Dependency Treebank (UPDT) to the universal dependencies framework and consists of nearly 6,000 sentences and 152,871 word tokens with an average sentence length of 25 words. In addition to the universal dependencies syntactic annotation guidelines, the two treebanks differ in tokenization. All words containing unsegmented clitics (pronominal and copula clitics) annotated with complex labels in the UPDT have been separated from the clitics and appear with distinct labels in the Persian UD. The treebank has its original syntactic annota- tion scheme based on Stanford Typed Dependencies. In this paper, we present the approaches taken in the development of the Persian UD. Keywords: Universal Dependencies, Persian, Treebank 1. Introduction this paper, we present how we adapt the Universal Depen- In the past decade, the development of numerous depen- dencies to Persian by converting the Uppsala Persian De- dency parsers for different languages has frequently been pendency Treebank (UPDT) (Seraji, 2015) to the Persian benefited by the use of syntactically annotated resources, Universal Dependencies (Persian UD). First, we briefly de- or treebanks (Bohmov¨ a´ et al., 2003; Haverinen et al., 2010; scribe the Universal Dependencies and then we present the Kromann, 2003; Foth et al., 2014; Seraji et al., 2015; morphosyntactic annotations used in the extended version Vincze et al., 2010).