Latent Semantic Network Induction in the Context of Linked Example Senses
Total Page:16
File Type:pdf, Size:1020Kb
Latent semantic network induction in the context of linked example senses Hunter Scott Heidenreich Jake Ryland Williams Department of Computer Science Department of Information Science College of Computing and Informatics College of Computing and Informatics [email protected] [email protected] Abstract of a network—much like the Princeton Word- Net (Miller, 1995; Fellbaum, 1998)—that is con- The Princeton WordNet is a powerful tool structed solely from the semi-structured data of for studying language and developing nat- ural language processing algorithms. With Wiktionary. This relies on the noisy annotations significant work developing it further, one of the editors of Wiktionary to naturally induce a line considers its extension through aligning network over the entirety of the English portion of its expert-annotated structure with other lex- Wiktionary. In doing so, the development of this ical resources. In contrast, this work ex- work produces: plores a completely data-driven approach to network construction, forming a wordnet us- • an induced network over Wiktionary, en- ing the entirety of the open-source, noisy, user- riched with semantically linked examples, annotated dictionary, Wiktionary. Compar- forming a directed acyclic graph (DAG); ing baselines to WordNet, we find compelling evidence that our network induction process • an exploration of the task of relationship dis- constructs a network with useful semantic structure. With thousands of semantically- ambiguation as a means to induce network linked examples that demonstrate sense usage construction; and from basic lemmas to multiword expressions (MWEs), we believe this work motivates fu- • an outline for directions of expansion, includ- ture research. ing increasing precision in disambiguation, cross-linking example usages, and aligning 1 Introduction English Wiktionary with other languages. Wiktionary is a free and open-source collaborative 2 1 We make our code freely available , which in- dictionary (Wikimedia). With the ability for any- cludes code to download data, to disambiguate one to add or edit lemmas, definitions, relations, relationships between lemmas, to construct net- and examples, Wiktionary has the potential to be works from disambiguation output, and to interact larger and more diverse than any printable dic- with networks produced through this work. tionary. Wiktionary features a rich set of exam- ples of sense usage for many of its lemmas which, 2 Related work when converted to a usable format, supports lan- guage processing tasks such as sense disambigua- 2.1 WordNet tion (Meyer and Gurevych, 2010a; Matuschek The Princeton WordNet, or WordNet as it’s more and Gurevych, 2013; Miller and Gurevych, 2014) commonly referred to, is a lexical database orig- and MWE identification (Muzny and Zettlemoyer, inally created for the English language (Miller, 2013; Salehi et al., 2014; Hosseini et al., 2016). 1995; Fellbaum, 1998). It consists of expert- With natural alignment to other languages, Wik- annotated data, and has been more or less contin- tionary can likewise be used as a resource for ma- ually updated since its creation (Harabagiu et al., chine translation tasks (Matuschek et al., 2013; 1999; Miller and Hristea, 2006). WordNet is built Borin et al., 2014;G ohring¨ , 2014). With these up of synsets, collections of lexical items that all uses in mind, this work introduces the creation 2 Code will be available at https://github.com/ 1https://www.wiktionary.org/ hunter-heidenreich/lsni-paper 170 Proceedings of the 2019 EMNLP Workshop W-NUT: The 5th Workshop on Noisy User-generated Text, pages 170–180 Hong Kong, Nov 4, 2019. c 2019 Association for Computational Linguistics have the same meaning. For each synset, a def- tionship anchoring, this task has been previously inition is provided, and for some synsets, usage explored in the creation of machine-readable dic- examples are also presented. If extracted and at- tionaries (Krovetz, 1992), ontology learning (Pan- tributed properly, the example usages present on tel and Pennacchiotti, 2006, 2008), and German Wiktionary could critically enhance WordNet by Wikitionary (Meyer and Gurevych, 2010b). filling gaps. While significant other work has been As mentioned above, Meyer and Gurevych ex- done in utilizing Wiktionary to enhance WordNet plore relationship disambiguation in the context for purposes like this (discussed in the next sec- of Wiktionary, motivating a sense-disambiguated tions), this work takes a novel step by constructing Wiktionary as a powerful resource (Meyer and a wordnet through entirely computational means, Gurevych, 2012a,b). This task is frequently i.e. under the framing of a machine learning task viewed as a binary classification: Given two linked based on Wiktionary’s data. lemmas, do these pairs of definitions belong to the relationship? While easier to model, this fram- 2.2 Wiktionary ing can suffer from a combinatorial explosion as Wiktionary is an open-source, Wiki-based, open all pairs of definitions must be compared. This content dictionary organized by the WikiMedia work attempts to model the task differently, dis- Foundation (Wikimedia). It has a large and active ambiguating all definitions in the context of a re- volunteer editorial community, and from its noisy, lationship and its lemmas. crowd-sourced nature, includes many MWEs, col- loquial terms, and their example usages, which 3 Model could ultimately fill difficult-to-resolve gaps left 3.1 Framework in other linguistic resources, such as WordNet. Thus, Wiktionary has a significant history of ex- This work starts by identifying a set of lemmas, ploration for the enhancement of WordNet, includ- W , and a set of senses, S. It then proceeds, as- ing efforts that extend WordNet for better domain suming that S forms the vertex set of a Directed coverage of word senses (Meyer and Gurevych, Acyclic Graph (DAG) with edge set E, organizing 2011; Gurevych et al., 2012; Miller and Gurevych, S by refinement of specificity. That is, if senses 2014), automatically derive new lemmas (Jurgens s; t 2 S have a link (t; s) 2 E—to s—then s is and Pilehvar, 2015; Rusert and Pedersen, 2016), one degree of refinement more specific than t. and develop the creation of multilingual word- Next, we suppose a lemma u 2 W has relation nets (de Melo and Weikum, 2009; Gurevych et al., ∼ (e.g., synonymy) indicated to another lemma 2012; Bond and Foster, 2013). While these works v 2 W . Assuming ∼ is recorded from u to v (e.g., constitute important steps in the usage of extracted from u’s page), we call u the source and v the Wiktionary contents for the development of Word- sink. Working along these lines, the model then Net, none before this effort has attempted to utilize assumes a given indicated relation ∼ is qualified by a sense s; this semantic equivalence is denoted the entirety of Wiktionary alone for the construc- s tion of such a network. u ∼ v. Most similarly, Wiktionary has been used Like others (Landauer and Dumais, 1997; Blei in a sense-disambiguated fashion (Meyer and et al., 2003; Bengio et al., 2003), this work as- Gurevych, 2012b) and to construct an ontology sumes senses exist in a latent semantic space. Pro- (Meyer and Gurevych, 2012a). Our work does cessing a dictionary, one can empirically discover s t not create an ontology, but instead attempts to relationships like u ∼ v and v ∼ w. But for a create a semantic wordnet. In this context, our larger network structure one must know if s = t— work can be viewed as building on notions of that is, do s and t refer to the same relationship— sense-disambiguating Wiktionary to construct a and often neither s nor t are known, explicitly. WordNet-like resource. Hence, this work sets up approximations of s and t for comparison. Given a lemma, u 2 W , sup- 2.3 Relation Disambiguation pose a set of definitions, Du, exists and form the The task of taking definitions, a semantic relation- basis for disambiguation of a lemma’s senses. We ship, and sub-selecting the definitions that belong then assume that for any d 2 Du there exists one to that relationship is one of critical importance to or more senses, s 2 S, such that d =) s, that is, our work. Sometimes called sense linking or rela- the definition d conveys the sense s. 171 Having assumed a DAG structure for S, this ‘gloss’ labels, which indicate the definitions that work denotes specificity of sense by using the for- apply to relationships. So from the data there is malism of a partial order, , which, for senses some knowledge of exact matching, but due to s; t 2 S having s t, indicates that the sense s their limited, noisy, and crowd-sourced nature, the is comparable to t and more specific. Note that— labelings may not cover all definitions that belong. as with any partial order—senses can be, and are We assume annotations exhibit relationships be- s often non-comparable. tween lemmas. Finding one: u ∼ v, if u is Intuitively, a given definition d might convey the source, we assume there exists some defini- multiple senses d =) s; t of differing speci- tion d 2 Du that implies the appropriate sense: ficities, s t. So for a given definition d, the d =) s. This good practice assumption mod- model’s goal is to find the sense t that is least spe- els editor behavior as a response to exposure to a cific in being conveyed. Satisfying this goal im- particular definition on the source page. Provided plies resolving the sense identification function, this, an editor won’t necessarily annotate the rela- f : D ! S, for which any lemma u 2 W and tionship on the sink page—even if the sink page definition d 2 Du with d =) s 2 S, it is as- has a definition that implies the sense s.