Context-Aware Frame-Semantic Role Labeling
Total Page:16
File Type:pdf, Size:1020Kb
Context-aware Frame-Semantic Role Labeling Michael Roth and Mirella Lapata School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB mroth,mlap @inf.ed.ac.uk { } Abstract including question answering (Shen and Lapata, 2007), text-to-scene generation (Coyne et al., 2012), Frame semantic representations have been stock price prediction (Xie et al., 2013), and so- useful in several applications ranging from cial network extraction (Agarwal et al., 2014). text-to-scene generation, to question answer- Whereas some tasks directly utilize information ing and social network analysis. Predicting encoded in the FrameNet resource, others make such representations from raw text is, how- ever, a challenging task and corresponding use of FrameNet indirectly through the output of models are typically only trained on a small SRL systems that are trained on data annotated set of sentence-level annotations. In this pa- with frame-semantic representations. While ad- per, we present a semantic role labeling sys- vances in machine learning have recently given tem that takes into account sentence and dis- rise to increasingly powerful SRL systems follow- course context. We introduce several new fea- ing the FrameNet paradigm (Hermann et al., 2014; tures which we motivate based on linguistic Tackstr¨ om¨ et al., 2015), little effort has been devoted insights and experimentally demonstrate that they lead to significant improvements over the to improve such models from a linguistic perspec- current state-of-the-art in FrameNet-based se- tive. mantic role labeling. In this paper, we explore insights from the lin- guistic literature suggesting a connection between 1 Introduction discourse and role labeling decisions and show how to incorporate these in an SRL system. Although The goal of semantic role labeling (SRL) is to iden- early theoretical work (Fillmore, 1976) has recog- tify and label the arguments of semantic predicates nized the importance of discourse context for the in a sentence according to a set of predefined re- assignment of semantic roles, most computational lations (e.g., “who” did “what” to “whom”). In approaches have shied away from such considera- addition to providing definitions and examples of tions. To see how context can be useful, consider as role labeled text, resources like FrameNet (Ruppen- an example the DELIVERY frame, which states that hofer et al., 2010) group semantic predicates into so- a THEME can be handed off to either a RECIPIENT called frames, i.e., conceptual structures describing or “more indirectly” to a GOAL. While the distinc- the background knowledge necessary to understand tion between the latter two roles might be clear for a situation, event or entity as a whole as well as some fillers (e.g., people vs. locations), there are oth- the roles participating in it. Accordingly, semantic ers where both roles are equally plausible and addi- roles are defined on a per-frame basis and are shared tional information is required to resolve the ambigu- among predicates. ity (e.g., countries). If we hear about a letter being In recent years, frame representations have been delivered to Greece, for instance, reliable cues might successfully applied in a range of downstream tasks, be whether the sender is a person or a country and 449 Transactions of the Association for Computational Linguistics, vol. 3, pp. 449–460, 2015. Action Editor: Diana McCarthy. Submission batch: 5/2015; Revision batch 7/2015; Published 8/2015. c 2015 Association for Computational Linguistics. Distributed under a CC-BY 4.0 license. Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/tacl_a_00150 by guest on 23 September 2021 whether Greece refers to the geographic region or to (Roth and Woodsend, 2014; Lei et al., 2015; Foland the Greek government. and Martin, 2015) explore ways of using low-rank The example shows that context can generally in- vector and tensor approximations to represent lex- fluence the choice of correct role label. Accordingly, ical and syntactic features as well as combinations we assume that modeling contextual information, thereof. such as the meaning of a word in a given situation, To the best of our knowledge, there exists no can improve semantic role labeling performance. To prior work where features based on discourse con- validate this assumption, we explore different ways text are used to assign roles on the sentence level. of incorporating contextual cues in a SRL model and Discourse-like features have been previously ap- provide experimental support that demonstrates the plied in models that deal with so-called implicit ar- usefulness of such additional information. guments, i.e., roles which are not locally realized The remainder of this paper is structured as fol- but resolvable within the greater discourse context lows. In Section 2, we present related work on se- (Ruppenhofer et al., 2010; Gerber and Chai, 2012). mantic role labeling and the various features applied Successful features for resolving implicit arguments in traditional SRL systems. In Section 3, we provide include the distance between mentions and any dis- additional background on the FrameNet resource. course relations occurring between them (Gerber Sections 4 and 5 describe our baseline system and and Chai, 2012), roles assigned to mentions in the contextual extensions, respectively, and Section 6 previous context, the discourse prominence of the presents our experimental results. We conclude the denoted entity (Silberer and Frank, 2012), and its paper by discussing in more detail the output of our centering status (Laparra and Rigau, 2013). None system and highlighting avenues for future work. of these features have been used in a standard SRL system to date (and trivially, not all of them will be 2 Related Work helpful as, for example, the number of sentences be- tween a predicate and an argument is always zero Early work in SRL dates back to Gildea and Juraf- within a sentence). In this paper, we extend the sky (2002), who were the first to model role assign- contextual features used for resolving implicit ar- ment to verb arguments based on FrameNet. Their guments to the SRL task and show how a set of model makes use of lexical and syntactic features, discourse-level enhancements can be added to a tra- including binary indicators for the words involved, ditional sentence-level SRL model. syntactic categories, dependency paths as well as po- sition and voice in a given sentence. Most subse- 3 FrameNet quent work in SRL builds on Gildea and Jurafsky’s feature set, often with the addition of features that The Berkeley FrameNet project (Ruppenhofer et al., describe relevant syntactic structures in more de- 2010) develops a semantic lexicon and an annotated tail, e.g., the argument’s leftmost/rightmost depen- example corpus based on Fillmore’s (1976) theory dent (Johansson and Nugues, 2008). of frame semantics. Annotations consist of frame- More sophisticated features include the use of evoking elements (i.e., words in a sentence that are convolution kernels (Moschitti, 2004; Croce et associated with a conceptual frame) and frame ele- al., 2011) in order to represent predicate-argument ments (i.e., instantiations of semantic roles, which structures and their lexical similarities more accu- are defined per frame and filled by words or word rately. Beyond lexical and syntactic information, sequences in a given sentence). For example, the a few approaches employ additional semantic fea- DELIVERY frame describes a scene or situation in tures based on annotated word senses (Che et al., which a DELIVERER hands off a THEME to a RE- 2010) and selectional preferences (Zapirain et al., CIPIENT or a GOAL.1 In total, there are 1,019 2013). Deschacht and Moens (2009) and Huang frames and 8,886 frame elements defined in the lat- and Yates (2010) use sentence-internal sequence in- formation, in the form of latent states in a hidden 1See https://framenet2.icsi.berkeley.edu/ markov model. More recently, a few approaches for a comprehensive list of frames and their definitions. 450 Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/tacl_a_00150 by guest on 23 September 2021 est publicly available version of FrameNet.2 An av- roles and implement I/O methods to read and gen- erage number of 11.6 different frame-evoking ele- erate FrameNet XML files. For direct compari- ments are provided for each frame (11,829 in total). son with the previous state-of-the-art for FrameNet- Following previous work on FrameNet-based SRL, based SRL, we further implement additional fea- we use the full text annotation data set, which con- tures used in the SEMAFOR system (Das et tains 23,087 frame instances. al., 2014) and combine the role labeling compo- Semantic annotations for frame instances and nents of mate-tools with SEMAFOR’s preprocess- fillers of frame elements are generally provided at ing toolchain.3 All features used in our system are the level of word sequences, which can be single listed in Table 1. words, complete or incomplete phrases, and entire The main differences between our adaptation of clauses (Ruppenhofer et al., 2010, Chapter 4). An mate-tools and SEMAFOR are as follows: whereas instance of the DELIVERY frame, with annotations the latter implements identification and labeling of of the frame-evoking element (underlined) and in- role fillers in one step, mate-tools follow the in- stantiated frame elements (in brackets), is given in sight that these two steps are conceptually differ- the example below: ent (Xue and Palmer, 2004) and should be modeled separately. Accordingly, mate-tools contain a global (1) The Soviet Union agreed to speed up [oil]THEME reranking component which takes into account iden- deliveriesDELIVERY [to Yugoslavia]RECIPIENT. tification and labeling decisions while SEMAFOR only uses reranking techniques to filter overlapping Note that the oil deliveries here concern Yugoslavia argument predictions and other constraints (see Das as a geopolitical entity and hence the RECIPIENT et al., 2014 for details).