Semlink+: Framenet, Verbnet and Event Ontologies

SemLink+: FrameNet, VerbNet and Event Ontologies Martha Palmer, Claire Bonial Diana McCarthy Department of Linguistics Department of Theoretical and University of Colorado Applied Linguistics (DTAL) Boulder, CO University of Cambridge mpalmer/[email protected] [email protected] places and events buried within text. These Abstract relations can be very complex, and are not always explicit, requiring subtle semantic This paper reviews the significant interpretation of the data. For instance, NLP contributions FrameNet has made to systems must be able to automatically our understanding of lexical resources, recognize that Stock prices sank and The stock semantic roles and event relations. market is falling can be describing the same event. Such an interpretation relies upon a 1 Introduction recognition of the similarity between sinking and falling, as well as noting the connection One of the great challenges of Natural between stock prices and the stock market, Language Processing (NLP) is the multitude of and, finally, acknowledgment that they are choices that language gives us for expressing playing the same role. A key element in event the same thing in different ways. This is extraction is the identification of the obviously true when taking other languages participants of an event, such as the initiator of into consideration - the same thought can be an action and any parties affected by it. expressed in English, French, Chinese or Basically who did what to whom, when, where, Russian, with widely varying results. However, why and how? Many systems today rely on it is also true when considering a single semantic role labeling to help identify language such as English. Light verb participants, and lexical resources that provide constructions, nominalizations, idioms, slang, an inventory of possible predicate argument paraphrases, and synonyms all give us myriads structures for individual lexical items are of alternatives for “coining a phrase.” This crucial to the success of semantic role labeling causes immense difficulty for NLP systems. (Palmer,et al., 2010). No one has made greater contributions to advancing the state of the art of lexical 3 SemLink+ and Semantic Roles semantics, and its applications to NLP, than Chuck Fillmore. In this paper we focus on the SemLink (Palmer, 2009) is an ongoing effort central role that FrameNet has played in our to map complementary lexical resources: development of SemLink+ and in our current PropBank (Palmer et al., 2005), VerbNet explorations into event ontologies that can play (Kipper et al., 2008), FrameNet (Fillmore et a practical role in accurate automatic event al., 2004), and the recently added OntoNotes extraction. (ON) sense groupings (Weischedel, et al., 2011). They all associate semantic information 2 Detecting events with the propositions in a sentence. Each was created independently with somewhat differing An elusive goal of current NLP systems is the goals, and they vary in the level and nature of accurate detection of events – recognizing the semantic detail represented. FrameNet is the meaningful relations among the topics, people, 13 Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929–2014), pages 13–17, Baltimore, Maryland USA, June 27, 2014. c 2014 Association for Computational Linguistics most fine-grained with the richest semantics, for core verb specific roles and for adjuncts it VerbNet focuses on syntactically-based has several generally applicable ArgModifiers generalizations that carry semantic that have function tag labels such as: MaNneR, implications, and the relatively coarse-grained TeMPoral, LOCation, DIRection, GOaL, etc. PropBank has been shown to provide the most VerbNet uses more traditional linguistic effective training data for supervised Machine thematic role labels, with about 30 in total, and Learning techniques. Nonetheless, they can be assumes adjuncts (ArgM’s) will be supplied by seen as complementary rather than conflicting, PropBank based semantic role labelers. and together comprise a whole that is greater FrameNet is even more fine-grained and has than the sum of its parts. SemLink serves as a frame-specific core and peripheral roles called platform to unify these resources. The recent Frame Elements for each frame, amounting to addition of ON sense groupings, which can be over 2000 individual Frame Element types. thought of as a more coarse-grained view of For example, He talked about politics would WordNet (Fellbaum, 1998), provides even receive the following semantic role labels from broader coverage for verbs, and a level of each framework.1 representation that is appropriate for linking between VerbNet class members and PropBank (talk.01) FrameNet lexical units, as described below. HeArg0 talkedRELATION about politicsArg1 SemLink unifies these lexical resources at several different levels. First by providing VerbNet (Talk-37.5): type-to-type mappings between the lexical HeAGENT talkedRELATION about politicsTOPIC units for each framework. For PropBank these are the very coarse-grained rolesets, for FrameNet (Statement frame): VerbNet they are verbs that are members of HeSPEAKER talkedRELATION about politicsTOPIC VerbNet classes, and for FrameNet they are the lexical units associated with each Frame. The Thanks to Chuck Fillmore’s careful same lemma can have multiple PropBank guidance, the rich, meticulously crafted rolesets and can be in several VerbNet classes Frames in FrameNet, with their detailed and FrameNet frames, but always with descriptions of all possible arguments and their different meanings. In general, the mappings relations to each other, offer the potential of from PropBank to VerbNet or FrameNet tend providing a foundation for inferencing about to be 1-many, while the mappings between events and their consequences. In addition VerbNet and FrameNet are more likely to be 1- FrameNet has from the beginning been 1. For example, the verb hear has just one inclusive in its addition of nominal and coarse-grained sense in PropBank, with the adjectival forms to the Frames, which greatly following roleset: increases our coverage of all predicating elements (Bonial, et al., 2014). There is also a Arg0: hearer comprehensive FrameNet Constructicon that Arg1: utterance, sound painstakingly lists many phrasal constructions, Arg2: speaker, source of sound such as “the Xer, the Yer” that cannot be found anywhere else (Fillmore, et al., 2012). Many of This roleset maps to both the Discover and See these frames, including the constructions, classes of VerbNet, and the Hear and apply equally well to other languages, as Perception_experience frames of FrameNet. evidenced by the various efforts to develop Then, for each lexical unit, SemLink also FrameNets in other languages2 promising a supplies a mapping between the semantic roles likely benefit to multilingual information of PropBank and VerbNet, as well as the roles of VerbNet and FrameNet. PropBank uses 1 very generic labels such as Arg0 and Arg1, Arg0 maps to Agent maps to Speaker. Arg1 maps to Topic maps to Topic. which correspond to Dowty’s Prototypical 2 Agent and Patient, respectively (Dowty, 1991). See FrameNet projects in other languages listed at https://framenet.icsi.berkeley.edu/fndrupal/framenets_in_ PropBank has up to six numbered arguments other_languages 14 processing as well. Given the close theoretical the kitchen event. It can sometimes be quite ties between PropBank, VerbNet and challenging to determine the relationship FrameNet, it should be possible to bootstrap between two events. For instance, earthquakes from the successful PropBank-based automatic are quite often associated with the collapse of semantic role labelers to equally accurate buildings, as in the following example, The FrameNet and VerbNet annotators, and to quake destroyed parts of Sausalito. All tall improve overall semantic role labeling buildings were demolished. performance (Bauer & Rambow, 2011; Many readers might agree that the Dipanjan, et al., 2010; Giuglea & Moschitti, earthquakes CAUSED the demolishment of the 2006; Merlo & der Plas, 2009; Yi, et al., 2007). buildings. However, are the building collapses That is one of the primary goals of SemLink. also SUBEVENTs of the earthquakes? The first release of SemLink (1.1) contained Sometimes they happen a few days later, or mappings between these three lexical resources immediately, simultaneously with the as well as a set of PropBank instances from the earthquake. Are they both subevents? In Wall Street Journal data with mappings to general, for accurate event detection, it would VerbNet classes and thematic roles (Palmer, be very useful to know which events must 2009). Our most recent release, SemLink 1.2,3 precede, must follow, or cannot be now includes mappings to FrameNet frames simultaneous with, which other events. As and Frame Elements wherever they are discussed in the 2013 NAACL Events available (FN version 1.5), as well as ON sense workshop and this year’s ACL Events groupings (Bonial, et al., 2013). The mapping workshop, clear, consistent annotation of files between PropBank and VerbNet (version events and their coreference and causal and 3.2), and FrameNet have also been checked for temporal relations is a much desired but very consistency and updated to more accurately challenging goal (Ikuta & Palmer, 2014). Any reflect the current relations between these assistance that can be provided by lexical resources. resources is welcome. This annotated corpus can now be used to Another

Semlink+: Framenet, Verbnet and Event Ontologies

An Arabic Wordnet with Ontologically Clean Content

Verbnet Based Citation Sentiment Class Assignment Using Machine Learning

Enriching an Explanatory Dictionary with Framenet and Propbank Corpus Examples

Towards a Cross-Linguistic Verbnet-Style Lexicon for Brazilian Portuguese

Developing a Large Scale Framenet for Italian: the Iframenet Experience

Integrating Wordnet and Framenet Using a Knowledge-Based Word Sense Disambiguation Algorithm

The Hebrew Framenet Project

INF5830 Introduction to Semantic Role Labeling

Using Frame Semantics in Natural Language Processing

Event-Based Knowledge Reconciliation Using Frame Embeddings And

Out-Of-Domain Framenet Semantic Role Labeling

Leveraging Verbnet to Build Corpus-Specific Verb Clusters