Connecting Wordnet, Framenet and Verbnet Together

Total Page:16

File Type:pdf, Size:1020Kb

Connecting Wordnet, Framenet and Verbnet Together Connecting WordNet, FrameNet and VerbNet together Nadiya Yampolksa SS2007 University of Saarland Seminar: Resources for Computational Linguists Magdalena Wolska and Michaela Regneri Outline WordNet VerbNet FrameNet (already covered) Combining Resources: “Putting Pieces Together: Combining FrameNet, VerbNet and WordNet for Robust Semantic Parsing”. Lei Shi and Rada Mihalcea. Combining Resources: why and how? Combining Resources: Algorithms Connecting VerbNet to WordNet Connecting FrameNet to VerbNet Labeling VerbNet Syntactic Frame Arguments with Semantic Roles Applications which use rich lexical resources Outline WordNet. Princeton University. • WordNet • VerbNet http://wordnet.princeton.edu/ • Combining Resources: Synsets = synonym sets Algorithms WordNet 2.0 has a network of 152,046 • Applications words WordNet includes semantic relations across concepts (more than 250.000 relations in WN 2.0): . hypernymy/hyponymy, meronymy/holonymy, antonymy, entailment, etc. Download: http://wordnet.princeton.edu/obtain Take a glance at standalone application Outline • WordNet How to use in a different program? • VerbNet http://wordnet.princeton.edu/links : scripts • Combining Resources: in different programming languages for Algorithms playing around with WN. • Applications Ex.: A Perl extension module for accessing and manipulating WordNet (by Dan Brian). This module allows access to the Wordnet lexicon from Perl applications, as well as manipulation and extension of the lexicon. Texts semantically annotated with WordNet 1.6 senses (created at Princeton University), and automatically mapped to WordNet 1.7 and WordNet 1.7.1: http://lit.csci.unt.edu/index.php/Downloads#W ordNet_mappings Tagged fragment of Brown corpus Brown Corpus semantically tagged fragment: • WordNet <wf cmd=done pos=NN lemma=september wnsn=1 • VerbNet lexsn=1:28:00::>September</wf> • Combination <wf cmd=done pos=NN lemma=october wnsn=1 • Applications lexsn=1:28:00::>October</wf> <wf cmd=done pos=NN lemma=term wnsn=2 lexsn=1:28:00::>term</wf> <wf cmd=done pos=NN lemma=jury wnsn=1 lexsn=1:14:00::>jury</wf> WordNet entries for term: The noun term has 7 senses (first 5 from tagged texts) 1. (307) term -- (a word or expression used for some particular thing; "he learned many medical terms") 2. (216) term -- (a limited period of time; "a prison term"; "he left school before the end of term") 3. (113) condition, term -- ((usually plural) a statement of what is required as part of an agreement; "the contract set out the conditions of the lease"; "the terms of the treaty were generous") … …augmented with lexical file information Brown Corpus semantically tagged fragment: • WordNet <wf cmd=done pos=NN lemma=september wnsn=1 • VerbNet lexsn=1:28:00::>September</wf> • Combination <wf cmd=done pos=NN lemma=october wnsn=1 • Applications lexsn=1:28:00::>October</wf> <wf cmd=done pos=NN lemma=term wnsn=2 lexsn=1:28:00::>term</wf> <wf cmd=done pos=NN lemma=jury wnsn=1 lexsn=1:14:00::>jury</wf> WordNet entries for term: The noun term has 7 senses (first 5 from tagged texts) 1. (307) <noun.communication> term -- (a word or expression used for some particular thing; "he learned many medical terms") 2. (216) <noun.time> term -- (a limited period of time; "a prison term"; "he left school before the end of term") 3. (113) <noun.communication> condition2, term2 -- ((usually plural) a statement of what is required as part of an agreement; "the contract set out the conditions of the lease"; "the terms of the treaty were generous") … …augmented with location in the database Brown Corpus semantically tagged fragment: • WordNet <wf cmd=done pos=NN lemma=september wnsn=1 • VerbNet lexsn=1:28:00::>September</wf> • Combination <wf cmd=done pos=NN lemma=october wnsn=1 • Applications lexsn=1:28:00::>October</wf> <wf cmd=done pos=NN lemma=term wnsn=2 lexsn=1:28:00::>term</wf> <wf cmd=done pos=NN lemma=jury wnsn=1 lexsn=1:14:00::>jury</wf> WordNet entries for term: The noun term has 7 senses (first 5 from tagged texts) 1. (307) {06220694} <noun.communication> term -- (a word or expression used for some particular thing; "he learned many medical terms") 2. (216) {15025298} <noun.time> term -- (a limited period of time; "a prison term"; "he left school before the end of term") 3. (113) {06680100} <noun.communication> condition2, term2 -- ((usually plural) a statement of what is required as part of an agreement; "the contract set out the conditions of the lease"; "the terms of the treaty were generous") … … and finally augmented with sense number Brown Corpus semantically tagged fragment: • WordNet <wf cmd=done pos=NN lemma=september wnsn=1 • VerbNet lexsn=1:28:00::>September</wf> • Combination <wf cmd=done pos=NN lemma=october wnsn=1 • Applications lexsn=1:28:00::>October</wf> <wf cmd=done pos=NN lemma=term wnsn=2 lexsn=1:28:00::>term</wf> <wf cmd=done pos=NN lemma=jury wnsn=1 lexsn=1:14:00::>jury</wf> WordNet entries for term: The noun term has 7 senses (first 5 from tagged texts) 1. (307) {06220694} <noun.communication> term#1 -- (a word or expression used for some particular thing; "he learned many medical terms") 2. (216) {15025298} <noun.time> term#2 -- (a limited period of time; "a prison term"; "he left school before the end of term") 3. (113) {06680100} <noun.communication> condition2#4, term2#3 -- ((usually plural) a statement of what is required as part of an agreement; "the contract set out the conditions of the lease"; "the terms of the treaty were generous") … Outline • WordNet VerbNet. University of Colorado at Boulder – • VerbNet verb lexicon with explicitly stated syntactic and • Combining Resources: semantic info based on Levin’s verb classification Algorithms http://verbs.colorado.edu/~mpalmer/projects/v • Applications erbnet.html Download: http://verbs.colorado.edu/~mpalmer/projects/v erbnet/downloads.html 239 verb structures in VerbNet 2.1 in XML format Glance at sample entry VerbNet 2.1 sample accompany-51.7.xml • WordNet • VerbNet dedicate-79.xml • Combining Resources: Algorithms Inspector (Java app): easy retrieval of only • Applications necessary information about the verb. http://verbs.colorado.edu/verb-index/inspector/ w - WordNet sense tags (-Vm required) t - thematic roles u - selectional restrictions for thematic roles (-Vt required) r - frames e - examples x - syntax z - selectional restrictions for syntax (-Vx required) s - semantics Outline • WordNet Combining resources: • VerbNet What features of component resources makes it • Combining so useful and doable Resources: Algorithms Algorithms of combining resources (based on Lei • Applications Shi and Rada Mihalcea) Underlying grounds • WordNet FrameNet: each annotated sentences exemplifies • VerbNet possible syntactic realization for the semantic roles • Combining associated with a frame for a given target word Resources: Algorithms (here: verbs only) • Applications VerbNet: same VerbNet class (Levin) share common syntactic frame, therefore, have the same behavior WordNet: related meanings in the WN hierarchy Pluses and Minuses • WordNet FrameNet • VerbNet + good generalization across predicates using • Combining frames and semantic roles Resources: Algorithms - does not define selectional restrictions for • Applications semantic roles; limited coverage VerbNet + better coverage; defines syntactic-semantic relations - thematic roles are too generic WordNet + almost complete coverage of English verbs augmented with relational information b/w verb senses - does not encode synt. or sem. behaviour (pred- argument realization) How to profit from all • WordNet Goal 1. augment the frame semantics with VerbNet • VerbNet verb classes by labeling FrameNet frames and • Combining semantic roles with VerbNet verb entries and Resources: Algorithms corresponding arguments; • Applications Goal 2. extend the coverage of FrameNet verbs by exploiting both VerbNet verb classes and WordNet verb synonym and hyponym relations; Goal 3. identify explicit connections between semantic roles and semantic classes, by encoding selectional restrictions for semantic roles using WordNet noun hierarchies. VN/FN to WN: defining selectional restrictions • WordNet VN lexical entries are already linked to WN. Now we • VerbNet want to also link selectional restrictions to noun • Combining classes in WN Resources: Algorithms VerbNet selectional restrictions are specified as • Applications generic terms (person, concrete, instrument, etc.) Map the restrictions to ontological classes, as indicated in WN (semantic hierarchy of nouns) Example: instrument (VN) instrumentality (WN) semi-automatic ___________________________________________ FN lexical units get attached with a list of sense IDs in WN similar to annotation of VN in WN allows to see the direct mapping of VN and FM manual annotation Mapping: FN verb senses with WN 2.0 3.094 entries (http://mira.csci.unt.edu/~stone/home.html) … closure open open%2:35:00:: closure close close%2:35:00:: closure lace lace%2:35:01:: judgment boo boo%2:32:00:: judgment fault fault%2:32:00:: judgment disapprove disapprove%2:31:00:: disapprove%2:32:00:: judgment scorn scorn%2:37:00:: scorn%2:32:00:: judgment mock mock%2:32:00:: mock%2:32:01:: … Connecting FN to VN • WordNet STEP 1. Mapping VN verb entries to appropriate • VerbNet semantic frames in FN; • Combining STEP 2. Linking arguments of VN syntactic frames with Resources: Algorithms corresponding FN semantic roles. • Applications Pre-process: Divide all VerbNet verb entries
Recommended publications
  • An Arabic Wordnet with Ontologically Clean Content
    Applied Ontology (2021) IOS Press The Arabic Ontology – An Arabic Wordnet with Ontologically Clean Content Mustafa Jarrar Birzeit University, Palestine [email protected] Abstract. We present a formal Arabic wordnet built on the basis of a carefully designed ontology hereby referred to as the Arabic Ontology. The ontology provides a formal representation of the concepts that the Arabic terms convey, and its content was built with ontological analysis in mind, and benchmarked to scientific advances and rigorous knowledge sources as much as this is possible, rather than to only speakers’ beliefs as lexicons typically are. A comprehensive evaluation was conducted thereby demonstrating that the current version of the top-levels of the ontology can top the majority of the Arabic meanings. The ontology consists currently of about 1,300 well-investigated concepts in addition to 11,000 concepts that are partially validated. The ontology is accessible and searchable through a lexicographic search engine (http://ontology.birzeit.edu) that also includes about 150 Arabic-multilingual lexicons, and which are being mapped and enriched using the ontology. The ontology is fully mapped with Princeton WordNet, Wikidata, and other resources. Keywords. Linguistic Ontology, WordNet, Arabic Wordnet, Lexicon, Lexical Semantics, Arabic Natural Language Processing Accepted by: 1. Introduction The importance of linguistic ontologies and wordnets is increasing in many application areas, such as multilingual big data (Oana et al., 2012; Ceravolo, 2018), information retrieval (Abderrahim et al., 2013), question-answering and NLP-based applications (Shinde et al., 2012), data integration (Castanier et al., 2012; Jarrar et al., 2011), multilingual web (McCrae et al., 2011; Jarrar, 2006), among others.
    [Show full text]
  • Verbnet Based Citation Sentiment Class Assignment Using Machine Learning
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 9, 2020 VerbNet based Citation Sentiment Class Assignment using Machine Learning Zainab Amjad1, Imran Ihsan2 Department of Creative Technologies Air University, Islamabad, Pakistan Abstract—Citations are used to establish a link between time-consuming and complicated. To resolve this issue there articles. This intent has changed over the years, and citations are exists many researchers [7]–[9] who deal with the sentiment now being used as a criterion for evaluating the research work or analysis of citation sentences to improve bibliometric the author and has become one of the most important criteria for measures. Such applications can help scholars in the period of granting rewards or incentives. As a result, many unethical research to identify the problems with the present approaches, activities related to the use of citations have emerged. That is why unaddressed issues, and the present research gaps [10]. content-based citation sentiment analysis techniques are developed on the hypothesis that all citations are not equal. There are two existing approaches for Citation Sentiment There are several pieces of research to find the sentiment of a Analysis: Qualitative and Quantitative [7]. Quantitative citation, however, only a handful of techniques that have used approaches consider that all citations are equally important citation sentences for this purpose. In this research, we have while qualitative approaches believe that all citations are not proposed a verb-oriented citation sentiment classification for equally important [9]. The quantitative approach uses citation researchers by semantically analyzing verbs within a citation text count to rank a research paper [8] while the qualitative using VerbNet Ontology, natural language processing & four approach analyzes the nature of citation [10].
    [Show full text]
  • Enriching an Explanatory Dictionary with Framenet and Propbank Corpus Examples
    Proceedings of eLex 2019 Enriching an Explanatory Dictionary with FrameNet and PropBank Corpus Examples Pēteris Paikens 1, Normunds Grūzītis 2, Laura Rituma 2, Gunta Nešpore 2, Viktors Lipskis 2, Lauma Pretkalniņa2, Andrejs Spektors 2 1 Latvian Information Agency LETA, Marijas street 2, Riga, Latvia 2 Institute of Mathematics and Computer Science, University of Latvia, Raina blvd. 29, Riga, Latvia E-mail: [email protected], [email protected] Abstract This paper describes ongoing work to extend an online dictionary of Latvian – Tezaurs.lv – with representative semantically annotated corpus examples according to the FrameNet and PropBank methodologies and word sense inventories. Tezaurs.lv is one of the largest open lexical resources for Latvian, combining information from more than 300 legacy dictionaries and other sources. The corpus examples are extracted from Latvian FrameNet and PropBank corpora, which are manually annotated parallel subsets of a balanced text corpus of contemporary Latvian. The proposed approach augments traditional lexicographic information with modern cross-lingually interpretable information and enables analysis of word senses from the perspective of frame semantics, which is substantially different from (complementary to) the traditional approach applied in Latvian lexicography. In cases where FrameNet and PropBank corpus evidence aligns well with the word sense split in legacy dictionaries, the frame-semantically annotated corpus examples supplement the word sense information with clarifying usage examples and commonly used semantic valence patterns. However, the annotated corpus examples often provide evidence of a different sense split, which is often more coarse-grained and, thus, may help dictionary users to cluster and comprehend a fine-grained sense split suggested by the legacy sources.
    [Show full text]
  • Towards a Cross-Linguistic Verbnet-Style Lexicon for Brazilian Portuguese
    Towards a cross-linguistic VerbNet-style lexicon for Brazilian Portuguese Carolina Scarton, Sandra Alu´ısio Center of Computational Linguistics (NILC), University of Sao˜ Paulo Av. Trabalhador Sao-Carlense,˜ 400. 13560-970 Sao˜ Carlos/SP, Brazil [email protected], [email protected] Abstract This paper presents preliminary results of the Brazilian Portuguese Verbnet (VerbNet.Br). This resource is being built by using other existing Computational Lexical Resources via a semi-automatic method. We identified, automatically, 5688 verbs as candidate members of VerbNet.Br, which are distributed in 257 classes inherited from VerbNet. These preliminary results give us some directions of future work and, since the results were automatically generated, a manual revision of the complete resource is highly desirable. 1. Introduction the verb to load. To fulfill this gap, VerbNet has mappings The task of building Computational Lexical Resources to WordNet, which has deeper semantic relations. (CLRs) and making them publicly available is one of Brazilian Portuguese language lacks CLRs. There are some the most important tasks of Natural Language Processing initiatives like WordNet.Br (Dias da Silva et al., 2008), that (NLP) area. CLRs are used in many other applications is based on and aligned to WordNet. This resource is the in NLP, such as automatic summarization, machine trans- most complete for Brazilian Portuguese language. How- lation and opinion mining. Specially, CLRs that treat the ever, only the verb database is in an advanced stage (it syntactic and semantic behaviour of verbs are very impor- is finished, but without manual validation), currently con- tant to the tasks of information retrieval (Croch and King, sisting of 5,860 verbs in 3,713 synsets.
    [Show full text]
  • Developing a Large Scale Framenet for Italian: the Iframenet Experience
    Developing a large scale FrameNet for Italian: the IFrameNet experience Roberto Basili° Silvia Brambilla§ Danilo Croce° Fabio Tamburini§ ° § Dept. of Enterprise Engineering Dept. of Classic Philology and Italian Studies University of Rome Tor Vergata University of Bologna {basili,croce}@info.uniroma2.it [email protected], [email protected] ian Portuguese, German, Spanish, Japanese, Swe- Abstract dish and Korean. All these projects are based on the idea that English. This paper presents work in pro- most of the Frames are the same among languages gress for the development of IFrameNet, a and that, thanks to this, it is possible to adopt large-scale, computationally oriented, lexi- Berkeley’s Frames and FEs and their relations, cal resource based on Fillmore’s frame se- with few changes, once all the language-specific mantics for Italian. For the development of information has been cut away (Tonelli et al. 2009, IFrameNet linguistic analysis, corpus- Tonelli 2010). processing and machine learning techniques With regard to Italian, over the past ten years are combined in order to support the semi- several research projects have been carried out at automatic development and annotation of different universities and Research Centres. In par- the resource. ticular, the ILC-CNR in Pisa (e.g. Lenci et al. 2008; Johnson and Lenci 2011), FBK in Trento (e.g. Italiano. Questo articolo presenta un work Tonelli et al. 2009, Tonelli 2010) and the Universi- in progress per lo sviluppo di IFrameNet, ty of Rome, Tor Vergata (e.g. Pennacchiotti et al. una risorsa lessicale ad ampia copertura, 2008, Basili et al.
    [Show full text]
  • Integrating Wordnet and Framenet Using a Knowledge-Based Word Sense Disambiguation Algorithm
    Integrating WordNet and FrameNet using a knowledge-based Word Sense Disambiguation algorithm Egoitz Laparra and German Rigau IXA group. University of the Basque Country, Donostia {egoitz.laparra,german.rigau}@ehu.com Abstract coherent groupings of words belonging to the same frame. This paper presents a novel automatic approach to In that way we expect to extend the coverage of FrameNet partially integrate FrameNet and WordNet. In that (by including from WordNet closely related concepts), to way we expect to extend FrameNet coverage, to en- enrich WordNet with frame semantic information (by port- rich WordNet with frame semantic information and ing frame information to WordNet) and possibly to extend possibly to extend FrameNet to languages other than FrameNet to languages other than English (by exploiting English. The method uses a knowledge-based Word local wordnets aligned to the English WordNet). Sense Disambiguation algorithm for linking FrameNet WordNet1 [12] (hereinafter WN) is by far the most lexical units to WordNet synsets. Specifically, we ex- ploit a graph-based Word Sense Disambiguation algo- widely-used knowledge base. In fact, WN is being rithm that uses a large-scale knowledge-base derived used world-wide for anchoring different types of seman- from WordNet. We have developed and tested four tic knowledge including wordnets for languages other than additional versions of this algorithm showing a sub- English [4], domain knowledge [17] or ontologies like stantial improvement over previous results. SUMO [22] or the EuroWordNet Top Concept Ontology [3]. It contains manually coded information about En- glish nouns, verbs, adjectives and adverbs and is organized around the notion of a synset.
    [Show full text]
  • The Hebrew Framenet Project
    The Hebrew FrameNet Project Avi Hayoun, Michael Elhadad Dept. of Computer Science Ben-Gurion University Beer Sheva, Israel {hayounav,elhadad}@cs.bgu.ac.il Abstract We present the Hebrew FrameNet project, describe the development and annotation processes and enumerate the challenges we faced along the way. We have developed semi-automatic tools to help speed the annotation and data collection process. The resource currently covers 167 frames, 3,000 lexical units and about 500 fully annotated sentences. We have started training and testing automatic SRL tools on the seed data. Keywords: FrameNet, Hebrew, frame semantics, semantic resources 1. Introduction frames and their structures in natural language. Recent years have seen growing interest in the task 1.2. FrameNet in Other Languages of Semantic Role Labeling (SRL) of natural language The original FrameNet project has been adapted and text (sometimes called “shallow semantic parsing”). ported to multiple languages. The most active interna- The task is usually described as the act of identify- tional FrameNet teams include the Swedish FrameNet ing the semantic roles, which are the set of semantic (SweFN) covering close to 1,200 frames with 34K LUs properties and relationships defined over constituents (Ahlberg et al., 2014); the Japanese FrameNet (JFN) of a sentence, given a semantic context. with 565 frames, 8,500 LUs and 60K annotated ex- The creation of resources that document the realiza- ample sentences (Ohara, 2013); and FrameNet Brazil tion of semantic roles in natural language texts, such (FN-Br) covering 179 frames, 196 LUs and 12K anno- as FrameNet (Fillmore and Baker, 2010; Ruppenhofer tated sentences (Torrent and Ellsworth, 2013).
    [Show full text]
  • INF5830 Introduction to Semantic Role Labeling
    INF5830 Introduction to Semantic Role Labeling Andrey Kutuzov University of Oslo Language Technology Group With thanks to Lilja Øvrelid, Martha Palmer and Dan Jurafsky INF5830 Introduction to Semantic Role Labeling 1(36) Semantic Role Labeling INF5830 Introduction to Semantic Role Labeling 2(36) Introduction Contents Introduction Semantic roles in general PropBank: Proto-roles FrameNet: Frame Semantics Summary INF5830 Introduction to Semantic Role Labeling 2(36) Introduction Semantics I Study of meaning, expressed in language; I Morphemes, words, phrases, sentences; I Lexical semantics; I Sentence semantics; I (Pragmatics: how the context affects meaning). INF5830 Introduction to Semantic Role Labeling 3(36) Introduction Semantics I Linguistic knowledge: meaning I Meaningful or not: I Word { flick vs blick I Sentence { John swims vs John metaphorically every I Several meanings (WSD): I Words { fish I Sentence { John saw the man with the binoculars I Same meaning (semantic similarity): I Word { sofa vs couch I Sentence { John gave Hannah a gift vs John gave a gift to Hannah I Truth conditions: I All kings are male I Molybdenum conducts electricity I Entailment: I Alfred murdered the librarian I The librarian is dead I Participant roles: John is the `giver', Hannah is the `receiver' INF5830 Introduction to Semantic Role Labeling 4(36) Introduction Representing events I We want to understand the event described by these sentences: 1. IBM bought Spark 2. IBM acquired Spark 3. Spark was acquired by IBM 4. The owners of Spark sold it to IBM I Dependency parsing is insufficient. UDPipe will give us simple relations between verbs and arguments: 1. (buy, nsubj, IBM), (buy, obj, Spark) 2.
    [Show full text]
  • Using Frame Semantics in Natural Language Processing
    Using Frame Semantics in Natural Language Processing Apoorv Agarwal Daniel Bauer Owen Rambow Dept. of Computer Science Dept. of Computer Science CCLS Columbia University Columbia University Columbia University New York, NY New York, NY New York, NY [email protected] [email protected] [email protected] Abstract the notion of a social event (Agarwal et al., 2010), We summarize our experience using a particular kind of event which involves (at least) FrameNet in two rather different projects two people such that at least one of them is aware in natural language processing (NLP). of the other person. If only one person is aware We conclude that NLP can benefit from of the event, we call it Observation (OBS): for FrameNet in different ways, but we sketch example, someone is talking about someone else some problems that need to be overcome. in their absence. If both people are aware of the event, we call it Interaction (INR): for example, 1 Introduction one person is telling the other a story. Our claim We present two projects at Columbia in which we is that links in social networks are in fact made use FrameNet. In these projects, we do not de- up of social events: OBS social events give rise velop basic NLP tools for FrameNet, and we do to one-way links, and INR social events to two- not develop FramNets for new languages: we sim- way links. For more information, see (Agarwal ply use FrameNet or a FrameNet parser in an NLP and Rambow, 2010; Agarwal et al., 2013a; Agar- application.
    [Show full text]
  • Event-Based Knowledge Reconciliation Using Frame Embeddings And
    Knowledge-Based Systems 135 (2017) 192–203 Contents lists available at ScienceDirect Knowle dge-Base d Systems journal homepage: www.elsevier.com/locate/knosys Event-base d knowle dge reconciliation using frame emb e ddings and frame similarity ∗ Mehwish Alam a, Diego Reforgiato Recupero b,d, , Misael Mongiovi c, Aldo Gangemi a,d, Petar Ristoski e a Université Paris 13, 99 avenue JB Clément, Villetaneuse, Paris 93430, France b Università degli Studi di Cagliari, Department of Mathematics and Computer Science, Via Ospedale 72, Cagliari 09124, Italy c CNR, ISTC, Catania, Italy d CNR, ISTC, Via S. Martino della Battaglia 44, Rome, Italy e University of Mannheim, Mannheim, Germany a r t i c l e i n f o a b s t r a c t Article history: This paper proposes an evolution over MERGILO, a tool for reconciling knowledge graphs extracted from Received 6 April 2017 text, using graph alignment and word similarity. The reconciled knowledge graphs are typically used Revised 9 August 2017 for multi-document summarization, or to detect knowledge evolution across document series. The main Accepted 14 August 2017 point of improvement focuses on event reconciliation i.e., reconciling knowledge graphs generated by text Available online 16 August 2017 about two similar events described differently. In order to gather a complete semantic representation of Keywords: events, we use FRED semantic web machine reader, jointly with Framester, a linguistic linked data hub Knowledge reconciliation represented using a novel formal semantics for frames. Framester is used to enhance the extracted event Frame semantics knowledge with semantic frames.
    [Show full text]
  • Out-Of-Domain Framenet Semantic Role Labeling
    Out-of-domain FrameNet Semantic Role Labeling Silvana Hartmann§†, Ilia Kuznetsov†, Teresa Martin§†, Iryna Gurevych§† Research Training Group AIPHES Ubiquitous§ Knowledge Processing (UKP) Lab Department† of Computer Science, Technische Universitat¨ Darmstadt http://www.ukp.tu-darmstadt.de Abstract on a range of benchmark datasets. This is crucial as the demand for semantic textual analysis of large- Domain dependence of NLP systems is one scale web data keeps growing. of the major obstacles to their application in large-scale text analysis, also restrict- Based on FrameNet (Fillmore et al., 2003), ing the applicability of FrameNet semantic FrameNet SRL extracts frame-semantic structures role labeling (SRL) systems. Yet, current on the sentence level that describe a specific FrameNet SRL systems are still only eval- situation centered around a semantic predicate, uated on a single in-domain test set. For often a verb, and its participants, typically the first time, we study the domain depen- syntactic arguments or adjuncts of the predicate. frame dence of FrameNet SRL on a wide range of The predicate is assigned a label, essentially benchmark sets. We create a novel test set a word sense label, that defines the situation and semantic roles for FrameNet SRL based on user-generated determines the of the participants. web text and find that the major bottleneck The following sentence from FrameNet provides Grinding for out-of-domain FrameNet SRL is the an example of the frame and its roles: frame identification step. To address this [The mill]Grinding cause grindsGrinding [the problem, we develop a simple, yet efficient malt]P atient [to grist]Result.
    [Show full text]
  • Leveraging Verbnet to Build Corpus-Specific Verb Clusters
    Leveraging VerbNet to build Corpus-Specific Verb Clusters Daniel W Peterson and Jordan Boyd-Graber and Martha Palmer University of Colorado daniel.w.peterson,jordan.boyd.graber,martha.palmer @colorado.edu { } Daisuke Kawhara Kyoto University, JP [email protected] Abstract which involved dozens of linguists and a decade of work, making careful decisions about the al- In this paper, we aim to close the gap lowable syntactic frames for various verb senses, from extensive, human-built semantic re- informed by text examples. sources and corpus-driven unsupervised models. The particular resource explored VerbNet is useful for semantic role labeling and here is VerbNet, whose organizing princi- related tasks (Giuglea and Moschitti, 2006; Yi, ple is that semantics and syntax are linked. 2007; Yi et al., 2007; Merlo and van der Plas, To capture patterns of usage that can aug- 2009; Kshirsagar et al., 2014), but its widespread ment knowledge resources like VerbNet, use is limited by coverage. Not all verbs have we expand a Dirichlet process mixture a VerbNet class, and some polysemous verbs model to predict a VerbNet class for each have important senses unaccounted for. In addi- sense of each verb, allowing us to incorpo- tion, VerbNet is not easily adaptable to domain- rate annotated VerbNet data to guide the specific corpora, so these omissions may be more clustering process. The resulting clusters prominent outside of the general-purpose corpora align more closely to hand-curated syn- and linguistic intuition used in its construction. tactic/semantic groupings than any previ- Its great strength is also its downfall: adding ous models, and can be adapted to new new verbs, new senses, and new classes requires domains since they require only corpus trained linguists - at least, to preserve the integrity counts.
    [Show full text]