Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources

Total Page:16

File Type:pdf, Size:1020Kb

Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources Simone Conia Andrea Bacciu Roberto Navigli Sapienza NLP Group Department of Computer Science Sapienza University of Rome {first.lastname}@uniroma1.it Abstract robust multilingual representations, such as multi- lingual word embeddings (Grave et al., 2018) and While cross-lingual techniques are finding in- multilingual language models (Devlin et al., 2019; creasing success in a wide range of Natural Conneau et al., 2020), researchers have been able Language Processing tasks, their application to shift their focus to the development of models to Semantic Role Labeling (SRL) has been strongly limited by the fact that each language that work on multiple languages (Cai and Lapata, adopts its own linguistic formalism, from Prop- 2019b; He et al., 2019; Lyu et al., 2019). Bank for English to AnCora for Spanish and A robust multilingual representation is neverthe- PDT-Vallex for Czech, inter alia. In this work, less just one piece of the puzzle: a key challenge in we address this issue and present a unified multilingual SRL is that the task is tightly bound to model to perform cross-lingual SRL over het- linguistic formalisms (Màrquez et al., 2008) which erogeneous linguistic resources. Our model implicitly learns a high-quality mapping for may present significant structural differences from different formalisms across diverse languages language to language (Hajic et al., 2009). In the re- without resorting to word alignment and/or cent literature, it is standard practice to sidestep this translation techniques. We find that, not only is issue by training and evaluating a model on each our cross-lingual system competitive with the language separately (Cai and Lapata, 2019b; Chen current state of the art but that it is also robust et al., 2019; Kasai et al., 2019; He et al., 2019; Lyu to low-data scenarios. Most interestingly, our et al., 2019). Although this strategy allows a model unified model is able to annotate a sentence in a single forward pass with all the invento- to adapt itself to the characteristics of a given for- ries it was trained with, providing a tool for malism, it is burdened by the non-negligible need the analysis and comparison of linguistic theo- for training and maintaining one model instance ries across different languages. We release our for each language, resulting in a set of monolingual code and model at https://github.com/ systems. SapienzaNLP/unify-srl. Instead of dealing with heterogeneous linguis- tic theories, another line of research consists in 1 Introduction actively studying the effect of using a single for- Semantic Role Labeling (SRL) – a long-standing malism across multiple languages through annota- open problem in Natural Language Processing tion projection or other transfer techniques (Akbik (NLP) and a key building block of language un- et al., 2015, 2016; Daza and Frank, 2019; Cai and derstanding (Navigli, 2018) – is often defined as Lapata, 2020; Daza and Frank, 2020). However, the task of automatically addressing the question such approaches often rely on word aligners and/or “Who did what to whom, when, where, and how?” automatic translation tools which may introduce (Gildea and Jurafsky, 2000; Màrquez et al., 2008). a considerable amount of noise, especially in low- While the need to manually engineer and fine-tune resource languages. More importantly, they rely on complex feature templates severely limited early the strong assumption that the linguistic formalism work (Zhao et al., 2009), the great success of neu- of choice, which may have been developed with a ral networks in NLP has resulted in impressive specific language in mind, is also suitable for other progress in SRL, thanks especially to the ability of languages. recurrent networks to better capture relations over In this work, we take the best of both worlds sequences (He et al., 2017; Marcheggiani et al., and propose a novel approach to cross-lingual SRL. 2017). Owing to the recent wide availability of Our contributions can be summarized as follows: 338 Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 338–351 June 6–11, 2021. ©2021 Association for Computational Linguistics • We introduce a unified model to perform demonstrated the effectiveness of linguistically- cross-lingual SRL with heterogeneous linguis- informed self-attention layers in SRL; Cai and tic resources; Lapata(2019b) observed that syntactic dependen- cies often mirror semantic relations and proposed • We find that our model is competitive against a model that jointly learns to perform syntactic state-of-the-art systems on all the 6 languages dependency parsing and SRL; He et al.(2019) de- of the CoNLL-2009 benchmark; vised syntax-based pruning rules that work for mul- • We show that our model is robust to low- tiple languages. On the other hand, the complexity resource scenarios, thanks to its ability to gen- of syntax and the noisy performance of automatic eralize across languages; syntactic parsers have deterred other researchers who, instead, have found methods to improve SRL • We probe our model and demonstrate that it without syntax: Cai et al.(2018) took advantage implicitly learns to align heterogeneous lin- of an attentive biaffine layer (Dozat and Manning, guistic resources; 2017) to better model predicate-argument relations; Chen et al.(2019) and Lyu et al.(2019) obtained • We automatically build and release a cross- remarkable results in multiple languages by cap- lingual mapping that aligns linguistic for- turing predicate-argument interactions via capsule malisms from diverse languages. networks and iteratively refining the sequence of We hope that our unified model will further ad- output labels, respectively; Cai and Lapata(2019a) vance cross-lingual SRL and represent a tool for proposed a semi-supervised approach that scales the analysis and comparison of linguistic theories across different languages. across multiple languages. While we follow the latter trend and develop a syntax-agnostic model, we underline that both the 2 Related Work aforementioned syntax-aware and syntax-agnostic End-to-end SRL. The SRL pipeline is usually approaches suffer from a significant drawback: divided into four steps: predicate identification, they require training one model instance for each predicate sense disambiguation, argument identi- language of interest. Their two main limitations fication, and argument classification. While early are, therefore, that i) the number of trainable pa- research focused its efforts on addressing each step rameters increases linearly with the number of lan- individually (Xue and Palmer, 2004; Björkelund guages, and ii) the information available in one et al., 2009; Zhao et al., 2009), recent work has suc- language cannot be exploited to make SRL more cessfully demonstrated that tackling some of these robust in other languages. In contrast, one of the subtasks jointly with multitask learning (Caruana, main objectives of our work is to develop a unified 1997) is beneficial. In particular, He et al.(2018) cross-lingual model which can mitigate the paucity and, subsequently, Cai et al.(2018), Li et al.(2019) of training data in some languages by exploiting and Conia et al.(2020), indicate that predicate the information available in other, resource-richer sense signals aid the identification of predicate- languages. argument relations. Therefore, we follow this line Cross-lingual SRL. A key challenge in perform- and propose an end-to-end system for cross-lingual ing cross-lingual SRL with a single unified model SRL. is the dissimilarity of predicate sense and semantic Multilingual SRL. Current work in multilingual role inventories between languages. For example, SRL revolves mainly around the development of the multilingual dataset distributed as part of the novel neural architectures, which fall into two CoNLL-2009 shared task (Hajic et al., 2009) adopts broad categories, syntax-aware and syntax-agnostic the English Proposition Bank (Palmer et al., 2005) ones. On one hand, the quality and diversity of and NomBank (Meyers et al., 2004) to annotate En- the information encoded by syntax is an enticing glish sentences, the Chinese Proposition Bank (Xue prospect that has resulted in a wide range of con- and Palmer, 2009) for Chinese, the AnCora (Taulé tributions: Marcheggiani and Titov(2017) made et al., 2008) predicate-argument structure inventory use of Graph Convolutional Networks (GCNs) to for Catalan and Spanish, the German Proposition better capture relations between neighboring nodes Bank which, differently from the other PropBanks, in syntactic dependency trees; Strubell et al.(2018) is derived from FrameNet (Hajic et al., 2009), and 339 PDT-Vallex (Hajic et al., 2003) for Czech. Many of 3 Model Description these inventories are not aligned with each other as they follow and implement different linguistic the- In the wake of recent work in SRL, our model falls ories which, in turn, may pose different challenges. into the broad category of end-to-end systems as it learns to jointly tackle predicate identification, Padó and Lapata(2009), and Akbik et al.(2015, predicate sense disambiguation, argument identi- 2016) worked around these issues by making the fication and argument classification. The model English PropBank act as a universal predicate architecture can be roughly divided into the follow- sense and semantic role inventory and projecting ing components: PropBank-style annotations from English onto non- English sentences by means of word alignment • A universal sentence encoder whose pa- techniques applied to parallel corpora such as Eu- rameters are shared across languages and roparl (Koehn, 2005). These efforts resulted in the which produces word encodings that capture creation of the Universal PropBank, a multilingual predicate-related information (Section 3.2); collection of semi-automatically annotated corpora for SRL, which is actively in use today to train and • A universal predicate-argument encoder evaluate novel cross-lingual methods such as word whose parameters are also shared across lan- alignment techniques (Aminian et al., 2019).
Recommended publications
  • Abstraction and Generalisation in Semantic Role Labels: Propbank
    Abstraction and Generalisation in Semantic Role Labels: PropBank, VerbNet or both? Paola Merlo Lonneke Van Der Plas Linguistics Department Linguistics Department University of Geneva University of Geneva 5 Rue de Candolle, 1204 Geneva 5 Rue de Candolle, 1204 Geneva Switzerland Switzerland [email protected] [email protected] Abstract The role of theories of semantic role lists is to obtain a set of semantic roles that can apply to Semantic role labels are the representa- any argument of any verb, to provide an unam- tion of the grammatically relevant aspects biguous identifier of the grammatical roles of the of a sentence meaning. Capturing the participants in the event described by the sentence nature and the number of semantic roles (Dowty, 1991). Starting from the first proposals in a sentence is therefore fundamental to (Gruber, 1965; Fillmore, 1968; Jackendoff, 1972), correctly describing the interface between several approaches have been put forth, ranging grammar and meaning. In this paper, we from a combination of very few roles to lists of compare two annotation schemes, Prop- very fine-grained specificity. (See Levin and Rap- Bank and VerbNet, in a task-independent, paport Hovav (2005) for an exhaustive review). general way, analysing how well they fare In NLP, several proposals have been put forth in in capturing the linguistic generalisations recent years and adopted in the annotation of large that are known to hold for semantic role samples of text (Baker et al., 1998; Palmer et al., labels, and consequently how well they 2005; Kipper, 2005; Loper et al., 2007). The an- grammaticalise aspects of meaning.
    [Show full text]
  • A Novel Large-Scale Verbal Semantic Resource and Its Application to Semantic Role Labeling
    VerbAtlas: a Novel Large-Scale Verbal Semantic Resource and Its Application to Semantic Role Labeling Andrea Di Fabio}~, Simone Conia}, Roberto Navigli} }Department of Computer Science ~Department of Literature and Modern Cultures Sapienza University of Rome, Italy {difabio,conia,navigli}@di.uniroma1.it Abstract The automatic identification and labeling of ar- gument structures is a task pioneered by Gildea We present VerbAtlas, a new, hand-crafted and Jurafsky(2002) called Semantic Role Label- lexical-semantic resource whose goal is to ing (SRL). SRL has become very popular thanks bring together all verbal synsets from Word- to its integration into other related NLP tasks Net into semantically-coherent frames. The such as machine translation (Liu and Gildea, frames define a common, prototypical argu- ment structure while at the same time pro- 2010), visual semantic role labeling (Silberer and viding new concept-specific information. In Pinkal, 2018) and information extraction (Bas- contrast to PropBank, which defines enumer- tianelli et al., 2013). ative semantic roles, VerbAtlas comes with In order to be performed, SRL requires the fol- an explicit, cross-frame set of semantic roles lowing core elements: 1) a verb inventory, and 2) linked to selectional preferences expressed in terms of WordNet synsets, and is the first a semantic role inventory. However, the current resource enriched with semantic information verb inventories used for this task, such as Prop- about implicit, shadow, and default arguments. Bank (Palmer et al., 2005) and FrameNet (Baker We demonstrate the effectiveness of VerbAtlas et al., 1998), are language-specific and lack high- in the task of dependency-based Semantic quality interoperability with existing knowledge Role Labeling and show how its integration bases.
    [Show full text]
  • Semantic Role Labeling 2
    Semantic Role Labeling 2 Outline • Semantic role theory • Designing semantic role annotation project ▫ Granularity ▫ Pros and cons of different role schemas ▫ Multi-word expressions 3 Outline • Semantic role theory • Designing semantic role annotation project ▫ Granularity ▫ Pros and cons of different role schemas ▫ Multi-word expressions 4 Semantic role theory • Predicates tie the components of a sentence together • Call these components arguments • [John] opened [the door]. 5 Discovering meaning • Syntax only gets you so far in answering “Who did what to whom?” John opened the door. Syntax: NPSUB V NPOBJ The door opened. Syntax: NPSUB V 6 Discovering meaning • Syntax only gets you so far in answering “Who did what to whom?” John opened the door. Syntax: NPSUB V NPOBJ Semantic roles: Opener REL thing opened The door opened. Syntax: NPSUB V Semantic roles: thing opened REL 7 Can the lexicon account for this? • Is there a different sense of open for each combination of roles and syntax? • Open 1: to cause something to become open • Open 2: become open • Are these all the senses we would need? (1) John opened the door with a crowbar. Open1? (2) They tried the tools in John’s workshop one after the other, and finally the crowbar opened the door. Still Open1? 8 Fillmore’s deep cases • Correspondence between syntactic case and semantic role that participant plays • “Deep cases”: Agentive, Objective, Dative, Instrument, Locative, Factitive • Loosely associated with syntactic cases; transformations result in the final surface case 9 The door opened. Syntax: NPSUB V Semantic roles: Objective REL John opened the door. Syntax: NPSUB V NPOBJ Semantic roles: Agentive REL Objective The crowbar opened the door.
    [Show full text]
  • ACL 2019 Social Media Mining for Health Applications (#SMM4H)
    ACL 2019 Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task Proceedings of the Fourth Workshop August 2, 2019 Florence, Italy c 2019 The Association for Computational Linguistics Order copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ISBN 978-1-950737-46-8 ii Preface Welcome to the 4th Social Media Mining for Health Applications Workshop and Shared Task - #SMM4H 2019. The total number of users of social media continues to grow worldwide, resulting in the generation of vast amounts of data. Popular social networking sites such as Facebook, Twitter and Instagram dominate this sphere. According to estimates, 500 million tweets and 4.3 billion Facebook messages are posted every day 1. The latest Pew Research Report 2, nearly half of adults worldwide and two- thirds of all American adults (65%) use social networking. The report states that of the total users, 26% have discussed health information, and, of those, 30% changed behavior based on this information and 42% discussed current medical conditions. Advances in automated data processing, machine learning and NLP present the possibility of utilizing this massive data source for biomedical and public health applications, if researchers address the methodological challenges unique to this media. In its fourth iteration, the #SMM4H workshop takes place in Florence, Italy, on August 2, 2019, and is co-located with the
    [Show full text]
  • Syntax Role for Neural Semantic Role Labeling
    Syntax Role for Neural Semantic Role Labeling Zuchao Li Hai Zhao∗ Shanghai Jiao Tong University Shanghai Jiao Tong University Department of Computer Science and Department of Computer Science and Engineering Engineering [email protected] [email protected] Shexia He Jiaxun Cai Shanghai Jiao Tong University Shanghai Jiao Tong University Department of Computer Science and Department of Computer Science and Engineering Engineering [email protected] [email protected] Semantic role labeling (SRL) is dedicated to recognizing the semantic predicate-argument struc- ture of a sentence. Previous studies in terms of traditional models have shown syntactic informa- tion can make remarkable contributions to SRL performance; however, the necessity of syntactic information was challenged by a few recent neural SRL studies that demonstrate impressive performance without syntactic backbones and suggest that syntax information becomes much less important for neural semantic role labeling, especially when paired with recent deep neural network and large-scale pre-trained language models. Despite this notion, the neural SRL field still lacks a systematic and full investigation on the relevance of syntactic information in SRL, for both dependency and both monolingual and multilingual settings. This paper intends to quantify the importance of syntactic information for neural SRL in the deep learning framework. We introduce three typical SRL frameworks (baselines), sequence-based, tree-based, and graph-based, which are accompanied by two categories of exploiting syntactic information: syntax pruning- based and syntax feature-based. Experiments are conducted on the CoNLL-2005, 2009, and 2012 benchmarks for all languages available, and results show that neural SRL models can still benefit from syntactic information under certain conditions.
    [Show full text]
  • Semantic Role Labeling Tutorial NAACL, June 9, 2013
    Semantic Role Labeling Tutorial NAACL, June 9, 2013 Part 1: Martha Palmer, University of Colorado Part 2: Shumin Wu, University of Colorado Part 3: Ivan Titov, Universität des Saarlandes 1 Outline } Part 1 Linguistic Background, Resources, Annotation Martha Palmer, University of Colorado } Part 2 Supervised Semantic Role Labeling and Leveraging Parallel PropBanks Shumin Wu, University of Colorado } Part 3 Semi- , unsupervised and cross-lingual approaches Ivan Titov, Universität des Saarlandes, Universteit van Amsterdam 2 Motivation: From Sentences to Propositions Who did what to whom, when, where and how? Powell met Zhu Rongji battle wrestle join debate Powell and Zhu Rongji met consult Powell met with Zhu Rongji Proposition: meet(Powell, Zhu Rongji) Powell and Zhu Rongji had a meeting meet(Somebody1, Somebody2) . When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) Capturing semantic roles SUBJ } Dan broke [ the laser pointer.] SUBJ } [ The windows] were broken by the hurricane. SUBJ } [ The vase] broke into pieces when it toppled over. PropBank - A TreeBanked Sentence (S (NP-SBJ Analysts) S (VP have (VP been VP (VP expecting (NP (NP a GM-Jaguar pact) have VP (SBAR (WHNP-1 that) (S (NP-SBJ *T*-1) NP-SBJ been VP (VP would Analysts (VP give expectingNP (NP the U.S. car maker) SBAR (NP (NP an eventual (ADJP 30 %) stake) NP S (PP-LOC in (NP the British company)))))))))))) a GM-Jaguar WHNP-1 VP pact that NP-SBJ VP *T*-1 would NP give Analysts have been expecting a GM-Jaguar NP PP-LOC pact that would give the U.S.
    [Show full text]
  • Generating Lexical Representations of Frames Using Lexical Substitution
    Generating Lexical Representations of Frames using Lexical Substitution Saba Anwar Artem Shelmanov Universitat¨ Hamburg Skolkovo Institute of Science and Technology Germany Russia [email protected] [email protected] Alexander Panchenko Chris Biemann Skolkovo Institute of Science and Technology Universitat¨ Hamburg Russia Germany [email protected] [email protected] Abstract Seed sentence: I hope PattiHelper can helpAssistance youBenefited party soonTime . Semantic frames are formal linguistic struc- Substitutes for Assistance: assist, aid tures describing situations/actions/events, e.g. Substitutes for Helper: she, I, he, you, we, someone, Commercial transfer of goods. Each frame they, it, lori, hannah, paul, sarah, melanie, pam, riley Substitutes for Benefited party: me, him, folk, her, provides a set of roles corresponding to the sit- everyone, people uation participants, e.g. Buyer and Goods, and Substitutes for Time: tomorrow, now, shortly, sooner, lexical units (LUs) – words and phrases that tonight, today, later can evoke this particular frame in texts, e.g. Sell. The scarcity of annotated resources hin- Table 1: An example of the induced lexical represen- ders wider adoption of frame semantics across tation (roles and LUs) of the Assistance FrameNet languages and domains. We investigate a sim- frame using lexical substitutes from a single seed sen- ple yet effective method, lexical substitution tence. with word representation models, to automat- ically expand a small set of frame-annotated annotated resources. Some publicly available re- sentences with new words for their respective sources are FrameNet (Baker et al., 1998) and roles and LUs. We evaluate the expansion PropBank (Palmer et al., 2005), yet for many lan- quality using FrameNet.
    [Show full text]
  • Efficient Pairwise Document Similarity Computation in Big Datasets
    International Journal of Database Theory and Application Vol.8, No.4 (2015), pp.59-70 http://dx.doi.org/10.14257/ijdta.2015.8.4.07 Efficient Pairwise Document Similarity Computation in Big Datasets 1Papias Niyigena, 1Zhang Zuping*, 2Weiqi Li and 1Jun Long 1School of Information Science and Engineering, Central South University, Changsha, 410083, China 2School of Electronic and Information Engineering, Xi’an Jiaotong University Xian, 710049, China [email protected], * [email protected], [email protected], [email protected] Abstract Document similarity is a common task to a variety of problems such as clustering, unsupervised learning and text retrieval. It has been seen that document with the very similar content provides little or no new information to the user. This work tackles this problem focusing on detecting near duplicates documents in large corpora. In this paper, we are presenting a new method to compute pairwise document similarity in a corpus which will reduce the time execution and save space execution resources. Our method group shingles of all documents of a corpus in a relation, with an advantage of efficiently manage up to millions of records and ease counting and aggregating. Three algorithms are introduced to reduce the candidates shingles to be compared: one creates the relation of shingles to be considered, the second one creates the set of triples and the third one gives the similarity of documents by efficiently counting the shared shingles between documents. The experiment results show that our method reduces the number of candidates pairs to be compared from which reduce also the execution time and space compared with existing algorithms which consider the computation of all pairs candidates.
    [Show full text]
  • Cross-Lingual Semantic Role Labeling with Model Transfer Hao Fei, Meishan Zhang, Fei Li and Donghong Ji
    1 Cross-lingual Semantic Role Labeling with Model Transfer Hao Fei, Meishan Zhang, Fei Li and Donghong Ji Abstract—Prior studies show that cross-lingual semantic role transfer. The former is based on large-scale parallel corpora labeling (SRL) can be achieved by model transfer under the between source and target languages, where the source-side help of universal features. In this paper, we fill the gap of sentences are automatically annotated with SRL tags and cross-lingual SRL by proposing an end-to-end SRL model that incorporates a variety of universal features and transfer methods. then the source annotations are projected to the target-side We study both the bilingual transfer and multi-source transfer, sentences based on word alignment [7], [9], [10], [11], [12]. under gold or machine-generated syntactic inputs, pre-trained Annotation projection has received the most attention for cross- high-order abstract features, and contextualized multilingual lingual SRL. Consequently, a range of strategies have been word representations. Experimental results on the Universal proposed for enhancing the SRL performance of the target Proposition Bank corpus indicate that performances of the cross-lingual SRL can vary by leveraging different cross-lingual language, including improving the projection quality [13], joint features. In addition, whether the features are gold-standard learning of syntax and semantics information [7], iteratively also has an impact on performances. Precisely, we find that bootstrapping to reduce the influence of noise target corpus gold syntax features are much more crucial for cross-lingual [12], and joint translation-SRL [8]. SRL, compared with the automatically-generated ones.
    [Show full text]
  • Natural Language Semantic Construction Based on Cloud Database
    Copyright © 2018 Tech Science Press CMC, vol.57, no.3, pp.603-619, 2018 Natural Language Semantic Construction Based on Cloud Database Suzhen Wang1, Lu Zhang1, Yanpiao Zhang1, Jieli Sun1, Chaoyi Pang2, Gang Tian3 and Ning Cao4, * Abstract: Natural language semantic construction improves natural language comprehension ability and analytical skills of the machine. It is the basis for realizing the information exchange in the intelligent cloud-computing environment. This paper proposes a natural language semantic construction method based on cloud database, mainly including two parts: natural language cloud database construction and natural language semantic construction. Natural Language cloud database is established on the CloudStack cloud-computing environment, which is composed by corpus, thesaurus, word vector library and ontology knowledge base. In this section, we concentrate on the pretreatment of corpus and the presentation of background knowledge ontology, and then put forward a TF-IDF and word vector distance based algorithm for duplicated webpages (TWDW). It raises the recognition efficiency of repeated web pages. The part of natural language semantic construction mainly introduces the dynamic process of semantic construction and proposes a mapping algorithm based on semantic similarity (MBSS), which is a bridge between Predicate-Argument (PA) structure and background knowledge ontology. Experiments show that compared with the relevant algorithms, the precision and recall of both algorithms we propose have been significantly improved. The work in this paper improves the understanding of natural language semantics, and provides effective data support for the natural language interaction function of the cloud service. Keywords: Natural language, cloud database, semantic construction. 1 Introduction The Internet of things and big data have promoted the rapid development of the artificial intelligence industry.
    [Show full text]
  • Knowledge-Based Semantic Role Labeling Release 0.1
    Knowledge-based Semantic Role Labeling Release 0.1 September 11, 2014 Contents 1 Resources parsing 3 1.1 VerbNet parsing.............................................3 1.2 FrameNet parsing............................................4 2 Frames and arguments extractions7 2.1 Frame extraction.............................................7 2.2 Arguments extraction..........................................8 2.3 Evaluation................................................9 3 Debug ressources 11 3.1 Sample of headwords extractions.................................... 11 3.2 Dump module.............................................. 11 4 Evaluation 15 4.1 Initial mappings............................................. 15 4.2 Frame-matching............................................. 16 4.3 Probability model............................................ 16 4.4 Bootstrapping.............................................. 16 5 Ideas 17 5.1 Finding new matches........................................... 17 5.2 Restricting potential VerbNet classes.................................. 18 5.3 Informative probability model...................................... 18 6 Indices and tables 19 7 References 21 i ii Knowledge-based Semantic Role Labeling, Release 0.1 This repository contains: • A reproduction of (Swier & Stevenson, 2005) results for recent versions of resources and NLP tools (MST Parser, latest VerbNet, FrameNet and VN-FN mappings). • Some improvements of our own • Experiments on domain-specific knowledge-based semantic role labeling Documentation: Contents
    [Show full text]
  • Question-Answer Driven Semantic Role Labeling
    Question-Answer Driven Semantic Role Labeling Using Natural Language to Annotate Natural Language Luheng He, Mike Lewis, Luke Zettlemoyer University of Washington EMNLP 2015 1 Semantic Role Labeling (SRL) who did what to whom, when and where? 2 Semantic Role Labeling (SRL) Role Predicate Argument Agent Patent Time They increased the rent drastically this year Manner 2 Semantic Role Labeling (SRL) Role Predicate Argument Agent Patent Time They increased the rent drastically this year Manner • Defining a set of roles can be difficult • Existing formulations have used different sets 2 Existing SRL Formulations and Their Frame Inventories FrameNet PropBank 1000+ semantic frames, 10,000+ frame files 10,000+ frame elements (roles) with predicate-specific roles Frame: Change_position_on_a_scale This frame consists of words that indicate the Roleset Id: rise.01 , go up change of an Item's position on a scale Arg1-: Logical subject, patient, thing rising (the Attribute) from a starting point Arg2-EXT: EXT, amount risen (Initial_value) to an end point (Final_value). Arg3-DIR: start point The direction (Path) … Arg4-LOC: end point Lexical Units: Argm-LOC: medium …, reach.v, rise.n, rise.v, rocket.v, shift.n, … Unified Verb Index, University of Colorado http://verbs.colorado.edu/verb-index/ PropBank Annotation Guidelines, Bonial et al., 2010 FrameNet II: Extended theory and practice, Ruppenhofer et al., 2006 FrameNet: https://framenet.icsi.berkeley.edu/ 3 This Talk: QA-SRL • Introduce a new SRL formulation with no frame or role inventory • Use question-answer pairs to model verbal predicate-argument relations • Annotated over 3,000 sentences in weeks with non-expert, part-time annotators • Showed that this data is high-quality and learnable 4 Our Annotation Scheme Given sentence and a verb: They increased the rent this year .
    [Show full text]