Generating Lexical Representations of Frames Using Lexical Substitution Saba Anwar†, Artem Shelmanov‡, Alexander Panchenko‡, and Chris Biemann†

Generating Lexical Representations of Frames using Lexical Substitution Saba Anwary, Artem Shelmanovz, Alexander Panchenkoz, and Chris Biemanny yUniversitat¨ Hamburg, Germany zSkolkovo Institute of Science and Technology, Russia fanwar,[email protected] a.fshelmanov,[email protected] Abstract Seed sentence: I hope PattiHelper can helpAssistance youBenefited party soonTime . Semantic frames are formal linguistic struc- Substitutes for Assistance: assist, aid tures describing situations/actions/events, e.g. Substitutes for Helper: she, I, he, you, we, someone, Commercial transfer of goods. Each frame they, it, lori, hannah, paul, sarah, melanie, pam, riley provides a set of roles corresponding to the sit- Substitutes for Benefited party: me, him, folk, her, uation participants, e.g. Buyer and Goods, and everyone, people Substitutes for Time: tomorrow, now, shortly, sooner, lexical units (LUs) – words and phrases that tonight, today, later can evoke this particular frame in texts, e.g. Sell. The scarcity of annotated resources hin- Table 1: An example of the induced lexical represen- ders wider adoption of frame semantics across tation (roles and LUs) of the Assistance FrameNet languages and domains. We investigate a sim- frame using lexical substitutes from a single seed sen- ple yet effective method, lexical substitution tence. with word representation models, to automatically expand a small set of frame-annotated lenging and requires semanticists or very complex sentences with new words for their respective crowd-sourcing setups (Fossati et al., 2013). roles and LUs. We evaluate the expansion quality using FrameNet. Contextualized mod- We suggest a different perspective on the prob- els demonstrate overall superior performance lem: expanding the FrameNet resource automati- compared to the non-contextualized ones on cally by using lexical substitution. Given a small roles. However, the latter show comparable set of seed sentences with their frame annotations, performance on the task of LU expansion. we can expand it by substituting the targets (words corresponding to lexical units of the respective 1 Introduction frame) and arguments (words corresponding to The goal of lexical substitution (McCarthy and roles of the respective frame) of those sentences Navigli, 2009) is to replace a given target word and aggregating possible substitutions into an in- in its context with meaning-preserving alterna- duced frame-semantic resource. Table1 shows tives. In this paper, we show how lexical sub- one such induced example. For this purpose, stitution can be used for semantic frame expan- we have experimented with state-of-the-art non- sion. A semantic frame is a linguistic structure contextualized (static) word representation mod- used to describe the formal meaning of a situa- els including neural word embeddings, i.e. fast- tion or event (Fillmore, 1982). Semantic frames Text (Bojanowski et al., 2017), GloVe (Pennington have witnessed a wide range of applications; such et al., 2014), and word2vec (Mikolov et al., 2013); as question answering (Shen and Lapata, 2007; and distributional thesaurus, i.e. JoBimText (Bie- Berant and Liang, 2014; Khashabi et al., 2018), mann and Riedl, 2013); and compared their results machine translation (Gao and Vogel, 2011; Zhai with contextualized word representations of the et al., 2013), and semantic role labelling (Do et al., state-of-the-art BERT model (Devlin et al., 2019), 2017; Swayamdipta et al., 2018). The impact, which has set a new benchmark performance on however, is limited by the scarce availability of many downstream NLP applications. To complete annotated resources. Some publicly available re- the comparison, we also include the lexical substi- sources are FrameNet (Baker et al., 1998) and tution model of Melamud et al.(2015), which uses PropBank (Palmer et al., 2005), yet for many lan- dependency-based word and context embeddings guages and domains, specialized resources do not and produces context-sensitive lexical substitutes. exist. Besides, due to the inherent vagueness To generate substitutes, we decompose the of frame definitions, the annotation task is chal- problem into two sub-tasks: Lexical unit expansion: Given a sentence and its target word, the task places ELMo with BERT (Devlin et al., 2019) for is to generate frame preserving substitutes for this improved performance. Zhou et al.(2019) show word. Frame role expansion: Given a sentence the utility of BERT for the lexical substitution and an argument, the task is to generate meaning- task. Lexical substitution has been used for a preserving substitutes for this argument. range of NLP tasks such as paraphrasing or text Contributions of our work are (i) a method for simplification, but here, we are employing it, as inducing frame-semantic resources based on a few far as we are aware, for the first time to perform frame-annotated sentences using lexical substitu- expansion of frame-semantic resources. tion, and (ii) an evaluation of various distributional semantic models and lexical substitution 3 Inducing Lexical Representations of methods on the ground truth from FrameNet. Frames via Lexical Substitution We experimented with two groups of lexical sub- 2 Related Work stitution methods. The first one use no con- Approaches to semantic frame parsing with re- text: non-contextualized neural word embed- spect to a pre-defined semantic frame resource, ding models, i.e. fastText (Bojanowski et al., such as FrameNet, have received much atten- 2017), GloVe (Pennington et al., 2014), and tion in the literature (Das et al., 2010; Oepen word2vec (Mikolov et al., 2013), as well as dis- et al., 2016; Yang and Mitchell, 2017; Peng et al., tributional thesaurus based models in the form of 2018), with SEMAFOR (Das et al., 2014) be- JoBimText (Biemann and Riedl, 2013). The sec- ing a most widely known system to extract com- ond group of methods does use the context: here, plete frame structure including target identifica- we tried contextualized word embedding model tion. Some works focus on identifying par- BERT (Devlin et al., 2019) and the lexical substi- tial structures such as frame identification (Hart- tution model of Melamud et al.(2015). mann et al., 2017; Hermann et al., 2014), role 3.1 Static Word Representations labelling with frame identification (Swayamdipta et al., 2017; Yang and Mitchell, 2017), and sim- These word representations models are inherently ple role labelling (Kshirsagar et al., 2015; Roth non-contextualized as they learn one representa- and Lapata, 2015; Swayamdipta et al., 2018), tion of a word regardless of its context. which is considered very similar to standard Prop- Neural Word Embeddings Neural word em- Bank (Palmer et al., 2005) style semantic role la- beddings represent words as vectors of continu- belling, albeit more challenging because of the ous numbers, where words with similar meanings high granularity of frame roles. These super- are expected to have similar vectors. Thus, to pro- vised models rely on a dataset of frame-annotated duce substitutes, we extracted the k nearest neigh- sentences such as FrameNet. FrameNet-like re- bors using a cosine similarity measure. We use sources are available only for very few languages pre-trained embeddings by authors models: fast- and cover only a few domains. In this paper, we Text trained on the Common Crawl corpus, GloVe venture into the inverse problem, the case where trained on Common Crawl corpus with 840 billion the number of annotations is insufficient, simi- words, word2vec trained on Google News. All lar to the idea of Pennacchiotti et al.(2008) who these models produce 300-dimension vectors. investigated the utility of semantic spaces and WordNet-based methods to automatically induce Distributional Thesaurus (DT) In this ap- new LUs and reported their results on FrameNet. proach, word similarities are computed using com- Our method is inspired by the recent work of plex linguistic features such as dependency rela- Amrami and Goldberg(2018). They suggest to tions (Lin, 1998). The representations provided predict the substitutes vectors for target words us- by DTs are sparser, but similarity scores based ing pre-trained ELMo (Peters et al., 2018) and dy- on them can be better. JoBimText (Biemann and namic symmetric patterns, then induced the word Riedl, 2013) is a framework that offers many DTs senses using clustering. Arefyev et al.(2019) takes computed on a range of different corpora. Context the idea of substitute vectors from (Amrami and features for each word are ranked using the lexi- Goldberg, 2018) for the SemEval 2019 (Qasem- cographer’s mutual information (LMI) score and iZadeh et al., 2019) frame induction task and re- used to compute word similarity by feature over- lap. We extract the k nearest neighbors for the tar- roles and LUs can consist of a single token or mul- get word. We use two JoBimText DTs: (i) DT tiple tokens. For this work, we have only con- built on Wikipedia with n-grams as contexts and sidered a single-token substitution. The datasets (ii) DT built on a 59G corpus (Wikipedia, Giga- for evaluation were derived automatically from word, ukWaC, and LCC corpora combined) using FrameNet. To create a gold standard for LU ex- dependency relations as context. pansion task, for each sentence containing an annotated LU, we consider other LUs of the cor- 3.2 Contextualized Models responding semantic frame as ground truth sub- Static word representations fail to handle poly- stitutes. We keep only LUs marked as verbs in semic words. This paves the way for context- FrameNet. To make a gold standard for the role aware word representation models, which can gen- expansion task, for each of the sentences that con- erate diverse word-probability distributions for a tain an annotation of a given frame role, we con- target word based on its context. sider all the single-word annotations from the rest of the corpus marked with the same role and re- Melamud et al.(2015) This simple model uses lated to the same frame as ground truth substitutes.

Load more