Detecting Novel Metaphor Using Selectional Preference Information
Total Page:16
File Type:pdf, Size:1020Kb
Detecting novel metaphor using selectional preference information Hessel Haagsma and Johannes Bjerva University of Groningen The Netherlands Abstract costs an arm and a leg.’ and the Dutch ‘Dit lesboek kost een rib uit het lijf.’ (lit. ‘This textbook costs a Recent work on metaphor processing often rib from the body.’) to be mapped to the same mean- employs selectional preference information. ing representation, which should not include refer- We present a comparison of different ap- ences to any of the body parts used in the idiomatic proaches to the modelling of selectional pref- erences, based on various ways of generaliz- expressions. ing over corpus frequencies. We evaluate on In this paper, we focus on one area of figurative the VU Amsterdam Metaphor corpus, a broad language processing, namely the detection of novel corpus of metaphor. We find that using only metaphor, and especially the utility of selectional selectional preference information is enough preference features for this task. By doing this, we to outperform an all-metaphor baseline clas- hope to answer a two-fold question: can selectional sification, but that generalization through pre- preference information be used to successfully de- diction or clustering is not beneficial. A pos- tect novel metaphor? And how do generalization sible explanation for this lies in the nature of the evaluation data, and lack of power of se- methods influence the effectiveness of selectional lectional preference information on its own preference information? for non-novel metaphor detection. To bet- ter investigate the role of metaphor type in 2 Background metaphor detection, we suggest a resource with annotation of novel metaphor should be 2.1 Types of metaphor in language created. The first part of any work on metaphor processing is is to arrive at a definition for ‘metaphor’. A good starting point for this is the MIP frame- 1 Introduction work (Pragglejaz Group, 2007), which has this as Within natural language processing (NLP), there has the most important criterion for metaphor: ‘a lexi- been an increasing interest in the processing of fig- cal unit is metaphorical if it has a more basic con- urative language in recent years. This increasing in- temporary meaning in other contexts than in the cur- terest is exemplified by the (NA)ACL workshops on rent context’ (paraphrased). Where ‘more basic’ is metaphor in NLP, and shared tasks like SemEval- defined as more concrete, related to bodily action, 2015, Task 11, about sentiment analysis of figurative more precise, and historically older. This is a very language in Twitter. broad definition, causing many meanings to be clas- The main benefits of improved treatment of figu- sified as metaphorical. rative language within NLP lie with high-level tasks Examples of this are found in the VU Amster- dependent on semantics, such as semantic parsing. dam Metaphor corpus, which is annotated using the For example, in multilingual semantic parsing, one MIPVU procedure (Steen et al., 2010), an exten- would like to see both the English ‘This textbook sion of MIP. In the corpus, we find a wide range of 10 Proceedings of The Fourth Workshop on Metaphor in NLP, pages 10–17, San Diego, CA, 17 June 2016. c 2016 Association for Computational Linguistics metaphor from highly conventionalized metaphor, ble pitfall of trying to solve two essentially different such as have in Example 1, to more clear metaphor- problems using the same method. ical cases, like rolling in Example 2. 2.2 Selectional preference violation as a feature (1) Do the Greeks have a word for it? for metaphor detection (2) [...] the hillsides ceased their upward Because, by definition, we cannot use a pre-existing rolling and curved together [...] sense or meaning definition for novel metaphor, novel metaphor processing provides a new kind of Three general categories of meaning can be dis- challenge. On the other hand, with the ultimate goal tinguished. The first type of contextual meaning is of integrating figurative language processing in a se- literal meaning, which is the most basic meaning of mantic parsing framework, novel metaphor is the a word. This kind of meaning is generally the least most problematic for deducing meaning, and is thus problematic, since the meaning can be deduced from the most relevant and interesting. the lexicon. The second type of meaning is conven- Another important aspect of novel metaphors is tional metaphorical meaning, i.e. a non-basic mean- that they tend to be less frequent than conventional ing of a word, which is also in the lexicon or sense metaphor (and literals). This has the advantage of inventory. making one feature commonly used for metaphor Novel metaphors form a third type of contextual detection, selectional preference violation, more ef- meaning, one which is non-basic, but is not ac- fective. counted for by the lexicon either. These are the most The idea of using selectional preference viola- problematic, since their meaning cannot be deduced tion as an indicator of metaphoricity goes far back by selecting the correct sense from a sense inventory. (Wilks, 1975; Wilks, 1978), and has been widely Aligned with the distinction between conven- used in previous work on metaphor detection (Fass, tional metaphor and novel metaphor is the dis- 1991; Mason, 2004; Jia and Yu, 2008; Shutova et tinction between word sense disambiguation and al., 2010; Neuman et al., 2013; Wilks et al., 2013). metaphor detection. That is, word sense dis- Using selectional preference violation as a heuris- ambiguation can be expected to cover literal and tic works well, but has the fundamental shortcoming conventionalized metaphorical meanings, whereas that selectional preferences are based on frequency; metaphor processing is necessary to deal with novel selectional preference data are obtained from a cor- metaphorical meanings. pus by counting how often words occur together in a We argue here that the most important applica- certain relation. Metaphor, on the other hand, is de- tion of metaphor processing approaches lies not with fined by basicness of meaning, and not frequency of conventional metaphors, but with novel metaphors. meaning. Although these two are correlated, there This type of metaphor is the most problematic for are also many cases where the figurative sense of a other NLP applications, since it requires a deduction word has become more frequent than its original, lit- of meaning from outside the sense lexicon. There- eral sense. fore, it requires a dedicated metaphor processing The result of this is that metaphor detection sys- system, whereas more conventional metaphor can, tems based on selectional preferences erroneously in principle, be handled by existing WSD systems. classify low-frequent literals as metaphorical and This is not to say that metaphor processing re- high-frequent metaphors as literal. However, if we search should exclusively focus on novel metaphor, intend only to capture novel metaphors, the selec- since insights about metaphor and metaphoricity can tional preference violation approach is less vulnera- definitely be useful for improving the handling of ble to these errors, since novel metaphors are less conventional metaphor within a word sense disam- likely to be mistakenly classified as literal due to biguation framework. Nevertheless, the approach their lower frequency. proposed in this paper aims only to deal with novel As such, we hypothesize that this feature, when metaphor. In addition, separating the treatment of used correctly, can be very useful for metaphor de- novel and conventional metaphor avoids the possi- tection. To this purpose, we examine new ways of 11 generalizing over selectional preferences obtained scientific or encyclopaedia-style text. In contrast, from a corpus. the data used for evaluation is taken from differ- ent parts of the BNC, which represent different gen- 3 Methods res. Nevertheless, we assume that the larger size and number of topics of the Wikipedia corpus compen- Selectional preference information can be automati- sate for this. cally gathered from a large, parsed corpus. This al- As a dependency parser, we use the spacy.io3 lows for counting the occurrences of specific pairs, Python library, because of its speed, high accuracy such as ‘evoke-excitement’. Based on this, we can and ease-of-use. From the parsed corpus, all verb- then calculate conditional probabilities or a measure noun pairs with the labels nsubj and dobj were like selectional association (Resnik, 1993). extracted and counted. This yielded 101 million However, even a very large corpus will not yield verb-subject triples of 14.0 million distinct types and full coverage of all possible pairs. This could cause 67 million verb-object triples of 7.8 million distinct the system to flag, for example ‘evoke-exhilaration’ types. The triples contained 175,630 verb lemma as a selectional preference violation and thus as types and 1,982,512 noun lemma types. metaphoric, simply because this combination occurs very infrequently in the corpus. 3.2 Selectional preference metrics The solution to this problem is to do some sort of Selectional preference information can be used in generalization over these verb-noun pairs. One way three ways: as a conditional probability, a selec- of doing this is to perform clustering the verbs and tional preference strength for the verb, and as selec- nouns in semantically coherent classes. We test two tional association between a verb and a noun. The clustering methods: Brown clustering (Brown et al., conditional probability is defined as follows: 1992), and a novel method, which relies on k-means clustering of word embeddings. P (n, v) c(n, v) P (n v) = (1) In addition to these clustering methods, we also | P (v) ≈ c(v) look at using word embeddings directly in a classi- Where P (n v) is the probability of seeing noun n | fier, in order to prevent the information loss inher- occurring with verb v.