Word Embedding and Wordnet Based Metaphor Identification And

Word Embedding and WordNet Based Metaphor Identification and Interpretation Rui Mao, Chenghua Lin and Frank Guerin Department of Computing Science University of Aberdeen Aberdeen, United Kingdom fr03rm16, chenghua.lin, [email protected] Abstract Translate failed in translating devour within a sentence, “She devoured his novels.” (Mohammad Metaphoric expressions are widespread in et al., 2016), into Chinese. The term was translated natural language, posing a significant chal- into 吞l, which takes the literal sense of swallow lenge for various natural language pro- and is not understandable in Chinese. Interpreting cessing tasks such as Machine Translation. metaphors allows us to paraphrase them into literal Current word embedding based metaphor expressions which maintain the intended meaning identification models cannot identify the and are easier to translate. exact metaphorical words within a sen- Metaphor identification approaches based on tence. In this paper, we propose an un- word embeddings have become popular (Tsvetkov supervised learning method that identi- et al., 2014; Shutova et al., 2016; Rei et al., fies and interprets metaphors at word-level 2017) as they do not rely on hand-crafted knowl- without any preprocessing, outperforming edge for training. These models follow a sim- strong baselines in the metaphor identifi- ilar paradigm in which input sentences are first cation task. Our model extends to inter- parsed into phrases and then the metaphoricity pret the identified metaphors, paraphras- of the phrases is identified; they do not tackle ing them into their literal counterparts, so word-level metaphor. E.g., given the former sen- that they can be better translated by ma- tence “She devoured his novels.”, the aforemen- chines. We evaluated this with two popu- tioned methods will first parse the sentence into a lar translation systems for English to Chi- verb-direct object phrase devour novel, and then nese, showing that our model improved detect the clash between devour and novel, flag- the systems significantly. ging this phrase as a likely metaphor. However, which component word is metaphorical cannot be 1 Introduction identified, as important contextual words in the Metaphor enriches language, playing a significant sentence were excluded while processing these role in communication, cognition, and decision phrases. Discarding contextual information also making. Relevant statistics illustrate that about leads to a failure to identify a metaphor when both one third of sentences in typical corpora contain words in the phrase are metaphorical, but taken out metaphor expressions (Cameron, 2003; Martin, of context they appear literal. E.g., “This young 2006; Steen et al., 2010; Shutova, 2016). Linguis- man knows how to climb the social ladder.” (Mo- tically, metaphor is defined as a language expres- hammad et al., 2016) is a metaphorical expression. sion that uses one or several words to represent an- However, when the sentence is parsed into a verb- other concept, rather than taking their literal mean- direct object phrase, climb ladder, it appears lit- ings of the given words in the context (Lagerwerf eral. and Meijers, 2008). Computational metaphor pro- In this paper, we propose an unsupervised cessing refers to modelling non-literal expressions metaphor processing model which can identify (e.g., metaphor, metonymy, and personification) and interpret linguistic metaphors at the word- and is useful for improving many NLP tasks such level. Specifically, our model is built upon word as Machine Translation (MT) and Sentiment Anal- embedding methods (Mikolov et al., 2013) and ysis (Rentoumi et al., 2012). For instance, Google uses WordNet (Fellbaum, 1998) for lexical re- 1222 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Long Papers), pages 1222–1231 Melbourne, Australia, July 15 - 20, 2018. c 2018 Association for Computational Linguistics lation acquisition. Our model is distinguished ing metaphor processing for supporting MT. We from existing methods in two aspects. First, our describe related work in §2, followed by our la- model is generic which does not constrain the belling method in §4, experimental design in §5, source domain of metaphor. Second, the devel- results in §6 and conclusions in §7. oped model does not rely on any labelled data for 2 Related Work model training, but rather captures metaphor in an unsupervised, data-driven manner. Linguistic A wide range of methods have been applied for metaphors are identified by modelling the distance computational metaphor processing. Turney et al. (in vector space) between the target word’s literal (2011); Neuman et al.(2013); Assaf et al.(2013) and metaphorical senses. The metaphorical sense and Tsvetkov et al.(2014) identified metaphors within a sentence is identified by its surrounding by modelling the abstractness and concreteness context within the sentence, using word embed- of metaphors and non-metaphors, using a ma- ding representations and WordNet. This novel ap- chine usable dictionary called MRC Psycholin- proach allows our model to operate at the sentence guistic Database (Coltheart, 1981). They be- level without any preprocessing, e.g., dependency lieved that metaphorical words would be more ab- parsing. Taking contexts into account also ad- stract than literal ones. Some researchers used dresses the issue that a two-word phrase appears topic models to identify metaphors. For instance, literal, but it is metaphoric within a sentence (e.g., Heintz et al.(2013) used Latent Dirichlet Alloca- the climb ladder example). tion (LDA) (Blei et al., 2003) to model source and We evaluate our model against three strong target domains, and assumed that sentences con- baselines (Melamud et al., 2016; Shutova et al., taining words from both domains are metaphor- 2016; Rei et al., 2017) on the task of metaphor ical. Strzalkowski et al.(2013) assumed that identification. Extensive experimentation con- metaphorical terms occur out of the topic chain, ducted on a publicly available dataset (Moham- where a topic chain is constructed by topical mad et al., 2016) shows that our model sig- words that reveal the core discussion of the text. nificantly outperforms the unsupervised learning Shutova et al.(2017) performed metaphorical con- baselines (Melamud et al., 2016; Shutova et al., cept mappings between the source and target do- 2016) on both phrase and sentence evaluation, and mains in multi-languages using both unsupervised achieves equivalent performance to the state-of- and semi-supervised learning approaches. The the-art deep learning baseline (Rei et al., 2017) source and target domains are represented by se- on phrase-level evaluation. In addition, while mantic clusters, which are derived through the dis- most of the existing works on metaphor processing tribution of the co-occurrences of words. They solely evaluate the model performance in terms of also assumed that when contextual vocabularies metaphor classification accuracy, we further con- are from different domains then there is likely to ducted another set of experiments to evaluate how be a metaphor. metaphor processing can be used for supporting There is another line of approaches based on the task of MT. Human evaluation shows that our word embeddings. Generally, these works are not model improves the metaphoric translation sig- limited by conceptual domains and hand-crafted nificantly, by testing on two prominent transla- knowledge. Shutova et al.(2016) proposed a tion systems, namely, Google Translate1 and Bing model that identified metaphors by employing Translator2. To our best knowledge, this is the word and image embeddings. The model first first metaphor processing model that is evaluated parses sentences into phrases which contain target on MT. words. In their word embedding based approach, To summarise, the contributions of this paper the metaphoricity of a phrase was identified by are two-fold: (1) we proposed a novel frame- measuring the cosine similarity of two component work for metaphor identification which does not words in the phrase, based on their input vectors require any preprocessing or annotated corpora from Skip-gram word embeddings. If the cosine for training; (2) we conducted, to our knowledge, similarity is higher than a threshold, the phrase is the first metaphor interpretation study of apply- identified as literal; otherwise metaphorical. Rei et al.(2017) identified metaphors by introducing a 1https://translate.google.co.uk deep learning architecture. Instead of using word 2https://www.bing.com/translator input vectors directly, they filtered out noisy in- 1223 CBOW Skip-gram where Cn is the one-hot encoding of the nth con- i Input Hidden Output Input Hidden Output text word, vc;n is the nth context word row vec- i C1 C1 tor (input vector) in W which is a weight matrix … … between input and hidden layers. Thus, the hid- Cn T T Cn .. .. den layer is the transpose of the average of input … .. .. … vectors of context words. The probability of pre- Cm W i W o W i W o Cm dicting a centre word in its context is given by a softmax function below: Figure 1: CBOW and Skip-gram framework. o> o> ut = Wt × HCBOW = vt × HCBOW (3) formation in the vector of one word in a phrase, exp(u ) p(tjc ; :::; c ; :::; c ) = t (4) projecting the word vector into another space via 1 n m PV exp(uj) a sigmoid activation function. The metaphoricity j=1 o o of the phrases was learnt via training a supervised where Wt is equivalent to the output vector vt deep neural network. which is essentially a column vector in a weight The above word embedding based models, matrix W o that is between hidden and output lay- while demonstrating some success in metaphor ers, aligning with the centre word t. V is the size identification, only explored using input vectors, of vocabulary in the corpus. which might hinder their performance. In addi- The output is a one-hot encoding of the centre tion, metaphor identification is highly dependent word. W i and W o are updated via back propa- on its context.

Load more