Semantic Classifications for Detection of Verb Metaphors

Semantic classifications for detection of verb metaphors Beata Beigman Klebanov1 and Chee Wee Leong1 and Elkin Dario Gutierrez2 and Ekaterina Shutova3 and Michael Flor1 1 Educational Testing Service 2 University of California, San Diego 3 University of Cambridge bbeigmanklebanov,cleong,mflor @ets.org [email protected]{ , [email protected]} Abstract (Tsvetkov et al., 2014; Heintz et al., 2013; Tur- ney et al., 2011; Birke and Sarkar, 2007; Gedigan We investigate the effectiveness of se- et al., 2006), rather than using naturally occurring mantic generalizations/classifications for continuous text, as done here. Beigman Klebanov capturing the regularities of the behavior et al. (2014) and Beigman Klebanov et al. (2015) of verbs in terms of their metaphoric- are the exceptions, used as a baseline in the current ity. Starting from orthographic word paper. unigrams, we experiment with various Features that have been used so far in super- ways of defining semantic classes for vised metaphor classification address concreteness verbs (grammatical, resource-based, dis- and abstractness, topic models, orthographic uni- tributional) and measure the effectiveness grams, sensorial features, semantic classifications of these classes for classifying all verbs using WordNet, among others (Beigman Klebanov in a running text as metaphor or non et al., 2015; Tekiroglu et al., 2015; Tsvetkov et al., metaphor. 2014; Dunn, 2014; Heintz et al., 2013; Turney et al., 2011). Of the feature sets presented in this pa- 1 Introduction per, all but WordNet features are novel. According to the Conceptual Metaphor theory (Lakoff and Johnson, 1980), metaphoricity is a 3 Semantic Classifications property of concepts in a particular context of use, In the following subsections, we describe the dif- not of specific words. The notion of a concept is a ferent types of semantic classifications; Table 1 fluid one, however. While write and wrote would summarizes the feature sets. likely constitute instances of the same concept according to any definition, it is less clear whether Name Description #Features eat and gobble would. Furthermore, the Con- U orthographic unigram varies ceptual Metaphor theory typically operates with UL lemma unigram varies whole semantic domains that certainly generalize VN-Raw VN frames 270 VN-Pred VN predicate 145 beyond narrowly-conceived concepts; thus, save VN-Role VN thematic role 30 and waste share a very general semantic feature of VN-RoRe VN them. role filler 128 applying to finite resources – it is this meaning el- WordNet WN lexicographer files 15 Corpus distributional clustering 150 ement that accounts for the observation that they tend to be used metaphorically in similar contexts. Table 1: Summary of feature sets. All features are In this paper, we investigate which kinds of gen- binary features indicating class membership. eralizations are the most effective for capturing regularities of metaphor usage. 3.1 Grammar-based 2 Related Work The most minimal level of semantic generalization Most previous supervised approaches to verb is that of putting together verbs that share the same metaphor classification evaluated their systems on lemma (lemma unigrams, UL). We use NLTK selected examples or in small-scale experiments (Bird et al., 2009) for identifying verb lemmas. 101 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 101–106, Berlin, Germany, August 7-12, 2016. c 2016 Association for Computational Linguistics 3.2 Resource-based strictions that apply to fillers of various thematic roles. For example, verbs that have a thematic VerbNet: The VerbNet database (Kipper et al., role of instrument can have the filler restricted 2006) provides a classification of verbs accord- to being inanimate, body part, concrete, pointy, ing to their participation in frames – syntactic pat- solid, and others. Across the various VerbNet terns with semantic components, based on Levin’s classes, there are 128 restricted roles (such as in- classes (Levin, 1993). Each verb class is anno- strument pointy). We used those to generate VN- tated with its member verb lemmas, syntactic con- RoRe classes. structions in which these participate (such as tran- WordNet: We use lexicographer files to clas- sitive, intransitive, diathesis alternations), seman- sify verbs into 15 classes based on their general tic predicates expressed by the verbs in the class meaning, such as verbs of communication, con- (such as motion or contact), thematic roles (such sumption, weather, and so on. as agent, patient, instrument), and restrictions on the fillers of these semantic roles (such as pointed 3.3 Corpus-based instrument). We also experimented with automatically- VerbNet can thus be thought of as providing a generated verb clusters as semantic classes. We number of different classifications over the same clustered VerbNet verbs using a spectral cluster- set of nearly 4,000 English verb lemmas. The ing algorithm and lexico-syntactic features. We main classification is based on syntactic frames, as selected the verbs that occur more than 150 times enacted in VerbNet classes. We will refer to them in the British National Corpus, 1,610 in total, and as VN-Raw classes. An alternative classification clustered them into 150 clusters (Corpus). is based on the predicative meaning of the verbs; We used verb subcategorization frames (SCF) for example, the verbs assemble and introduce are and the verb’s nominal arguments as features for in different classes based on their syntactic beha- clustering, as they have proved successful in pre- vior, but both have the meaning component of to- vious verb classification experiments (Shutova et gether, marked in VerbNet as a possible value of al., 2010). We extracted our features from the Gi- the Predicate variable. Similarly, shiver and faint gaword corpus (Graff et al., 2003) using the SCF belong to different VerbNet classes in terms of classification system of Preiss et al. (2007) to iden- syntactic behavior, but both have the meaning el- tify verb SCFs and the RASP parser (Briscoe et al., ement of describing an involuntary action. Using 2006) to extract the verb’s nominal arguments. the different values of the Predicate variable, we Spectral clustering partitions the data relying created a set of classes. We note that the VN-Pred on a similarity matrix that records similarities be- same verb lemma can occur in multiple classes, tween all pairs of data points. We use Jensen- since different senses of the same lemma can have Shannon divergence (d ) to measure similarity different meanings, and even a single sense can JS between feature vectors for two verbs, v and v , express more than one predicate. For example, the i j and construct a similarity matrix S : verb stew participates in the following classes of ij various degrees of granularity: cause (shared with Sij = exp( dJS(vi, vj)) (1) 2,912 other verbs), use (with 700 other verbs), ap- − ply heat (with 49 other verbs), cooked (with 49 The matrix S encodes a similarity graph G over other verbs). our verbs. The clustering problem can then be de- Each VerbNet class is marked with the thematic fined as identifying the optimal partition, or cut, of roles its members take, such as agent or benefi- the graph into clusters. We use the multiway nor- ciary. Here again, verbs that differ in syntactic malized cut (MNCut) algorithm of Meila and Shi behavior and in the predicate they express could (2001) for this purpose. The algorithm transforms share thematic roles. For example, stew and prick S into a stochastic matrix P containing transition belong to different VerbNet classes and share only probabilities between the vertices in the graph as 1 the most general predicative meanings of cause P = D− S, where the degree matrix D is a dia- N and use, yet both share a thematic role of instru- gonal matrix with Dii = j=1 Sij. It then com- ment. We create a class for each thematic role putes the K leading eigenvectors of P , where K is P (VN-Role). the desired number of clusters. The graph is par- Finally, VerbNet provides annotations of the re- titioned by finding approximately equal elements 102 in the eigenvectors using a simpler clustering al- al., 2015), Random Forest (Tsvetkov et al., 2014), gorithm, such as k-means. Meila and Shi (2001) Linear Support Vector Classifier. We found that have shown that the partition I derived in this way Logistic Regression was better for unigram fea- minimizes the MNCut criterion: tures, Random Forest was better for features using K WordNet and VerbNet classifications, whereas the MNCut(I) = [1 P (I I I )], (2) − k → k| k corpus-based features yielded similar performance kX=1 across classifiers. We therefore ran all evaluations which is the sum of transition probabilities across with both Logistic Regression and Random For- different clusters. Since k-means starts from a ran- est classifiers. We use the skll and scikit-learn dom cluster assignment, we ran the algorithm mul- toolkits (Blanchard et al., 2013; Pedregosa et al., tiple times and used the partition that minimizes 2011). During training, each class is weighted in the cluster distortion, that is, distances to cluster inverse proportion to its frequency. The optimiza- centroid. tion function is F1 (metaphor). We tried expanding the coverage of VerbNet verbs and the number of clusters using grid search 5 Results on the training data, with coverage grid = 2,500; { We first consider the performance of each type of 3,000; 4,000 and #clusters grid = 200; 250; 300; } { semantic classification separately as well as var- 350; 400 , but obtained no improvement in perfor- } ious combinations using cross-validation on the mance over our original setting. training set. Table 3 shows the results with the 4 Experiment setup classifier that yields the best performance for the given feature set. 4.1 Data We use the VU Amsterdam Metaphor Corpus Name N F A C Av.

Load more