Supervised Learning of a Probabilistic Lexicon of Verb Semantic Classes
Total Page:16
File Type:pdf, Size:1020Kb
Supervised Learning of a Probabilistic Lexicon of Verb Semantic Classes Yusuke Miyao Jun’ichi Tsujii University of Tokyo University of Tokyo Hongo 7-3-1, Bunkyo-ku, Tokyo, Japan University of Manchester [email protected] National Center for Text Mining Hongo 7-3-1, Bunkyo-ku, Tokyo, Japan [email protected] Abstract al., 2005; Li and Brew, 2007; Abend et al., 2008). In other words, those methods are heavily de- The work presented in this paper explores pendent on the availability of a semantic lexicon. a supervised method for learning a prob- Therefore, recent research efforts have invested in abilistic model of a lexicon of VerbNet developing semantic resources, such as WordNet classes. We intend for the probabilis- (Fellbaum, 1998), FrameNet (Baker et al., 1998), tic model to provide a probability dis- and VerbNet (Kipper et al., 2000; Kipper-Schuler, tribution of verb-class associations, over 2005), which greatly advanced research in seman- known and unknown verbs, including pol- tic processing. However, the construction of such ysemous words. In our approach, train- resources is expensive, and it is unrealistic to pre- ing instances are obtained from an ex- suppose the availability of full-coverage lexicons; isting lexicon and/or from an annotated this is the case because unknown words always ap- corpus, while the features, which repre- pear in real texts, and word-semantics associations sent syntactic frames, semantic similarity, may vary (Abend et al., 2008). and selectional preferences, are extracted This paper explores a method for the supervised from unannotated corpora. Our model learning of a probabilistic model for the VerbNet is evaluated in type-level verb classifica- lexicon. We target the automatic classification of tion tasks: we measure the prediction ac- arbitrary verbs, including polysemous verbs, into curacy of VerbNet classes for unknown all VerbNet classes; further, we target the esti- verbs, and also measure the dissimilarity mation of a probabilistic model, which represents between the learned and observed proba- the saliences of verb-class associations for polyse- bility distributions. We empirically com- mous verbs. In our approach, an existing lexicon pare several settings for model learning, and/or an annotated corpus are used as the training while we vary the use of features, source data. Since VerbNet classes are designed to rep- corpora for feature extraction, and disam- resent the distinctions in the syntactic frames that biguated corpora. In the task of verb clas- verbs can take, features, representing the statistics sification into all VerbNet classes, our best of syntactic frames, are extracted from the unan- model achieved a 10.69% error reduction notated corpora. Additionally, as the classes rep- in the classification accuracy, over the pre- resent semantic commonalities, semantically in- viously proposed model. spired features, like distributionally similar words, are used. These features can be considered as a 1 Introduction generalized representation of verbs, and we ex- Lexicons are invaluable resources for semantic pect that the obtained probabilistic model predicts processing. In many cases, lexicons are neces- VerbNet classes of the unknown words. sary to restrict a set of semantic classes to be as- Our model is evaluated in two tasks of type- signed to a word. In fact, a considerable number of level verb classification: one is the classification works on semantic processing implicitly or explic- of monosemous verbs into a small subset of the itly presupposes the availability of a lexicon, such classes, which was studied in some previous works as in word sense disambiguation (WSD) (Mc- (Joanis and Stevenson, 2003; Joanis et al., 2008). Carthy et al., 2004), and in token-level verb class The other task is the classification of all verbs into disambiguation (Lapata and Brew, 2004; Girju et the full set of VerbNet classes, which has not yet 1328 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1328–1337, Singapore, 6-7 August 2009. c 2009 ACL and AFNLP been attempted. In the experiments, training in- 43 Emission stances are obtained from VerbNet and/or Sem- 43.1 Light Emission beam, glow, sparkle,... Link (Loper et al., 2007), while features are ex- 43.2 Sound Emission tracted from the British National Corpus or from blare, chime, jangle,... Wall Street Journal. We empirically compare sev- ... 44 Destroy eral settings for model learning by varying the annihilate, destroy, ravage,... set of features, the source domain and the size 45 Change of State ... of a corpus for feature extraction, and the use of 47 Existence the token-level statistics obtained from a manually 47.1 Exist disambiguated corpus. We also provide the anal- exist, persist, remain,... 47.2 Entity-Specific Modes Being ysis of the remaining errors, which will lead us to bloom, breathe, foam,... further improve the supervised learning of a prob- 47.3 Modes of Being with Motion abilistic semantic lexicon. jiggle, sway, waft,... ... Supervised methods for automatic verb classifi- cation have been extensively investigated (Steven- son et al., 1999; Stevenson and Merlo, 1999; Figure 1: VerbNet classes Merlo and Stevenson, 2001; Stevenson and Joa- 43.2 Sound Emission nis, 2003; Joanis and Stevenson, 2003; Joanis et Theme V al., 2008). However, their focus has been lim- Theme V P:loc Location ited to a small subset of verb classes, and a lim- P:loc Location V Theme there V Theme P:loc Location ited number of monosemous verbs. The main con- Agent V Theme tributions of the present work are: i) to provide Theme V Oblique empirical results for the automatic classification Location V with Theme of all verbs, including polysemous ones, into all 47.3 Modes of Being with Motion VerbNet classes, and ii) to empirically explore the Theme V Theme V P:loc Location effective settings for the supervised learning of a P:loc Location V Theme probabilistic lexicon of verb semantic classes. there V Theme Agent V Theme 2 Background Figure 2: Syntactic frames for VerbNet classes 2.1 Verb lexicon Levin’s (1993) work on verb classification has broadened the field of computational research that sification. The classes therefore cover more En- concerns the relationships between the syntactic glish verbs, and the classification should be more and semantic structures of verbs. The principal consistent (Korhonen and Briscoe, 2004; Kipper idea behind the work is that the meanings of verbs et al., 2006). can be identified by observing possible syntactic The current version of VerbNet includes 270 frames that the verbs can take. In other words, classes.1 Figure 1 shows a part of the classes of with the knowledge of syntactic frames, verbs can VerbNet. The top-level categories, e.g. Emis- be semantically classified. This idea provided the sion and Destroy, represent a coarse classifica- computational linguistics community with crite- tion of verb semantics. They are further classi- ria for the definition and the classification of verb fied into verb classes, each of which expresses semantics; it has subsequently resulted in the re- a group of verbs sharing syntactic frames. Fig- search of the induction of verb classes (Korhonen ure 2 shows an excerpt from VerbNet, which rep- and Briscoe, 2004), and the construction of a verb resents the possible syntactic frames for the Sound lexicon based on Levin’s criteria. Emission class, including “chime” and “jangle,” VerbNet (Kipper et al., 2000; Kipper-Schuler, and the Modes of Being with Motion class, in- 2005) is a lexicon of verbs organized into classes cluding “jiggle” and “waft.” In this figure, each that share the same syntactic behaviors and seman- line represents a syntactic frame, where Agent, tics. The design of classes originates from Levin 1Throughout this paper, we refer to VerbNet 2.3. Sub- (1993), though the design has been considerably classes are ignored in this work, following the setting of reorganized and extends beyond the original clas- Abend et al. (2008). 1329 . the walls still shook;VN=47.3 and an evacuation (2008) focused on 14 classes and 835 verbs. Al- alarm blared;VN=43.2 outside. though these works provided a theoretical frame- Suddenly the woman begins;VN=55.1 swaying work for supervised verb classification, their re- ;VN=47.3 and then . sults were not readily available for practical ap- plications, because of the limitation in the cover- Figure 3: An excerpt from SemLink age of the targeted classes/verbs on real texts. On the contrary, we target the classification of arbi- Theme, and Location indicate the thematic trary verbs, including polysemous verbs, into all roles, V denotes a verb, and P specifies a prepo- VerbNet classes (270 in total). In this realistic sit- sition. P:loc defines locative prepositions such uation, we will empirically compare settings for as: “in” and “at.” For example, the second syn- model learning, in order to explore effective con- tactic frame of Sound Emission, i.e., Theme V ditions to obtain better models. P:loc Location, corresponds to the follow- Another difference from the aforementioned ing sentence: works is that we aim at obtaining a probabilis- tic model, which represents saliences of classes 1. The coins jangled in my pocket. of polysemous verbs. Lapata and Brew (2004) Theme corresponds to “the coins,” V to “jangled,” and Li and Brew (2007) focused on this issue, P:loc to “in,” and Location to “my pocket.” and described methods for inducing probabilities While VerbNet provides associations between of verb-class associations. The obtained proba- verbs and semantic classes, SemLink (Loper et bilistic model was intended to be incorporated into al., 2007) additionally provides mappings among a token-level disambiguation model. Their meth- VerbNet, FrameNet (Baker et al., 1998), PropBank ods claimed to be unsupervised, meaning that the (Palmer et al., 2005), and WordNet (Fellbaum, induction of a probabilistic lexicon did not re- 1998).