A Taxonomic Classification of Wordnet Polysemy Types
Total Page:16
File Type:pdf, Size:1020Kb
A Taxonomic Classification of WordNet Polysemy Types Abed Alhakim Freihat Fausto Giunchiglia Biswanath Dutta Qatar Computing Research Institute University of Trento Indian Statistical Institute (ISI) Doha, Qatar Trento, Italy Bangalore, India [email protected] [email protected] [email protected] Abstract been given, so far, to the principles or rules used to identify polysemy types. In fact, none WordNet represents polysemous terms by of these approaches can explain how to identify capturing the different meanings of these the polysemy types of the discovered polysemy terms at the lexical level, but without giv- structural patterns or how to differentiate for ing emphasis on the polysemy types such example, between homonymy and metaphoric terms belong to. The state of the art pol- structural patterns. Although Apersejan’s seman- ysemy approaches identify several poly- tic similarity criterion (Apresjan, 1974) can be semy types in WordNet but they do not ex- used to account for regularity in polysemy, it plain how to classify and organize them. can not predict the polysemy type of the regular In this paper, we present a novel approach polysemy types in WordNet. Our hypotheses in for classifying the polysemy types which this paper is that identifying and differentiating exploits taxonomic principles which in between the polysemy types of the regular pol- turn, allow us to discover a set of poly- ysemy structural patterns requires understanding semy structural patterns. the hierarchical structure of WordNet and, thus, the criteria related to the taxonomic principles that 1 Introduction the hierarchical structure of WordNet comply with Polysemy in WordNet (Miller, 1995) corresponds or violates. In this paper, we show how to use two to various kinds of linguistic phenomena and taxonomic principles as criteria for identifying can be grouped into various polysemy types the polysemy types in WordNet. Based on these (Falkum, 2011). Although WordNet was inspired principles, we introduce a semi automatic method by psycholinguistic and semantic principles for discovering and identifying three polysemy (Miller et al., 1990), its conceptual dictionary puts types in WordNet. greater emphasis on the lexical level rather than on the semantic one (Dolan, 1994). Lexicalizing The paper is organized as follows. In Sec- polysemous terms without any further information tion two, we discuss the problem. In Section about their polysemy type affects the usability of three, we introduce the formal definitions we WordNet as a knowledge resource for semantic use. In Section four, we discuss the taxonomic applications (Mandala et al., 1999). principles that we use to discover three of the In general, the state of the art approaches suggests polysemy types in WordNet. In Section five, we different solutions to the polysemy problem. The give an overview of our approach. In Section six, most prosperous among these approaches are the we show how to use the taxonomic principles to regular/systematic polysemy approaches such as identify metaphoric structural patterns. In Section (Buitelaar, 1998) (Barque and Chaumartin, 2009) seven, we demonstrate how to determine special- (Veale, 2004) (Peters, 2004). These approaches ization polysemy structural patterns. In Section propose the semantic regularity as a basis for eight, we describe how to discover homonymy classification of the polysemy classes and offer structural patterns. In Section nine, we explain different solutions that commensurate the nature how to handle false positives in the structural of the discovered polysemy types. patterns. In Section ten, we present the results of Despite the diversity and depth of the state of our approach. In Section eleven, we conclude the the art solutions, no or very little attention has paper and depict our future work. 2 Problem Statement approaches do not reveal the principles or the cri- teria used to classify polysemous terms into poly- WordNet is a machine readable online lexical semy types. database for the English language. Based on psy- In this paper, we explain how to use the exclusive- cholinguistic principles, WordNet has been devel- ness property and the collectively exhaustiveness oping since 1985, by linguists and psycholinguists property (Bailey, 1994) (Marradi, 1990) for iden- as a conceptual dictionary rather than an alpha- tifying the following polysemy types. betic one (Miller et al., 1990). Since that time, several versions of WordNet have been developed. 1 Metaphoric polysemy: Refers to the poly- In this paper, we are concerned with WordNet 2.1. semy instances in which a term has literal WordNet 2.1. contains 147,257 words, 117,597 and figurative meanings (Evans and Zinken, synsets and 207,019 word-sense pairs. The num- 2006). In the following example, the first ber of polysemous words in WordNet is 27,006, meaning of the term fox is the literal mean- where 15776 are nouns. ing and the second meaning is the figurative. In this paper, we deal with polysemous nouns at #1 fox: alert carnivorous mammal. the concept level only. We do not consider pol- #2 dodger, fox, slyboots: a shifty ysemy at the instance level. After removing the deceptive person. polysemous nouns that refer to proper names, the remaining polysemous nouns are 14530 nouns. 2 Specialization polysemy: A type of related WordNet does not differentiate between the types polysemy which denotes a hierarchical rela- of the polysemous terms and it does not contain tion between the meanings of a polysemous any information in terms of polysemy relations term. In the case of abstract meanings, we that can be conducted to determine the polysemy say that a meaning A is a more general type between the synsets of a polysemous term. meaning of a meaning B. We may also use The researchers who attached the polysemy prob- the taxonomic notations type and subtype lem in WordNet gave different descriptions for the instead of more general meaning and more polysemy types in WordNet. For example, poly- specific meaning respectively. For example, semy reduction approaches (Edmonds and Agirre, we say that the first meaning of turtledove 2008) (Mihalcea R., 2001) (Gonzalo J., 2000) is a subtype of the second meaning. differentiate between contrastive polysemy and #1 australian turtledove, turtledove: complementary polysemy. Regular polysemy ap- small Australian dove. proaches such as (Barque and Chaumartin, 2009) #2 turtledove: any of several Old (Veale, 2004) (Peters, 2004) (Freihat et al., 2013) World wild doves. (Lohk et al., 2014) give more refined classification of the polysemy types into metonymy, metaphoric, 3 Homonymy: Refers to the contrastive specialization polysemy, and homonymy. In one polysemy instances, where meanings are not of our recent papers, compound noun polysemy is related. Consider for example the following introduced as a new polysemy type beside the for- polysemy instance of the term bank. mer four polysemy types in WordNet (Freihat et #1 depository financial institution, al., 2015). bank: a financial institution. So far, no polysemy reduction approaches have #2 bank: sloping land (especially introduced a mechanism for classifying the pol- the slope beside a body of water). ysemy types into contrastive and complemen- tary. Instead, these approaches adopt seman- 3 Approach Notations tic and probabilistic rules to discover redundant We begin with the basic notations. Lemma is the and/or very fine grained senses. On the other basic lexical unit in WordNet that refers to the base hand, the regular polysemy approaches embrace a form of a word or a collocation. Based on this defi- clear definition for classifying polysemous terms nition, we define a natural language term or simply into regular and non regular polysemy (Apresjan, a term as a lemma that belongs to a grammatical 1974). Although, the definition of regular poly- category; i.e., noun, verb, adjective or adverb. semy in these approaches is useful to distinguish between regular and non regular polysemy, these Definition 1 (Term). A term T is a quadruple xLemma, Caty, where a) entity P S is the single root of the hierarchy; a) Lemma is the term lemma; b) E S ¢ S; b) Cat is the grammatical category of the term. c) ps1; s2q P E if s1 s2; d) For any synset s entity, there exists at least 1 1 Synset is the fundamental structure in Word- one synset s such that s s. Net that we define as follow. Definition 2 (WordNet synset). In this definition, point (a) defines the single root of the hierarchy and point (d) defines the x A synset S is defined as Cat, Terms, Gloss, connectivity property in the hierarchy. y Relations , where We move now to the semantics of WordNet. We a) Cat is the grammatical category of the synset; define the subset of the semantics of WordNet b) Terms is an ordered list of synonymous terms hierarchy that is relevant for our approach. A full that have the same grammatical category Cat; definition of the WordNet semantics is described c) Gloss is a text that describes the synset; in approaches such as (Alvarez, 2000) (Rudolph, d) Relations is a set of semantic relations that hold 2011) (Breaux et al., 2009). between synsets. We define the semantics of WordNet using an Interpretation I x∆I ; fy, where ∆I is an non Now, we move to the hierarchical structure empty set (the domain of interpretation) and f is of WordNet. WordNet uses the relation direct an interpretation function. hypernym to organize the hierarchical relations Definition 5 (Semantics of WordNet Hierarchy). between the synsets. This relation denotes the superordinate relationship between synsets. For Let WH xS; Ey be wordNet hierarchy. We I example, the relation direct hypernym holds define an Interpretation of WH, I x∆ ; fy as between vehicle and wheeled vehicle where follows: I I vehicle is hypernym of wheeled vehicle. The a) entity ∆ ; I direct hypernym relation is transitive. In the b) K H; I I following, we generalize the direct hypernym c) @s P S: s ∆ ; p [ qI I X I relation to reflect the transitivity property, where d) s1 s2 s1 s2; p \ qI I Y I we use the notion hypernym instead of a direct e) s1 s2 s1 s2; I I hypernym.