A Statistical Approach to the Processing of Metonymy

A Statistical Approach to the Processing of Metonymy Masao Utiyama, Masaki Murata, and Hitoshi Isahara Communications Research Laboratory, MPT, 588-2, Iwaoka, Nishi-ku, Kobe, Hyogo 651-2492 Japan {mutiyama,murata,isahara}@crl.go.jp Abstract Such approaches are restricted by the knowl- This paper describes a statistical approach to edge bases they use, and may only be applicable the interpretation of metonymy. A metonymy to domain-specific tasks because the construc- is received as an input, then its possible inter- tion of large knowledge bases could be very dif- pretations are ranked by applying a statistical ficult. measure. The method has been tested experi- The method outlined in this paper, on the mentally. It correctly interpreted 53 out of 75 other hand, uses corpus statistics to interpret metonymies in Japanese. metonymy, so that a variety of metonymies can be handled without using hand-constructed 1 Introduction knowledge bases. The method is quite promis- Metonymy is a figure of speech in which the ing as shown by the experimental results given name of one thing is substituted for that of in section 5. something to which it is related. The explicit 2 Recognition and Interpretation term is ‘the name of one thing’ and the implicit term is ’the name of something to which it is Two main steps, recognition and interpre- related’. A typical example of metonymy is tation, are involved in the processing of metonymy (Fass, 1997). In the recognition step, He read Shakespeare. (1) metonymic expressions are labeled. In the interpretation step, the meanings of those expres- ‘Shakespeare’ is substituted for ‘the works of sions are interpreted. Shakespeare’. ‘Shakespeare’ is the explicit term Sentence (1), for example, is first recognized and ‘works’ is the implicit term. as a metonymy and ‘Shakespeare’ is identified Metonymy is pervasive in natural language. as the explicit term. The interpretation ‘works’ The correct treatment of metonymy is vital for is selected as an implicit term and ‘Shakespeare’ natural language processing applications, es- is replaced by ‘the works of Shakespeare’. pecially for machine translation (Kamei and A comprehensive survey by Fass (1997) shows Wakao, 1992; Fass, 1997). A metonymy may be that the most common method of recogniz- acceptable in a source language but unaccept- ing metonymies is by selection-restriction vio- able in a target language. For example, a direct lations. Whether or not statistical approaches translation of ‘he read Mao’, which is acceptable can recognize metonymy as well as the selection- in English and Japanese, is completely unac- restriction violation method is an interesting ceptable in Chinese (Kamei and Wakao, 1992). question. Our concern here, however, is the In such cases, the machine translation system interpretation of metonymy, so we leave that has to interpret metonymies to generate accept- question for a future work. able translations. In interpretation, an implicit term (or terms) Previous approaches to processing metonymy that is (are) related to the explicit term is (are) have used hand-constructed ontologies or se- selected. The method described in this paper mantic networks (Fass, 1988; Iverson and Helm- uses corpus statistics for interpretation. reich, 1992; Bouaud et al., 1996; Fass, 1997).1 tual clues obtained through corpus analysis for detecting 1As for metaphor processing, Ferrari (1996) used tex- metaphors. This method, as applied to Japanese concepts such as that the name of a container metonymies, receives a metonymy in a phrase can represent its contents and the name of an of the form ‘Noun A Case-Marker R Predicate artist can imply an artform (container for V ’ and returns a list of nouns ranked in or- contents and artist for artform below).3 Ex- der of the system’s estimate of their suitability amples of these and similar types of metonymic as interpretations of the metonymy, assuming concepts (Lakoff and Johnson, 1980; Fass, 1997) that noun A is the explicit term. For exam- are given below. 2 ple, given Ford wo (accusative-case) kau (buy) Container for contents (buy a Ford), zyôyôsya (car), best seller, kuruma (vehicle), etc. are returned, in that order. • glass no mizu (water) The method follows the procedure outlined below to interpret a metonymy. • nabe (pot) no ryôri (food) 1. Given a metonymy in the form ‘Noun A Artist for artform Case-Marker R Predicate V ’, nouns that • can be syntactically related to the explicit Beethoven no kyoku (music) term A are extracted from a corpus. • Picasso no e (painting) 2. The extracted nouns are ranked according to their appropriateness as interpretations Object for user of the metonymy by applying a statistical • ham sandwich no kyaku (customer) measure. • sax no sôsya (performer) The first step is discussed in section 3 and the second in section 4. Whole for part 3 Information Source • kuruma (car) no tire We use a large corpus to extract nouns which • door no knob can be syntactically related to the explicit term of a metonymy. A large corpus is valuable as a These examples suggest that we can extract source of such nouns (Church and Hanks, 1990; semantically related nouns by using the AnoB Brown et al., 1992). relation. We used Japanese noun phrases of the form AnoBto extract nouns that were syntactically 4 Statistical Measure related to A. Nouns in such a syntactic relation A metonymy ‘Noun A Case-Marker R Predi- are usually close semantic relatives of each other cate V ’ can be regarded as a contraction of (Murata et al., 1999), and occur relatively infre- ‘Noun A Syntactic-Relation Q Noun B Case- quently. We thus also used an AnearBrela- Marker R Predicate V ’, where A has relation tion, i.e. identifying the other nouns within the Q to B (Yamamoto et al., 1998). For exam- target sentence, to extract nouns that may be ple, Shakespeare wo yomu (read) (read Shake- more loosely related to A, but occur more fre- speare) is regarded as a contraction of Shake- quently. These two types of syntactic relation speare no sakuhin (works) wo yomu (read the are treated differently by the statistical measure works of Shakespeare), where A=Shakespeare, which we will discuss in section 4. Q=no, B=sakuhin, R=wo, and V=yomu. The Japanese noun phrase AnoBroughly Given a metonymy in the form ARV, the corresponds to the English noun phrase BofA, appropriateness of noun B as an interpretation but it has a much broader range of usage (Kuro- of the metonymy under the syntactic relation Q hashi and Sakai, 1999). In fact, AnoBcan ex- is defined by press most of the possible types of semantic relation between two nouns including metonymic . LQ(B|A, R, V ) = Pr(B|A, Q, R, V ), (2) 2‘Ford’ is spelled ‘hôdo’ in Japanese. We have used English when we spell Japanese loan-words from English 3Yamamoto et al. (1998) also used AnoBrelation for the sake of readability. to interpret metonymy. where Pr(···) represents probability and Q is This treatment does not alter the order of the either an AnoBrelation or an AnearBre- nouns ranked by the system because Pr(R, V ) lation. Next, the appropriateness of noun B is is a constant for a given metonymy of the form defined by ARV. Equations (5) and (6) differ in their treatment M(B|A, R, V ) = max LQ(B|A, R, V ). (3) Q of zero frequency nouns. In Equation (5), a noun B such that f(A, Q, B) = 0 will be ignored We rank nouns by applying the measure M. (assigned a zero probability) because it is un- Equation (2) can be decomposed as follows: likely that such a noun will have a close relation- ship with noun A. In Equation (6), on the other LQ(B|A, R, V ) hand, a noun B such that f(B,R,V ) = 0 is as- = Pr(B|A, Q, R, V ) signed a non-zero probability. These treatments Pr(A, Q, B, R, V ) reflect the asymmetrical property of metonymy, = i.e. in a metonymy of the form ARV,an Pr(A, Q, R, V ) implicit term B will have a much tighter rela- Pr(A, Q, B) Pr(R, V |A, Q, B) = tionship with the explicit term A than with the Pr(A, Q) Pr(R, V |A, Q) predicate V. Consequently, a noun B such that Pr(B|A, Q) Pr(R, V |B) f(A, Q, B) 0 ∧ f(B, R, V ) = 0 may be ap- ' , (4) Pr(R, V ) propriate as an interpretation of the metonymy. Therefore, a non-zero probability should be as- where hA, Qi and hR, V i are assumed to be in- signed to Pr(R, V |B) even if f(B, R, V )=0.5 dependent of each other. Equation (7) is the probability that noun B Let f(event) be the frequency of an event and occurs as a member of class C. This is reduced to f(B) | | Classes(B) be the set of semantic classes to f(C) if B is not ambiguous, i.e. Classes(B) = which B belongs. The expressions in Equation 1. If it is ambiguous, then f(B) is distributed (4) are then defined by4 equally to all classes in Classes(B). The frequency of class C is obtained simi- . f(A, Q, B) f(A, Q, B) Pr(B|A, Q) = = P , larly: X f(A, Q) B f(A, Q, B) f(B) (5) f(C)= , (8) |Classes(B)| B∈C Pr(R, V |B) where B is a noun which belongs to the class C. f(B,R,V ) Finally we derive Pf(B) if f(B,R,V ) > 0, .

Load more