<<

The Use of WordNet in

Rila Mandala, Tokunaga Takenobu, and Tanaka Hozumi Department of Computer Science Tokyo Institute of Technology {rila, take, t anaka}@cs, t itech, ac. j p

Abstract searchers use is often not the same as the one by WordNet has been used in information retrieval which the information has been indexed. Query research by many researchers, but failed to im- expansion is one method to solve this problem. prove the performance of their retrieval system. The query is expanded using terms which have Thereby in this paper we investigate why the similar meaning or bear some relation to those use of WordNet has not been successful. Based in the query, increasing the chances of matching on this analysis we propose a method of making words in relevant documents. Expanded terms WordNet more useful in information retrieval are generally taken from a thesaurus. applications. Experiments using several stan- Obviously, given a query, the information re- dard information retrieval test collections show trieval system must present all useful articles to that our method results in a significant improve- the user. This objective is measured by recall, ment of information retrieval performance. i.e. the proportion of relevant articles retrieved by the system. Conversely, the information re- 1 Introduction trieval system must not present any useless ar- Development of WordNet began in 1985 at ticle to the user. This criteria is measured by Princeton University (Miller, 1990). A team precision, i.e. the proportion of retrieved arti- lead by Prof. George Miller aimed to create cles that are relevant. a source of lexical knowledge whose organiza- Voorhees used WordNet as a tool for query tion would reflect some of the recent findings of expansion (Voorhees, 1994). She conducted ex- psycholinguistic research into the human lexi- periments using the TREC collection (Voorhees con. WordNet has been used in numerous nat- and Harman, 1997) in which all terms in the ural language processing, such as part of speech queries were expanded using a combination of tagging (Segond et al., 97), word sense disam- synonyms, hypernyms, and hyponyms. She set biguation (Resnik, 1995), text categorization the weights of the words contained in the orig- (Gomez-Hidalgo and Rodriguez, 1997), infor- inal query to 1, and used a combination of 0.1, mation extraction (Chai and Biermann, 1997), 0.3, 0.5, 1, and 2 for the expansion terms. She and so on with considerable success. However then used the SMART Information Retrieval the usefulness of WordNet in information re- System Engine (Salton, 1971) to retrieve the trieval applications has been debatable. documents. Through this method, Voorhees Information retrieval is concerned with lo- only succeeded in improving the performance cating documents relevant to a user's infor- on short queries and a tittle with no significant mation needs from a collection of documents. improvement for long queries. She further tried The user describes his/her information needs to use WordNet as a tool for word sense dis- with a query which consists of a number of ambiguation (Voorhees, 1993) and applied it to words. The information retrieval system com- text retrieval, but the performance of retrieval pares the query with documents in the collec- was degraded. tion and returns the documents that are likely Stairmand (Stairmand, 1997) used WordNet to satisfy the user's information requirements. to compute lexical cohesion according to the A fundamental weakness of current information method suggested by Morris (Morris and Hirst, retrieval methods is that the vocabulary that 199 I), and applied this to information retrieval.

31 He concluded that his method could not be ap- performance. We thus turned to investigation of ! plied to a fully-functional information retrieval why all the relevant documents could not be re- system. trieved with the query expansion method above. Smeaton (Smeaton and Berrut, 1995) tried Some of the reasons are stated below : ! to expand the queries of the TREC-4 collec- tion with various strategies of weighting expan- • Two terms that seem to be interrelated sion terms, along with manual and automatic have different parts of speech in WordNet. word sense disambiguation techniques. Unfor- This is the case between stochastic (adjec- I tunately all strategies degraded the retrieval tive) and statistic (noun). Since words in performance. WordNet are grouped on the basis of part Instead of matching terms in queries and doc- of speech in WordNet, it is not possible to ! uments, Richardson (Richardson and Smeaton, find a relationship between terms with dif- 1995) used WordNet to compute the semantic ferent parts of speech. distance between concepts or words and then • Most of relationships between two terms I used this term distance to compute the similar- are not found in WordNet. For example ity between a query and a document. Although how do we know that Sumitomo Bank is a he proposed two methods to compute seman- Japanese company ? tic distances, neither of them increased the re- I trieval performance. • Some terms are not included in WordNet (proper name, etc). 2 What's wrong with WordNet? To overcome all the above problems, we pro- I In this section we analyze why WordNet has pose a method to enrich WordNet with an au- failed to improve information retrieval perfor- tomatically constructed thesaurus. The idea mance. We run exact-match retrieval against underlying this method is that an automati- 9 small standard test collections in order to I cally constructed thesaurus could complement observe this phenomenon. An information re- the drawbacks of WordNet. For example, as trieval test collection consists of a collection of i documents along with a set of test queries. The we stated earlier, proper names and their inter- set of relevant documents for each test query relations among them are not found in Word- is also given, so that the performance of the Net, but if proper names and other terms have information retrieval system can be measured. some strong relationship, they often cooccur in the document, so that their relationship may be I We expand queries using a combination of syn- onyms, hypernyms, and hyponyms in WordNet. modelled by an automatically constructed the- The results are shown in Table 1. saurus. I In Table 1 we show the name of the test col- Polysemous words degrade the precision of in- lection (Collection), the total number of docu- formation retrieval since all senses of the origi- ments (#Doc) and queries (#Query), and all nal query term are considered for expansion. To I relevant documents for all queries (#Rel) in overcome the problem of polysemous words, we that collection. For each document collection, apply a restriction in that queries are expanded we indicate the total number of relevant docu- by adding those terms that are most similar to ments retrieved (Rel-ret), the recall (~), the entirety of query terms, rather than select- I ing terms that are similar to a single term in the total number of documents retrieved (Ret- the query. docs), and the precision t Rel-ret ~ for each of In the next section we describe the details of Ret-docs j I no expansion (Base), expansion with synonyms our method (Exp. I), expansion with synonyms and hyper- nyms (Exp. II), expansion with synonyms and 3 Method I hyponyms (Exp. III), and expansion with syn- 3.1 Co-occurrence-based Thesaurus onyms, hypernyms, and hyponyms (Exp. IV). The general idea underlying the use of term co- From the results in Table 1, we can conclude occurrence data for thesaurus construction is I that query expansion can increase recall per- that words that tend to occur together in doc- formance but unfortunately degrades precision uments are likely to have similar, or related, I 32 I Table 1: Term Expansion Experiment Results using WordNet Collection #Doc #Query i #Rel, Base Exp. I Exp. II Exp. III Exp. IV ADI 82 35 170 Rel-ret 157 159 166 169 169 Recall 0.9235 0.9353 0.9765 0.9941 0.9941 Ret-docs 2,063 2,295 2,542 2,737 2,782 Precision 0.0761 0.0693 0.0653 0.0617 0.0607 CACM 3204 64 796 Rel-ret 738 756 766 773 773 Recall 0.9271 0.9497 0.9623 0.9711 0.9711 Ret-docs 67,950 86,552 101,154 109,391 116,001 Precision 0.0109 0.0087 0.0076 0.0070 0.0067 i CISI 1460 112 3114 Rel-ret 2,952 3015 3,076 3,104 3,106 Recall 0.9479 0.9682 0.9878 0.9968 0.9974 Ret-docs 87,895 98,844 106,275 108,970 109,674 Precision 0.0336 0.0305 0.0289 0.0284 0.0283 CRAN 1398 225i 1838' Rel-ret 1,769 1,801 1,823 1,815 1,827 ! Recall 0.9625 0.9799 0.9918 0.9875 0.9940 ,~ Ret-docs 199,469 247,212 284,026 287,028 301,314 Precision 0.0089 0.0073 0.0064 0.0063 0.0060 J i INsPEc 12684 84 2543 Rel-ret 2,508 2,531 2,538 2,536 2,542 i Recall 0.9862 0.9953 0.9980 0.9972 0.9996 Ret-docs 564,809 735,931 852,056 869,364 912,810 0.0044 0.0034 0.0030 0.0029 0.0028 I i Precision LISA 6004 35 " 339 Rel-ret 339 339 339 339 339 Recall 1.0000 1.0000 1.0000 1.0000 1.0000 Ret-docs 148,547 171,808 184,101 188,289 189,784 Precision 0.0023 0.0020 0.0018 0.0018 0.0018 I I ME[) 1033 30 696 - Rel-ret 639 662 670 671 673 Recall 0.9181 0.9511 0.9626 0.9640 0.9670 Ret-docs 12,021 16,758 22,316 22,866 25,250 i Precision 0.0532 I i 0.0395 0.0300 0.0293 0.0267 NPL 11429 100 2083 Rel-ret 2,061 2,071 2,073 2,072 2,074 Recall 0.9894 0.9942 0.9952 0.9942 0.9957 Ret-docs 267,158 395,280 539,048 577,033 678,828 0.0077 i Precision 0.0052 0.0038 0.0036 0.0031 Th'vIE 423 24 324 i Rel-ret 324 324 324 324 324 Recall 1.000 1.000 1.000 1.000 1.000 Ret-docs 23,014 29,912 33,650 32,696 34,443 Precision 0.0141 0.0108 0.0096 0.0095 0.0094

meanings. Co-occurrence data thus provides a 3.2 Predicate-Argument-based statistical method for automatically identifying Thesaurus semantic relationships that are normally con- In contrast with the previous section, this tained in a hand-made thesaurus. Suppose two method attempts to construct a thesaurus ac- words (A and B) occur fa and fb times, respec- cording to predicate-argument structures. The tively, and cooccur fc times, then the similarity use of this method for thesaurus construction between A and B can be calculated using a sim- is based on the idea that there are restrictions ilarity coefficient such as the Dice Coefficient : on what words can appear in certain environ- ments, and in particular, what words can be ar- guments of a certain predicate. For example, a 2xf cat may walk, bite, but can not fly. Each noun may therefore be characterized according to the

33 verbs or adjectives that it occurs with. Nouns 3.3 Expansion Term Weighting Method may then be grouped according to the extent to A query q is represented by a vector -~ = which they appear in similar constructions. (ql, q2, ..., qn), where the qi's are the weights of First, all the documents are parsed using the the search terms ti contained in query q. Apple Pie Parser, which is a probabilistic chart The similarity between a query q and a term parser developed by Satoshi Sekine (Sekine and tj can be defined as belows : Grisbman, 1995). Then the following syntactic structures are extracted : simqt(q, tj) = ~ qi * sirn(ti, tj) ti E q • Subject-Verb Where the value of sim(ti, tj) can be de- • Verb-Object fined as the average of the similarity values in • Adjective-Noun the three types of thesaurus. Since in Word- Net there are no similarity weights, when there Each noun has a set of verbs and adjective is a relation between two terms in WordNet, that it occurs with, and for each such relation- their similarity is taken from the average of the ship, a dice coefficient value is calculated. similarity between those two terms in the co- occurrence-based and in predicate-argument- 2Xh.b(vi,n i) based thesauri. Csub(Vi, nj) -- l(v~)+L,b(nj)' With respect to the query q, all the terms in where fsub(Vi, nj) is the frequency of noun the collection can now be ranked according to nj occurring as the subject of verb vi, their simqt. Expansion terms are terms tj with fsub(nj) is the frequency of the noun nj oc- high simqt(q, tj). curring as subject of any verb, and f(vi) is The weight(q, tj) of an expansion term tj is the frequency of the verb vi defined as a function of simqt(q, tj): Cobj(Vi, n3) = 2×/obj(.,,~) f(vl)+fobj (nj)' weight(q, tj) = simqt(q, tj) where fobi(vi, ni) is the frequency of noun t, eq qi n i occurring as the object of verb vi, fobj(nj) is the frequency of the noun nj oc- where 0 _< weight(q, tj) <_ 1. curring as object of any verb, and f(vi) is An expansion term gets a weight of 1 if its the frequency of the verb vi similarity to all the terms in the query is 1. Ex- • C~aj(a~,n3)= 2×/od~(.,,.~) pansion terms with similarity 0 to all the terms f(ai)'l'fadj(nj) ' in the query get a weight of 0. The weight of an where f(ai, nj) is the frequency of noun expansion term depends both on the entire re- nj occurring as argument of adjective ai, trieval query and on the similarity between the fadi(nj) is the frequency of the noun n i oc- terms. The weight of an expansion term can curring as argument of any adjective, and be interpreted mathematically as the weighted f(a 0 is the frequency of the adjective ai mean of the similarities between the term tj and all the query terms. The weight of the original We define the object similarity of two nouos query terms are the weighting factors of those with respect to one predicate, as the minimum similarities. of each dice coefficient with respect to that predicate, i.e. Therefore the query q is expanded by adding SI~'t/I, ub(Vi, rlj, nk)=min{C.ub(Vi, nj), C.ub(Vi, nk)} the following query : SIMobi(vi, n i, n~)=rnin{Cobj (vi, nj), Cob1(vi, nh) } $IM~dj(ai, n i, nk)=min{C~dj(a~, n j), C~dj(a,, nk)} ~e = (al, a2, ..., at)

Finally the overall similarity between two where aj is equal to weight(q, tj) iftj belongs to nouns is defined as the average of all the similar- the top r ranked terms. Otherwise a i is equal ities between those two nouns for all predicate- to 0. argument structures. The resulting expanded query is :

34 5 Discussions In this section we discuss why our method of using WordNet is able to improve the perfor- where the o is defined as the concatenation op- mance of information retrieval. The important erator. points of our method are : The method above can accommodate the pol- • the coverage of WordNet is broadened ysemous word problem, because an expansion term which is taken f~om a different sense to the • weighting method original query term is given very low weight. The three types of thesaurus we used have 4 Experimental Results different characteristics. Automatically con- In order to evaluate the effectiveness of the pro- structed thesauri add not only new terms but posed method in the previous section we con- also new relationships not found in WordNet. ducted experiments using the WSJ, CACM, IN- If two terms often cooccur together in a docu- SPEC, CISI, Cranfield, NPL, and LISA test col- ment then those two terms are likely bear some lections. The WSJ collection comprises part of relationship. Why not only use the automati- the TREC collection (Voorhees and Harman, cally constructed thesauri ? The answer to this 1997). As a baseline we used SMART (Salton, is that some relationships may be missing in 1971) without expansion. SMART is an in- the automatically constructed thesauri. For ex- formation retrieval engine based on the vector ample, consider the words tumor and turnout. space model in which term weights are calcu- These words certainly share the same context, lated based on term frequency, inverse docu- but would never appear in the same document, ment frequency and document length normal- at least not with a frequency recognized by a ization. The results are shown in Table 2. This cooccurrence-based method. In general, dif- table shows the average of 11 point uninterpo- ferent words used to describe similar concepts lated recall-precision for each of baseline, expan- may never be used in the same document, and sion using only WordNet, expansion using only are thus missed by the cooccurrence methods. predicate-argument-based thesaurus, expansion However their relationship may be found in the using only cooccurrence-based thesaurus, and WordNet thesaurus. expansion using all of them. For each method The second point is our weighting method. we give the percentage of improvement over the As already mentioned before, most attempts at baseline. It is shown that the performance us- automatically expanding queries by means of ing the combined thesauri for query expansion WordNet have failed to improve retrieval effec- is better than both SMART and using just one tiveness. The opposite has often been true: ex- type of thesaurus. panded queries were less effective than the orig- inal queries. Beside the "incomplete" nature Table 2: Experiment Result using Combined of WordNet, we believe that a further problem, Thesauri the weighting of expansion terms, has not been solved. All weighting methods described in the Expand~l with past researches of query expansion using Word- CoU Ba~e WordNet Pred-ar$ Cooccur Combined only only only Net have been based on "trial and error" or ad- wsJ 0.245 o.28z 0.258 0.294 0.384 (+2.0%) (+5.2%) (+t0.8%) (+58.7%) hoc methods. That is, they have no underlying CACM 0.269 0.281 0.29[ 0.297 0.533 justification. ~+4.5%) (+8.3%) (+zo.8%) (+98.2%) INSPEC 0.273 0.283 0.284 0.328 0.472 The advantages of our weighting method are: (+3.7%) (+4.3%) (+20.4%) (+73.1%) czsz 0.2is 0.23z O. 238 0.262 0.301 (+7 2%) (+9.4%) (+21.8%) (+81.3%) • the weight of each expansion term considers Cran 0.412 -0.421 0.441 0.487 0.667 ~+= 3%) (+7.0%) (+zs.3%) (+82.z%) the similarity of that term with all terms in NPL 0.201 0.210 0.217 0.238 0.333 the original query, rather than to just one (+4.2%) (+8.t%) (+Z7.5%) (+65.5%) LISA i 0.304 0.313 0.327 0.369 0,485 or some query terms. I , (+3,1%) (+7.o%) (+21.4%) (+~9.7%) • the weight of the expansion term accom- modates the po[ysemous word problem.

35 This method can accommodate the polysemous method (Qiu and Frei, 1993), except that Qiu word problem, because an expansion term taken used it to expand terms only from a single auto- from a different sense to the original query term matically constructed thesarus and did not con- sense is given very low weight. The reason for sider the use of more than one thesaurus. this is that, the weighting method depends on all query terms and all of the thesauri. For ex- 7 Conclusions ample, the word bank has many senses in Word- This paper analyzed why the use of WordNet Net. Two such senses are the financial institu- has failed to improve the retrieval effectiveness tion and the river edge senses. In a document in information retrieval applications. We found collection relating to financial banks, the river that the main reason is that most relationships sense of bank will generally not be found in the between terms are not found in WordNet, and eooccurmnce-based thesaurus because of a lack some terms, such as proper names, are not in- of articles talking about rivers. Even though eluded in WordNet. To overcome this problem (with small possibility) there may be some doc- we proposed a method to enrich the WordNet uments in the collection talking about rivers, ff with automatically constructed thesauri. the query contained the finance sense of bank Another problem in query expansion is that then the other terms in the query would also concerned with finance and not rivers. Thus of polysemous words. Instead of using a word rivers would only have a relationship with the sense disambiguation method to select the apro- priate sense of each word, we overcame this bank term and there would be no relationships with other terms in the original query, resulting problem with a weighting method. Experiments in a low weight. Since our weighting method proved that our method of using WordNet in depends on both query in its entirety and sim- query expansion could improve information re- ilarity in the three thesauri, the wrong sense trieval effectiveness. expansion terms are given very low weight. Future work will include experiments on larger test collections, and the use of WordNet 6 Related Research in methods other than query expansion in infor- mation retrieval. Smeaton (Smeaton and Berrut, 1995) and Voorhees (Voorhees, 1994) have proposed an ex- 8 Acknowledgements pansion method using WordNet. Our method differs from theirs in that we enrich the cover- The authors would like to thank Mr. Timothy age of WordNet using two methods of automatic Baldwin (TIT, Japan) for his comments on the thesatmm construction, and we weight the ex- earlier version of this paper, Dr. Chris Buck- pausion term appropriately so that it can ac- Icy (Cornell Univesity) for the SMART support, commodate the polysemous word problem. and Mr. Satoshi Sekine (New York University) Although Stairmand (Stairmand, 1997) and for the Apple Pie Parser support. Richardson (Richardson and Smeaton, 1995) have proposed the use of WordNet in informa- References tion retrieval, they did not used WordNet in the J.Y. Chai and A. Biermann. 1997. The use of query expansion framework. lexical semantics in information extraction. Our predicate-argument structure-based the- In Proceedings of the Workshop in Automatic satmis is based on the method proposed by Hin- Information Extraction and Building of Lez- die (Hindle, 1990), although Hindle did not ap- ical Semantic Resources, pages 61-70. ply it to information retrieval. Instead, he used J.M. Gomez-Hidalgo and M.B. Rodriguez. mutual information statistics as a Similarity co- 1997. Integrating a lexical database and a efficient, wheras we used the Dice coefficient for training collection for text categorization. In normalization purposes. Hindle only extracted Proceedings o? the Workshop in Automatic the subject-verb and the object-verb predicate- Information Extraction and Building o? Lez- arguments, while we also extract adjective-noun ical Semantic Resources, pages 39-44. predicate-arguments. D. Hindle. 1990. Noun classification from Our weighting method follows the Qiu predicate-argument structures. In Proceed-

36 I ings of 28th Annual Meeting of the ACL, ceedings of the 16th A CM-SIGIR Conference, pages 268-275. pages 171-180. I G.A Miller. 1990. Special issue, : An E.M. Voorhees. 1994. Query expansion using on-line lexical database. International Jour- lexical-semantic relations. In Proceedings of the 17th ACM-SIGIR Conference, pages 61- nal of Lezicography, 3(4). I J. Morris and G. Hirst. 1991. Lexical cohesion 69. computed by thesaural relations as an indica- tor of the structure of text. In Proceedings of I A CL Conference, pages 21--45. Qiu and H.P. Frei. 1993. Concept based query expansion. In Proceedings of the 16th A CM I SIGIR Conference, pages 160--169. P Pa~nik. 1995. Disambiguating noun grouping with respect to wordnet senses. In Proceed- I ings of 3rd Workshop on Very Large Corpora. R. Richardson and A.F. Smeaton. 1995. Us- ing wordnet in a knowledge-based approach .I to information retrieval. Technical Report CA-0395, School of Computer Applications, Dublin City University.. I G. Salton. 1971. The SMART Retrieval Sys- tem: Experiments in Automatic Document Processing. Prentice-Hall. I F. Segond, A. Schiller, G. Grefenstette, and J. Chanod. 97. An experiment in semantic tagging using tagging. I In Proceedings of the Workshop in Automatic Information Extraction and Building of Lex- ical Semantic Resources, pages 78-81. I S. Sekine and R. Grishman. 1995. A corpus- based probabilistic gr~rnrnar with only two non-terminals. In Proceedings of the Interna- tional Workshop on Technologies. I A.F. Smeaton and C. Berrut. 1995. Running tree-4 experiments: A chronological report of query expansion experiments carried out I as part of tree-4. Technical Report CA-2095, School of Comp. Science, Dublin City Univer- sity. I M.A. Stairmand. 1997. Textual context analy- sis for information retrieval. In Proceedings of the ~Oth A CM-SIGIR Conference, pages 140-- I 147. E.M. Voorheesand D. Harman. 1997. Overview of the fifth text retrieval conference (trec- I 5). In Proceedings of the Fifth Text REtrieval Conference, pages 1-28. NIST Special Publi- cation 500-238. I E.M. Voorhees. 1993. Using wordnet to disarn- biguate word senses for text retrieval. In Pro- I 37 I