A Corpus-Based Approach for Building Semantic Lexicons

Ellen Riloff and Jessica Shepherd Department of Computer Science University of Utah Salt Lake City, UT 84112 riloff©cs, utah. edu

Abstract never fully satisfy the need for semantic knowledge. Many domains are characterized by their own sub- Semantic knowledge can be a great asset to language containing terms and jargon specific to natural language processing systems, but the field. Representing all sublanguages in a single it is usually hand-coded for each applica- knowledge base would be nearly impossible. Fur- tion. Although some semantic information thermore, domain-specific semantic lexicons are use- is available in general-purpose knowledge ful for minimizing ambiguity problems. Within the bases such as WordNet and Cyc, many ap- context of a restricted domain, many polysemous plications require domain-specific lexicons words have a strong preference for one word sense, that represent words and categories for a so knowing the most probable word sense in a do- particular topic. In this paper, we present main can strongly constrain the ambiguity. a corpus-based method that can be used We have been experimenting with a corpus- to build semantic lexicons for specific cat- based method for building semantic lexicons semi- egories. The input to the system is a small automatically. Our system uses a text corpus and set of seed words for a category and a rep- a small set of seed words for a category to identify resentative text corpus. The output is a other words that also belong to the category. The ranked list of words that are associated algorithm uses simple statistics and a bootstrapping with the category. A user then reviews the mechanism to generate a ranked list of potential cat- top-ranked words and decides which ones egory words. A human then reviews the top words should be entered in the semantic lexicon. and selects the best ones for the . Our ap- In experiments with five categories, users proach is geared toward fast semantic lexicon con- typically found about 60 words per cate- struction: given a handful of seed words for a cate- gory in 10-15 minutes to build a core se- gory and a representative text corpus, one can build mantic lexicon. a semantic lexicon for a category in just a few min- utes. 1 Introduction In the first section, we describe the statistical bootstrapping algorithm for identifying candidate Semantic information can be helpful in almost all category words and ranking them. Next, we describe aspects of natural language understanding, includ- experimental results for five categories. Finally, we ing word sense disambiguation, selectional restric- discuss our experiences with additional categories tions, attachment decisions, and discourse process- and seed word lists, and summarize our results. ing. Semantic knowledge can add a great deal of power and accuracy to natural language processing 2 Generating a Semantic Lexicon systems. But semantic information is difficult to ob- tain. In most cases, semantic knowledge is encoded Our work is based on the observation that category manually for each application. members are often surrounded by other category There have been a few large-scale efforts to cre- members in text, for example in conjunctions (lions ate broad semantic knowledge bases, such as Word- and tigers and bears), lists (lions, tigers, bears...), Net (Miller, 1990) and Cyc (Lenat, Prakash, and appositives (the stallion, a white Arabian), and nom- Shepherd, 1986). While these efforts may be use- inal compounds (Arabian stallion; tuna fish). Given ful for some applications, we believe that they will a few category members, we wondered whether it

117 would be possible to collect surrounding contexts The context windows do not cut across sen- and use statistics to identify other words that also tence boundaries. Note that our context win- belong to the category. Our approach was moti- dow is much narrower than those used by other vated by Yarowsky's word sense disambiguation al- researchers (Yarowsky, 1992). We experimented gorithm (Yarowsky, 1992) and the notion of statis- with larger window sizes and found that the nar- tical salience, although our system uses somewhat row windows more consistently included words different statistical measures and techniques. related to the target category. We begin with a small set of seed words for a category. We experimented with different numbers . Given the context windows for a category, we of seed words, but were surprised to find that only compute a category score for each word, which 5 seed words per category worked quite well. As an is essentially the conditional probability that example, the seed word lists used in our experiments the word appears in a category context. The are shown below. category score of a word W for category C is defined as: Energy: fuel gas gasoline oil power ¢corefW ¢7~ - /reg. o/ w in O's context windows Financial: bank banking currency dollar money v/ freq. o] W in corpus Military: army commander infantry soldier troop Note that this is not exactly a conditional prob- Vehicle: airplane car jeep plane truck ability because a single word occurrence can be- Weapon: bomb dynamite explosives gun rifle long to more than one context window. For example, consider the sentence: I bought an Figure 1: Initial Seed Word Lists AK-~7 gun and an M-16 rifle. The word M-16 would be in the context windows for both gun The input to our system is a text corpus and an and rifle even though there was just one occur- initial set of seed words for each category. Ideally, rence of it in the sentence. Consequently, the the text corpus should contain many references to category score for a word can be greater than 1. the category. Our approach is designed for domain- specific text processing, so the text corpus should be . Next, we remove stopwords, numbers, and any a representative sample of texts for the domain and words with a corpus frequency < 5. We used the categories should be semantic classes associated a stopword list containing about 30 general with the domain. Given a text corpus and an initial nouns, mostly pronouns (e.g., /, he, she, they) seed word list for a category C, the algorithm for and determiners (e.g., this, that, those). The building a semantic lexicon is as follows: stopwords and numbers are not specific to any category and are common across many domains, 1. We identify all sentences in the text corpus that so we felt it was safe to remove them. The re- contain one of the seed words. Each sentence is maining nouns are sorted by category score and given to our parser, which segments the sen- ranked so that the nouns most strongly associ- tence into simple noun phrases, verb phrases, ated with the category appear at the top. and prepositional phrases. For our purposes, we do not need any higher level parse structures. . The top five nouns that are not already seed words are added to the seed word list dynam- 2. We collect small context windows surrounding ically. We then go back to Step 1 and repeat each occurrence of a seed word as a head noun the process. This bootstrapping mechanism dy- in the corpus. Restricting the seed words to namically grows the seed word list so that each be head nouns ensures that the seed word is iteration produces a larger category context. In the main concept of the noun phrase. Also, our experiments, the top five nouns were added this reduces the chance of finding different word automatically without any human intervention, senses of the seed word (though multiple noun but this sometimes allows non-category words word senses may still be a problem). We use a to dilute the growing seed word list. A few in- very narrow context window consisting of only appropriate words are not likely to have much two words, the first noun to the word's right impact, but many inappropriate words or a few and the first noun to its left. We collected only highly frequent words can weaken the feedback nouns under the assumption that most, if not process. One could have a person verify that all, true category members would be nouns3 each word belongs to the target category be- 1 Of course, this may depend on the target categories. fore adding it to the seed word list, but this

118 would require human interaction at each itera- 3 Experimental Results tion of the feedback cycle. We decided to see We performed experiments with five categories to how well the technique could work without this evaluate the effectiveness and generality of our ap- additional human interaction, but the potential proach: energy, financial, military, vehicles, and benefits of human feedback still need to be in- weapons. The MUC-4 development corpus (1700 vestigated. texts) was used as the text corpus (MUC-4 Pro- ceedings, 1992). We chose these five categories be- After several iterations, the seed word list typi- cause they represented relatively different semantic cally contains many relevant category words. But classes, they were prevalent in the MUC-4 corpus, more importantly, the ranked list contains many ad- and they seemed to be useful categories. ditional category words, especially near the top. The For each category, we began with the seed word number of iterations can make a big difference in lists shown in Figure 1. We ran the bootstrapping the quality of the ranked list. Since new seed words algorithm for eight iterations, adding five new words are generated dynamically without manual review, to the seed word list after each cycle. After the final the quality of the ranked list can deteriorate rapidly iteration, we had ranked lists of potential category when too many non-category words become seed words for each of the five categories. The top 45 words. In our experiments, we found that about words 3 from each ranked list are shown in Figure 2. eight iterations usually worked well. While the ranked lists are far from perfect, one The output of the system is the ranked list of can see that there are many category members near nouns after the final iteration. The seed word list the top of each list. It is also apparent that a few ad- is thrown away. Note that the original seed words ditional heuristics could be used to remove many of were already known to be category members, and the extraneous words. For example, our number pro- the new seed words are already in the ranked list cessor failed to remove numbers with commas (e.g., because that is how they were selected. ~ 2,000). And the military category contains several Finally, a user must review the ranked list and ordinal numbers (e.g., lOth 3rd 1st) that could be identify the words that are true category members. easily identified and removed. But the key question How one defines a "true" category member is sub- is whether the ranked list contains many true cate- jective and may depend on the specific application, gory members. Since this is a subjective question, so we leave this exercise to a person. Typically, the we set up an experiment involving human judges. words near the top of the ranked list are highly asso- For each category, we selected the top 200 words ciated with the category but the density of category from its ranked list and presented them to a user. words decreases as one proceeds down the list. The We presented the words in random order so that user may scan down the list until a sufficient number the user had no idea how our system had ranked of category words is found, or as long as time per- the words. This was done to minimize contextual mits. The words selected by the user are added to effects (e.g., seeing five category members in a row a permanent semantic lexicon with the appropriate might make someone more inclined to judge the next category label. word as relevant). Each category was judged by two Our goal is to allow a user to build a semantic people independently. 4 lexicon for one or more categories using only a small The judges were asked to rate each word on a scale set of known category members as seed words and a from 1 to 5 indicating how strongly it was associ- text corpus. The output is a ranked list of potential ated with the category. Since category judgements category words that a user can review to create a se- can be highly subjective, we gave them guidelines mantic lexicon quickly. The success of this approach to help establish uniform criteria. The instructions depends on the quality of the ranked list, especially that were given to the judges are shown in Figure 3. the density of category members near the top. In We asked the judges to rate the words on a scale the next section, we describe experiments to evalu- from 1 to 5 because different degrees of category ate our system. membership might be acceptable for different appli- cations. Some applications might require strict cat-

3Note that some of these words are not nouns, such as 21t is possible that a word may be near the top of boardedand U.S.-made. Our parser tags unknown words the ranked list during one iteration (and subsequently as nouns, so sometimes unknown words are mistakenly become a seed word) but become buried at the bottom selected for context windows. of the ranked list during later iterations. However, we 4The judges were members of our research group but have not observed this to be a problem so far. not the authors.

119 Energy: Limon-Covenasa oligarchs spill staples poles Limon Barrancabermeja Covenas 200,000 barrels oil Bucaramanga pipeline prices electric pipelines towers Cano substation transmission CRITERIA: On a scale of 0 to 5, rate each word's strength of association with the given category using rates pylons pole infrastructure transfer gas fuel the following criteria. We'll use the category ANI- sale lines companies power tower price gasoline MAL as an example. industries insurance Arauca stretch inc industry forum nationalization supply electricity controls 5: CORE MEMBER OF THE CATEGORY: If a word is clearly a member of the category, then it deserves a 5. For example, dogs and Financial: monetary fund nationalization sparrows are members of the ANIMAL cate- attractive circulation suit gold branches manager gory. bank advice invested banks bomb_explosion 4: SUBPART OF MEMBER OF THE investment invest announcements content CATEGORY: managers insurance dollar savings product If a word refers to a part of something that is employee accounts goods currency reserves a member of the category, then it deserves a amounts money shops farmers maintenance 4. For example, feathers and tails are parts of Itagui economies companies foundation ANIMALS. moderation promotion annually cooperatives 3: STRONGLY ASSOCIATED WITH THE empire loans industry possession CATEGORY: If a word refers to something that is strongly Military: infantry 10th 3rd 1st brigade techni- associated with members of the category, but is not actually a member of the category itself, cian 2d 3d moran 6th 4th Gaspar 5th 9th Amil- then it deserves a 3. For example, zoos and car regiment sound 13th Pineda brigades Anaya nests are strongly associated with ANIMALS. division Leonel contra anniversary ranks Uzcategui brilliant Aristides escort dispatched 2: WEAKLY ASSOCIATED WITH THE CATEGORY: 8th Tablada employee skirmish puppet If a word refers to something that can be as- Rolando columns (FMLN) deserter troops sociated with members of the category, but is Nicolas Aureliano Montes Fuentes also associated with many other types of things, then it deserves a 2. For example, bowls and Vehicle: C-47 license A-37 crewmen plate parks are weakly associated with ANIMALS. plates crash push tank pickup Cessna air- 1: NO ASSOCIATION WITH THE CATEGORY: craft cargo passenger boarded Boeing_727 luxury If a word has virtually no association with the Avianca dynamite_sticks hostile passengers acci- category, then it deserves a 1. For example, dent sons airplane light plane flight U.S.-made tables and moons have virtually no association with ANIMALS. weaponry truck airplanes gunships fighter carrier apartment schedule flights observer tanks planes 0: UNKNOWN WORD: La._Aurorab fly helicopters helicopter pole If you do not know what a word means, then it should be labeled with a 0. Weapon: fragmentation sticks cartridge AK-47 IMPORTANT! Many words have several distinct M-16 carbines AR-15 movie clips knapsacks cal- meanings. For example, the word "horse" can re- ibers TNT rifles cartridges theater 9-mm 40,000 fer to an animal, a piece of gymnastics equipment, quantities grenades machineguns dynamite kg or it can mean to fool around (e.g., "Don't horse around!"). If a word has ANY meaning associated ammunition revolvers FAL rifle clothing boots with the given category, then only consider that materials submachineguns M-60 pistols pistol M- meaning when assigning numbers. For example, the 79 quantity assault powder fuse grenade caliber word "horse" would be a 5 because one of its mean- squad mortars explosives gun 2,000 ings refers to an ANIMAL.

"Limon-Covenas refers to an oil pipehne. Figure 3: Instructions to human judges bLa_Aurora refers to an airport.

Figure 2: The top-ranked words for each category

120 egory membership, for example only words like gun, lO0. 9o,1 rifle, and bomb should be labeled as weapons. But from a practical perspective, subparts of category 80, ! 7o! ,-,,,- 5 members might also be acceptable. For example, if '~ 6oi i... • 4&5 a cartridge or trigger is mentioned in the context 50 ) ,... 3&4&5 of an event, then one can infer that a gun was used. ~ 40 ,! .o" And for some applications, any word that is strongly . °, ..... 2&3&4&5 30,: "" associated with a category might be useful to in- 20 " ~-- ~ --' clude in the semantic lexicon. For example, words 10 ! '~ s-- ~' like ammunition or bullets are highly suggestive of a ...... weapon. In the UMass/MUC-4 information extrac- 0 20 40 60 80 100120140160180200 tion system (Lehnert et al., 1992), the words ammu- Words Reviewed nition and bullets were defined as weapons, mainly for the purpose of selectional restrictions. Figure 4: Energy Results The human judges estimated that it took them ap- |OOL proximately 10-15 minutes, on average, to judge the 9oi 200 words for each category. Since the instructions o° 801 iI "°" allowed the users to assign a zero to a word if they °..~.-°" 70.i ,---- 5 did not know what it meant, we manually removed • ! 60: .-" 4&..5 the zeros and assigned ratings that we thought were 50) °" ] ,~ 3&4&5 appropriate. We considered ignoring the zeros, but 40! °." ~ ~..,, r" ..... 2&3&4&5 some of the categories would have been severely 3oi impacted. For example, many of the legitimate 2o! ."L~/ -~"-~" ~ = weapons (e.g., M-16 and AR-15) were not known lO I ~,.- to the judges. Fortunately, most of the unknown 0':1,, , i,,, i,, , i, ,. i ,, ,1, ,, i, . , i. ,,i,,, i,, , I words were proper nouns with relatively unambigu- 0 20 40 60 80 100120140160180200 Words Reviewed ous , so we do not believe that this process compromised the integrity of the experiment. Figure 5: Financial Results Finally, we graphed the results from the human judges. We counted the number of words judged as 5's by either judge, the number of words judged ter in a semantic lexicon on their own. For exam- as 5's or 4's by either judge, the number of words ple, suppose a user wanted to build a dictionary of judged as 5's, 4's, or 3's by either judge, and the Vehicle words. Most people would probably define number of words judged as either 5's, 4's, 3's, or 2's. words such as car, truck, plane, and automobile. But We plotted the results after each 20 words, step- it is doubtful that most people would think of words ping down the ranked list, to see whether the words like gunships, fighter, carrier, and ambulances. The near the top of the list were more highly associated corpus-based algorithm is especially good at identi- with the category than words farther down. We also fying words that are common in the text corpus even wanted to see whether the number of category words though they might not be commonly used in general. leveled off or whether it continued to grow. The re- As another example, specific types of weapons (e.g., sults from this experiment are shown in Figures 4-8. M-16, AR-15, M-60, or M-79) might not even be With the exception of the Energy category, we known to most users, but they are abundant in the were able to find 25-45 words that were judged as MUC-4 corpus. 4's or 5's for each category. This was our strictest If we consider all the words rated as 3's, 4's, or test because only true category members (or sub- 5's, then we were able to find about 50-65 words parts of true category members) earned this rating. for every category except Energy. Many of these Although this might not seem like a lot of category words would be useful in a semantic dictionary for words, 25-45 words is enough to produce a reason- the category. For example, some of the words rated able core semantic lexicon. For example, the words as 3's for the Vehicle category include: flight, flights, judged as 5's for each category are shown in Figure 9. aviation, pilot, airport, and highways. Figure 9 illustrates an important benefit of the Most of the words rated as 2's are not specific corpus-based approach. By sifting through a large to the target category, but some of them might be text corpus, the algorithm can find many relevant useful for certain tasks. For example, some words category words that a user would probably not en- judged as 2's for the Energy category are: spill, pole,

121 100 100-. 90 9o.1 = [ 80 [ ,r • 8o.1 I 770, • "°••] 5 7o-i ' .. "'" 5 . " /._~ 4&5 .... 4&5 i o 6o! 50 ...~;'" / 3&4&5 50 i. -- -" 3&4&5 ..... 2&3&4&5 4o: ,"- ..... 2&3&4&5

2o. "" /~ j

O" "'1'''1'''1'''1'''1'''1'''1'''1'''1'''1 0 20 40 60 80 100120140160180200 0 20 40 60 80 100120140160180200 Words Reviewed Words Reviewed

Figure 6: Military Results Figure 8: Weapon Results

100- 90": - ! of different categories• Fortunately, most of them so.! ! -" I worked fairly well, but some of them did not. We 5 7oi j do not claim to understand exactly what types of 4&5 "E 60.: •..,'"" categories will work well and which ones will not, 50.: .. "'" ¢'" ~ 3&4&5 • . J~l but our early experiences did shed some light on the 40.:. -/J I ..... 2&3&4&5 strengths and weaknesses of this approach. In addition to the previous five categories, we also 5oi ,./,.- ! experimented with categories for Location, Commer- cial, and Person. The Location category performed 0 i/~• ...... i i I ...... t'''1 ..... I 0 20 40 60 80 100120140160180200 very well using seed words such as city, town, and Words Reviewed province. We didn't formally evaluate this category because most of the category words were proper Figure 7: Vehicle Results nouns and we did not expect that our judges would know what they were. But it is worth noting that this category achieved good results, presumably be- tower, and fields. These words may appear in many different contexts, but in texts about Energy topics cause location names often cluster together in ap- these words are likely to be relevant and probably positives, conjunctions, and nominal compounds• should be defined in the dictionary• Therefore we For the Commercial category, we chose seed words expect that a user would likely keep some of these such as store, shop, and market• Only a few new words in the semantic lexicon but would probably commercial words were identified, such as hotel and In retrospect, we realized that there were be very selective. restaurant• Finally, the graphs show that most of the acquisi- probably few words in the MUC-4 corpus that re- tion curves displayed positive slopes even at the end ferred to commercial establishments. (The MUC-4 corpus mainly contains reports of terrorist and mil- of the 200 words. This implies that more category words would likely have been found if the users had itary events•) The relatively poor performance of the Energy category was probably due to the same reviewed more than 200 words. The one exception, again, was the Energy category, which we will dis- problem• If a category is not well-represented in • the corpus then it is doomed because inappropriate cuss in the next section• The size of the ranked lists ranged from 442 for the financial category to 919 for words become seed words in the early iterations and the military category, so it would be interesting to quickly derail the feedback loop. The Person category produced mixed results• know how many category members would have been found if we had given the entire lists to our judges• Some good category words were found, such as rebel, advisers, criminal, and citizen• But many of 4 Selecting Categories and Seed the words referred to Organizations (e.g., FMLN), groups (e.g., forces), and actions (e.g., attacks). Words Some of these words seemed reasonable, but it was When we first began this work, we were unsure hard to draw a line between specific references to about what types of categories would be amenable to people and concepts like organizations and groups this approach. So we experimented with a number that may or may not consist entirely of people• The

122 Energy: oil electric gas fuel power gasoline elec- the category by hand! tricity petroleum energy CEL 5 Conclusions Financial: monetary fund gold bank invested Building semantic lexicons will always be a subjec- banks investment invest dollar currency money tive process, and the quality of a semantic lexicon economies loans billion debts millions IMF com- is highly dependent on the task for which it will merce wealth inflation million market funds dol- be used. But there is no question that semantic lars debt knowledge is essential for many problems in natu- ral language processing. Most of the time semantic Military: infantry brigade regiment brigades knowledge is defined manually for the target applica- division ranks deserter troops commander cor- tion, but several techniques have been developed for poral GN Navy Bracamonte soldier units patrols generating semantic knowledge automatically. Some cavalry detachment officer patrol garrisons army systems learn the meanings of unknown words us- paratroopers Atonal garrison battalion unit mili- ing expectations derived from other word definitions tias lieutenant in the surrounding context (e.g., (Granger, 1977; Carbonell, 1979; Jacobs and Zernik, 1988; Hast- Vehicle: C-47 A-37 tank pickup Cessna air- ings and Lytinen, 1994)). Other approaches use craft Boeing_727 airplane plane truck airplanes example or case-based methods to match unknown gunships fighter carrier tanks planes La_Aurora word contexts against previously seen word contexts helicopters helicopter automobile jeep car boats (e.g., (Berwick, 1989; Cardie, 1993)). Our task ori- trucks motorcycles ambulances train buses ships entation is a bit different because we are trying to cars bus ship vehicle vehicles construct a semantic lexicon for a target category, instead of classifying unknown or polysemous words Weapon: AK-47 M-16 carbines AR-15 TNT ri- in context. fles 9-mm grenades machineguns dynamite re- To our knowledge, our system is the first one volvers rifle submachineguns M-60 pistols pistol aimed at building semantic lexicons from raw text M-79 grenade mortars gun mortar submachine- without using any additional semantic knowledge. gun cannon RPG-7 firearms guns bomb ma- The only lexical knowledge used by our parser is chinegun weapons car_bombs car_bomb artillery a part-of-speech dictionary for syntactic processing. tanks arms Although we used a hand-crafted part-of-speech dic- Figure 9: Words judged as 5's for each category tionary for these experiments, statistical and corpus- based taggers are readily available (e.g., (Brill, 1994; Church, 1989; Weischedel et al., 1993)). Our corpus-based approach is designed to sup- large proportion of action words also diluted the port fast semantic lexicon construction. A user only list. More experiments are needed to better under- needs to supply a representative text corpus and a stand whether this category is inherently difficult or small set of seed words for each target category. Our whether a more carefully chosen set of seed words experiments suggest that a core semantic lexicon can would improve performance. be built for each category with only 10-15 minutes More experiments are also needed to evaluate dif- of human interaction. While more work needs to be ferent seed word lists. The algorithm is clearly sen- done to refine this procedure and characterize the sitive to the initial seed words, but the degree of sen- types of categories it can handle, we believe that this sitivity is unknown. For the five categories reported is a promising approach for corpus-based semantic in this paper, we arbitrarily chose a few words that knowledge acquisition. were central members of the category. Our initial seed words worked well enough that we did not ex- 6 Acknowledgments periment with them very much. But we did perform This research was funded by NSF grant IRI-9509820 a few experiments varying the number of seed words. and the University of Utah Research Committee. In general, we found that additional seed words tend We would like to thank David Bean, Jeff Lorenzen, to improve performance, but the results were not and Kiri Wagstaff for their help in judging our cat- substantially different using five seed words or using egory lists. ten. Of course, there is also a law of diminishing re- turns: using a seed word list containing 60 category words is almost like creating a semantic lexicon for

123 References MUC-4 Proceedings. 1992. Proceedings of the Fourth Message Understanding Conference Berwick, Robert C. 1989. Learning Word Mean- (MUC-4). Morgan Kaufmann, San Mateo, CA. ings from Examples. In Semantic Structures: Ad- vances in Natural Language Processing. Lawrence Weischedel, Erlbaum Associates, chapter 3, pages 89-124. R., M. Meteer, R. Schwartz, L. Ramshaw, and J. Palmucci. 1993. Coping with Ambiguity and Brill, E. 1994. Some Advances in Rule-based Part of Unknown Words through Probabilistic Models. Speech Tagging. In Proceedings of the Twelfth Na- Computational Linguistics, 19(2):359-382. tional Conference on Artificial Intelligence, pages 722-727. AAAI Press/The MIT Press. Yarowsky, D. 1992. Word sense disambiguation us- ing statistical models of Roget's categories trained Carbonell, J. G. 1979. Towards a Self-Extending on large corpora. In Proceedings of the Fourteenth Parser. In Proceedings of the 17th Annual Meeting International Conference on Computational Lin- of the Association for Computational Linguistics, guistics (COLING-92), pages 454-460. pages 3-7.

Cardie, C. 1993. A Case-Based Approach to Knowledge Acquisition for Domain-Specific Sen- tence Analysis. In Proceedings of the Eleventh Na- tional Conference on Artificial Intelligence, pages 798-803. AAAI Press/The MIT Press.

Church, K. 1989. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Pro- ceedings off the Second Conference on Applied Nat- ural Language Processing.

Granger, R. H. 1977. FOUL-UP: A Program that Figures Out Meanings of Words from Context. In Proceedings of the Fifth International Joint Con- ference on Artificial Intelligence, pages 172-178.

Hastings, P. and S. Lytinen. 1994. The Ups and Downs of Lexical Acquisition. In Proceedings of the Twelfth National Conference on Artificial In- telligence, pages 754-759. AAAI Press/The MIT Press.

Jacobs, P. and U. Zernik. 1988. Acquiring Lexical Knowledge from Text: A Case Study. In Pro- ceedings of the Seventh National Conference on Artificial Intelligence, pages 739-744.

Lehnert, W., C. Cardie, D. Fisher, J. McCarthy, E. Riloff, and S. Soderland. 1992. Univer- sity of Massachusetts: Description of the CIR- CUS System as Used for MUC-4. In Proceedings of the Fourth Message Understanding Conference (MUC-~), pages 282-288, San Mateo, CA. Mor- gan Kaufmann.

Lenat, D. B., M. Prakash, and M. Shepherd. 1986. CYC: Using Common Sense Knowledge to Over- come Brittleness and Knowledge-Acquisition Bot- tlenecks. AI Magazine, 6:65-85.

Miller, G. 1990. Wordnet: An On-line Lexical Database. International Journal of Lexicography, 3(4).

124