Learning Noun-Modifier Semantic Relations with Corpus-Based

Learning Noun-Modifier Semantic Relations with Corpus-based and WordNet-based Features Vivi Nastase Marina Sokolova∗ Stan Szpakowicz Jelber Sayyad-Shirabad Département d’informatique et de School of Information Technology and Engineering School of Information Technology and Engineering recherche opérationnelle University of Ottawa 800 King Edward Avenue University of Ottawa 800 King Edward Avenue Universitéde Montréal Ottawa, Ontario K1N 6N5, Canada Ottawa, Ontario K1N 6N5, Canada 2920, chemin de la Tour Institute of Computer Science vnastase,jsayyad @site.uottawa.ca Montréal, Québec H3T 1J4, Canada Polish Academy of Sciences { } [email protected] Ordona 21, 01-237 Warszawa, Poland [email protected] Abstract We experiment with two methods of representing the words in a base NP, to be used in Machine Learning (ML) We study the performance of two representations of word experiments for learning semantic relations between nouns meaning in learning noun-modifier semantic relations. One and their modifiers. One method is based on features ex- representation is based on lexical resources, in particular WordNet, the other – on a corpus. We experimented with de- tracted from WordNet, which was designed to capture, de- cision trees, instance-based learning and Support Vector Ma- scribe, and relate word senses. In brief, we use hypernyms chines. All these methods work well in this learning task. We to describe in more and more general terms the sense of report high precision, recall and F-score, and small variation a word in a pair. The other method is based on contex- in performance across several 10-fold cross-validation runs. tual information extracted from corpora. Contexts are use- The corpus-based method has the advantage of working with ful for determining word senses, as reflected in research on data without word-sense annotations and performs well over word-sense disambiguation using a window of words sur- the baseline. The WordNet-based method, requiring word- rounding the target word (starting with (Yarowsky 1995), sense annotated data, has higher precision. and more recently (Purandare & Pedersen 2004)). We use grammatical collocations3 (as opposed to proximity-based Introduction co-occurrences) extracted from the British National Corpus, to describe each word in a pair. We compare the learning In understanding a text, it is essential to recognize rela- results produced by using these two types of word sense de- tions among occurrences1, entities and their attributes, rep- scriptions with the results obtained by (Turney & Littman resented at the surface as verbs, nouns and their modifiers. 2003) and (Turney 2005), who used a paraphrase method to Semantic relations describe interactions between a noun and describe the pair as a whole. its modifiers (noun-modifier relations), a verb and its argu- The data we work with consist of noun-modifier pairs la- ments (case relations/semantic roles), and two clauses. In belled with 30 fine-grained semantic relations, grouped into the past few years we have seen the Natural Language Pro- five relation classes. Experiments presented in this paper are cessing (NLP) community’s renewed interest in analyzing based on the five-class coarse-grained grouping. semantic relations especially between verbs and their argu- The corpus-based method gives precision, recall and F- ments (Baker, Fillmore, & Lowe 1998), (Kipper, Dang, & scores well above the baseline (discussed in the “Experi- Palmer 2000), nouns and their modifiers (Rosario & Hearst mental setup and results” section), and it works on data with- 2001). Semantic role labelling competitions2 also seem to out word-sense annotations. The method based on WordNet increase the attractiveness of this topic. gives results with higher precision, but requires word-sense In this paper we consider a specific supervised learning annotated data. task: assign semantic relations to noun-modifier pairs in The paper continues with related work on learning noun- base noun phrases (base NPs), composed only of a noun modifier relations, description of the data and the represen- and its modifier. To identify such noun-modifier relations tations used, presentation of the experimental setup and the we can rely only on semantic and morphological informa- results, and concludes with a discussion. tion about words themselves. For example, in the base NPs iron gate, brick house, plastic container: iron, brick, plas- Related Work tic are substances, and gate, house, container are artifacts; this suggests a MATERIAL relation in these pairs. On the We focus on methods that analyze and learn semantic rela- other hand, analyzing case relation or clause-level relations tions in noun-phrases. is assisted by prepositions, subordinators, coordinators and Levi (1978) analyzes the formation of nominal com- maybe more elaborate syntactic structures. pounds. She identifies 9 recoverable deletable predicates (RDPs): be, cause, have, make, use, about, for, from, in, Copyright c 2006, American Association for Artificial Intelli- which, when erased from a more complex expression, gen- gence (www.aaai.org). All rights reserved. erate a noun phrase. Levi writes that relations expressed by ∗This work was done at the University of Ottawa. the RDPs may be universal, because from a semantic point 1Occurrence encompasses all types of events, actions, activi- ties, processes, states and accomplishments (Allen 1984) 3Grammatical collocations are collocates that appear with the 2CONLL 2004, 2005 – Semantic Role Labelling shared task target word in a grammatical relation, such as subject, object, http://www.lsi.upc.edu/ srlconll/ . prepositional complement (Kilgarriff et al. 2004). 781 of view they appear to be quite primitive. Different seman- semantic relation, using a 2-nearest-neighbour algorithm. tic relations may be associated with each RDP. For example: This has been generalized in Turney (2005) using what cause – causes/is caused by; have – possession/possessor; he calls Latent Relational Analysis (LRA). For each word make – physically producing/material. in a dataset of pairs, Lin’s thesaurus (Lin 1998) gives a set Berland & Charniak (1999) and Hearst (1992) work with of possible synonyms. All original pairs and pairs generated specific relations, part of and type of respectively. The Au- from synonyms are used to mine a corpus for paraphrases. tomatic Content Extraction project is a research program in All paraphrases are gathered and a few thousand of the most information extraction that focuses on detecting specific re- frequent ones are selected. The selected paraphrases, the lations (such as employer-organization, agent-artifact) be- original word pairs and the synonym pairs are used to build tween seven types of entities (such as person, organization, an incidence matrix, whose dimensionality is reduced using facility) in texts (Zhao & Grishman 2005). Several types singular value decomposition (Landauer & Dumais 1997). of information – lexical, grammatical and contextual – are Similarity between pairs combines scores for similarity be- combined using kernel methods. tween the original word pair and pairs built using synonyms. Vanderwende (1994) uses hand-crafted rules and a dictio- Because we use the same data as Turney & Littman nary built from texts to find clues about the semantic rela- (2003) and Turney (2005), we compare the results of tions in which a word may be involved. This was tested with learning noun-modifier relations using WordNet-based and 97 pairs extracted from the Brown corpus, with an accuracy corpus-based representations, with the results obtained us- of 52%. ing paraphrase-based information. Several systems use lexical resources (domain-specific like MeSH or general like WordNet or Roget’s Thesaurus) Data and Representations to find the appropriate level of generalization for words in a pair, so that words linked by different relations are properly Lists of semantic relations in use range from general, as separated. the lists in PropBank (Palmer, Gildea, & Kingsbury 2005) and NomBank (Myers et al. 2004), to more and more spe- Rosario & Hearst (2001) learn noun-modifiersemantic re- cific, as in VerbNet (Kipper, Dang, & Palmer 2000) and lations in a medical domain, using neural networks. The list FrameNet (Baker, Fillmore, & Lowe 1998), to domain- of 13 relations is tailored to the application domain. Rosario, specific (Rosario & Hearst 2001). The data we use consist Hearst, & Fillmore (2002), continuing that research, look of 600 noun-modifier pairs, tagged with 30 semantic rela- manually for rules which classify correctly noun compounds tions, grouped into 5 classes of relations by general similar- in the medical domain, based on the MeSH lexical hierar- ity (Barker 1998), (Nastase & Szpakowicz 2003), (Turney & chy. The data are extracted automatically from biomedical Littman 2003): journal articles, and sampled for manual analysis. MeSH is traversed top-down to find a level at which the noun com- 1. CAUSAL groups relations enabling or opposing an occur- pounds in different relations are properly separated. rence. Examples (H denotes the head of a base NP, M Lauer (1995) maps words in noun compounds onto cat- denotes the modifier): egories in Roget’s Thesaurus, in order to find probabilities cause - H causes M: flu virus; of occurrence of certain noun compounds and their paraphrases. There is no automatic process to find the best level effect - H is the effect (was caused by) M: exam anxiety; of generalization. Nastase & Szpakowicz (2003) use the purpose - H is

Load more