Computational Semantics of Noun Compounds in a Semantic Space Model

Computational Semantics of Noun Compounds in a Semantic Space Model Akira Utsumi Department of Systems Engineering The University of Electro-Communications 1-5-1, Chofugaoka, Chofushi, Tokyo 182-8585, Japan [email protected] Abstract These two processes are equally important, but unfortunately, the process of similarity computation has never been stud- This study examines the ability of a semantic space ied computationally in the field of NLP and AI; the existing model to represent the meaning of noun compounds studies on noun compounds such as those mentioned above such as “information gathering” or “weather fore- have addressed only the compound disambiguation process. cast”. A new algorithm, comparison, is proposed This problem becomes more serious when we consider a for computing compound vectors from constituent particularly intriguing aspect of noun compounds that they word vectors, and compared with other algorithms can yield emergent properties or meanings [Wilkenfeld and (i.e., predication and centroid) in terms of accu- Ward, 2001]. Emergent properties are those that are made racy of multiple-choice synonym test and similar- salient in the interpretation of a noun compound, but not ity judgment test. The result of both tests is that the salient either in the representation of the head noun or the comparison algorithm is, on the whole, superior to modifier. For example, the sense INTELLIGENCE of “infor- other algorithms, and in particular achieves the best mation gathering” is an emergent meaning since INTELLI- performance when noun compounds have emergent GENCE is not likely to be listed as characteristic of either the meanings. Furthermore, the comparison algorithm head “gathering” or the modifier “information”. Such non- also works for novel noun compounds that do not compositional meanings cannot be yielded solely by the com- occur in the corpus. These findings indicate that a positional process of compound disambiguation; they should semantic space model in general and the compari- be explained within the process of similarity computation. son algorithm in particular has sufficient ability to This paper, therefore, aims at proposing and evaluating a compute the meaning of noun compounds. method of similarity computation for noun compounds. For the purpose of similarity computation, this paper em- 1 Introduction ploys a semantic space model [Bullinaria and Levy, 2007; Landauer et al., 2007; Padó and Lapata, 2007],inwhich Noun compounds are short phrases consisting of two or more each word is represented by a high-dimensional vector and nouns such as “apple pie” and “information gathering”. Re- the degree of semantic similarity between any two words can search on noun compounds is important in any disciplines be easily computed as, for example, the cosine of the angle relevant to language, because they are very common not formed by their vectors. Semantic space models are computa- only in everyday language but also in technical documents tionally efficient as a way of representing meanings of words, [Costello et al., 2006]. Recently, therefore, a number of because they take much less time and less effort to construct computational studies have been made on interpretation of meaning representation and they can provide a more fine- noun compounds [Costello et al., 2006; Girju et al., 2005; grained similarity measure between words than other repre- Kim and Baldwin, 2005; 2007]. sentation methods such as thesauri (e.g., WordNet). Semantic According to computational lexical semantics, computing space models are also psychologically plausible; a number of the meaning of noun compounds involves the following two studies have shown that vector-based representation achieves processes: remarkably good performance for simulating human verbal • Compound disambiguation: the process of determining behavior such as similarity judgment and semantic priming which sense of constituent words is used and identify- [Bullinaria and Levy, 2007; Landauer et al., 2007]. Hence ing the semantic relation holding between (the senses they are advantageous for similarity computation of emergent of) constituent words in a noun compound. meanings of noun compounds. • Similarity computation: the process of computing the se- The basic question to be answered, then, is how a proper mantic similarity between a noun compound and other vector representation of a noun compound should be com- words (and compounds), which is used for identifying puted in a semantic space model. One possible and simple taxonomic relations (e.g., synonym, hyponym) and as- way of doing this is to treat noun compounds as individual sociative relations. words; vectors for noun compounds are constructed directly 1568 from the corpus just as vectors for words are constructed. 2.3 Comparison This method is expected to compute a proper semantic rep- The predication algorithm does not take fully into account resentation of noun compounds, but suffers from one serious the relevance (or similarity) between the head noun and its limitation; it cannot deal with novel compounds which do not intended sense in a noun compound. It is quite likely that a occur in the corpus. This drawback is all the more serious, more plausible vector can be computed by using the set of given the empirical finding that people easily comprehend common neighbors of the head and the modifier. For this novel compounds [Gagné, 2000; Wisniewski, 1996]. purpose, I propose a comparison algorithm which chooses k An alternative way of computing compound vectors is to common neighbors and then computes the centroid vector of combine word vectors for constituent words (i.e., the head these neighbors and the head noun. The comparison algo- noun and the modifier) of a noun compound. Some algorithm can be seen as Gentner’s [1983] comparison process rithms (i.e., centroid or predication) have been devised for consisting of alignment and projection [Utsumi, 2006],and vector composition, but their semantic validity has never been Wisniewski [1996] empirically demonstrated that noun com- examined in a systematic way. Hence this paper examines the pound comprehension involves such comparison process. applicability of these algorithms to noun compounds. Fur- Formally, the comparison algorithm of computing a com- thermore, this paper proposes a new algorithm for vector pound vector vcomp(C) is given as follows: composition, i.e., comparison, and tests whether the proposed k N (H) ∩ N (M) algorithm shows better performance on similarity computa- 1. Compute common neighbors i i of the M H i tion of noun compounds. modifier and the head by finding the smallest that satisfies |Ni(H) ∩ Ni(M)|≥k. 2 Algorithm 2. Compute a compound vector vcomp(C) as the centroid of v(H) and k vectors of the words chosen at Step 1. 2.1 Centroid The standard method for vector composition in semantic 3 Evaluation Experiment space models is to compute the centroid of constituent word vectors. For a noun compound C consisting of the head noun 3.1 Materials H and the modifier M, the centroid algorithm computes the As a corpus from which noun compounds were collected compound vector vcent(C) as (v(H)+v(M))/2.However, and the semantic space was constructed, one year’s worth of the centroid algorithm has a serious drawback that word or- Japanese Mainichi newspaper articles was used. This corpus der is completely ignored; this algorithm wrongly computes consists of 523,249 paragraphs and 62,712 different words. the same vector for the different compounds, e.g., “apartment Furthermore, a Japanese thesaurus “Nihongo Dai-Thesaurus” dog” and “dog apartment”. [Yamaguchi, 2006] was used for automatically identifying the meanings of words and compounds. This thesaurus consists 2.2 Predication of 1,044 basic categories which are divided into nearly 14,000 The predication algorithm has been proposed by semantic categories. In this study, these semantic categories Kintsch [2001] to compute the intuitively plausible and were used for representing word meanings. The thesaurus contextually dependent vectors of the proposition with the contains nearly 200,000 words (including compounds) most predicate argument structure. Given that a proposition P (A) of which are classified into multiple semantic categories. consisting of a predicate P (i.e., the modifier M in the case Noun compounds used for evaluation were chosen such of a noun compound) and an argument A (i.e., the head H), that they occurred at least 20 times in the corpus and were in- the predication algorithm first chooses m nearest neighbors cluded in the thesaurus. For each of the chosen compounds, of a predicate P , i.e., m words with the highest similarity its meanings (i.e., semantic categories that the compound to P . The algorithm then picks up k neighbors of P that belongs to) were automatically classified as emergent ones are also related to A. Finally the algorithm computes the when the semantic categories included neither the head nor centroid vector of P , A,andthek neighbors of P as a vector the modifier. As a result, 1,254 compounds were chosen for representation of P (A). When the predication algorithm evaluation, and 606 out of them were judged to have emergent is applied to noun compounds, the set of neighbors of P meanings. A Semantic spaces are generally constructed from large bod- relevant to can be seen as representing the intended sense (i, j) of the modifier M that is appropriate for describing the ies of text by computing a word-context matrix whose - intended sense of the head noun H. th entry represents the distributional characteristics of the i-th word in the j-th linguistic context. The number of Formally, the predication algorithm of computing a com- columns, i.e., the dimensionality of a semantic space, is often pound vector vpred(C) is given as follows. reduced. Several methods have been proposed for comput- 1. Compute Nm(M), which denotes a set of m neighbors ing a word-context matrix and for reducing dimensions [Lan- of the modifier M.

Computational Semantics of Noun Compounds in a Semantic Space Model

The Generative Lexicon

Deriving Programming Theories from Equations by Functional Predicate Calculus

A Survey of Computational Semantics: Representation, Inference and Knowledge in Wide-Coverage Text Understanding Johan Bos* University of Groningen

Proceedings of the LFG11 Conference Miriam Butt and Tracy Holloway King (Editors) 2011

Detecting Semantic Difference: a New Model Based on Knowledge and Collocational Association

Syntactic and Semantic Improvements to Computational Metaphor Processing

The Computational Lexical Semantics of Syntagmatic Expressions

Intensions As Computable Functions Shalom Lappin1

Introducing Computational Semantics for Natural Language Understanding in Conversational Nutrition Coaches for Healthy Eating †

An Introduction to Formal Computational Semantics

Ontology and Description in Computational Semantics: Models

CS 181: Natural Language Processing Semantic Ambiguity