The Disambiguation of Nominalizations
The Disambiguation of Nominalizations
Maria Lapata∗ University of Edinburgh
This article addresses the interpretation of nominalizations, a particular class of compound nouns whose head noun is derived from a verb and whose modifier is interpreted as an argument of this verb. Any attempt to automatically interpret nominalizations needs to take into account: (a) the selectional constraints imposed by the nominalized compound head, (b) the fact that the relation of the modifier and the head noun can be ambiguous, and (c) the fact that these constraints can be easily overridden by contextual or pragmatic factors. The interpretation of nominalizations poses a further challenge for probabilistic approaches since the argument relations between a head and its modifier are not readily available in the corpus. Even an approximation that maps the compound head to its underlying verb provides insufficient evidence. We present an approach that treats the interpretation task as a disambiguation problem and show how we can “re-create” the missing distributional evidence by exploiting partial parsing, smoothing techniques, and contextual information. We combine these distinct information sources using Ripper, a system that learns sets of rules from data, and achieve an accuracy of 86.1% (over a baseline of 61.5%) on the British National Corpus.
1. Introduction
The automatic interpretation of compound nouns has been a long-standing problem for natural language processing (NLP). Compound nouns in English have three basic properties that present difficulties for their interpretation: (a) the compounding process is extremely productive (this means that a hypothetical system would have to interpret previously unseen instances), (b) the semantic relationship between the compound head and its modifier is implicit (this means that it cannot be easily recovered from syntactic or morphological analysis), and (c) the interpretation can be influenced by a variety of contextual and pragmatic factors. A considerable amount of effort has gone into specifying the set of semantic rela- tions that hold between a compound head and its modifier (Levi 1978; Warren 1978; Finin 1980; Isabelle 1984). Levi (1978), for example, distinguishes two types of com- pound nouns: (a) compounds consisting of two nouns that are related by one of nine recoverably deletable predicates (e.g., cause relates onion tears, for relates pet spray; see the examples in (1)) and (b) nominalizations, that is, compounds whose heads are nouns derived from a verb and whose modifiers are interpreted as arguments of the related verb (e.g., a car lover loves cars; see the examples in (2)–(4)). The prenominal modifier can be either a noun or an adjective (see the examples in (2)). The nominal- ized verb can take a subject (see (3a)), a direct object (see (3b)) or a prepositional object (see (3c)). (1) a. onion tears cause b. vegetable soup have
∗ Division of Informatics, University of Edinburgh, 2 Buccleuch Place, Edinburgh EH8 9LW, UK. E-mail: [email protected]