The Assignment of Grammatical Relations in Natural Language Processing
Total Page:16
File Type:pdf, Size:1020Kb
THE ASSIGNMENT OF GRAMMATICAL RELATIONS IN NATURAL LANGUAGE PROCESSING Leonardo Lesmo, Vincenzo Lombardo Dipartimento di Infom~atica - Universila'di Tofino C.~ Svizzera 185 - 10149 Torino - ITALY e-mail: lesmo,[email protected] 1. Introduction ((He) has been seen by l'iero's friends) the passive form does not obey tile law of direct One of the main goals of an interpreter is to map mapping. The example is, however, easily the syntactic descriptions found in the sentence accounted fbr by the relational theories. The into the correct roles that the elements passivization rule induces only changes of (described by the nominals) play in the situation function: the SUBJ becomes the BY- at hand (described by the verb). For instance, complement and the OBJ becomes the SUBJ. we must be able to state that in The importance of grammatical relations, 1) The cat ate the mouse taken as primitives for a universal grammar, is the cat is the "eater" and the mouse is the stated by a number of formalisms often "eaten thing". Of course, if we only talk about collected under the label of Relational roles and situations we miss some significant Grammar. The problem is to map the surface generalizations. In constituents into their correct roles. With 2) The boy drank the water, languages as Italian, which stands in the middle if we say that the boy is the "drinker" and the between configurational and freely ordered water is the "drunk thing", we disregard the languages [Stock 891 some flexibility is evident similarity of the roles of "eater" and required to accomplish this task. One possibility "drinker" in the two situations. The notion of is to adopt 11 neutral syntactic structure, open to deep case arises as the common ground several alternatives in the interpretation process. underlying a number of "apparently" different The head & modifier approach seems to feature roles. Upon this notion some frameworks, that this kind of neutrality, and has effectively been stand at the core of semantic representation and used for dealing with free word order natural language processing, are built (see languages, like the Slavonic languages [Sgall et [Fillmore 68], [Bruce 75] and ISomers 871). al. 861 and Finnish [Jappinen et al. 86]. The hard task is to devise a mapping The dependency formalism we have adopted between the surface descriptions and these deep is presented in [Lesmo, Lombardo 91]. An cases. The complexity of some syntactic example is reported in fig.l, and concerns the phenomena, like passivization, subject and sentence: object raising, long distance dependencies, has 4) La ragazza ebe lavora al guardaroba led many researchers to pose an intermediate fu persuasa da un cliente a level between the linear string of words and the comprare una enciclopedia case system. The concept involved is that of (The girl who works at the wardrobe was "grammatical relation", such as "subject", persuaded by a customer to buy an "direct object", "indirect object". It is claimed, encyclopedia). for example, that "passivizatiou" is universally The daughter nodes that stand on the left of (cross-linguistically) explained if one says that their head precede it in the linear order of the the "object" of an active sentence becomes the sentence, while daughter nodes on the right "subject" in the passive form, rather than by follow it. The arcs that link the nodes in the saying that the NP in the VP is moved to dependency tree are of three types: arcs of replace the NP in S (that is a direct mapping). structural and logical dependency (D&S arcs, In the latter case it is implicit that the partictdar represented by bold arrows in the figure), arcs language under examination has a Subject- of only structural dependency (STR arcs, Verb-Object structure (SVO), as it usually simple arrows in the figure), and arcs of only happens in configurational languages such as logical dependency (DEP arcs, dashed arrows English. In the example in the figure). D&S arcs link two words that 3a) Lo hanno visto gli amici di Piero stand in a "both structural and logical" relation. (Him &tve seen the friends of Piero) STR and DEP split these two functions of arc: 3b) E' stato visto dagli amici di Piero an STR individuatcs a purely superficial AcrEs DE COLING-92. NAN'~S, 23-28 ^o~r 1992 1 0 9 0 PROC. or: COLING-92, NANTF.S.AUG. 23-28, 1992 PEFISUADERE at RAG~2~ A kA LAVORARE ~, CLIENTE COMPRb~E \,oo k ./ ~ \.et ~', .' OEld % L~ .," %%. CI-E A ENCICLOPEDIA GtJ,N:I[I~V~ UNA J IL Fig.1 - An example of dependency tree. Because of space constraints, the figure already in- cludes the grammatical relations that will be described below. The bold labels (e.g. agt, pat) refer to deep cases. The labels immediately below them refer to the initial stratum of grammatical relations. The lowest labels are the last stratum (surface relations). SUB-Goal stands for a GOAL relation expressed via a subordinate sentence. Cho-1 is a chomeur (see text), expressed in Itatian via a BY (DA) complement. attachment, DEP represents a deep dependency hypothesis. On tile contrary, the RG rules between two words that are structurally governing the nlappiug between strata airu at independent. DEP arcs enable us to represent confirming the hypotheses: they are applied on long distance dependencies, the sharing of the basis of the lexical and morphological dependent nodes (i.e. multiple heads, see fig. 1) infonnatioo associated with tile verb, where the and to represent coordinative and comparative lexicon provides tbe first stratum and possible constructions without violating the adjacency constraints on nde applicability. principle [Hudson 84], that applies only to STR and D&S arcs l. An arc involving dependency 2. The assignment of grammatical (of DEP or D&S type) is labelled with the relations grammatical relation that exists between the two We start this section by providing a short nodes that it links (the arrangement in strata is overview of the main ideas of RG. Such ideas explained below). are shared by many fornlalisms, bttt we. will The goal of this paper is to show that the mostly refer to the work described in formalism of Relational Grammar can be [Perlnlutter 83] and [l)erhuutter, Rosen 841, integrated in a useful way in a gcneral NL where it can be tound a comparison with other interpreter, in particular if the surface stntctures RG fonnalisms. are represented via the dependency formalism. Grammatical relations are arranged in a The paper examines the problems associated hierarchy and are usually referred to by with the use of RG in an interpretive (as numbers: 1, which is the highest, corresponds opposed to generative) framework, where the to SUBJECT, 2 to DIRECT OBJEC~I ', 3 to phase of surface relation hypothesization is INDIRECT OBJECT. The key principle of RG critical. The partial configurationality of Italian is the promotion of relations to higher levels in can be exploited as heuristic information aiding the hierarchy. The passive Cml be described as a the interpreter in selecting the preferable initial promotion of 2 to 1 (i.e. DIR-OBJ to SUB J), leaving the previous 1 element "unemployed". The relation "uuemployed", which is technically 1 The adjacency principle intuitively states that a word indicated by the corresponding French word B, that stands between the words A and C in the sentence, results in the santo position if we project Ihe related node.~in the dependency tree onto a line. AcrEs DE COLING-92, NANTES.23-28 AO(n" 1992 l 0 9 1 I)~OC. OF COLING-92, NAN'rES, AUG. 23 28. 1992 (a) (b) give Mary the book John give Mary the book John Fig.2 - The Relational Networks associated with the sentences "John was given the book by Mary" (a) and "The book was given to John by Mary" (b). chomeur, is assigned to an element that cannot logical representations, because of the be involved in any other promotion. Consider resemblance between a Relational Network and 5a) Mary gave the book to John, a Predicate-Argument structure: the initial where Mary is the 1-element, the book is the stratum states which are the grammatical 2-element and John is the 3-element. If we relations (actually the elements at the sentence apply the rule for passivization described level) that act as arguments of the predicate above, the book must be promoted to the 1 identified by P. relation, Mary becomes a chomeur, while From a computational point of view, John is still the 3 element. The chomeur-1 syntactic and semantic clues must be taken into element, i.e. a chomeur element that results account, in order to map the grammatical from the "unemploying" of a l-element, relations onto the surface descriptions. The assumes the surface form of a by-complement mapping is carried out incrementally, i.e. as in English, thus yielding soon as the nominal head of the complement is 5b) The book was given to John by parsed: it is highly language-dependent, and Mary considers features like inflectionality, where Mary cannot be involved in any other configurationality and deep underlying promotion, because of its chomeur condition. A structures. The mapping must also take into similar rule applies to double-accusative account the changes on the surface form that are constructions, as shown in fig.2a. induced by the rules on grammatical relations, At the same level of promotional rules we discussed informally above 2. can posit the lexical rules, that account for the Unfortunately the mapping raises some determination of grammatical relations within difficulties.