<<

THE ASSIGNMENT OF GRAMMATICAL RELATIONS IN NATURAL LANGUAGE PROCESSING

Leonardo Lesmo, Vincenzo Lombardo Dipartimento di Infom~atica - Universila'di Tofino C.~ Svizzera 185 - 10149 Torino - ITALY e-mail: lesmo,[email protected]

1. Introduction ((He) has been seen by l'iero's friends) the passive form does not obey tile law of direct One of the main goals of an interpreter is to map mapping. The example is, however, easily the syntactic descriptions found in the sentence accounted fbr by the relational theories. The into the correct roles that the elements passivization rule induces only changes of (described by the nominals) play in the situation function: the SUBJ becomes the BY- at hand (described by the ). For instance, complement and the OBJ becomes the SUBJ. we must be able to state that in The importance of grammatical relations, 1) The cat ate the mouse taken as primitives for a universal grammar, is the cat is the "eater" and the mouse is the stated by a number of formalisms often "eaten thing". Of course, if we only talk about collected under the label of Relational roles and situations we miss some significant Grammar. The problem is to map the surface generalizations. In constituents into their correct roles. With 2) The boy drank the water, languages as Italian, which stands in the middle if we say that the boy is the "drinker" and the between configurational and freely ordered water is the "drunk thing", we disregard the languages [Stock 891 some flexibility is evident similarity of the roles of "eater" and required to accomplish this task. One possibility "drinker" in the two situations. The notion of is to adopt 11 neutral syntactic structure, open to deep case arises as the common ground several alternatives in the interpretation process. underlying a number of "apparently" different The & modifier approach seems to feature roles. Upon this notion some frameworks, that this kind of neutrality, and has effectively been stand at the core of semantic representation and used for dealing with free word order natural language processing, are built (see languages, like the Slavonic languages [Sgall et [Fillmore 68], [Bruce 75] and ISomers 871). al. 861 and Finnish [Jappinen et al. 86]. The hard task is to devise a mapping The dependency formalism we have adopted between the surface descriptions and these deep is presented in [Lesmo, Lombardo 91]. An cases. The complexity of some syntactic example is reported in fig.l, and concerns the phenomena, like passivization, and sentence: raising, long distance dependencies, has 4) La ragazza ebe lavora al guardaroba led many researchers to pose an intermediate fu persuasa da un cliente a level between the linear string of words and the comprare una enciclopedia case system. The concept involved is that of (The girl who works at the wardrobe was "", such as "subject", persuaded by a customer to buy an "direct object", "indirect object". It is claimed, encyclopedia). for example, that "passivizatiou" is universally The daughter nodes that stand on the left of (cross-linguistically) explained if one says that their head precede it in the linear order of the the "object" of an active sentence becomes the sentence, while daughter nodes on the right "subject" in the passive form, rather than by follow it. The arcs that link the nodes in the saying that the NP in the VP is moved to dependency tree are of three types: arcs of replace the NP in S (that is a direct mapping). structural and logical dependency (D&S arcs, In the latter case it is implicit that the partictdar represented by bold arrows in the figure), arcs language under examination has a Subject- of only structural dependency (STR arcs, Verb-Object structure (SVO), as it usually simple arrows in the figure), and arcs of only happens in configurational languages such as logical dependency (DEP arcs, dashed arrows English. In the example in the figure). D&S arcs link two words that 3a) Lo hanno visto gli amici di Piero stand in a "both structural and logical" relation. (Him &tve seen the friends of Piero) STR and DEP split these two functions of arc: 3b) E' stato visto dagli amici di Piero an STR individuatcs a purely superficial

AcrEs DE COLING-92. NAN'~S, 23-28 ^o~r 1992 1 0 9 0 PROC. or: COLING-92, NANTF.S.AUG. 23-28, 1992 PEFISUADERE at

RAG~2~ A

kA LAVORARE ~, CLIENTE COMPRb~E \,oo k ./ ~ \.et ~', .' OEld % L~ .," %%. CI-E A ENCICLOPEDIA

GtJ,N:I[I~V~ UNA J IL

Fig.1 - An example of dependency tree. Because of space constraints, the figure already in- cludes the grammatical relations that will be described below. The bold labels (e.g. agt, pat) refer to deep cases. The labels immediately below them refer to the initial stratum of grammatical relations. The lowest labels are the last stratum (surface relations). SUB-Goal stands for a GOAL relation expressed via a subordinate sentence. Cho-1 is a chomeur (see text), expressed in Itatian via a BY (DA) complement.

attachment, DEP represents a deep dependency hypothesis. On tile contrary, the RG rules between two words that are structurally governing the nlappiug between strata airu at independent. DEP arcs enable us to represent confirming the hypotheses: they are applied on long distance dependencies, the sharing of the basis of the lexical and morphological dependent nodes (i.e. multiple heads, see fig. 1) infonnatioo associated with tile verb, where the and to represent coordinative and comparative lexicon provides tbe first stratum and possible constructions without violating the adjacency constraints on nde applicability. principle [Hudson 84], that applies only to STR and D&S arcs l. An arc involving dependency 2. The assignment of grammatical (of DEP or D&S type) is labelled with the relations grammatical relation that exists between the two We start this section by providing a short nodes that it links (the arrangement in strata is overview of the main ideas of RG. Such ideas explained below). are shared by many fornlalisms, bttt we. will The goal of this paper is to show that the mostly refer to the work described in formalism of can be [Perlnlutter 83] and [l)erhuutter, Rosen 841, integrated in a useful way in a gcneral NL where it can be tound a comparison with other interpreter, in particular if the surface stntctures RG fonnalisms. are represented via the dependency formalism. Grammatical relations are arranged in a The paper examines the problems associated hierarchy and are usually referred to by with the use of RG in an interpretive (as numbers: 1, which is the highest, corresponds opposed to generative) framework, where the to SUBJECT, 2 to DIRECT OBJEC~I ', 3 to phase of surface relation hypothesization is INDIRECT OBJECT. The key principle of RG critical. The partial configurationality of Italian is the promotion of relations to higher levels in can be exploited as heuristic information aiding the hierarchy. The passive Cml be described as a the interpreter in selecting the preferable initial promotion of 2 to 1 (i.e. DIR-OBJ to SUB J), leaving the previous 1 element "unemployed". The relation "uuemployed", which is technically 1 The adjacency principle intuitively states that a word indicated by the corresponding French word B, that stands between the words A and C in the sentence, results in the santo position if we project Ihe related node.~in the dependency tree onto a line.

AcrEs DE COLING-92, NANTES.23-28 AO(n" 1992 l 0 9 1 I)~OC. OF COLING-92, NAN'rES, AUG. 23 28. 1992 (a) (b)

give Mary the book John give Mary the book John Fig.2 - The Relational Networks associated with the sentences "John was given the book by Mary" (a) and "The book was given to John by Mary" (b).

chomeur, is assigned to an element that cannot logical representations, because of the be involved in any other promotion. Consider resemblance between a Relational Network and 5a) Mary gave the book to John, a Predicate- structure: the initial where Mary is the 1-element, the book is the stratum states which are the grammatical 2-element and John is the 3-element. If we relations (actually the elements at the sentence apply the rule for passivization described level) that act as arguments of the predicate above, the book must be promoted to the 1 identified by P. relation, Mary becomes a chomeur, while From a computational point of view, John is still the 3 element. The chomeur-1 syntactic and semantic clues must be taken into element, i.e. a chomeur element that results account, in order to map the grammatical from the "unemploying" of a l-element, relations onto the surface descriptions. The assumes the surface form of a by-complement mapping is carried out incrementally, i.e. as in English, thus yielding soon as the nominal head of the complement is 5b) The book was given to John by parsed: it is highly language-dependent, and Mary considers features like inflectionality, where Mary cannot be involved in any other configurationality and deep underlying promotion, because of its chomeur condition. A structures. The mapping must also take into similar rule applies to double-accusative account the changes on the surface form that are constructions, as shown in fig.2a. induced by the rules on grammatical relations, At the same level of promotional rules we discussed informally above 2. can posit the lexical rules, that account for the Unfortunately the mapping raises some determination of grammatical relations within difficulties. The bias of the rules at the RG level subordinate untensed sentences, as in 4). Such is of a generative kind. Rules start from the information is stored within the lexical entry of initial stratum, the one which is closer to the the verb that governs the subordinate . deep cases (arguments of the predicate), to For example, to promise forces the SUBJ of produce the final stratum of the surface the subordinate clause to be the SUBJ element arrangement; on the contrary, in an interpreter of the governing clause, as we can see in of language, the task is to trace what rules have 6) Mary promised John to write him a been actually applied (and in what order) to the letter, initial stratum (and the subsequent strata) in where the SUBJ of write is Mary, the same of order to achieve the surface realization of the promise. On the contrary, to persuade grammatical relations. Useful heuristics are forces the SUBJ of the subordinate to be the devised to identify the surface clues that OBJ element of the governing clause, as in 4). evidence tile application of a particular nile. For It must be noted that the lexical rules are related example, a passivization is accompanied by a to the assignment of relations in the initial passive form of the verb. stratum, even if they are subsequently changed Starting from the surface descriptions, the by promotional rules. For example, in 7) The girl was persuaded by a customer to buy an encyclopedia, the girl is the element which is still shared by 2 Of course, the application of such rules involves also the two , even if it is the SUBJ now. changes in focus. Two expressions that result to be derivable from each other (3a and 3b). according to The semantic interpretation process takes phenomena that are explained in terms of grammatical advantage of the functional analysis, i.e. the relations, are therefore not strictly equivalent, even if analysis in terms of grammatical functions: both of them involve the same roles to be played by relational structures are easily mapped onto the individuals in the ground sentence (or, better, in the sentence that has been claimed to be ground).

Acids DECOLING-92, NANTES.23-28 AO~r 1992 1 0 9 2 PRec. OF COLING-92. N^r,n'I~S.Auo. 23-28, 1992 interpreter is not always able to uniquely corresponding nominals are not preceded by a determine the assignment of grammatical preposition, and prououns be inflected relations on the basis of syntactic features. appropriately. For example, in 3a the pronoun Consider I0 features an , thus a DIR-OBJ. 8a) Giovanni il vino Io ha bevuto If two nominal descriptions are not inflected (John the wine [it] has drunk) and, hence, they cannot be associated with 8b) II vino Giovanni Io ha bevuto a particular relation via this marking, as in (The wine John [it] has drunk) 9) 11 gatto mangio' ii topo Only semantics allows a hearer to realize that in (The cat ate the n~)use), both cases the "drinker" is John and not the the position of the nominals can be useful, wine. Hence a flexible interaction between since, in a partially configuratioual language syntactic and semantic information (selectioual such as Italian, grammatical relations are restrictions) must be devised. usually connected with the canonical positions Since in our system the analysis is of the SVO order: with a transitive verb, the incremental, in the sense that parsing and SUBJ precedes the verb and OBJ follows it; interpretation are synchronous processes, as witll iutransitive , the position of the soon as the dependency tree is extended with a nominal without a preposition does not affect head of a substructure the semantic interpreter is the grammatical relation assigned to, since it triggered to interpret it: in the case of verbs, all will be surely the SUBJ wherever it is. If the the complements that precede the verb are order too does not give an unambiguous interpreted when the verb is found, while each assignment of grammatical relations, the last complement that follows the verb is interpreted resources are the number agreement for the as soon as it is attached to it. SUBJ relation and the semantic check. In The association of the grammatical relations situations such as with the descriptions in the sentence is 10) Le ragazze Giorgio le ha viste accomplished by the rules at the relational level (The girls Giorgio [them] has seen) (GR rules), which are divided into three even if the nominal descriptions staud on the groups: the first of them deals with the initial same side of a transitive verb, the latter agrees proposal of relations based on syntactic features only with Giorgio in number. On the contrary, (Proposal Rules - PR), the second concerns the only semantics can solve a situation as 8; movement across the strata (Stratal Rules ~ SR), moreover, the semantic check can also reject an and the third, of a lexical kind (then Lexical assignment made on the basis of the syntactic Rules - LR), accounts for the sharing of features that we have described. Consider, for relations in unteused subordinate clauses, as in example, the sentence 4. The verbal lexical entry contains, among 11) Un snsso calcio' il vitello other information, its initial stratum of (A rock kicked the calf) grammatical relations. Once the verb has been Even if the order rules assign an sasso the found, the GR rules are triggered iu order to SUBJ relation and il vitello the OBJ, such an find out the roles that are played by the elements assignment is rejected on the semantic ground. that precede the verb in the input sentence Notwithstanding a system that works correctly (incremental interpretation). It is the actual input cannot be based only on semantics, since a that determines which of the three groups must sentence like il sounds really strange to a be applied. For example, if we have a single native speaker, if we are not in a particular active sentence (without subordinate embedded focussing situation. clauses or passive forms), the Proposal Rules are triggered. SR and LR rules are activated 3. An example only in presence of special features: a passive form (was eaten), for instance, activates the In figure 1, we can find the result of the Passivization rule (belonging to SR), if we have interpretation of sentence 4. When the analysis the pair in the current stratum, arrives at lavora, in the relative sentence, its while lexical rules are associated with verbs that initial stratum is retrieved from the govern subordinate clauses (e.g. t o lexicon. Since ehe (who) is a nominal without persuade). The result of the application of one a prepositional marker, it (or better the element or more rules is the final stratum, against which refelTed to) is the SUBJ of lavora, as stated by the assignment of relations guessed by the PR the Proposal Rules. Lavora has also an group is nmtched. adjunct, al guardaroba (at the wardrobe), In the PR, the first feature that is taken into a non-teml relation of type LOC. The structure account is the syntactic form of the participants. for the nominal description la ragazza che SUBJ and DIR-OBJ require that the lavora al guardaroba has already been built,

ACTESDE COLING-92, NArCrES.23-28 Aot~q 1992 l 0 9 3 PROC. Ot: COLING-92. NAN'IF.S.At ยข;. 23k28 1992 when the input word is the verb persuadere. The main feature of the approach is the strict Its lexieal entry provides the parser with an cooperation among different knowledge sources initial stratum of grammatical relations that (lexicon, RG rules and semantics) in carrying consists of: a SUBJ, an OBJ and a subordinate out the task: this cooperation is made necessary sentential Goal (SUB-Goal) 3, i.e. a persuader, by the partial configurationality of Italian, a persuadee and the persuasion. This basic where the ordering of constituents can only be assignment can be related to the deep cases of considered as the basis for plausible AGT, PAT and GOAL respectively. Moreover, suggestions, but not as the source of stricts a lexical rule is contained in the lexical entry: constraints. The adoption of an unmarked input The SUBJ of the subordinate untensed clause (an unlabelled dependency tree) makes available governed by persuadere is the OBJ element a flexible starting point that leaves the RG of the governing clau.~e. module the task of making the required Since the verb is in the passive form and the inferences. current stratum features a SUBJ and an OBJ The ideas expressed herein are implemented relations, the Passivization rule in the SR group in the GULL system (see [Lesmo, Torasso 83, is triggered, in order to find the actual 85al for the syntactic part [Di Eugenio, Lesmo arrangement of relations in the input sentence. 871 for the basic ideas about semantics): both The new stratum is , against which the proposals made by condition-action rules. The system is the PR group are matched. Since the nominal implemented in Common Lisp and runs on description already found is not inflected and is SUN workstations. not marked by a preposition, the positional rules suggest that, since it precedes the verb and References agrees with it, a possible assignment of relation [Bruce 75] Bruce B., Case Systems for Natural Lang- is SUBJ. The semantic check, which is uage, Artificial Intelligence 6, 1975, pp, 327-360. activated on tbe basic relation (i.e. OBJ) IDi Eugenio, Lesmo 87] B.Di Eugenio, L.Lesmo, Representation and Interpretation of in validates such an assignment, because a girl that Natural Language, Proc. 10th 1JCAI, Milano, Italy, works at the wardrobe may happen to be 1987, pp.648-654. persuaded. The analysis proceeds to the next IFillmore 68] Fillmore C., The case for case, in nominal description, with the set of relations "Universal in Linguistic Theory", (Bach and Harms (Cho-1, SUB-Goal) not assigned yet. Da un eds.), Holt, Rinehart and Winston, 1968. eliente (by a customer) has exactly the [l-tudson 84] Hudson R., Word Grammar, Basil form of a Cho-1 in Italian. The Proposal Rules, Blackwell, Oxford, 1984. whose hypothesis is confirmed by the semantic [Jappinen et al. 861 Jappinen H., Lehtola A., Valkonen check, are sufficient to deal with this situation. K., Functional Structures for parsing dependency When we find the verb comprare (buy), the constraints, Proc. COLlNG 86, Bonn, Germany, PR group assigns to such a description the 1986, pp.461-463. [Lesmo, Torasso 83] Lesmo L., Torasso P., A Flexible SUB-Goal relation and consequently the lexical Natural Language Parser based on a two-level Re- rule associated with persuadere assigns the presentation of , Proceedings of the 1st Con- SUBJ relation of the initial stratum of li~rence ACL Europe, Pi~, Italy, 1983, pp.114-121. comprare to la ragazza the .... The initial [Lesmo, Torasso 85a] Lesmo L., Torasso P., Analysis stratum of comprare features also an OBJ of Conjunctions in a Rule-Based Parser, Proceedings relation, that will be assigned to eneiclopedia, ACL 85, Chicago, USA, 1985, pp.180-187. when it is found. [Lesmo, Lombardo 91] Lesmo L., Lombardo V., A The completeness of the set of found Dependency Syntax for the Surface Structure of grammatical relations is checked when the node Sentences, Pro,:. of WOCFAI, Paris, July 1991. corresponding to the verb is "closed", i.e. when [Perlmutter 83] Perlmutter D.M. (ed.), Studies in Rel- it cannot have further modifiers. ational Grammar 1, Univ. of Chicago Press, 1983. [Perlmuner, Rosen 84] Perlmutter D.M., Rosen C.G., (eds.), Studies in Relational Grammar 2, The Univ. 4. Conclusions of Chicago Press, 1984. [Sgall et al. 86] Sgall P., Haijcova E., Panevova J., The paper illustrates how RG can be used to 7"he Meaning of the Sentence in its Semantic and map in a principled way surface dependency Pragmatic Aspects, D. Reidel Publ. Co., 1986. relations into thematic roles. [Shiners 87] Somers H.L., Valency and Case in Comp- utational Linguistics, Edinburgh Univ. Press, 1987. [Stock 89] Stock O., Parsing with Flexibility, Dynamic 3 '/'he Goal relation, such as Instrument or Location, is Strategies, and Idioms in Mind, Computational a non-term relation and participates only to special Linguistics 15, 1989, pp.l-18. kinds of promotional rules (see [Perlmutter 83] for details).

ACl'l~.SDE COLING-92, NANTES,23-28 Aot;r 1992 1 0 9 4 PRoc. OF COL1NG-92, NANTES,AUcI. 23-28, 1992