I Integration of syntactic and lexical information in a hierarchical ! dependency grammar Cristina Barbero and Leonardo Lesmo and Vincenzo Lombardo Dipartimento di Informatica I Universit~ di Torino - Italy
Paola Merlo Universit6 de Gen~ve - Switzerland I IRCS - University of Pennsylvania
I Abstract level provided by syntactic rules is necessary to avoid In this paper, we propose to introduce syntactic the loss of generalization which would arise if class- classes in a lexicalized dependency formalism. Sub- level information were repeated in all lexical items. i categories of words are organized hierarchically from In parsing, a predictive component is required to a general, abstract level (syntactic categories) to a guarantee the valid prefiz property, namely the ca- word-specific level (single lexical items). The formal- pabifity of detecting as soon as possible whether a ism is parsimonious, and useful for processing. We substring is a valid prefix for the language defined I also sketch a parsing model that uses the hierarchi- by the grammar. Knowledge of syntactic categories, cal mixed-grain representation to make predictions which does not depend on the input, is needed for a on the structure of the input. parser to be predictive. In this paper we address the problem of the in- I 1 Introduction teraction between syntactic and lexical information in dependency grammar. We introduce many inter- Much recent work in linguistics and computational mediate levels between lexical items and syntactic linguistics emphasizes the role of lexical information categories, by organizing the grammar around the i in syntactic representation and processing. notion of subcategorizetion. Intuitively, a subcat- This emphasis given to the lexicon is the result egorization frame for a lexical item L is a specifi- of a gradual process. The original trend in linguis- cation of the number and type of elements that L I tics has been to individuate categories of words hav- requires in order, for ml utterance that contains L, ing related characteristics - the traditional syntactic to be well-formed. For example, within the syntac- categories like verb, noun, adjective, etc. - and to tic category VERB, different verbs require different express the structure of a sentence in terms of con- numbers of nominal dependents for a well-formed I stituents, or phrases, built around these categories. sentence. In Italian (our case study), an intransi- Subsequent considerations lead to a lexicalization of tive verb such as dormirv, "sleep", subcategorizes for grammar. Linguistically, the constraints expressed only one nominal element (the subject), while a tran- on syntactic categories are too general to explain sitive verb such as baciare, "kiss", subcategorizes for I facts about words - e.g. the relation between a verb two nominal elements (the subject and the object) and its nominalization, "destroy the city" and "de- 1. Grammatical relations such as subject and object struction of the city" - or to account uniformly for a are primitive concepts in a dependency paradigm, number of phenomena across languages - e.g. pas- | i.e. they directly define the structure of the sen- sivization. In parsing, the use of individual item tence. Consequently, the dependency paradigm is information reduces the search space of the possi- particularly suitable to define the grammar in terms ble structures of a sentence. From a mathematical of constraints on subcategorization frames. point of view, lexicalized grammars exhibit proper- Our proposal is to use subcategories organized in i ties - like finite ambiguity (Schabes, 1990) - that are of a practical interest (especially in writing real- a hierarchy: the upper level of the hierarchy corre- istic grammars). Dependency grammar is naturally sponds to the syntactic categories, the other levels I suitable for a lexicalization, as the binary relations correspond to subcategories that are more and more representing the structure of a sentence are defined 1We include the subject relation in the subcategorization, with respect to the head (that is a word). or valency, of a verb - cf. (Hudson, 1990) (Mel'cuk, 1988). Pure lexicalized formalisms, however, have also In most constituency theories, on the contrary, the subject is I several disadvantages. Linguistically, the abstract not part of the valency of a verb. I
i 58 I specific as one descends the hierarchy. This repre- 2 A dependency formalism sentation is advantageous because of its compact- The basic idea of dependency is that the syntac- ness, and bemuse the hierarchical mixed-grained or- tic structure of a sentence is described in terms of ganization of the information is useful in processing. binary relations (dependency relations) on pairs of In fact, using the general knowledge at the upper words, a head (or parent), and a dependent (daugh- level of the hierarchy, we can make predictions on ter), respectively; these relations form a tree, the de- the structure of the sentence before encountering the pendency tree. In this section we introduce a formal lexical head. dependency system, which expresses the syntactic Hierarchical formalisms have been proposed in knowledge through dependency rules. The grammar some theories. Pollard and Sag (1987) suggested and the lexicon coindde, since the rules are lexical- a hierarchical organization of lexical information: ized: the head of the rule is a word of a certain cate- as far as subcategorization is concerned, they in- gory, namely the lexical anchor. The formalism is a troduced a "hierarchy of lexical types". A specific shnplified version of (Lombardo and Lesmo, 1998); formalisation of this hierarchy has never reached a we have left out the treatment of long-distance de- wide consensus in the HPSG community, but sev- pendencies to focus on the subcategorization knowl- eral proposals have been developed - see for example edge, which is to be represented in a hierarchy. (Meurers, 1997), that uses head subtypes and lexical principles to express generalizations on the valency A dependency grammar is a five-tuple
59 grammar) and sets of subcategories, L : W --~ 2 T- {}. That is, each word can "belong" to one or mo- re subcategories; D is a set of dependency relations (as in section 2); Q is a set of subcategorization frames. Each subcate- gorization frame is a total mapping q : D -4 Rx 2 T, where R is the set of pairs of natural numbers Figure 1: Dependency tree of the sentence Gg arnici io
60 Vtz, t~ E T Vdr E H DepSetq = (t, <_T t2 ^ dr ~ rules(h) --~ dr ~ rules(t~)) { {<,ogg, N>, #}, In order to specify the correspondence between sub- { where {
61 0), the constraint is assumed to be respected in case I t137 the number of actual instantiations is 0. The ordering relation is transitive, namely: [1,1]j¢" |O,1]P ~ [0,21 ! iN} iN, {Prep) if 63 ZITI~'~IF.ATZOm IFJUCDZC~'~ tJU~XOH I I NOUN~ NOUN~ SP~C/ SP~ ---.2 sP~/ I DETERNr~-] 1 DETZRM~'~ ~. $¢A/@C.T/~ PJU~ZC~ZCW O~OI.g27CW I V-TR [credevano] V-TR ['c~dev..o~ V-TR [clredevano] 4 I ,-.~-~--, ~ 3 ,-.-7.~.~ 2 3 _ I Figure 3: Analysis of the sentence Gli amid |o credevano ~ ~ ~ friends con.~dered him a hero". eral literary genres of written Italian. The whole hierarchy has 6 levels: the top level (class The information required by our formalism -- the VERB) represents the general constraints for Italian I grammatical relations associated to the dependents, verbs, the top+l level distinguishes the constraints their number (Nq(d)) and the set of categories for impersonal(V1), intransitive (VERB-INTR) and (Vq(d)) that can realize them -- was partly obtained transitive (VERB-TRANS) verbs, the top+2, top+3, by consulting Italian dictionaries, partly based on etc. levels represent specific classes of verbs (from I native speakers intuitions, and mostly from the anal- V2 to VS0). ysis of the corpus. 5.3 Results All the sentences containing the verbs under anal- The graph in figure 5 shows the distribution of verbs ysis were automatically extracted from the corpus, I and the subcategorization patterns (rules) exhibited by type, namely how the number of verbs covered by the classes grows in relation to the number of classes. by the verbs in those sentences were manually col° We can see that the first (more common) class covers lected. 15 verbs, the first and second more common classes ! We represented the set of subcategorization patterns together covers 26 verbs, etcetera. With the first 9 (rules) as subcategorization frames, by associating classes we cover 63 verbs, giving rise to a reduction with each grammatical relation - according to the of 85.7% compared to having a distinct subcatego- formalism - the related number (Nv(d)) and value rization frame for each verb. With the first 18 classes (Vq(d)) restrictions computed on the corpus. In this l we cover 81 verbs (reduction of 77.7%). The whole test, we have kept the order between the dependents set of verbs requires, however, 50 classes (reduction of a verb free, so there are no ordering constraints. of 50.5%): in fact, we have found many verbs with its I Each class tt is connected to supexclass t2. very idiosyncratic behaviours. Diff(tl, t2), the difference between the constraints Table 2 shows the distribution of verbs by token associated with tl and the ones associated with t.~, (sum of the occurrences, in the corpus, of all the is expressed by specifying, for each relation that is verbs referring to each class), level by level. The I restricted from t2 to tl, the relation itself with the fact that some rare classes occur is interesting if com- new number and value restrictions. pared to the percentage of reduction in the represen- 5.2 Hierarchy tation. There is a compression of 55,7%, while still taking care of very low frequency patterns, where i Figure 4 illustrates a small portion of the resulting compression is almost 0%. hierarchy. This hierarchy is based on the depen- In Table 3, we show, for each level, the number dency relations for a generic Italian verb summa- of subcategorization patterns represented by all the I rized in Table 1 s. classes of that level, namely the sum of the patterns 6Usually the adjuncts are not indicated as part of the sub- of each class at that level. The number of patterns categorization frames of the verbs: they are not obligatorily decreases rapidly by d~,cending the hierarchy. required by the verbs themselves. We have specified them anyway, as the hierarchy represents the grammar - which in- The representation of the syntactic knowledgeconcerning ad- I cludes all the information about the dependents, adjuncts in- juncts is currently a research goal. Most authors tend to cluded. Moreover, by specifying the information about the avoid it in the representation of subcategorlzation frames - adjuncts at the top level, we maintain the clarity of the rep- see (Hudson, 1990) and the "adjoining" operation in LTAG I resentation and the mapping on the formal grammar. (Josh| and Schabes, 1996). I l 64 t GRAMMATICAL SYNTACTIC CATEGORY EXAMPLE + TRANSLATION RELATION (value restriction) subject nominal group, N Paolo area Maria (Soot) "Paolo loves Maria" embedded clause headed by the Mi diverte che tu dica ci~), complementizer che, "that", Cheaub "It amuses me that you say this" preposition di, "off, Prep[di] Non mi interessa di venire, "I am not interested in coming" infinitive verb, Verb[inf] ~qdar¢ t be/to, "Skiing is nice" object nominal group, N Gianni mangia una mela, (Oct) "Gianni eats an apple" embedded clause headed by the Credo J~e_z~l_d/lffizlm~, complementizer ehe, "that", Chesub "I think it is amusing" preposition d/, "of", Prep[di] Aepetto di partite, "I am waiting to leave" predicative complement nominal group, N Gon.~idero Piero ~ Omico, (PRED) "I consider Piero a friend" adjective group, Adj Luigi ~ gentile, "Luigi is prepositional group, Prep Tuo zio ~ senza ritegno, "Your uncle is without reserve" complements prepositional group, Prep Metro il vaso (COMPL), "I put the vase on the table" at most 3 clitic, Clitic Gli ho dato un libro, "I gave him a book" adjunct prepositional group, Prep Procedevo di buon p~so, (AGGIUNTO) "I was walking at a brisk pace" adverb, Ado Luigi corse veloeemente, ,Luigi ran quickly" conjunction, Conjdip Telefonami fuando puoi, "Call me when you can" verb of non-finite mood, Verb[non.fin] Camminavo ~schiettando, "I was walking while whistling" Table I: The grammatical relations of a generic Italian verb, with their possible realizations and related examples. For the patterns found in the texts, we observe a de- The hierarchical formalism has shown to be effective crease similar but less marked than the grammatical in representing parsimoniously - that is, without re- patterns. Even the more specific classes describe a dundancy - the syntactic and lexical knowledge in good portion of the patterns in the texts, so confirm- an empirical test on 101 Italian verbs. ing the usefulness of very specific information in the Moreover, we have sketched a left-to-right predictive analysis. parsing model that takes advantage of the hierarchi- Table 2 show this point more clearly. The lower, cal knowledge representation in order to make pre- more specific levels, while having fewer classes, still dictions on the structure of the input sentence. cover many occurrences of verbs in the text. In the next future we will address a massive empiri- cal test of Italian corpora, and the formal specifica- 6 Conclusion tion of the parsing model, together with a complex- ity analysis. The paper has presented a hierarchical organization of a dependency formalism. The hierarchy is defined by the subsumption relation on subcategories, de- fined as a mapping between subcategories and sub- 7 Acknowledgements categorization frames. Subcategorization frames, in turn, define the number of possible instantiations This research was partly sponsored by the Swiss of a dependency relation and the subcategories that National Science Foundation, with fellowhip 8210- can realize it. 46569 to P. Merlo. 65 VERB AGGIUNTO lo.,lv llOo,l ",,!,o.,! 1 [O, Lnf] IN, Chesub, {N, Chesub, {N,AdJ, {Prep, {P repo Adv, Verblinf], Prep[dil } Prep} Clltle} ConJdlp, Prep[di]} Verb[no flnl} VERB-ZNTR ~-T~NS SOGG OGG COMPL [0.0l V°.01 10,Ol [1,1l # [0.0] • l {N, Chesub, { } {N, Chesub, {N, Chesub, {} 1} 1} 1} Verb [Infl, Verb[inf] } Prep[dl] } l Prep [dl] } V$O TERMINE I[0,1] |lO, l {Prep[a], {N} {Adj] {Preplda] } Clitic [dat ] } Figure 4: A portion of the hierarchy. Subclasses inherit and restrict the constraints at the top of the hierarchy. The top class, VERB, has three daughters. V1 is the class of impersonal verbs, that can only have adjuncts as dependents - the restriction is on the range, [0, 0], of the other relations. For example, we can say Piove o,i tetti della cittd, "It rains on the roofs of the town". The classes VERB*INTR and VERB-TRANS correspond to intransitive and transitive verbs, respectively. VERB-INTR requires an obligatory subject ([1,1]) and it cannot have a direct object ([0, 0]). VERB-TRANS requires an obligatory subject ([1,1}), that can be headed by a nominal element, a conjunction ch¢ or an infinitive verb, and an obligatory object ([1,1]). A subclass of VERB-INTR, V2, is shown: its only restriction is on the relation COMPL, which is specialized on the subrelation TERMINE, "Indirect Object", having a range [0,1], and having Prep[a]and Clitic[dat]as associated categories (preposition lexically realized by a, "to", and dative clitlc). For example, sembrare~ "seem", is a verb belonging to this class: we can say A Luigi Maria aembra beUissima, "To Lui~ Maria seems very beautiful". VS0 is a subclass of VERB-TRANS: it restrics the sets of categories associated to the relations OGG and PRED, and specializes the relation Co~lPl. on the subrelation SEPARAZIONE, "Separation" (realized by the preposition da, "from", Prep[da)). The verb allontanare, "distance", belongs to VS0: Luigi mi aIlontanb da re, "Luigi distanced me from you". Number of I01 verbs belon?lnq to ~he classes 81 4S, 12. 21,. 15' |234S tD X| SO N~rof ClasBas Figure 5: Distribution of verbs by type. 66 ii LEVEL DISTRIBUTIONOF CLASS SIZE 1 2 5 3 423 322 318 308170 169 148 136 99 77 67 545451 47 46 45 21 20 16 14 12 11 10 3 2 4 321 292 229 116 103 897850 41206 5 5 397332 212 111 52 15 6 299 239 2 Table 2: Distribution of class size by level. PATTERNS L Mel'cuk. 1988. Dependency 5yntaz: Theory and LEVEL GRAMM. [ IN TEXT Practice. SUNY Press, Albany. 1 484531 5674 W.D. Meurers. 1997. Using lexical principles in 2 244533 5674 HPSG to generalize over valence properties. In 3 102986 2643 Proceedings of the Third Conference on Formal 4 20558 1351 Grammar, Aix-en-Provence, France. 5 1166 1135 B. NebeL 1990. Reasoning and Revision in Hy- 6 134 540 brid Representation Systems. In LNAI n. 4~. Springer-Verlag. Table 3: Number of possible and actual patterns at the F. Palazzi and G. Folena. 1992. Dizionario della levels of the hierarchy. lingua italiana. Loescher. C. Pollard and I. Sag. 1987. Information-based syn- taz and semantics, vol. 1 Fundamentals. CSLL References L. Renzi. 1988. Grande grammatica italiana di con- C. Barbero. 1998. On the granularity of information sultazione~ I1 Mulino. in syntactic representation and processing: the use Y. Schabes. 1990. Mathematical and Computational of a hierarchy of syntactic classes. Ph.D. thesis, Aspects of Lezicalized Grammars. Ph.D. thesis, Universith di Torino. University of Pennsylvania. T. Becker. 1993. HyTAG: a new type of Tree Ad- K. Vijay-Shanker and Y. Schabes. 1992. Structure joining Grammars for Hybrid Syntactic Represen- sharing in lexicalized tree-adjoining grammars. In tation of Free Order Languages. Ph.D. thesis, Proceedings of COLING'92. University of Saarbruecken. M.H. Candito. 1996. A principle-based hierarchical representation of LTAGs. In Proceedings of COL- ING'g6. C. Doran, B. Hockey, P. Hopely, J. Rosenzwieg, A. Sarkar, B. Srinivas, F. Xia, A. Nazr, and O. Ranbow. 1997. Maintaining the Forest and Burning out the Underbrush in XTAG. In Com- putational Environments for Grammar Develop- ment and Language Engineering (ENVGRAM). tL Evans, G. Gazdar, and D. Weir. 1995. Encoding Lexicalized Tree Adjoining Grammar with a Non- monotonic Inheritance Hierarchy. In Proceedings of ACL'95. R. Hudson. 1990. English Word Grammar. Black- well. A. Joshi and Y. Schabes. 1996. Tree-Adjoining Grammars. In Handbook of Formal Languages and Automata. Springer-Verlag, Berlin. V. Lombardo and L. Lesmo. 1996. An Earley-type recognizer for Dependency Grammar. In Proceed- ings of COLING'96. V. Lombardo and L. Lesmo. 1998. Formal aspects and parsing issues of dependency theory. In Pro- ceedings of A CL-COLING'g8. 67