<<

I \\ - Complements and Adjuncts in Dependency Grammar~arsing Emulated by a Constrained Context-Free Gramn~ar

Tom B.Y. Lai Changuing Huang Dept. of Chinese, Translation and Dept. of Computer Science and Technology Linguistics Tsinghua University, Beijing City University ofHong Kong Beijing 100084 Tat Chee Avenue, Kowloon China Hong Kong [email protected] Dept. of Computer Science and Technology Tsinghua University, Beijing [email protected]

(b) All others depend directly on some element; (c) No element depends directly on more than other;, Abstract (d) If A depends directly on B and some element C intervenes between them (in linear order ofs~ing), Generalizing from efforts parsing natural then C depends directly on A or on B or some other language sentences using the grammar intervening element. formalism, Dependency Grammar (DG) has been emulated by a context-free grammar (CFG) These axioms require that all should constrained by grammatical function annotation. depend on only one and that, arranging Single-headedness and projectivity are assumed. words in linear order, crossing of dependency This approach has the benefit of making general links as in Fig.l should not be allowed. constraint-based context-free grammar parsing facilities available to DG analysis. This paper describes an experimental implementation of this approach using unification to realize grammatical function constraints imposed on a dependency structure backbone emulated by a context-flee grammar. Treating complements of a head word using subcategoHzation lists Yuyanxue wo zhidao ta xihuan residing in the head word makes possible to linguistics I know likes keep -structure-rule-like mechanisms to the minimum. Adjuncts are treated with a syntactic mechanism that does not reside in the Fig.! lexicon in this generally lexical approach to grammar. These are effectively the requirements of single- headedness and projectivity. Introduction While there are some schools of DG that do not follow Robinson's axioms in their entirety (e.g. The mathematical properties of Dependency Hudson (1984, 1990), Melcuk (1988)), many Grammar (Tesniere (1959)) are studied by computational linguists working on DG-based Gaifman (1965) and Hays (1964). Following their parsing have based their work on these footsteps, Robinson (1970) formulates four assumptions (e.g. Hellwig (1986), Covington axioms to govern the weU-formedness of (1990)). DG parsing of Chinese have used dependency structures: statistical corpus-based algorithms (Huang et al. (1992), Yuan and Huang. (1992)), rule-based (a) One and only one element is independent;

102 may or may not take word order into defined by Robinson's axioms, does have aspects consideration; but they all observe Robinson's that cannot be modelled elegantly by PSG. axioms in their entirety. They also label I depedency relations with grammatical functions 1 Representation of Dependency like and . Generalizing over DG-based parsing of The governor-dependent (head-modifier) I Chinese, Lai and Huang (1994) note that, taking relationship between words in an utterance can be linear word-order into consideration, this represented as for the Chinese sentence (from approach to DG can be emulated by a model YuanandHuang(1992) in Fig. 2: I having a context-free constituent component that / '~ is constrained by a grammatical function / component, very much in the spirit of the constituent and functional structures of Lexical Functional Grammar (LFG, Bresnan ed. (1982)). U The syntactic dependency structure in this Na ten zai gongyuan li approach to DG is however different from that person in park inside context-free phrase structure in that non-lexical phrasal nodes are not allowed. As in LFG, the Fig. 2 grammatical function structure, which provides the constraining mechanism, is mathematically a The main (or central) element, using Hays' (1964) graph rather than a tree. This relieves the syntactic terminology, of the sentence is zaL Its immediate dependency component of any need to be dependants are ten and !i, which, in turn, have multiple-headed and non-projective (Lai and dependants of their own. This sentence can also be Huang 1995). Following this approach, Lai and represented as in Fig. 3 (Tesniere's stemma): Huang (in press) describes a unification-based (Shieber (1986)) experimental parser adapted zai from the PATR parser in Gazdar and Mellish (1989). Control and simple semantic analysis are ten ii handled. / na gongyuan The present paper discusses issues of using a constrained CFG to emulate DG. Section 1 Fig. 3 explains the implications of Robinson's axioms and describes Hays' CFG-like formulation of If do not mangle up word order in the dependency rules. Section 2 formulates the dependency structure of Fig. 3, it can be seen that "dependency rule with grammatical function it is equivalent to the tree structure in Fig. 2. annotation" model and describes its emulation Based on the work of Gaifman (1965), Hays with PATR. Section 3 discusses how the lexical (1964) proposes rules of the following form for orientation of DG motivates a proper distinction the generation of dependency structures: between complements subeategorized for by the head and adjuncts that are not, and describes how (a) X(A,B,C..... H, * ,Y.... ,Z) this can be accomplished in a constrained CFG (I,) X(* ) emulation using subcategorization lists in the (c) * (X) lexicon. A distinction between grammatical information residing and not residing in the Fig. 4 lexicon is noted. Section 4 discusses the real nature of the constrained CFG emulation of DG. In Fig.4, (a) states that the governing auxiliary Though DG can be usefully emulated by a alphabet X has dependent A, B, C ..... H, Y ..... Z constrained CFG model, the formalism, as least as and that X itself (the governor) is situated between

103 I

I H and Y. Co) says that the terminal alphabet X LFG, they proposed annotated dependency rules occurs without any dependants. (c) says that X of the following form: occurs without any governor, i.e. it is the main or I central element. Gaifman (1965) establishes that a X(A(fa), B(~),...,* .... Z(fz)) Dependency Grammar obtained in this way is equivalent to a phrase structure grammar in the For example, the following rules account for the I sense that: transitive verbs:

- they have the same terminal alphabet; (a) * (TV) - for every string over that alphabet, Co) TVfS(subj),* , N(obj)) I every structure attributed by either (c) N(* ) grammar corresponds to a structure attributed by the other. Fig. 6

I Robinson's (1970) four axioms (v. supra) While this is not a phrase-structure grammar license the same kinds of dependency structures (PSG), Lai and Huang (in press) exploited the I as Hays' rules. Unlike these rules, they do not obvious affinity between Robinson-style DG and contain unnecessary stipulation involving linear PSG and implemented a DG parser using the word order. It is easy to see that the third axiom is PATR of Gazdar and Mellish (1989). In this I a requirement of single-headedness. As for the implementation, the Chinese sentence fourth axiom, consider Fig. 5: B' Zhang San kanjian Li Si name saw name

.is accounted for by the annotated rules (simplified and semantic analysis mechanisms stripped for ! brevity) in Fig. 7:

Fig. 5 R ule TVP m> [PNI, TV, PN2] :- TVP:cat J" tv, The axiom stipulates that the governor of C must TV:cat ~ tv, be located between A and B (which are PNi:cat ~ n, themselves possible governors). Seen in the PN2:cat ~ n, Baysian cast, this effectively requires that the link TVP:ds ..... TV:ds, TVP:ds:subj --~ PNl:ds, between C and its governor should not cross the TVP:ds:obj " PN2:ds. lines AB' and BB'. This is thus a requirement of projectivity. W ord kanjian:- W:cat tv, 2 Dependency Rules and Grammatical W:ds:head .. kanjian. Function Annotation Word 'zhangsan':- W:cat .... n, Generalizing over approaches adopted in DG- W:ds:head ~ 'zhangsan'. based parsing of Chinese, Lai and Huang (1994) W ord 'lisi' :° noted that grammatical functions like subject and W:cat n, object are generally used to label dependency W:ds:head ~ ']isi'. links. These labels are not found in Hays' dependency rules and Robinson's axioms. In a Fig. 7 sense, they are entities on a second level. Borrowing the idea of functional annotation from

104 !

I Besides outputting a grammatical function V:ds:obj:filler' : N:ds. structure, the following dependency structure is Word xiang3 :- produced: W:cat : v, i W:subcat:tran ~ tv,

[tv, [[n, [zhangsan]], [tv, [kanjian]], [n, [lisi]]]] W:subcat:c~'l II l subj, W:ds:head ~.: xiang. i PATR outputs a phrase-structure tree, but after Control sentences are one of the motivations applying a pruning operation on the two tv's, a for DG grammarians like Hudson (1994) to give structure equivalent to Fig. 8 is obtained: up the single-headedness and non-projectivity I requirements, finding it difficult, for example, not TV kanjian to allow the controlled verb da to have both Zhang San and xiang as its heads, violating single- I ~ headedness and projectivity at the same time as / \ shown in Fig. 9:

I Zhang San Li Si

Fig. $

I The more complicated sentence, involving a subject-control verb, Zhang San xiang da Li Si Zhang San xiang da Li Si Fig. 9 name want hit name By introducing a level of grammatical function to is accounted for the following rules (necessary accommodate such complications, Lai and Huang details only, for brevity): (1995; in press) preserve single-headedness and projectivity in the syntactic dependency structure Rule CVP ---> [N, CV, VS] :- CVP:slash:here =--~ no, as in Fig. 10: CVP:slash:down ~ yes, CVP:cat --- CV:cat, CV:cat ~ v, CV:subcat:tran ~ tv, CV:subcat:ctrl =~ subj, N:cat .... n, VS:cat • v, VS:slash:her¢ := subj, Zhang San xiang da Li Si CVP:ds === CV:ds, CV:ds:subj:fi]l ~ N:ds, Fig. 10 CV:ds:obj:fill ------VS:ds, %Subject-control information follows CV:ds:subj:fil|::= CV:ds:obj:fill:subj:fill. Other difficulties involving raising, extraction, R ule VS --> [V, N] :- tough-movement and extraposition (Hudson VS:slash:here -' " subj, (1994)) can be dealt with similarly. VS:cat V:cat, This two-level approach to DO parsing is V:cat -- v, essentially a context-free PSG constrained by V:subcat:tran tv, grammatical function annotations. A grammatical N:cat " n, function structure accompanies the dependency VS:ds V:ds,

105 structure of a legal sentence just as a functional a head word and it's dominating nodes are the structure is associated with a constituent structure same. Using (simplified) PATR notations, the two in LFG. Morphological and semantic constraints generic rules: (Melcuk (1988)) can also be dealt with on additional levels. X_,XY Y: fun = 3 Complements and Adjuncts X_+YX The constrained CFG emulation of DG described Y:fun = adjunct in the previous section inevitably prompts the question whether it is still a DG. In his foreword Fig. 11 in Starosta (1988), Hudson mentioned three characteristics of DG. First, DG should be will be able to cover all kinds of adjunct rules. monostrata! in the sense that there should be no The two X's on the two sides of the arrow are transformations. Second, dependency should be short-hand for two different symbols, say XI and basic, not derived. Third, the rules of grammar X2, constrained by the condition XI :cat = X2:cat. should not be formally distinct from As subcategorization information has to be subcategorization facts. Lai and Huang's approach encoded in the lexical items anyway, there is meets the first two criteria. While the proper nothing seriously wrong with phrase-structure- treatment of adjuncts will be discussed below, the rule-like stipulations about complements. close coupling of the phrase-structure-rule-like However, we should note that, in Chinese and dependency rules and subcategorization properties English, adjuncts generally do not come between discussed in the previous section also gives the a head word and its "unmoved" non-subject approach the third characteristic. complements. This could be taken care of by One may feel somewhat uncomfortable about adding a bar-level feature to the rules in Fig. l ! as phrase-structure rules or phrase-structure-rule like in Generalized Phrase Structure Grammar (GPSG, mechanisms playing an important role in an Gazdar et al. (1985)). But then we would be emulation of DG. After all, although it is true that relying more heavily on phrase-structure-rule-like Hays' rules work like phrase structure rules, mechanisms. Instead of this, we find the conformation to Robinson's axioms does not alternative method of treatment in Head-Driven imply that the process of sentence recognition will Phrase Structure Grammar (HPSG, Pollard and necessarily have an image in a PSG. The situation Sag (! 994)) convenient. is particularly critical in the treatment of adjuncts First, lexical entries are as in Fig. 12: that are not subcategorized for by a head word. We could quite easily deal with adjunct in the gei ('give') manner of the following annotated phrase cat= v structure rule in LFG: subcat.left = [n(subj)] subcat.right = [n(iobj), n(obj)] VP --* V NP ZP (1'obj) = $ ('l'adjunct) = Fig. 12

But we would then have to let a large number of Rules like the following (necessary details only, phrase structure rules not related to for brevity) will take care of complements subcategorization facts slip into the grammar. subcategorized for by the head word: This violates the third criterion mentioned above and is obviously undesirable. V _+ V X (cat(fun) = pop('V:subcat.right)} This being a critical problem of constrained % fails if V:subcat = n CFG emulation of DG, we adopt another approach X:cat = cat by exploiting the fact that the categorical labels of X:fun = fun

106 dependency links, it sanctions sentences with V ._~ X V dependency structures that satisfy the single- {cat(fun) = pop(V:subcat.left)} headed and non-projective conditions. Well- % fails ifV:subcat = U formed dependency structures are accompanied X:cat = cat by a grammatical function structures that, inter X:fun = fun alia, ensure that subcategorization properties of lexicai items are satisfied. Grammatical function Fig. 13 structures do not have to conform to the single- headed and projective conditions. Morphological Adjuncts are kept from getting in between a head and semantic constraints can be accommodated word and its unmoved non-subject complements similarly. by adding constraints like Most grammatical mechanism in the X: subcat.right = [] emulation are triggered by lexical information. Hays-style dependency rules, which are emulated to rules in Fig. 11 by phrase-structure mechanisms in PATR. Rules As shown in Fig. 12, a lexical entry has two for complements of the head word derive their subcategorization lists, one for complements on real power from lexical subcategorization its left and one for complements on its right, an information. They thus meet the criterion that inspiration from Yuan and Huang (1992). The rules of grammar of a IX; should not be formally elements in a subcategorization list is arranged so distinct from subcategorization facts. that the one that occurs closest to the head word is Adjunct rules are not related to any at the head. The rules in Fig. 13 are presented in a subcategorization facts. We believe that their form that is easily understood by readers. The pop existence (in small numbers) is justified in our operation, which is procedural in nature (hence emulation model. Even in a DG formalism that the braces), hands over to the caller the head does not have phrase-structure-like rules, there elements of a the subcategorization list, removing have to be some general facilities to take care of it from the list at the same time. It is actually such non-lexical grammatical mechanisms. DG is implemented in a PATR-compatible manner. lexically orientated, but it has to cope with non- This scheme works for Chinese and English, lexical grammatical mechanisms, where they exist, in which unmoved non-subject complements in language. follow the verb. Adjustments are required for The dependency rule cum functional moved complements. Adjustments are also constraint emulation in Lai and Huang (1994; required for other languages. 1995; in press) has obviously been influenced by It should be noted that the rules, in the spirit LFG. With the introduction of mechanisms to of Robinson's axioms, try not to meddle with handle complements and adjuncts in the previous word order as far as possible. In this respect, section, the emulation model has moved towards PATR is inelegant in that it has to have two HPSG. Grammatical function constraints, which symmetrical adjunct rules in Fig. !I and two work like functional annotations in LFG, provide symmetrical complement rules in Fig. 13. This the main facilities to resolve grammatical inelegance seems to be inherent in the PSG nature problems like control. On the other hand, of PATR. dependency rules, emulated by PATR phrase- structure rules, are kept to a minimum and deals 4 Nature of the Emulation Model with subeategorization and adjoining with a l-IPSG-like mechanism. An examination of the real nature of our The emulation model, however, remains an emulation model is in order. As a computational emulation. The Chinese parsing experiments from emulation of a DG conforming to Robinson's four which the generalization has been made do not all axioms and using grammatical functions to label use (context-free) phrase structure rules (e.g.

107 I

Yuan et al. (1992); Zhou and Huang (1994)). Acquisition and Syntacu'c Parsing). Journal of Chinese I PATR rules are useful only in so far as they can Information Processing, 6/3, pp. I-6. produce structures that can be transformed to Hudson R. (1984) Word Grammar. Blackw¢ll, Oxford, 267p. dependency structures. Hudson R. (1990) English Word Grammar. Blackwell, I Oxford, 445p. Conclusion Hudson R. (1994) Discontinuous in Dependency Grammar. University of London Working Papers in We have thus been committed in our efforts to Linguistics, 6, pp. 89-124. I emulate a Robinson-style Dependency Grammar Hudson R. (1995) Dependency Counta. In "Functional with a lexically oriented Context-Free Grammar Description of Language", E. Hajicova, ed., Faculty of constrained by grammatical function annotations. Mathematics and Physics, Charles University, Prague, pp. I Besides providing a formalism for valid and 85-115. illuminating linguistic analysis, this emulation has Lai B.Y. and Huang C.N. (1994) Dependency Grammar and enabled us to implement a unification-based the Parsing of Chinese Sentences. The Proceedings of the parser in PATR. We are however not necessarily 1994 Kyoto Conference (Joint ACLIC8 and FACFoCoL2), I 10-11 Aug. 1994, pp. 63-71. committed to the claim that Dependency Lai B.Y.T. and Huang C.N. (!995) Single-Headedness and Grammar is a notational variant of Phrase Projectivity for Syntactic Dependency. The Linguistics Structure Grammar. Robinson-style dependency Association of Great Britain Spring Conference, University I structures have great affinity with phrase structure of New Castle, 10-12 August, 1995. trees, but they do not have to be generated by Lai T.B.Y. and Huang C.N. (in press) An Approach to phrase structure rules. In fact, phrase structure Dependency Grammar for Chinese. In '*Theoretical I rule-based emulation is inelegant in handling Explorations in Chinese Linguistics", Y. Gu, ed., Hang Kong: Linguistic Society of Hang Kong, Hang Kung, in some phenomena that DG can deal with elegantly. press. Li J.K., Zhou M. and Huang C.N. (1993) Tong]i Yu Guize Acknowledgements I Jiehe De Hanyu Jufa Fenxi Yanjiu (Study on Using Our thanks go to the National Science Foundation Statistics and Rules at the Same Time in Syntactic Analysis). Proc. JSCL93, Xiamen University, pp. ! 76-181. of China for supporting the research reported in I this paper. Maxwell D. (ms) Unification Dependency Grammar. Meicuk I.A. (1988) Dependency : Theory and Practice. References State University of New York Press, Albany, 428p. I Pollard C. and Sag t. (1994) Head-Driven Phrase Structure Bresnan J.W, ed. (1982) The Mental Representation of Grammar. University of Chicago Press, Chicago, 440p. Grammatical Relations. MIT Press, Cambridge, U.S.A., Robinson J.J. (1970) Dependency Structures and I 874p. Transformation Rules. Language, 46, pp. 259-285. Covington M.A. (I 990) Parsing Discontinuous Constituents Shieber S.M. (1986) An Introduction to Unification-Based in Dependency Grammar. Computational Linguistics, 16/4, Approach to Grammar. Chicago University Press, Chicago, pp. 234-236. 105p. I Gaifman H. (1965) Dependency Systems and Phruse- Starosta S. (1988) The Case for Lexicase. Pinter, London. Structure Systems. Information and Control, 8, pp. 304-337. Tesniere L. (1959) Elementa de Syntaxe Structurale, Gazdar G., Klein E., Pullum E. and Sag L (1985) Generalized Klincksieck, Pads. I Phrase Structure Grammar. Blackwell, Oxford, 276p. Yuan C.F. and Huang C.N. (1992) Knowledge Acquisition Gazdar G. and Mellish C. (1989) Natural Language and Chinese Parsing Based on Corpus. Proc. COLING 92, Processing in Prolog. Addision Wesley, Wokingham, 504 Nantes, France, pp. 13000-13004. p. Zhou M. and Huang C.N. (1994) An Efficient Syntactic I Hays D.G. (1964)Dependency Theory: A Formalism and Tagging Tool for Corpora. Proc. COLI~G 94, Kyoto, pp. Some Observations, Language, 40, pp. 511-525. 949-955. Hellwig P. (1986) Dependency Unification Grammar. Proc. I COLING 86, pp. 195-199. Huang C.N., Yuan C.F. and Pan S.M. (1992) Yuliaoku. I Zhishi Huoqu He Jura Fenxi (Corpora. Knowledge

108