An Earley-type recognizer for dependency grammar Vincenzo Lombardo and Leonardo Lesmo Dipartimento di lnformatica and Centro di Scicnza Cognitiva Universith di Torino c.so Svizzcra 185, 10149 Torino, Italy e-mail: {vincenzo, lesmo}@di.unito.it
Abstract cooked The paper is a first attempt to fill a gap in the SUBJ dependency literature, by providing a mathematical result on the complexity of recognition with a dependency grammar. The chef tish paper describes an improved Earley-type T~r/ recognizer with a complexity O(IGl2n3). The ,, improvement is due to a precompilation of the the a dependency rules into parse tables, that determine the conditions of applicability of two primary (a) actions, predict and scan, used in recognition. S 1 Introduction /NP\ /V k Dependency and constituency frameworks define D N V NP different syntactic structures. Dependency grammars describe the structure of a sentence in terms of binary D N head-modifier (also called dependency) relations on I I I" I the dlef cooked a fish the words of the sentence. A dependency relation is an asymmetric relation between a word callexl head (governor, parent), and a word called modifier Figure 1. A dependency tree (a) and a p.s. (dependent, daughter). A word in the sentence can tree (b) for the sentence "The chef cooked a play the role of the head in several dependency fish". The leftward or rightward orientation relations, i.e. it can have several modifiers; but each of the arrows in the dependency tree word can play the role of the modifier exactly once. represents the order constraints: the One special word does not play the role of the modifiers that precede the head stand on its modifier in any relation, and it is named the root. The left, the modifiers that follow the head stand set of the dependency relations that can be defined on on its right. a sentence form a tree, called the dependency tree (fig. la). subject, object, xcomplement .... label the dependency Although born in the same years, dependency relations when the head is a verb. Grainmatical syntax (Tesniere 1959) and constituency, or phrase relations gained much popularity within the structure, syntax (Chomsky 1956) (see fig.lb), have unification formalisms in early 1980%. FUG (Kay had different impacts. The mainstream of formalisms 1979) and LFG (Kaplan, Bresnan 1982) exhibit consists ahnost exclusively of constituency mechanisms for producing a relational (or functional) approaches, but some of the original insights of the structure of the sentence, based on the merging of dependency tradition have found a role in the feature representations. constituency formalisms: in particular, the concept of All the recent constituency formalisms head of a phrase and the use of grammatical relations. acknowledge the importance of the lexicon, and The identification of the head within a phrase has reduce the amount of information brought by the been a major point of all the recent frameworks in phrasal categories. The "lexicalization" of context- linguistics: the X-bar theory (Jackendoff 1977), free grmnmars (Schabes, Waters 1993) points out defines phrases as projections of (pre)terminal many similarities between the two paradigms symbols, i.e. word categories; in GPSG (Gazdar et al, (Rainbow, Joshi 1992). Dependency syntax is an 1985) and HPSG (Pollard, Sag 1987), each phrase extremely lexicalized framework, because the phrase structure rule identifies a head and a related structure component is totally absent. Like the other subcategorization within its right-hand side; in HG lexicalized frameworks, the dependency approach (Pollard 1984) the head is involved in the so-called does not produce spurious grammars, and this facility head-wrapping operations, which allow the is of a practical interest, especially in writing realistic formalism to go beyond the context-free power (Joshi grammars. For instance, there are no heavily et al. 1991). ambiguous, infinitely ambiguous or cyclic Grmmnatical relations are the primitive entities of dependency grammars (such as S ~ SS; S ~ a; S --* relational grammar (Perhnutter 1983) (classified as a ~; see (Tomita 1985), pp. 72-73). dependency-based theory in (Mercuk 1988)):
723 Dependency syntax is attractive because of the T is a set of dependency rules of the form X(Y1 Y2 immediate mapping of dependency structures on the ... Yi-1 # Yi+l ... Ym), where XGC, Y1GC, predicate-argmnents structure (accessible by the .... Ym@C, and # is a special symbol that does not semantic interpreter), and because of the treatment of belong to C. (see fig. 2). free-word order constructs (Sgall et al. 1986) The modifier symbols Yj can take the form Yj*: as (Mel'cuk 1988) (Hudson 1990). A number of parsers usual, this means that an indefinite number of Yj's have been developed for some dependency (zero or more) may appear in an application of the frameworks (Fraser 1989) (Covington 1990) (Kwon, rule 1 . In the sample grammar below, this extension Yoon 1991) (Sleator, Temperley 1993) (Hahn et al. allows for several prepositional modifiers under a 1994) (Lai, Huang 1995): however, no result of single verbal or nominal head without introducing algorithmic efficiency has been published as far as intermediate symbols; the predicate-arguments we know. The theoretical worst-case analysis of structure is immediately represented by a one-level O(n 3) descends from the (weak) equivalence between (flat) dependency structure. projective dependency grammars (a restricted of Let x=al a2...ap ~W* be a sentence. A dependency dependency grammars) and context-free grammars tree of x is a tree such that: (Gaifman 1965), and not from an actual parsing 1) the nodes are the symbols ai~W (l, where ai W is a finite set of symbols (vocabulary of words of a ai natural language), C is a set of syntactic categories (preterminals, in Aiki constituency terms), i' ak | i | | S is a non-empty set of root categories (C _ S), i i L is a set of category assignment rules of the form X: i i x, where XCC, x@W, and Figure 3. The condition of projectivity. X
1 The use of the Kleene star is a notational change with respect to Gaifman: however, it is not uncommon to allow the symbols on the right hand side of a rule to be YI Y2 ... Yi-1 Yi+l ... Ym regular expressions in order to augment the perspicuity Figure 2 - A dependency rule: X is the of the syntactic representation, but not the expressive governor, and Y1 ..... Ym are the dependent power of the grammar (a similar extension appears in the of X in the given order (X is in # position). context-free part of the LFG formalism (Kaplan, Bresnan 1982)).
724 that a dependent is never separated from its governor i[ Y is starred by anything other than another dependent, together s' := s' U star(.Y[~) e,l~s':=s' u {.F}; with its subtree, or by a dependent of its own. endfor each dotted string; As an example, consider the grammar V := V U is'}; GI= , a state of the graphs for a grammar G is at most O(IGI), where IGI transition graph for a category Cat ~ C is a'set of is the sum of the lengths of the dependency rules. The dotted slxings of the Ibrm ". 13", where Cat(c~13) C T length of a dependency rule Cat(c0 is the length of (~. and et, [~ E (C U { # }) * ; an edge is a triple
725 0 1 $2 $3
(a) category V P
0 $1 @' - 0 1 $2 0 $1 (c) category P (e) category D Fig. 4 - The transition graphs obtained for the grammar G 1. cat "• V N A p D Figure 5 - The parse tables for the grammar G 1 parse-table (Cat, t-graphcat): although this does not happen for our simple initialize PTCat; grammar G 1. for each edge 726 before continuing in the recognition of the larger span a portion of the input starting at i. In F, arley's slructure headed by Category (Depcat = "_" means terms, this item corresponds to all the dotted rules that the item is not waiting for any completion). of the form Cat'(. cz). Sentence: w 0 w 1 ... Wn. 1 b) $3 [tall] S 5 #nan] V27 Position is bounded by O(n). The number of such Gaifman H., Dependency Systems and Phrase quadruples in a set of items is bounded by O(IGI n) Structure Systems, Information and Control 8, 1965, and there are n sets of items. 304-337. The time complexity of the recognizer is O(IGI2 Gazdar G., Klein E., Pullum G., Sag I., Generalized Phrase Structure Grammars, Basil n3). The phases scanner and predictor execute at Blackwell, Oxford, 1985. most O(IGI) actions per item; the items are at most Graham S. L., Harrison M. A., Ruzzo W. L., An O(IGI n 2) and the cost of these two phases for the improved Context-Free Recognizer, ACM Trans. on whole algorithm is O(IGl2n2). The phase completer Programming Languages and Systems 2, 1980, 415- executes at most one action per pair of items. The 462. variables of such a pair of items are the two states Hahn U., Schacht S., Broker N., Concurrent, Object-Oriented Natural Language Parsing: The (O(IGI2)), the two sets that contain them (O(n2)), and ParseTalk Model, CLIF Report 9/94, Albert- the two positions (O(n2)). But the pairs considered Ludwigs-Universitat, Freiburg, Germany. are not all the possible pairs: one of the sets has the Hudson R., English Word Grammar, Basil index which is the same of one of the positions, and Blackwell, Oxford, 1990. the complexity of the completer is O(IGI2 n3). The Jackendoff R., X-bar Syntax: A Study of Phrase phase completer prevails on the other two phases and Structure, MIT Press, 1977. the total complexity of the algorithm is O(IGI2 n3). Jacobs P.S., Rau L. F., Innovations in Text Interpretation, Artificial Intelligence Journal 63/1-2, Even if the O-analysis is equivalent to Earley's, the 1993, 143-191. phase of precompilation into the parse tables allows Joshi A.K., Vijay-Shanker K., Weir D., The to save a lot of computation time needed by the Convergence of Mildly Context-sensitive predictor. All the possible predictions are grammatical formalisms, in Sells P., Shieber S., precomputed in the transition to a new state. A Wasow T. (eds.), Foundational Issues in Natural similar device is presented in (Schabes 1990) for Language Processing, MIT Press, 1991. context-free grammars. Kaplan R., Bresnan J., Lexical-Functional Grammar: A Formal System for Grammatical Representation, in Bresnan J. (ed.), The mental 4 Conclusion representation of grammatical relations, MIT Press, 1982. The paper has described a recognition algorithm for Kay M., Functional Grammar, Proc. 5th Meeting dependency grammar. The dependency formalism is of the Berkeley Linguistic Society, 1979, 142-158. translated into parse tables, that determine the Kwon H., Yoon A., Unification-Based conditions of applicability of tile parser actions. The Dependency Parsing of Governor-Final Languages, recognizer is an improved Earley-type algorithm, Proc. IWPT 91, 1991, 182-192. whose performances are comparable to the best Lai B.Y.T., Huang C., Dependency Grammar and recognizers for the context-free grammars, the the Parsing of Chinese Sentences, Unpublished Manuscript on CompLing Server, 1995. formalism which is equivalent to the dependency Mel'cuk I., Dependency Syntax: Theory and formalism described in this paper. The algorithm has Practice, SUNY Press, Albany, 1988. been implemented in Common Lisp and runs under Perlmutter 1983, Studies in Relational Grammar 1, the Unix operating system. The next step in our Univ. of Chicago Press, Chicago, 1983. research will be to relax the condition of projectivity Pollard C.J., Generalized Phrase Structure in order to improve the expressive power and to deal Grammars, Head Grammars, and Natural Language, with phenomena that go beyond the context-free Ph.D. Thesis, Stanford Univ., 1984. power. These changes imply the restructuring of Pollard CJ., Sag I., An Information Based Syntax some parts of the recognizer, with a plausible and Semantics, vo1.1, Fundamentals, CSLI Lecture increment of the complexity. Note 13. CSLI, Stanford, 1987. Rambow O., Joshi A., A Formal Look at Dependency Grammars and Phrase-Structure References Grammars, with Special Consideration of Word- Order Phenomena, Proc. of the Int. Workshop on the Chomsky N., Three models for the description of Meaning-Text Theory, Darmstadt, 1992. language, IRE Transactions on Information Theory, Schabes Y., Polynomial Time and Space Shift- IT-2, 1956, 113-124. Reduce Parsing of Arbitrary Context-Free Grammars, Covington M. A., Parsing Discontinuous Proc. ACL 90, Pittsburgh (PA), 1990, 106-113. Constituents in Dependency Grammar, Schabes Y., Waters R. C., Lexicalized Context- Computational Linguistics 16, 1990, 234-236. Free Grammars, Proc. ACL 93, 121-129. Covington M. A., An Empirically Motivated Sgall P., Haijcova E., Panevova J., The Meaning of Reinterpretation of Dependency Grammar, Res. Rep. Sentence in its Semantic and Pragmatic Aspects, AI-1994-01, Univ. of Georgia (also on CompLing D.Reidel Publ. Co., Dordrecht, 1986. Server), 1994. Sleator D. D., Temperley D., Parsing English with Earley J., An Efficient Context-free Parsing a Link Grammar, Proc. oflWPT 93, 1993, .277-291. Algorithm. Comm. of the ACM 13,1970, 94-102. Tesniere L., Elements de Syntaxe Structurale, Fraser N.M., Parsing and Dependency Grammar, Kliensieck, Paris, 1959. UCL Working Papers in Linguistics, 1989, 296-319. Tomita M., Efficient Parsing for Natural Fraser N.M., Hudson R. A., Inheritance in Word Language, Kluwer Acad. Publ., 1985. Grammar, Computational Linguistics 18, 1992, 133- 158. 728 in E(t-graphcat) do if YECthen 3-2 A dependency recognizer for each category Z ~ first(Y) 0sa The dependency recognizer exhibits the same data PTCat(