
Two Useful Measures of Word O 'der Complexity | i!i Tomfi~ Holan, Vladislav Kubofi t, Karel Oliva IMartin Plfitek§ I i! Abstract Two types of syntactic structures, namely DR - trees (delete-rewrite-trees), and De- fre~s (depen- This paper presents a class of dependency-based for- dency trees), are connected with FODG's. Any DR, real grammars (FODG) which can be parametrized tree can be transformed into a De-tree in an easy by two different but similar measures of non- and uniform w~,. In [Kubofi,Holan.Pl/Ltek.1997] the projectivity. The measures allow to formulate con- measures of non-projectivity are introduced with the straints on the degree of word-order freedom in a lan- help of DR-trees only. Here we discuss one of them, guage described by a FODG. We discuss the problem called node-gaps complexity (Ng). It has some very of the degree of word-order freedom which should be allowed ~, a FODG describing the (surface) syntax interesting properties. CFI.'s are characterized by of Czech. the complexity 0 of Ng. The N9 also character- izes the time complexity of the parser used by the above-n~'entioned grammar-checker. The sets of sen- 1 Introduction tences with the Ny less than a fixed constant are In [Kuboh,Holan:Pl~tek.1997] we have introduced a parsable in a polynomial time. This led us to the class of formal grammars. Robust Free.Order Depen- idea to look for a fixed upper bound of Ng for dency Grammars (RFODG's), in order to provide all Czech sentences with a correct word order. In for a formal foundation to the way we are devel- [Kubofi,Holan,Pbltek.1997] we even worked with the oping a grammar-checker for Czech. a natural lan- conjecture that such an upper bound can be set to 1. guage with a considerable level of word-order free- We will show in Section 5 that. it is theoretically im- dora. The design of RFODG's was inspired by tile possible to find such an upper bound, and that even commutative CF-grammars (see [Huynh.83]), and for practical purposes, e.g.. for grammar-checking. several types of dependency based grammars (cf.. it should be set to a value considerably higher than e.g... [Gaifman.. 1965]. [B61eckij.1967], [Pl~tek,1974], 1. This is shown with the help of the measure dNg [Mer~uk.1988] ). Also in [Kubofi.Holan.Pl~tek,1997]. which is introduced in the same way as :Vg. but on we have introduced different measures of incorrect- the dependency trees, dNg creates tile lower esti- ness and of non-projectivity of a sentence. The mation for Ng. The advantage of d.Vg is that it measures of the non-projectivity create the focus of is linguistically transparent. It allows for an easy our interest in this paper. They are considered as discussion of the complexity of individual examples the measures of word-order freedom. Considering (e.g., Czech ~ntences). On the other hand. it al- this aim we work here with a bit simplified version lows neither for characterizing the class of CFL's, of RFODG's. namely with Fre~-Order Dependency nor for imposing the context-free interpretation for Grammars (FODG's). The measures of word-order some individual rules of a FODG. Also. no useful freedom are used to formulate constraints which can relation between dXg and some upper estimation be imposed on FODG's globally, or on their individ- of the time complexity of parsing has been estab- ual rules. lished yet. These complementary properties of Ny and dNg force us to consider both of them sinmha- • Department of Software and Computer Science Educa- t ion. Faculty of Mat hemat ics and Ph.~asi~. Charles University, neously here. Prague. Czech Republic. e-math holan~aksvi.ms.mff.cuni.cz I Institute of Formal and Applied Linguistics, Faculty of 2 FOD-Grammars Mathematics and Physics. Charles University. Prague, Czech Republic. e-mail:vk~ u fal.ms.mff.cuni.cz The basic notion we work with are free-order de- : C'c,mputalionM Linguistio,, University of Saarland. Saz~r- pendency grammars (FODG's). Ill tile .sequel tile briicken. Germany. e-mail: oliva(Icoli.uni-sb.de FODG's are analytic (recognition) grammars. § D~par!men! of Ther,relical Computer Science. Faeuhy of .Mat 10,-mal i*.'s and i'hysics.. Charles University. Prague. t"z*~'h Dcfinititm -( FODG ). I.'lv,-oldcr ch l,¢ mh ,,'y Ii l~epuhlic, e-math platek~kkl.ms.mff.runi.r~. grammar (FOD(;) is a tupb" (; -- (T.A'..g't./~). ,ili 21 where the union of N and T is denotc~! as I'. T a) U is a ,l-lnl)le of Ihe form [.-I. i. j. el. where .! E is the set of terminals. N is the set of nonterminais. I" (terminal or nonterminal of G). i.j E ,Vat. St C V is the ~.t. of rool-symbols (starling symbols), ¢ is either equal to 0 or it has tile shape L"r. and P is the set. of rewrhing rules of two types of the where k, p E ,Vat. The A is called symbol of I r. form: the number i is called hori:o,tal indcz" of U. j a) A "+x BC, where A E N. B,C E V. X is is called vertical index, c is called domination denoted as the subscript, of the rule, X E {L, R}. index. The horizontal index expresses the cor- b) .4 --+ B. where A E N, B E V. respondence of U with the i-th input symbol. The letters L (R) in the subscripts of the rules The vertical index corresponds to the length in- mean that the first. (second) symbol on the right- creased by I of the path leading bottom-up to hand side of the rule is considered dominant, and U from the leaf with the horizontal index i. The the other dependent. domination index either represents the fact that If a rule has only one symbol on its right-hand no edge starts in U (e = 0) or it represents the side, we consider the symbol to be dominant. final node of the edge starting in U (e = k r, cf. A rule is applied (for a reduction) in the following also the point e) below). way: The dependent symbol is deleted (if there is b) Let U = [A,i,j,e]and j > 1. Then there is one on the right-hand side of the rule) , and the exactly, one node Ul of the form [B, i, j- 1, ij] dominant one is rewritten (replaced) ~, the symbol in Tr, such that the pair {U], U) creates a (ver- standing on the left-hand side of the rule. tical) edge of Tr, and there is a rule in G with The rules A "+L BC, A -~n BC, can be applied A on its left-hand side, and with B in the role for a reduction of a string z to any of the occur- of the dominant symbol of its right-hand side. rences of symbols B, C in z; where B precedes (not necessarily' immediately) C in z. c) Let U = [A, i, j, el. Then U is a leaf if and only For the sake of the following explanations it is if A E T (terminal D, mbol of G), and j = 1. necessary to introduce a notion of a DR-tree (delete- d) Let"U = [A,i.j,e]. U = tit iffit is the single rewrite-tree) according to G. A DR-tree maps the node with the domination index (e) equal to 0. essential part of history of deleting dependent sym- e) Let U = [A,i.j,e]. lfe = k v and k < i (resp. bols and rewriting dominant symbols, performed by k > i). then an oblique edge leads from U (de- the rules applied. pendent node) to its mother node t% with the Put informally', a DR-tree (created by a FODG G) horizontal index k. and vertical index p. Fur- is a finite tree with a root and with the following two ther a vertical edge leads from some node U, types of edges: to Urn. Let C be the symbol from Urn, B from a) vertical: these edges correspond to the rewriting Us. then there exists a rule in G of the shape of the dominant symbol by" the symbol which is C "-~L BA (resp. C "-~n AB). on the left-hand side of the rule (of G) used. f) Let U = [A.i.j.~]. If e = k r. and k = i. then The vertical edge leads {is oriented) from the p = j + 1, and a vertical edge leads (bottom node containing the original dominant symbol up) from U to its mother node with the same to the node containing the symbol from the left- horizontal index i. and with the vertical index hand side of the rule used. j+L b) oblique: these edges correspond to the deletion of a dependent symbol. Any, such edge is ori- We will say that a DR-tree Tr is complete if for any ented from the node with the dependent deleted ofits leaves U = [.4. i. 1, el. where i > 1. it holds that symbol to the node containing the symbol from there is exactly" one leaf with the horizontal index th{ left-hand side of the rule used. i- 1 in Tr. Example 1. This formal example illustrates the Let us now proceed more formally'. The following notion of FODG.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages8 Page
-
File Size-