TAL Recognition in O(M(n2)) Time

Sanguthevar Rajasekaran Dept. of CISE, Univ. of Florida raj~cis.ufl.edu Shibu Yooseph Dept. of CIS, Univ. of Pennsylvania [email protected]

Abstract time needed to multiply two k x k boolean matri- ces. The best known value for M(k) is O(n 2"3vs) We propose an O(M(n2)) time algorithm (Coppersmith, Winograd, 1990). Though our algo- for the recognition of Adjoining Lan- rithm is similar in flavor to those of Graham, Har- guages (TALs), where n is the size of the rison, & Ruzzo (1976), and Valiant (1975) (which input string and M(k) is the time needed were Mgorithms proposed for recognition of Con- to multiply two k x k boolean matrices. text Pree Languages (CFLs)), there are crucial dif- Tree Adjoining Grammars (TAGs) are for- ferences. As such, the techniques of (Graham, et al., malisms suitable for natural language pro- 1976) and (Valiant, 1975) do not seem to extend to cessing and have received enormous atten- TALs (Satta, 1993). tion in the past among not only natural language processing researchers but also al- gorithms designers. The first polynomial 2 Tree Adjoining Grammars time algorithm for TAL parsing was pro- A Tree Adjoining Grammar (TAG) consists of a posed in 1986 and had a run time of O(n6). quintuple (N, ~ U {~}, I, A, S), where Quite recently, an O(n 3 M(n)) algorithm N is a finite of has been proposed. The algorithm pre- nonterminal symbols, sented in this paper improves the run time is a finite set of terminal symbols disjoint from of the recent result using an entirely differ- N, ent approach. is the empty terminal string not in ~, I is a finite set of labelled initial trees, A is a finite set of auxiliary trees, 1 Introduction S E N is the distinguished start symbol The Tree Adjoining Grammar (TAG) formalism was The trees in I U A are called elementary trees. All introduced by :loshi, Levy and Takahashi (1975). internal nodes of elementary trees are labelled with TAGs are tree generating systems, and are strictly nonterminal symbols. Also, every initial tree is la- more powerful than context-free grammars. They belled at the root by the start symbol S and has belong to the class of mildly context sensitive gram- leaf nodes labelled with symbols from ~3 U {E}. An mars (:loshi, et al., 1991). They have been found auxiliary tree has both its root and exactly one leaf to be good grammatical systems for natural lan- (called the foot node ) labelled with the same non- guages (Kroch, Joshi, 1985). The first polynomial terminal symbol. All other leaf nodes are labelled time parsing algorithm for TALs was given by Vi- with symbols in E U {~}, at least one of which has a jayashanker and :loshi (1986), which had a run time label strictly in E. An example of a TAG is given in of O(n6), for an input of size n. Their algorithm figure 1. had a flavor similar to the Cocke-Younger-Kasami A tree built from an operation involving two other (CYK) algorithm for context-free grammars. An trees is called a derived tree. The operation involved Earley-type parsing algorithm has been given by is called adjunction. Formally, adjunction is an op- Schabes and Joshi (1988). An optimal linear time eration which builds a new tree 7, from an auxiliary parallel parsing algorithm for TALs was given by tree fl and another tree ~ (a is any tree - initial, aux- Palls, Shende and Wei (1990). In a recent paper, iliary or derived). Let c~ contain an internal node m Rajasekaran (1995) shows how TALs can be parsed labelled X and let fl be the auxiliary tree with root in time O(n3M(n)). node also labelled X. The resulting tree 7, obtained In this paper, we propose an O(M(n2)) time by adjoining fl onto c~ at node m is built as follows recognition algorithm for TALs, where M(k) is the (figure 2):

166 S S Initial tree O~

EI S Auxiliary tree

G = {{S},{a,b,c,e }, { or}, { ~}, S}

b S*

Figure 1: Example of a TAG

1. The subtree of a rooted at m, call it t, is excised, and proceeds to find the transitive closure b + of this leaving a copy of m behind. matrix. (If b + is the transitive closure, then Ak E

2. The auxiliary tree fl is attached at the copy of b.$,J+. ¢:~ Ak-~ ai .... aj-1) m and its root node is identifed with the copy Instead of finding the transitive closure by the cus- of m. tomary method based on recursively splitting into disjoint parts, a more complex procedure based on 3. The subtree t is attached to the foot node of fl 'splitting with overlaps' is used. The extra cost in- and the root node of t (i.e. m) is identified with volved in such a strategy can be made almost negligi- the foot node of ft. ble. The algorithm is based on the following lemma This definition can be extended to include adjunc- Lemma : Let b be an n x n upper triangular ma- tion constraints at nodes in a tree. The constraints trix, and suppose that for any r > n/e, the tran- include Selective, Null and Obligatory adjunction sitive closure of the partitions [1 < i,j < r] and constraints. The algorithm we present here can he [n- r < i,j < n] are known. Then the closure of b modified to include constraints. can be computed by For our purpose, we will assume that every inter- nal node in an elementary tree has exactly 2 children. I. performing a single matrix multiplication, and Each node in a tree is represented by a tuple < 2. finding the closure of a 2(n - r) × 2(n - r) up- tree, node index, label >. (For brevity, we will refer per triangular matrix of which the closure of the to a node with a single variable m whereever there partitions[1 < i,j < n- r] and [n- r < i,j < is no confusion) 2(n - r)] are known. A good introduction to TAGs can be found in (Partee, et al., 1990). Proof: See (Valiant, 1975)for details The idea behind (Valiant, 1975) is based on visu- 3 Context Free recognition in alizing Ak E b+j as spanning a tree rooted at the O( M(n)) Time node Ak with l~aves ai through aj-1 and internal nodes as nonterminals generated from Ak according The CFG G = (N,~,P, A1), where to the productions in P. Having done this, the fol- N is a set of Nonterminals {A1, A2, .., Ak}, lowing observation is made : is a finite set of terminals, Given an input string al...a, and 2 distinct sym- P is a finite set of productions, bol positions, i and j, and a nonterminal Ak such A1 is the start symbol that Ak E b + ., where i' < i,j' > j, then 3 a non- I P3 is assumed to be in the Chomsky Normal Form. terminal A k, which is a descendent of Ak in the Valiant (1975) shows how the recognition problem tree rooted at Ak, such that A k, E bi + d' . where can be reduced to the problem of finding Transitive Closure and how Transitive Closure can be reduced i" < i, j" > j and A k, has two children Ak~ and Ak2 to Matrix Multiplication. such thatAk~ Eb +, andAk2 Eb +..withi

167 k

t t

Figure 2: Adjunction Operation

1. Find the closure of the first 2/3 ,i.e. all nodes nodes from the first and last 2/3's ,(say r =2/3). spanning trees which are within the first 2/3 . Thus, if we discard the mid 1/3, we will not be able 2. Find the closure of the last 2/3 , i.e. all nodes to infer that the adjunction had indeed taken place spanning trees which are within the last 2/3. at node m. 3. Do a composition operation (i.e. matrix multi- 4 Notations plication) on the nodes got as a result of Step 1 with nodes got as a result of Step 2. Before we introduce the algorithm, we state the no- tations that will be used. 4. Reduce problem size to az...an/zal+2n/3...an We will be making use of a 4-dimensional matrix and find closure of this input. A of size (n + 1) x (n + 1) x (n + 1) x (n + 1), where The point to note is that in step 3, we can get rid n is the size of the input string. of the mid 1/3 and focus on the remaining problem (Vijayashanker, Joshi, 1986) Given a TAG G and size. an input string aza2..an, n > 1, the entries in A will This approach does not work for TALs because of be nodes of the trees of G. We say, that a node m the presence of the adjunction operation. (= < 0, node index, label >) E A(i,j, k, l) iff m is a Firstly, the used, i.e. the 2- node in a derived tree 7 and the subtree of 7 rooted dimensional matrix with the given representation, at m has a yield given by either ai+l...ajXak+l...al is not sufficient as adjunction does not operate on (where X is the footnode of r/, j < k) or ai+l .... az contiguous strings. Suppose a node in a tree domi- (when j = k). nates a frontier which has the substring aiaj to the If a node m E A(i,j,k,l}, we will refer to m as left of the foot node and akat to the right of the spanning a tree (i,j,k,l). footnode. These substrings need not be a contigu- When we refer to a node m being realised as a ous part of the input; in fact, when this tree is used result of composition of two nodes ml and rnP, we for adjunction then a string is inserted between these mean that 3 an elementary tree in which m is the two suhstrings. Thus in order to represent a node, parent of ml and m2. we need to use a matrix of higher dimension, namely A Grown Auxiliary Tree is defined to be either dimension 4, to characterize the substring that ap- a tree resulting from an adjunction involving two pears to the left of the footnode and the substring auxiliary trees or a tree resulting from an adjunction that appears to the right of the footnode. involving an auxiliary tree and a grown auxiliary Secondly, the observation we made about an entry tree. E b + is no longer quite true because of the presence Given a node m spanning a tree (i,j,k,l), we define of adjunction. the last operation to create this tree as follows : Thirdly, the technique of getting rid of the mid if the tree (i,j,k,l) was created in a series of op- 1/3 and focusing on the reduced problem size alone, erations, which also involved an adjunction by an does not work as shown in figure 3: auxiliary tree (or a grown auxiliary tree) (i, Jl, kz, l) Suppose 3' is a derived tree in which 3 a node rn onto the node m, then we say that the last opera- on which adjunction was done by an auxiliary tree tion to create this tree is an adjunction operation; ft. Even if we are able to identify the derived tree else the last operation to create the tree (i,j,k,l) is a 71 rooted at m, we have to first identify fl before we composition. can check for adjunction, fl need not be realised as The concept of last operation is useful in modelling a result of the composition operation involving the the steps required, in a bottom-up fashion, to create

168 n .. x /, Derived tree

71 '3'

Node m has label X 71

Figure 3: Situation where we cannot infer the adjunction if we simply get rid of the mid 1/3 a tree. nodes from Sraa,Vm3, where S,n3 = { m3} U AS- SOC LIST (mS), m3 having label ¢. 5 Algorithm Following is the main procedure, Compute Nodes, which takes as input a sequence rlr2 ..... rp of symbol Given that the set of initial and auxiliary trees can positions (not necessarily contiguous). The proce- have leaf nodes labelled with e, we do some prepro- dure outputs all nodes spanning trees (i,j,k,O, with cessing on the TAG G to obtain an Association List {i, 1} E {rl,r2 ..... ~'ip} and {j,k} E {rl,r I Jr Z,...,rp}. (ASSOC LIST) for each node. ASSOC LIST (m), The procedure is initially called with the sequence where m is a node, will be useful in obtaining chains 012..n corresponding to the input string aa ..... an. of nodes in elementary trees which have children la- The matrix A is updated with every call to this pro- belled ~. cedure and it is updated with the nodes just realised Initialize ASSOC LIST (m) = ¢, V m, and then and also with the nodes in the ASSOC LISTs of the call procedure MAKELIST on each elementary tree, nodes just realised. in a top down fashion starting with the root node.

Procedure Compute Nodes ( rl r2 ..... rp ) Procedure MAKELIST (m) Begin Begin 1. Ifp = 2, then 1. If m is a leaf then quit 2. If m has children ml and me both yielding the a. Compose all nodes E A(rl,j, k, re) with all empty string at their frontiers (i.e. m spans a nodes E A(re,re, re, re), rt < j < k < re. subtree yielding e) then Update A . ASSOC LIST (ml) = ASSOC b. Compose all nodes E A(rl,rl,rl,rx) with LIST (m) u {m) all nodes E A(rt, j, k, r2), rt < j < k < re. ASSOC LIST (m2) = ASSOC Update A . LIST (m) U (m} e. Check for adjunctions involving nodes re- alised from steps a and b. Update A . 3. If m has children m1 and me, with only me yielding the empty string at its frontier, then d. Return ASSOC LIST (ml) = ASSOC 2. Compute Nodes ( rlr2 ..... rep/a ). LIST (m) u {m) 3. Compute Nodes ( rl+p/z ..... rp ). End 4. a. Compose nodes realised from step 2 with nodes realised from step 3. We initially fill A(i,i+l,i+l,i+l) with all nodes b. Update A. from Smt,Vml, where S,~1 = {ml} O AS- SOC LIST (ml), ml being a node with the same 5. a. Check for all possible adjunctions involving label as the input hi+l, for 0 < i < n-1. We also fill the nodes realised as a result of step 4. A(i,i,j,j), i < j, with nodes from S,~2, Vm2, where b. Update A. Sin2 = {me) tJ ASSOC LIST (me), me being a foot node. All entries A(i,i,i,i), 0 < i < n, are filled with 6. Compute Nodes ( rlre...rp/arl+2p/a...r p )

169 End Note that in CI i E {rl,..,rpls}, i E {rl+2p/3 , .., rp}, and 0 _< j < k < n. Steps la,lb and 4a can be carried out in the fol- Ce(qt, rs) = 1 iff m E A(q, r, s, t) lowing manner : -- 0 otherwise Consider the composition of node ml with node Note that inC2 0 r~ 4-1 , and a node m ml E A(i,j,j,l) and m2 E A(l,p, q, r), then their spanning a tree (i,j, k, l) such that i < rs and i _> rt parent, say node m should belong to A(i,p,q,s). with j and k in any of the possible combinations as This can also be handled similar to the manner de- shown in figure 4. scribed for case 1. Update A(i,p,q,s) with {m} U 3 a node m' which is a descendent of the ASSOC LIST (m). node m in the tree (i,j,k,l) and which either Notice that Case 1 also covers step la and Case 2 E ASSOC LIST(ml) or is the same as ml, with also covers step lb. ml having one of the two properties mentioned be- low : Step 5a and Step lc can be carried out in the 1. ml spans a tree (il,jl, kl, 11) such that the last following manner : operation to create this tree was a composition We know that if a node m E A(i,j,k,i), and the operation involving two nodes me and m3 with root ml of an auxiliary tree E A(r, i, i, s), then ad- me spanning (ix, J2, k2, 12) and m3 spanning joining the tree 7/, rooted at ml, onto the node m, (12,j3, ks, ix). (with (r, < l~. < rt), 01 <- r,), results in the node m spanning a tree (rj,k,s), i.e. m (rt < !1) and either (j2 = kz,j3 = jl,k3 = kl) E A(r, j, k, s). or (j2 = jl,k2 = kl,j3 = k3) ) We can essentially use the previous technique of 2. ml spans a tree (il,jl, kl, ll) such that the last reducing to boolean matrix multiplication. Con- operation to create this tree was an adjunction struct two matrices C1 and Ce of sizes (p2/9) x (n + by an auxiliary tree (or a grown auxiliary tree) 1) 2 and (n + 1) 2 x (n + 1) 2, respectively, as follows : (il, j2, ke, Ix), rooted at node me, onto the node Cl(ii, jk) = 1 iff 3ml, root of an auxiliary ml spanning the tree (je,jl, kl, k2) such that tree E A(i, j, k, l), with same label as m and node me has either the property mentioned in Cl(il, jk) = 0 otherwise (1) or belongs to the ASSOC LIST of a node

170 I I rs rt

j k

2 j k

3 j k

4 j k

5 j k

Figure 4: Combinations of j and k being considered

which has the property mentioned in (1). (The in addition to the initialization information, the labels of ml and me being the same) algorithm computes all the nodes spanning trees Any node satisfying the above observation will be (i,i,k,O with (i,l} ~ { r~,r~,..,rp } and i _< i < k, we call one adjunction. (rq,r~+l) a gap, iff rq+l ¢ rq + 1. Identifying min- imal nodes w.r.t, every new gap created, will serve 2. r2 ~ rl + 1, implies that (rl,r2) is a gap. our purpose in determining all the nodes spanning Thus, in addition to the information given trees (i, j, k, 1), with {i, l} e {rl, r2, .., rp}. as per the theorem, a composition involv- ing nodes from A(rl, j, k, r2) with nodes from A(r2,r2, r2,r2) and a composition involving nodes from A(rl,rl,rl,rl) with nodes from Theorem : Given an increasing sequence < A(rl, j, k, r2), (rl < j < k < r2), followed by an rl, r2, .., rp > of symbol positions and given adjunction involving nodes realised as a result of the previous two compositions will be sufficient a. V gaps (rq, rq+l), all nodes spanning trees (i,j,k,l} as the only adjunction to take care of involves with rq < i < j < k < l < rq+l the adjunction of some auxiliary tree onto a b. V gaps (rq, rq+l), all nodes spanning trees (i,j,k,l) node m which yields e, and m E A(rl, rl, rl, rl) such that either rq < i < rq+l or rq < l < rq+l or m E A(r2,r2,r2, r2). c. V gaps (rq,rq+l) , all the minimal nodes for the Induction hypothesis : V increasing sequence gap such that these nodes span trees (i,j,k,l) with < rl,r2, ..,r~ > of symbol positions of length < p, {i,l} E { rl,r2,..,rp } and i <_ 1 (i.e q < p), the algorithm, given the information as

171 (5a) (ab) m m

r r s t

(5c) (M) root of auxiliary m ra tree has property auxiliary A • tree o~,.~//////2X grow. tree ~///J//~ tree ///// ~k//~ i -'i 1 ' l i il ' j k ' ll ! 1 1 Grown aux tree formed by adjoining (Se) i z Ps " P2 Pl I onto root of grown aux tree 7

Root of ~1 has property shown in (Sa)

Figure 5: Identifying minimal nodes required by the theorem, computes all nodes span- j" =j, k" = k) or (j' =j, k'= k,j" = k') ning trees (i,j,k,l) such that {i, l} e { rl, r2, .., rq } ). and i < j < k < I. Induction : Given an increasing b. m spans a tree O,J, k,l) such that the last op- sequence < rl, r~, .., rp, rp+l > of symbol positions eration to create this tree was an adjunc- together with the information required as per parts tion by an auxiliary or grown auxiliary tree a,b,c of the theorem, the algorithm proceeds as fol- (i,j',k',l), rooted at node mI, onto the node lows: m spanning the tree (j',j,k,k') such that node ml has either the property mentioned 1. By the induction hypothesis, the algorithm correctly computes all nodes spanning trees in (1) or it belongs to the ASSOC LIST of (i,j,k,i) within the first 2/3, i.e, {i,l} E { a node which has the property mentioned in (1). (The labels of m and ml being the rt, r2, .., r2(p+D/3 } and i < l . By the hypothe- sis, it also computes all nodes (i',j,k',l')within same) the last 2/3, i.e, { i ~, !~ } E {rl+(p+l)/3, .., rp+z} Note that, in addition to the nodes m captured and i' < i'. from a or b, we will also be realising nodes E ASSOC LIST (m). 2. The composition step involving the nodes The nodes captured as a result of 2 are from the first and last 2/3 of the sequence the minimal nodes with respect to the gap < rl, r2, .., rp, rp+i >, followed by the adjunc- (r(p+l)/a, rl+2(p+l)/3) with the additional property tion step captures all nodes m such that either that the trees (i,j,k,l) they span are such that i E { rl, r2, .., r(p+l)]3 } and l E { rl+2(p+l)]3, .., rp+l }. a. m spans a tree (i,j,k,l)such that the last op- Before we can apply the hypothesis on the se- eration to create this tree was a composi- quence < rx, r2, .., r(p+t)/3, rl+2(p+l)[3, ..rp+l >, we tion operation on two nodes ml and m2 have to make sure that the conditions in parts with ml spanning (i,j',k;l'} and me span- a,b,c of the theorem are met for the new gap ning (r(p+1)/3, rl+2(p+l)/3). It is easy to see that con- (i;j",k",l). (with i E { rl, r2, .., r(p+l)/3 }, ditions for parts a and b are met for this gap. We i E { rl+(p+l)/3,..,r2(p+D/3 } and I E ! have also seen that as a result of step 2, all the mini- ri+2(p+z)/3, .., rp+z }, and either (j' = k, mal nodes w.r.t the gap (r(p+x)/3 , rl+2(p+l)/3), with

172 the desired property as required in part c have been Acknowledgements computed. Thus applying the hypothesis on the This research was supported in part by an NSF Re- sequence < rl, r2, .., r(p+l)[3, rl+2(p+l)/3, ..rp+l >, search Initiation Award CCR-92-09260 and an ARO the algorithm in the end correctly computes all grant DAAL03-89-C-0031. the nodes spanning trees (ij,k,1) with {i,l} E {rl,r2,..,rp+x } andi

173