Tree-Grammar Linear Typing for unified Super-Tagging/Probabilistic Models

Ariane Halber hal [email protected] AI group, dept of Computer-Science, ENST-Paris• Man-Machine Interaction lab, Corporate Research Laboratory, Thomson-CSFt

Abstract must thus take several hypothesis into account. This leaves one with two regrets: first, parsing has still We integrate super-tagging, guided-parsing to find some way to tackle combinatorial ambigu­ and probabilistic parsing in the frame­ ity, and second, there is a lack of synergy between work of an item-based LTAG chart parser. super-tagging and parsing , while they seem to share Items are based on a linear-typing of trees a kuowledge about tree potential-combinations. that encodes their expanding path, starting Probabilistic parsing offers a way to tune the com­ from their anchor. promize between accuracy and speed, by thresh­ olding partial parsing paths according to their cur­ rent Inside probability. This incurs a well-known 1 lntroduction bias (Goodman, 1998): At a given derivation step, Practical implementations of LTAG parsing bave to the lnside-probabilities of parse constituents esti­ face heavy lexical ambiguity and parsing combinato­ mate the relevance of the derivation past, but do rial ambiguity. Main techniques to address these is­ not teil anything about its future. This can be cor­ sues are super-tagging (Joshi and Srinivas, 1994), rected by A* cost functions, or Outside-probability which consists in disambiguating elementary trees estimates. before parsing; guided-parsing, like head.-driven To meet the weak points mentionned above, at parsing (van Noord, 1994) or anchor driven pars­ least partialy, we develop a unified framework for tbe ing (Lavelli and Satta, 1991; Lopez, 1998); and tbree techniques, and push their interactions further. probabilistic parsing (Schabes, 1992; Caroll and Sharing a parsing framework We propose an Weir, 1997). item-based chart-parser, where the parsing scheme All of tbese approaches exploit specific properties is expressed as a deduction system (Shieber, Sch­ of LTAG to improve parsing efficiency, but none is abes, and Pereira, 1994). This framework is totally satisfactory. also amenable for expressing probabilistic pars­ Guided-parsing is a very nsefull means to limit ing (Goodman, 1998). overgeneration of spurious items in the chart, but it does not provide a new ambiguity bound. Besides, Sharing models for super-tagging and item­ lexical ambiguity remains the main factor of com­ pruning. Super-tagging can be seen as a tree­ putational load and this problem is only undirectly sequel)Ce WF-model, and extended to score derived addressed by such techniques. item-sequences in the cbart, wrt their likelihood of Super-tagging strength is to discard elementary completing a parse. This yields a sound threshold­ trees while avoiding to go through actual tree com­ ing technique (Rayner and Carter, 1996; Goodman, binations. lt exploits instead. local models of Well­ 1998). Formedness (WF), as those used for tagging, where Sharing tree-types for item-pruning and parse depencies remain implicit or underspecified. guided-parsing. To support the WF parametric The problem though is that if a single tree is in­ mode!, trees and itcms are abstracted by theit lin­ correct the parse will fail. To be robust, parsing ear type, which consists in a list of connectors that "ENST Paris, 46 rue Barrault, 75634 Paris Cedex 13 represent combination properties. Guided-parsing IThomson-CSF, LCR, Domaine de Corbeville, 91404 relies on a specific ordering of these connectors, so Orsay Cedex, FRANCE that a single type drives the parsing deduction and

54 Item form:

Goal: <•'Jl 1 .JS> n[O,n,-,-1 Axioms:

Anchor Anchor(o·) = U!i r,' r r connectcd walks of a

co-Anchor <:1e,.(. l .(.w,>unrooted(•,1+1.-,-I

Rulcs: if<>r lert expansion, right is symetrical) <.\l !.JX> "'[q.f; .t:l o(J,k,Ji ,Jrl Substitution wrnp-3 and co-Anchor. recogrlition o (i,f.:, 11$1:, Ir fB 1:J

< ·„x, 1„,,rt !'~lrX(r001 J> JJ(•.1 ,-,-11L\(ryl f'df r>a11 ,k,Ji.1rl Right Adjunction ff E {IX*, S} <-rr;ri1r r>o[1,k,Ji .Jr] .i1„J1 .-.-1 0(11,k,/1 ,J,I Lcrt Adjunction ri (t sl spine(o) 0(1,k,fi ,fr] <"l\-1,,,,, 11~1Xroot> ß(„J.-.-1 a!J,k,/1,/,J Lcft Adj on spine r~ E {~X,S} n[J,k,fi./rl <1.\'lrX•> ßf•,J,Ji ,J o(J,k,/„/d Sub-tree extrnction wrnp-1 <·1 l\·(•'I) .\'r1.(. j .J-.\rirXt'l)>a()r,)r.-.-J < .j. X17fdf r> ofj,k,/r.!d >

< "IXl1-X ·> ;Jl•.J.J 1 ,Jr] o(Jt,Jr.f!,J:l Wrap Adj on spinc wrnp-2 adjunction an sub-tree of•.1.1: .1:1 < IXj'll fi ll r> n(1.1.J, ./rl No Lcft Adjunction [ •. J./1 ./rl <1X(rool)r, lfr> .il•./r.-.-l Gap crcation f1 (t {'IX, S}

Table l: Deducti \'e syslcm for an LTAG bidirectional chart-parser, lexica\ly guided and based majoritarily on trces. thanks to a prccomµilation of thcir nodes into left and right walks.

The acllve ('QJIJlector IS µopcd on e•treme ldl (resp. right) of its Stack r, (resp. rr)- Each connector is associated with its node ~. thou11h we do not always mark 11. The spme 1s the path from anchor to root.wr&p-1, wrnp-2, wrap-3 identifie the three steps of a wrapping ndjuncuon an iln mtf:'rnal node. er tijl;Ure L estimates the pruning model. Tyµes are described to . Left and right walks are ex­ in section 2, their use in t.hc deduction systcm, m prcssed as stacks of connectors, so that the extreme section 3. their use for itcm-pruning in section 4. connector is the one to connect the dosest to the anchor 1 An illustration is given in table 2 for the 2 Linear Typing tree in figure 2.

Guiding the Tree expansion \\'e guide the pars­ Typing strategy. In it.s own walk, the foot bears ing by independent left and right connected-walks, the adjunction, with type l or r inversly to the foot inspired from ( Lavelli and Satta. 19!)1) bidirectional side. In the opposit walk, the foot-node may be parscr and (Lopez, 1998) connected routes. Left and reached as weil, provided that there is a direct path right connected walks follow respcctively left- and from root to foot. In the deduction system, in ta­ right- monolonic expansion, out ward. from thc an­ ble 1, the foot-node of a left or right auxiliary tree chor to t hc root, as disµlayed in flgure 2. Thcy list achicves adjunction, but the foot-node of a wrapping node operations considercd as connectors. auxiliaiy tree creates a gap and passes its adjunction 1The derivation is represented as a fully connected Link-Gramnrnr expression To express linear and oricnted graph of trees whose edge labcls arc con­ types. WP "Xploit thc Link-Grammar formalism {Laf­ nector names (pr,,,ided that a sub-tree is extracted to fcrty. Sleator. and Temperley. l UU2). which is close decompose wrapping adjunction, cf. figure 1.

55 Wr&p·I fnhlrJlllW)' itfp ~" 'Wfl(l'l>I -.J;go.,._-n: n'JXU.JS!Xo-ltcT \tl fiJl'lheJ.11' left wnlk meta-rule: 'Sfoot B* NP* -l.did ,.YP* NP -l. NP* N* <-• know right walk meta-rule: know •-+ *rV that-l. Sf"oot {NA*rVP) *rS left and right connector stncks: type: abstraction on connector stacks, removes specialh:ed substitutions: „,~,Jo..lµ"l;'l•-n. ,„ ~ ~~t #1 ~hi..l UH co-Anchor: w-l.-+ LEX.). sub-trec: .\"17.).-+ X --l. jl jr The Four AdJunctlons In lhe char1 figure l: lllustration of the dcduction rules in la­ Table 2: Typing thc tree in figure 2. blc 1. In a right walk, ix• cxpreoses an auxiliary root-node and 'IX, a node expecting adjunct1on, X.j. expresses a Substitution Site and .jX, the root of an initial tree. In a left walk they work the other way around.

clearly the CF-component, so that the parsing be­ haves very nicely whcn faccd wit.h near-CF deriva­ tions, which are a majority in practice.

, , . ::::,,.. Now, regarding complexity, first t.hree "near-CF" 5 n~hl w.ilk rules yield a worst-casc complexity of O(n ), wrap­ ~- · '· ping adjunction on a lexical spine yields 0(116), but 1-·~)-:Jl"M:ho< kh walk ri.'l..'.Ugt11\J11n 7 the sub-tree rule yields O(n ). We could change that figurc 2: Lcft and right trce walks. rule into a "systematic" snb-tree extraction with ar­ 5 bitrary gap frontiers, in order to go down to O(n ), capacities to the root-node, with an opposit type for bnt this would generate a lot of spnrious items. the opposit. sides. Thcrefore we prefer a lexical check with a wrapping lt. can bc noted that each nodc that can receive ad­ auxiliary t.ree, since their occurrence is marginal. junction yields two linked connectors, which bound t he su b-list of connectors of their su b-trce. 4 Probabilistic Thresholding Probabilistic parsing is expressed through thc de­ 3 Deductive Chart Parser ductive system as fo!lows: \Ve wish to get elementary-like lypes on derived = Pr(iitemi) P(item w; ... Wj) structure, so as to use a super-tagging-like model to = ='* P(r11le) prune derived paths. \Vc t.ry thus to keep as close as = possible to trees when driving thc parsing. But we Rufe, [;itemifüitemk] are not aiming at top-down parsing. since we wish ~~~~~~~~~~~~~~d lo evaluate deri,•ed paths that span the input. This (;itemk] = Rule * (;item1] * f)temk] lcads to isolating wrapping adjunction from left- an

56 P([s}po•) =Lu ~· T. P(S~ h ,..\"i p(s]„o,T1 .9 ) 5 Discussion "'' P• 'f s.tU1..mVi .. „[s}po1T1 q,; W \Ve have presented a general framework for deduc­ (3) tive parsing, probabilistic parsing and super-tagging. For an item-path, outside probability accounts for This unified approach opens a lot of perspective in parsing-deductions to come. i.c t hc connectors of the design of efficient and robust LTAG parsing. t.he item stacks. \Vhereas consumed connectors are However. it rc•mains lobe fully validated. rcsponsible for the inside probability. Thcre is no ,\s far as supper-tagging is concerne the essential computations involved. consum1ned stacks: < r;'1r~' >

Pr(U):::::: VIrr • 1··'1·•'t , µro d • P.'o References Po((') :::::: .Vin , r"r·t r Carol!, John and David Weir. 1997. Encoding fre­ (Goodman, 1998) proposes two useful approxima­ quency information in lexicalized grammars. In tion of the outside score of itcm[s]. in ordcr to cor­ ACL/S!GPARSE workshop 011 Pm·sing Technolo­ rect t hc inside probabilily hi

57