Adaptive Probabilistic Generalized Lr Parsing
Total Page:16
File Type:pdf, Size:1020Kb
ADAPTIVE PROBABILISTIC GENERALIZED LR PARSING + Jerry Wright* , Ave Wrigley* and Richard Sharman • Centre for Communications Research Queen's- Building , University Walk , Bristol BS8 lTR , U.K. + I.B .M. United- Kingdom Scientific Centre Athelstan House, St Clement Street, Winchester S023 9DR, U.K. ABSTRACT connected speech recognition (Lee , 1989) , together with the possibility Various issues in the of building grammar ...le vel mqdelling implem�ntation of generalized LR into this . framework (Lee and parsing with probability' are Rabiner, 19S9) is further evidence discussed. A method for preve_nting of this trend. Th_e p\lrpose. of this the generation of infinite ·numbers paper is to consider some issues ,, in of states is described and the space the implementation of probabilistic requirements of the parsing tables generalized LR parsing . are · assessed for a substantial natural-language grammar . Because of a high degree of ambiguity in the The obj ectives of our current grammar , there are many multiple work on language modelling and · · entries and the tables are rather parsing for s·peech recognition can large . A new method for grammar be summarised as follows : adaptation is introduced which may (1) real-time parsing without help to reduce this problem . A excessive space requirement,s, probabilistic version of the Tomita ( 2 ) minimum restrictions on parse forest i& also described. the grammar ( ambiguity , null rules , left - recursion all · permitted, no need to use a normal form) , 1. INTRODUCTION ( 3 ) probabilistic predictions to be made available to the patte·rn matcher, with word or phoneme The generalized LR parsing likel ihoods received in return , algorithm of Tomita ( 1986) allows (4) interpretations , ranked by most context- free grammars to be overall probability, to be made parsed with high efficiency . For available to the user, applications in speech recognition (5) adaptation of the language ( and perhaps elsewhere) there is a model and parser , with mihimwn delay need for a systematic treatment of and interaction with the user. uncertainty in language · modelling , pattern recognition and parsing . The choice of parser generator( 3)is. Probabilis tic grammars are relevant to obj ectives (1) and increasing in importance as language All versions satisfy obj ective ( 2 ) models ( Sharman , 1989, Lari and but are initially susceptible td Young , 1990) , pattern recognition is generating an infinite numb er of guided by predictions of forthcoming states for certain probabilistic words ( one application of the recent grammars , and in the case of the algorithm of Jelinek ( 1990)), and canonical parser generator this can the extension of the Tomita happen in two ways . A solution to algorithm to probabilistic grammars this problem is described in section (Wright et al , 1989 , 1990) is one 2. The need for probabilistic approach to the parsing problem . predictions of forthc( oming3)) words or The successful application of the phonemes ( obj ective is best met Viterbi beam-search algorithm to by the canonical parser generator , 100 versions 2. PARSING TABLE because for all other the prior probability distribution can SPACE REQUIREMENTS only be found by following up all possible reduce actions in a state , 2.1 INFINITE SERIES OF STATES : inet aladvanc, e of the next input (Wright . FROM THE ITEM PROBABILITIES 1989 , 1990 ) . This · consumes both time and space , but the size of When applied fo a probabilistic the parsing tables produced by the grammar , the various versions of LR canonical parser generator generally parser generator first produce a pre.eludes . their use . The , space series of item sets in which a requirements for the various ·· parser· pro:t,ability derive� fr9m the grammar . generato,r s :are assessed in section . is attached to each item , and . then 3 . , - ,The grammar used· fo.r · this generate . the action and go to ·tables purpose. ,was develop�d by 1.-B.M. fro·m · - in wh;ich each entry again has an an Associated Press corpus o.f.. ·text attach�d probab.ility , representit;1g_ a ( Sharman, _1989) . frequency for . that act_ion coetndi al.ti, onal_ upon the state (Wright Obj ective (4) is met . by the parse l989 , . 199.0) . Sometimes these. forest representation which is a probapiliti..es can cause ;a p�oblem in . pro_bab_�listic version_ of . - that state generation . For example , empioyed · by Tomit·a . .. ( 19.�6)__ , consfder the following probabilistic incorporating sub -node .•�har ing· · �nd grammsar : local · amb iguity packi�g . · This . is A, P1 I B, P2 des�rib�d in- section 4: · A ➔ cA, ql a, q2 e B· C B, b, r2 · The final issue ·(�b,.j c_tive : (5) ➔ and section ty5) is crucial to the r1 ➔ applicab:i.li of the whole .9:ppr�_ac� •, where p1 , p2 and so on (with We regard a grammar as ;a P1 + p2 = 1) represent the prob�bilistic structured_ hier- probabilities of the respective arc.hical_ model of language . a� used� - . rules . After receiving·, the terminal not _a - prescriptive basis - for symbol c, the state with the . correctness .of that use . A (closed) item set shown in Table 1 rela_tiyely compact parsing table is entered, with the first coluinn of. presumes a relatively compact probabilities for each item . After grammar , wh ich is therefore going to receiving the terminal symbol c be inadequate to cope with the range again , a state with the same item of usage to which it is likely to be set is entered but with the second exposed. It is essential that the column of probabilities for each software be made adaptive , and our item, and ' these ·are different from experimental version operates the first unless q1 = r1 . For the through the LR. parser to synthesise probabilistic parser these states new grammar rules , assess their must therefore be distinguished, and plausibility, and make incremental in fact this process continues to changes to the LR. parsing tables in generate an ( in principle) infinite order to add or delete rules in the sequence of states . Although it may grammar . sometimes be sufficient merely to truncate this series at some point , the number of additional states generated when all the "goto" steps have been exhausted can be very large . 101 II- I d B Table 1: Item se1 with probabilities . ➔ --3> , 0_f Item First pro�I ab ility Second probability ·�� ,c,, 2 2 2 A C P1 q1 / (P1 q�+P2 ;--; )] I P1 q12 (p1 q12 +p2 r21 ) B ➔ C • AB P2r21/ (p1q1+P2r1 ) P2r31/ (p1q21+ P2r21 ) ➔ C• A I P1 q1 / (P1 q1 +p2 r1 ) P1 q12 (P1 q1 +p2 2 r1 ) 2 A ➔ • a . P1 q12 q2/ (p1q1+P2 r1 ) P1 q13 <1 2/ (P12 q1 +p2 r1· ·) · AB ➔ • c B I P2r1 / (P1 q1 +p2 r1 ) P2r21 (P1 q1 +p2 2 r1 ) 2 B ➔ • P2 r1 r2/(p1q1+P2r1 ) P2r1 r2/(p1q1+p2r1 ) ➔ • b Table 2: Separated item sets . 1 C 1 A c•A B • B A ➔ •cA B ➔ c B A ➔ B ➔ • b a ➔ • ➔ • goto We can avoid . this problem by multiple entry is required, and · in��oducing a multiple - shift entry this in .turn means that a for the terminal symbol c, in the probabilitgo toy has to be attached to state from wh ich the one just each entry ( this was not discussed is enteractioedn. Multiple required in the original version of entries in the table are the probabilistic LR parser) . normally confined to cases of shift- reduce and reduce- reduce Suppose in general that a grammar · conflicts , . but the purpose here is has non terminal and terminal to force the stack to divide , with a vocabularies and respectively . probability , P1 q1 / (p1q1+P2 r1 ) Conditions foNr the ocT currence of an attached to one branch and infinite series of states can be P2r1/(p1q1+P2r1 ) to the other . summarised as follows : there occurs These then lead to separate states a state in the closure of wh ich with item sets as shown in Table 2. there arise either The prior probabilities of a, ( a) two distinct self- recursive nonterminal symbols (A, B and c are obtained by comb ining theb + say two branches and take the same ) for wh ich a nonempty string a e (Nu T) exists such that values as before . Further A � a A f3 B � a B -y occurrences of the terminal symbol c and simply cause the same state to be /J,-y • re- entered in each branch , and where e (Nu T) eventually an a or b eliminates one or b two or more mutually ( ) ( ) + branch . If c is replaced by a recurs ive nonterminal symbols for nonterminal symbol C, the same which a nonempty string a e (Nu T) procedure applies except that a exists such that 102 A � a B /3 and B � a A -y 2.2 INFINITE SERIES OF STATES : • • FROM THE LOOKAHEAD DISTRIBUTION where {3,-y e (N u T) • and in addition either For the probabilistic LALR parser generator the lookaheads consist of ( i) � left -most derivation a set of terminal symbols , as in the of one symbol from the other is case of non-probabilistic grammars . possible : 6 (N u T) However, for the canonical parser A � B 6 where e generator there is a full • • probabil_ity distribution of or ( ii) a self-recurs ive lookaheads and this creates a second nonterminal say , which . - may potential source of looping coincide with (C, or also arises behaviour . It is possible for item such that A B) (N u sets · to have the same item. where 6 e T) probabilities but different • lookahead distributions . Suppose for the same a. that a state contains an item with a right- recurs ive nonterminal symbol These conditions ensure that the say) after the dot, with nothing a-successor of this state also fo(Cllo, wing.