MODULE 18 – LALR Parsing After Understanding the Most Powerful CALR Parser, in This Module We Will Learn to Construct the LALR Parser

MODULE 18 – LALR parsing After understanding the most powerful CALR parser, in this module we will learn to construct the LALR parser. The CALR parser has a large set of items and hence the LALR parser is designed that has lesser number of items but with reduction in the number of conflicts which is a problem of SLR parser. This module will discuss the construction of LR(1) items necessary for LALR parsing, LALR parsing table followed by parsing a string using the LALR parser. 18.1 Need for LALR parser Though the CALR parser is powerful enough in avoiding the conflicts of the SLR parser, it suffers from a large set of LR(1) items. This increases the number of entries in the CALR parsing table and thus increases the time complexity of computation and parsing. Increase in the number of items is reduced in LALR parsing table by combining the items that have the same core items but different look-ahead. Thus this is less powerful than CALR parser but avoids shift/reduce conflicts as shifts do not use look-ahead. As we are combining the items with different look- ahead into one, the LALR parser may introduce reduce-reduce conflicts, but not much of a problem for grammars of programming languages. 18.2 LR(1) items The algorithm for LR(1) items for the LALR parser is computed by first constructing the LR(1) items as in the case of the CALR parser and then combining the items that have the same item- set but differing look-ahead into one item. The algorithm for the CALR’s LR(1) items construction is discussed in module 17. Combining the items alone is discussed by means of an example. Example 18.1 Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table. S CC C cC C d The augmented grammar is given below and the CALR’s LR(1) items are repeated here for a quick reference in Table 18.1 • S’ S • S CC • C cC • C d Table 18.1 LR(1) items of CALR parsing. Item Set of Items Goto(I, X) Comments I0 S’ .S, $ This is the initial item. We have a non-terminal S S .CC, $ after the dot. So we add the productions of S, with C .cC, c/d look-ahead as FIRST($) since β is ε. Now again we C .d, c/d have non-terminal C after the dot and here β is ‘C” and ‘a’ is $. So, we add the productions of C with lookahead as FIRST(C$). FIRST(C) = {c, d} from the two productions of C. Thus we add two items for each of the productions of C one with ‘c’ and other with ‘d’ as look-ahead. However, we could represent it in a combined fashion as given in this items set. I1 S’ S., $ (I0 , S) Shifting the dot results in a kernel item, the look- ahead remains the same. I2 S C.C, $ (I0 , C) The dot is shifted by one position to the right. Now C .cC, $ we have C after the dot. β is ε and we add the items C .d, $ of C with FIRST($) as look-ahead. I3 C c.C, c/d (I0 , c), Shifting the dot by one position and keeping the C .cC, c/d (I3 , c) initial look ahead as it is, results in the first item. C .d, c/d Now we have a C after the dot. β is ε and we add the items of C with FIRST(c/d) as look-ahead. I4 C d., c/d (I0 , d), Kernel item with the look-ahead being the same (I3, d) I5 S CC., $ (I2 , C) Kernel item I6 C c.C, $ (I2 , c) The dot is shifted by one position to the right. Now C .cC, $ (I6 ,c) we have C after the dot. β is ε and we add the items C .d, $ of C with FIRST($) as look-ahead. I7 C d., $ (I2 ,d) Kernel item (I6 ,d) I8 C cC., c/d (I3 ,C) Kernel item I9 C cC., $ (I6 ,C) Kernel item and no more new items are necessary to be added. From Table 18.1 consider items I3 and I6. Both these items set have the same core but they differ in their look-ahead and hence we combine them and call it as item I36 as given below. I36 : goto(I0 , c), goto(I36 , c), C c.C, c/d/$ C .cC, c/d/$ C .d, c/d/$ Similarly items I4 and I7 could be combined together as item I47 and so is items I8 and I9 as I89 . The items are given below: • I47 : goto(I2 ,d) goto(I6 ,d) C d., c/d/$ • I89 : goto(I3 ,C) C cC., c/d/$ Thus we have reduced 3 items from the CALR’s LR(1) items and have items I0, I1, I2, I36, I47, I5 and I89. 18.2 LALR parsing table After constructing the LR(1) items by combining the necessary items we use this reduced set to construct the LALR parsing table. The parsing construction is the same as that discussed for the CALR parser in the previous module but we work with LALR’s LR(1) items. The LALR parsing table is given in Table 18.2 for the grammar of example 18.1. Table 18.2 LALR parsing table State Action Goto Comments c d $ S C 0 s36 s47 1 2 Goto(I0 ,c) = I36 , => [0,c] = s36 Goto(I0 ,d) = I47=> [0,d] = s47 Goto(I0 ,S) = I1 => [0,S] = 1 Goto(I0 ,C) = I2 => [0,C] = 2 1 accept I1 has [S’ S., $] so at [1, $] we have accept action 2 S36 S47 5 Goto(I2 ,c) = I36 , => [2,c] = s36 Goto(I2 ,d) = I47=> [2,d] = s47 Goto(I2 ,C) = I5 => [2,C] = 5 36 s36 s47 89 Goto(I36 ,c) = I36 , => [36,c] = s36 Goto(I36 ,d) = I47=> [36,d] = s47 Goto(I36 ,C) = I89 => [36,C] = 89 47 r3 r3 r3 C d., c/d/$, so at the intersection of [47, c], [47,d] and [47, $] we set reduce by C d 5 r1 S CC., $, at the intersection of [5, $] we set reduce by S CC 89 r2 r2 r2 C cC., c/d/$ at the intersection of [89,c], [89,d] and [89, $] we set reduce by C cC 18.3 LALR Parsing The LALR parsing algorithm is the same as CALR’s parsing algorithm except that this algorithm will refer to the LALR parsing table and the input stack. This parser will not have a shift/reduce conflict but for some grammar this will have a reduce/reduce conflict and the parser will be in favor of reducing with the first production. Example 18.2 Consider the grammar of example 18.1 and see the parsing action of LALR parser for the input “ccdd”. Like other parsers, the input string is appended with $ and the parsing action is shown in Table 18.3 Table 18.3 Parsing action of the LALR parser Stack Input Action 0 ccdd$ [0, c] – shift 36 0 c 36 c d d $ [36, c] – shift 36 0 c 36 c 36 d d $ [36, d] – shift 47 0 c 36 c 36 d 47 d $ [47, d] – reduce 3, pop 2 symbols from stack, push C, goto(36, C) = 89 0 c 36 c 36 C 89 d $ [89, d] – reduce 2, pop 4 symbols from the stack, push C, goto(36, C) = 89 0 c 36 C 89 d $ [89, d] – reduce 2, pop 4 symbols from the stack, push C, goto(0, C) = 2 0 C 2 d $ [2, d] – shift 47 0 C 2 d 47 $ [47, $] – reduce 3, pop 2 symbols from the stack, goto(2, C) = 5 0 C 2 C 5 $ [5, $] – reduce 1, pop 4 symbols off the stack, goto(0, S) = 1 0 S 1 $ [1, $] – accept – successful parsing As can be seen from Table 18.3 the number of steps in parsing is lesser than that of the CALR parser. Example 18.4 For the pointer variable declaration grammar, the modified set of LR (1) items and the parsing table are given in Table 18.4 and 18.5 respectively Item Set of Items Goto(I, X) Comments I0 S’ .S, $ Initial item. Then all the items need to be added S •L=R, $ with ‘$’ as look ahead for S, R. But for L we have S •R,$ two look-ahead ‘$’ and ‘=’one from S .L=R and L •*R,=/$ other from R .L. L •id,=/$ R •L,$ I1 S’ •S,$ (I0,S) Kernel item to result in accept action I2 S L•=R,$ (I0,L) After the dot we have a terminal and hence no additional items need to be added R L•, $ Kernel item I3 S R•, $ (I0,R) Kernel item I4 L *•R,=/$ (I0,*), Items of R to be added with the same look-ahead R •L,=/$ (I4,*) which results in addition of the items corresponding L •*R,=/$ to R and in –turn L L •id, =/$ I5 L id•,=/$ (I0,id) Kernel item (I4,id ) I6 S L=•R,$ (I2,=) Items of R to be added with same look ahead and R •L, $ in-turn items of L are added. L •*R, $ L •id, $ I7 L *R•,=/$ (I4,R) Kernel item I8 R L•,=/$ (I4,L) Kernel item I9 S L=R•,$ (I6,R) Kernel item I10 R L•,$ (I6,L), Kernel item (I11,L) I11 L *•R,$ (I6,*) This is a new item and is different from I4 because R •L,$ (I11,*) they have a different look-ahead L •*R,$ L •id, $ I12 L id•,$ (I6,id) Kernel item (I11,id) I13 L *R•, $ (I11,id) Kernel item I4 and I11 could be combined together and called I411 I5 and I12 could be combined together and called I512 I7 and I13 could be combined together and called I713 I8 and I10 could be combined together and called I810 Thus we reduce the set of items from 14 to 10 in the LALR parsing algorithm.

Load more