LR Parsing LR Parsing • Limits of SLR Parsing • LR Parsing Lecture Notes by • LALR Parsing Profs Aiken and Necula (UCB) • Implementation of Semantic Actions

Outline • Review of SLR parsing LR Parsing • Limits of SLR parsing • LR parsing Lecture Notes by • LALR parsing Profs Aiken and Necula (UCB) • Implementation of semantic actions • Using parser generators CS780(Prasad) L16LR 1 CS780(Prasad) L16LR 2 Review of SLR(1) Parsing LR Parsing Algorithm • LR parser maintains a stack Let I = w$ be initial input Let j = 0 á sym1, state1 ñ . á symn, staten ñ Let DFA state 1 have item S’ ® .S staten is the final state of the DFA on sym1 … symn Let stack = á dummy, 1 ñ • Goto table: the transition function of the DFA repeat A – Goto[i,A] = j if statei ® statej case action[top_state(stack), I[j]] of • Action table: for each state and terminal: shift k: push á I[j++], k ñ reduce X ® A: Shift j pop |A| pairs, Reduce X ® a push áX, Goto[X,top_state(stack)]ñ Action[i, a] = Accept accept: halt normally Error error: halt and report error CS780(Prasad) L16LR 3 CS780(Prasad) L16LR 4 Review. Items SLR(1) Action Table • An item [X ® a.b] says that For each state si and terminal a – the parser is looking for an X – If si has item X ® a.ab and Goto[i,a] = j then – it has an a on top of the stack Action[i,a] = shift j – Expects to find a string derived from b next in the input – If si has item S’ ® S. then Action[i,$] = accept • Notes: – [X ® a.ab] means that a should follow. Then we can – If si has item X ® a. and a Î Follow(X) and X ¹ S’ shift it and still have a viable prefix. then Action[i,a] = reduce X ® a – [X ®a.] means that we could reduce X • But this is not always a good idea ! – Otherwise, Action[i,a] = error CS780(Prasad) L16LR 5 CS780(Prasad) L16LR 6 1 Limits of SLR Parsing Limits of SLR Parsing (cont.) • SLR(1) is the simplest LR parsing method • Consider two states of the DFA for • SLR(1) is almost powerful enough, but … recognizing viable prefixes • … some common programming language ® ® constructs are not SLR(1). S’ . S S L . = E L • Consider the grammar S ® . L = E Þ E ® L . ® S ® L = E | E S . E SLR(1) parser on input “=“ ® L ® * E | id L . * E • shift (item L . = E ) L ® . id • reduce by E ® L E ® L (since “=“ Î Follow(E)) E ® . L CS780(Prasad) L16LR 7 CS780(Prasad) L16LR 8 What’s The Problem? What’s The Problem? (Cont.) • The grammar is not SLR(1), but why? • Problem: the SLR table has too many reduce • Focus on the reduce move in the second state actions. – We are in the context of S ® E ® L – Using Follow is too coarse. – No = can follow E in this context • In any given context, only some elements of – Even though = Î Follow(E) (in S ® L = E ® *E = E) Follow can actually follow a non-terminal. – The reduce move should not happen if an = follows • For example: in this context. Follow(E) = {=, $}, but In context S ® E only $ can follow E In context S ® L = E ® * E1 = E only = can follow E1 CS780(Prasad) L16LR 9 CS780(Prasad) L16LR 10 One Way to Fix The Problem: LR(1) Items LR(1) Items. Intuition • Idea: • [X ® a .b, a] describes a state of the parser: – refine Follow based on context. – We are trying to find an X, and – The context is described through items. – We have a already on top of the stack, and • An LR(1) item is a pair – We expect to see a prefix derived from ba [X ® a .b, a] ® a where X ® ab is a production and a is the • Back to reduce actions: have an [X ., a] lookahead token or $ – Perform the reduce only if next token is a ! – Will have fewer reduce actions • LR(k) is similar but with k tokens of lookahead – Not for all b Î Follow(X) – In practice, k = 1 CS780(Prasad) L16LR 11 CS780(Prasad) L16LR 12 2 Constructing Sets of LR(1) Items (1) Constructing Sets of LR(1) Items (2) • Similar to construction for LR(0). 1. For each LR(1) item [Y ® a.Xb, a] Add an X-transition [Y ® a.Xb, a] ®X [Y ® aX.b, a] • The states of the NFA are the LR(1) items of G. 2. For each LR(1) item [Y ® a.Xb, a] • The start state is [S’ ® . S, $] For each production X ® g For each terminal b Î First(ba) Add an e transition [Y ® a.Xb, a] ®e [X ® .g, b] CS780(Prasad) L16LR 13 CS780(Prasad) L16LR 14 NFA for Viable Prefixes in Detail (1) NFA for Viable Prefixes in Detail (2) S’ ® S . $ S’ ® S . $ L ® . id = S S e S ® . L = E $ S ® . L = E $ L e e e S’ ® . S $ S’ ® . S $ S ® L . = E $ L ® . * E = e e S ® . E $ S ® . E $ CS780(Prasad) L16LR 15 CS780(Prasad) L16LR 16 NFA for Viable Prefixes in Detail (3) NFA for Viable Prefixes in Detail (4) S’ ® S . $ S’ ® S . $ L ® . id = L ® . id = S e S e S ® . L = E $ L S ® . L = E $ L e e e e S’ ® . S $ S ® L . = E $ S’ ® . S $ S ® L . = E $ L ® . * E = L ® . * E = L e e E ® L . $ e E ® . L $ e E ® . L $ e S ® . E $ S ® . E $ e E E L ® . * E $ L ® . id $ CS780(Prasad) L16LR 17 CS780(Prasad) L16LR 18 3 An Example Revisited Constructing LR(1) Parsing Tables • Consider the state from last slide 1. Add a dummy S’ ® S production S ® L . = E $ 2. Construct the NFA of LR(1) items as before 3. Convert the NFA into a DFA E ® L . $ 4. Goto is defined exactly as before: A Goto[i, A] = j if statei ® statej • LR(1) parser on input “ =“ (the transition function of the DFA) • only shift (item L . = E ) CS780(Prasad) L16LR 19 CS780(Prasad) L16LR 20 Constructing LR(1) Parsing Tables (Cont.) LALR Parsing 5. For each state si of the DFA and terminal a • Two bottom-up parsing methods: SLR and LR – If si has item [X ® a.ab, c] and Goto[i, a] = j then action[i,a] = shift j • Which one we use? Neither – SLR is not powerful enough. – If si has item [X ® a., a] and X ¹ S’ then action[i,a] = reduce X ® a – LR parsing tables are too big (1000’s of states vs. 100’s of states for SLR). – If si has item [S’ ® S., $] then action[i,$] = accept • In practice, use LALR(1) – Otherwise, – Stands for Look-Ahead LR action[i,a] = error – A compromise between SLR(1) and LR(1) • LR(1) grammar Û action[i,a] uniquely defined CS780(Prasad) L16LR 21 CS780(Prasad) L16LR 22 LALR Parsing (Cont.) The Core of a Set of LR Item • Rough intuition: A LALR(1) parser for G has • Definition: The core of a set of LR items is – The number of states of an SLR parser. the set of first components. – Some of the lookahead discrimination of LR(1). • Example: the core of • Idea: construct the DFA for the LR(1). { [X ® a.b, b], [Y ® g.d, d]} • Then merge the DFA states whose items is ® a b ® g d differ only in the lookahead tokens {X . , Y . } – We say that such states have the same core. • The core of an LR item is an LR(0) item. CS780(Prasad) L16LR 23 CS780(Prasad) L16LR 24 4 A LALR(1) DFA The LALR Parser Can Have Conflicts • Repeat until all states have distinct core. • Consider for example the LR(1) states – Choose two distinct states with same core. {[X ® a., a], [Y ® b., b]} – Merge the states by creating a new one with the {[X ® a., b], [Y ® b., a]} union of all the items. – Point edges from predecessors to new state. • And the merged LALR(1) state – New state points to all the previous successors. {[X ® a., a/b], [Y ® b., a/b]} • Has a new reduce-reduce conflict. A B C A C BE • In practice such cases are rare. D E F D F CS780(Prasad) L16LR 25 CS780(Prasad) L16LR 26 LALR vs. LR Parsing A Hierarchy of Grammar Classes • LALR languages are not natural. – They are an efficiency hack on LR languages • Any reasonable programming language has an LALR(1) grammar. • LALR(1) has become a standard for programming languages and for parser generators. CS780(Prasad) L16LR 27 CS780(Prasad) L16LR 28 Semantic Actions Performing Semantic Actions. Example • We can now illustrate how semantic actions • Recall the example from earlier lecture are implemented for LR parsing. • Keep attributes on the stack. E ® T + E1 { E.val = T.val + E1.val } | T { E.val = T.val } • On shift a, push attribute for a on stack. T ® int * T1 { T.val = int.val * T1.val } • On reduce X ® a | int { T.val = int.val } – pop attributes for a – compute attribute for X • Consider the parsing of the string 3 * 5 + 8 – and push it on the stack CS780(Prasad) L16LR 29 CS780(Prasad) L16LR 30 5 Performing Semantic Actions. Example Notes | int * int + int shift • The previous discussion shows how int3 | * int + int shift synthesized attributes are computed by LR int3 * | int + int shift parsers. int3 * int5 | + int reduce T ® int int3 * T5 | + int reduce T ® int * T T15 | + int shift • It is also possible to compute inherited T15 + | int shift attributes in an LR parser.

LR Parsing LR Parsing • Limits of SLR Parsing • LR Parsing Lecture Notes by • LALR Parsing Profs Aiken and Necula (UCB) • Implementation of Semantic Actions

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support