LR Parsing, LALR Parser Generators Lecture 10 Outline • Review Of

Outline • Review of bottom-up parsing LR Parsing, LALR Parser Generators • Computing the parsing DFA • Using parser generators Lecture 10 Prof. Fateman CS 164 Lecture 10 1 Prof. Fateman CS 164 Lecture 10 2 Bottom-up Parsing (Review) The Shift and Reduce Actions (Review) • A bottom-up parser rewrites the input string • Recall the CFG: E → int | E + (E) to the start symbol • A bottom-up parser uses two kinds of actions: • The state of the parser is described as •Shiftpushes a terminal from input on the stack α I γ – α is a stack of terminals and non-terminals E + (I int ) ⇒ E + (int I ) – γ is the string of terminals not yet examined •Reducepops 0 or more symbols off the stack (the rule’s rhs) and pushes a non-terminal on • Initially: I x1x2 . xn the stack (the rule’s lhs) E + (E + ( E ) I ) ⇒ E +(E I ) Prof. Fateman CS 164 Lecture 10 3 Prof. Fateman CS 164 Lecture 10 4 LR(1) Parsing. An Example Key Issue: When to Shift or Reduce? int int + (int) + (int)$ shift 0 1 I int + (int) + (int)$ E → int • Idea: use a finite automaton (DFA) to decide E E → int I ( on $, + E I + (int) + (int)$ shift(x3) when to shift or reduce 2 + 3 4 E + (int I ) + (int)$ E → int – The input is the stack accept E int E + (E ) + (int)$ shift – The language consists of terminals and non-terminals on $ I ) E → int E + (E) I + (int)$ E → E+(E) 7 6 5 on ), + E I + (int)$ shift (x3) • We run the DFA on the stack and we examine E → E + (E) + E + (int )$ E → int the resulting state X and the token tok after I on $, + int I E + (E )$ shift –If X has a transition labeled tok then shift ( I 8 9 β E + (E) I $ E → E+(E) –If X is labeled with “A → on tok”then reduce E + E I $ accept 10 11 E → E + (E) Prof. Fateman CS 164 Lecture 10 5 ) on ), + Representing the DFA Representing the DFA. Example • Parsers represent the DFA as a 2D table • The table for a fragment of our DFA: – Recall table-driven lexical analysis int + ( ) $ E ( • Lines correspond to DFA states 3 4 … int 3s4 • Columns correspond to terminals and non- E 4s5 g6 terminals 5 6 5 rE→ int rE→ int • Typically columns are split into: E → int 6s8 s7 – Those for terminals: the action table ) on ), + 7 rE→ E+(E) rE→ E+(E) – Those for non-terminals: the goto table … sk is shift and goto state k 7 rX → α is reduce E → E + (E) gk is goto state k, Prof. Fateman CS 164 Lecture 10 7 on $, + Prof. Fateman CS 164 Lecture 10 8 The LR Parsing Algorithm The LR Parsing Algorithm • After a shift or reduce action we rerun the Let I = w$ be initial input DFA on the entire stack Let j = 0 Let DFA state 0 be the start state – This is wasteful, since most of the work is repeated Let stack = 〈 dummy, 0 〉 repeat • Remember for each stack element on which case action[top_state(stack), I[j]] of state it brings the DFA; use extra memory. shift k: push 〈 I[j++], k 〉 reduce X → A: pop |A| pairs, • LR parser maintains a stack push 〈X, Goto[top_state(stack), X]〉〈 sym1, state1 〉 . 〈 symn, staten 〉 accept: halt normally statek is the final state of the DFA on sym1 …symk error: halt and report error Prof. Fateman CS 164 Lecture 10 9 Prof. Fateman CS 164 Lecture 10 10 Key Issue: How is the DFA Constructed? LR(1) Items • The stack describes the context of the parse • An LR(1) item is a pair: – What non-terminal we are looking for X →α•β, a – What production rhs we are looking for –X → αβ is a production – What we have seen so far from the rhs –ais a terminal (the lookahead terminal) – LR(1) means 1 lookahead terminal • Each DFA state describes several such contexts •[X →α•β, a] describes a context of the parser – E.g., when we are looking for non-terminal E, we – We are trying to find an X followed by an a, and might be looking either for an int or a E + (E) rhs – We have (at least) α already on top of the stack – Thus we need to see next a prefix derived from βa Prof. Fateman CS 164 Lecture 10 11 Prof. Fateman CS 164 Lecture 10 12 Note Convention •The symbol I was used before to separate the • We add to our grammar a fresh new start stack from the rest of input symbol S and a production S → E – α I γ, where α is the stack and γ is the remaining –Where E is the old start symbol string of terminals •In items • is used to mark a prefix of a • The initial parsing context contains: production rhs: S → •E, $ →α β X • , a – Trying to find an S as a string derived from E$ β –Here might contain terminals as well –The stack is empty • In both case the stack is on the left Prof. Fateman CS 164 Lecture 10 13 Prof. Fateman CS 164 Lecture 10 14 LR(1) Items (Cont.) LR(1) Items (Cont.) • In context containing •Consider the item E → E + • ( E ), + E → E + (• E ) , + –If ( follows then we can perform a shift to context • We expect a string derived from E ) + containing • There are two productions for E → • E E + ( E ), + E → int and E → E + ( E) • In context containing • We describe this by extending the context E → E + ( E ) •, + with two more items: – We can perform a reduction with E → E + ( E ) E → • int, ) –But only if a + follows E → • E + ( E ) , ) Prof. Fateman CS 164 Lecture 10 15 Prof. Fateman CS 164 Lecture 10 16 The Closure Operation Constructing the Parsing DFA (1) • Extending the context with items is called the • Construct the start context: Closure({S → •E, $}) closure operation. S → •E, $ E → •E+(E), $ Closure(Items) = E → •int, $ repeat E → •E+(E), + for each [X → α•Yβ, a] in Items E → •int, + for each production Y → γ • We abbreviate as: for each b ∈ First(βa) S → •E, $ γ add [Y → • , b] to Items E → •E+(E), $/+ until Items is unchanged E → •int, $/+ Prof. Fateman CS 164 Lecture 10 17 Prof. Fateman CS 164 Lecture 10 18 Constructing the Parsing DFA (2) The DFA Transitions • A DFA state is a closed set of LR(1) items • A state “State” that contains [X → α•yβ, b] has a transition labeled y to a state that the • The start state contains [S → •E, $] items “Transition(State, y)” – y can be a terminal or a non-terminal • A state that contains [X → α•, b] is labeled Transition(State, y) α with “reduce with X → on b” Items ← ∅ for each [X → α•yβ, b] ∈ State • And now the transitions … add [X → αy•β, b] to Items return Closure(Items) Prof. Fateman CS 164 Lecture 10 19 Prof. Fateman CS 164 Lecture 10 20 Constructing the Parsing DFA. Example. LR Parsing Tables. Notes → • 0 1 S E, $ E → int•, $/+ E → int • Parsing tables (i.e. the DFA) can be E → •E+(E), $/+ int on $, + constructed automatically for a CFG E → •int, $/+ E → E+• (E), $/+ 3 E + 2 • Why study this at all in CS164? We still need S → E•, $ ( to understand the construction to work with E → E•+(E), $/+ E → E+(•E), $/+ 4 parser generators accept E E → •E+(E), )/+ on $ E → •int, )/+ – E.g., they report errors in terms of sets of items E → E+(E•), $/+ int 5 6 • What kind of errors can we expect? E → E•+(E), )/+ E → int•, )/+ E → int on ), + and so on… Prof. Fateman CS 164 Lecture 10 21 Prof. Fateman CS 164 Lecture 10 22 Shift/Reduce Conflicts Shift/Reduce Conflicts • If a DFA state contains both • They are a typical symptom if there is an [X → α•aβ, b] and [Y → γ•, a] ambiguity in the grammar • Classic example: the dangling else S → if E then S | if E then S else S | OTHER •Then on input “a” we could either • Will have DFA state containing –Shift into state [X → αa•β, b], or [S → if E then S•, else] – Reduce with Y → γ [S → if E then S• else S, x] •Ifelse follows then we can shift or reduce • This is called a shift-reduce conflict • Default (bison, CUP, JLALR, etc.) is to shift – Default behavior is right in this case Prof. Fateman CS 164 Lecture 10 23 Prof. Fateman CS 164 Lecture 10 24 More Shift/Reduce Conflicts More Shift/Reduce Conflicts • Consider the ambiguous grammar Some parser generators (YACC, BISON) → E E + E | E * E | int provide precedence declarations. • We will have the states containing – Precedence left PLUS, → → [E E * • E, +] [E E * E•, +] – Precedence left TIMES [E → • E + E, +] ⇒E [E → E • + E, +] – Precedence right EXP …… • Again we have a shift/reduce on input + •Bison, YACC – We need to reduce (* binds more tightly than +) – Declare precedence and associativity: – Solution: somehow impose the precedence of * and %left + + %left * Prof. Fateman CS 164 Lecture 10 25 Prof. Fateman CS 164 Lecture 10 26 More Shift/Reduce Conflicts Reduce/Reduce Conflicts Our LALR generator doesn’t do this. Instead, we • If a DFA state contains both “Stratify” the grammar. (Less explanation!) [X → α•, a] and [Y → β•, a] E → E + E | E * E | int ;; original New –Then on input “a”we don’t know which E → E + E1 | E1 ;; E1+E would be right associative production to reduce E1 → E1 * int | int (Many “layers” may be necessary for elaborate languages.

LR Parsing, LALR Parser Generators Lecture 10 Outline • Review Of

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support