
LR Parsing Lecture 7 January 30, 2018 Midterm 1 Next Week É in-class exam É One sheet letter paper, two sides É Some review questions online É Things to know É Cool Syntax É Compiler/language processor steps É Regular Languages É NFA/DFAs É Regular Expressions É Parsing É Context-Free Grammars Compiler Construction 2/41 From Last Week É Top-down Parsing strategies É Tells us how to get from the start symbol of a grammar to a sequence of terminals following a sequence of productions Compiler Construction 3/41 Top Down Parsing S S a B c B ! C x B B ! ε C ! d !| a B c Input string: “adxdxc” a d x d x c Compiler Construction 4/41 Top Down Parsing S S a B c B B ! C x B B ! ε C ! d !| a B c Input string: “adxdxc” a d x d x c Compiler Construction 4/41 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: “adxdxc” a d x d x c Compiler Construction 4/41 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: “adxdxc” a d x d x c Compiler Construction 4/41 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: C B “adxdxc” a d x d x c Compiler Construction 4/41 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: C B “adxdxc” a d x d x c Compiler Construction 4/41 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: C B “adxdxc” a d x d x ε c Compiler Construction 4/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E ( E ) Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E ( E ) int Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E int Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E int * Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E int * T Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E T + E Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E int * T Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E int * T Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E int * T Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E int * T Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E int * T int Compiler Construction 5/41 Top Down Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int E int * T int * int Compiler Construction 5/41 Top-Down Summary É Exhaust all grammatical rule options at each step É Notice significant backtracing! É Backtracing results from having multiple options for rules when considering tokens! É If we restrict our grammar slightly, we can use LL(1) parsing É Key idea: deterministically select grammar rule based on one token to eliminate backtracing Compiler Construction 6/41 LL(1) Summary É We want a parsing strategy that lets us parse an input string by considering only one token at a time É Intuition: If input string x L(G), there must be a parse tree. We want to pick2 a grammar rule one token at a time so we can quickly find if the string is in the language! É Implementation: Parse table É Intuition: 2-d table indexed by ‘current’ non-terminal and ‘lookahead’ token. Each cell contains the production rule corresponding to the lookahead. Compiler Construction 7/41 Parse Table Construction É Constructing the Parse table requires a PREDICT set of every non-terminal in the grammar. É We need a way to know which rule must be used to successfully construct the next part of the parse tree S a |! b Consider! You know exactly which rule to take based on the first thing you see as part of the S rule. Compiler Construction 8/41 Parse Table Construction S a A a !| b A A c c A !| b b A É Same idea! We always start with S. You can see that the first non-terminal in the S rules tells you unambiguously which rule is used! É Similarly, after you consume one ‘a’, the next non-terminal you consider is A. You can decide which A rule to choose based on the first terminal in the A rules! Compiler Construction 9/41 Parse Table Construction S A a A ! B b B ! c B !| a A É A bit trickier, but you can still decide which rule to take with only one terminal Compiler Construction 10/41 Parse Table Construction É Sounds like we need to know the first terminal that can appear as a result of expanding a non-terminal! É This is called the FIRST set FIRST(A) is the set of terminals that can appear first in the expansion of the non-terminal A É Note this means recursing through other nonterminals! FIRST(X) = b X ∗ bα ε X ∗ ε É f j ! g [ f j ! g Compiler Construction 11/41 Parse Table Construction É That includes ε rules S a A a A ! b A !| ε Compiler Construction 12/41 Parse Table Construction É Recall we’re trying to make a parse table É Almost enough to know FIRST set of all non-terminals. The token will tell us which rule to take next. É However, how do we know when the token should be an ε? Compiler Construction 13/41 Parse Table Construction S a A B a A ! b B !| ε B c A d !| ε If my input string is aba, after consuming the ‘a’ token, how do we know that the next ‘b’ token results in B ε? ! Compiler Construction 14/41 Parse Table Construction: FOLLOW É We also need to know the terminals that could come after the previous non-terminal FOLLOW(X) = b S ∗ βXb! É f j ! g Computing FOLLOW set É Compute FIRST sets for all non-terminals É Add $ to FOLLOW(S) É start symbol always ends with end-of-input For all productions Y ...XA1...An É ! É Add FIRST(Ai)- ε to FOLLOW(X). Stop if f g ε FIRST (Ai). 62 É Add FOLLOW(Y) to FOLLOW(X) Compiler Construction 15/41 Example FOLLOW Set E TX ! X + E | ε T ! ( E ) | i n t Y Y ! T | ε ! ∗ FOLLOW(“+”) = { int, ( } FOLLOW(“(”) = { int, ( } FOLLOW(X) = { $, ) } FOLLOW(Y) = { +, ), $ } Compiler Construction 16/41 Back to Parsing Tables É Recall: We want to build a LL(1) Parsing Table For each production A α in G do: ! For each terminal b FIRST(α) do É 2 É T[A][b] = α If α ∗ ε, for each b FOLLOW(A) do É ! 2 É T[A][b] = α Compiler Construction 17/41 Parsing Table E TX ! X + E | ε T ! ( E ) | i n t Y Y ! T | ε ! ∗ Where do we put Y T ? ! ∗ É Well, FIRST(*T) = {*}, thus column * of row Y gets *T int * + ( ) $ T int Y (E) E TX TX X + E ε ε Y *T ε ε ε Compiler Construction 18/41 Parsing Table E TX ! X + E | ε T ! ( E ) | i n t Y Y ! T | ε ! ∗ Where do we put Y ε? ! É Well, FOLLOW(Y) = {$, +,)}, thus columns $, +, and ) in row Y get Y ε ! int * + ( ) $ T int Y (E) E TX TX X + E ε ε Y *T ε ε ε Compiler Construction 19/41 Another Parse Table Example E TXR FR ! ! ∗ X + T X | ε |! ε F i n t T F R! | ( E ) ! Production Prediction + * ( ) int $ E TX (, int X! TX E + + X ! ε $, ) X T ! FR (, int) T R ! FR * R R !ε ∗ , $, ) F + F !int int ! F (E) ( ! Compiler Construction 20/41 Another Parse Table Example Production Prediction + * ( ) int $ E TX (, int X! TX E TX TX + + X ! ε $, ) X TX ε ε + T ! FR (, int) T FR FR R ! FR * R ε FR ε ε R !ε ∗ , $, ) F ∗ E int + ( ) F !int int ! F (E) ( ! Compiler Construction 21/41 Notes on LL(1) Parsing Tables É If any entry is multiply defined then G is not LL(1) É G is ambiguous É G is left-recursive É G is not left-factored Compiler Construction 22/41 Ambiguity in parse tables E E + TT F E ! TF ! i d T ! T FF ! (E) ! ∗ ! For the E productions, we need FIRST(T) = {(, id} and FIRST(E) = {(, id} But now, which rule ( E E + T or E T ) gets put in ! ! T[E][(] and T[E][id]?? + * ( ) id $ E ? ? T F Compiler Construction 23/41 LR Parsing É LL(1) parsing is simple and fast É However, what if we wanted more expressiveness in the grammar? É Enter LR Parsing, a bottom-up parsing approach É Equally efficient as LL(1) É Not quite as easy by hand É Preferred practical method (see bison/yacc) Compiler Construction 24/41 LR Parsing An LR Parser reads tokens from left to right and constructs a bottom-up rightmost derivation.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages92 Page
-
File Size-