Top-Down Parsing

Top-Down Parsing

TD2: LL(1) Parsing Top-down Parsing CMPT 379: Compilers Instructor: Anoop Sarkar anoopsarkar.github.io/compilers-class 1 Parsing - Roadmap • Parser: – decision procedure: builds a parse tree • Top-down vs. bottom-up • LL(1) – Deterministic Parsing – recursive-descent – table-driven • LR(k) – Deterministic Parsing – LR(0), SLR(1), LR(1), LALR(1) • Parsing arbitrary CFGs – Polynomial time parsing 2 Top-Down vs. Bottom Up Grammar: S ® A B Input String: ccbca A ® c | e B ® cbB | ca Top-Down/leftmost Bottom-Up/rightmost S Þ AB S®AB ccbca Ü Acbca A®c Þ cB A®c Ü AcbB B®ca Þ ccbB B®cbB Ü AB B®cbB Þ ccbca B®ca Ü S S®AB 3 Leftmost derivation for id + id * id E ® E + E E Þ E + E E ® E * E Þ id + E E ® ( E ) Þ id + E * E E ® - E Þ id + id * E E ® id Þ id + id * id E Þ*lm id + E \* E 4 Predictive Top-Down Parser • Knows which production to choose based on single lookahead symbol • Need LL(1) grammars – First L: reads input Left to right – Second L: produce Leftmost derivation – 1: one symbol of lookahead • Cannot have left-recursion • Must be left-factored (no left-factors) • Not all grammars can be made LL(1) 5 LL(1) Parser • In recursive-descent – for each non-terminal and input token, many choices of production to use – Backtracking to remove bad choices • In LL(1) – for each non-terminal and each token, only one production S ®* � A � and next input token: t A ®� is the only production � � � 6 Left Factoring • Consider this grammar – E ® T + E | T – T ® id | id * T | ( E ) • Hard to predict because – For T two productions start with id – For E it is not clear how to predict • The grammar must not have left-recursion • The grammar should be left-factored 7 Left Factoring • In general, for rules • Left factoring is achieved by the following grammar transformation: 8 Left Factoring • Recall the grammar – E ® T + E | T – T ® id | id * T | ( E ) • Factor out common prefixes for productions – E ® T X – X ® + E | ε – T ® id Y | ( E ) – Y ® * T | ε 9 Predictive Parsing Table • Can be specified via 2D tables • One dimension for current (leftmost) non-terminal to expand • One dimension for next token • Each table entry contains one production Productions 1 E ® T X + * ( ) id $ 2 X ® e E T X T X 3 X ® + E X + E e e 4 T ® ( E ) 5 T ® id Y T ( E ) id Y 6 Y ® * T Y e * T e e 7 Y ® e 10 Predictive Parsing Table • Consider [E, id] entry • When current non-terminal is E and the next input is id, use production E ® T X Productions 1 E ® T X + * ( ) id $ 2 X ® e E T X T X 3 X ® + E X + E e e 4 T ® ( E ) 5 T ® id Y T ( E ) id Y 6 Y ® * T Y e * T e e 7 Y ® e 11 Predictive Parsing Table • Consider [Y, +] entry • When current non-terminal is Y and the next input is + , get rid of Y • Y can be followed by + only if Y ® e Productions 1 E ® T X + * ( ) id $ 2 X ® e E T X T X 3 X ® + E X + E e e 4 T ® ( E ) 5 T ® id Y T ( E ) id Y 6 Y ® * T Y e * T e e 7 Y ® e 12 Predictive Parsing Table • Blank entries indicate error situations • Consider [E, *] entry • There is no way to derive a string starting with * from non-terminal E Productions 1 E ® T X + * ( ) id $ 2 X ® e E T X T X 3 X ® + E X + E e e 4 T ® ( E ) 5 T ® id Y T ( E ) id Y 6 Y ® * T Y e * T e e 7 Y ® e 13 Predictive Parsing • Method similar to recursive descent, except – For each non-terminal S – We look at the next token a – And chose the production shown at entry [S,a] • We use a stack to keep track of pending non- terminals (frontier of parse tree) • We reject when we encounter an error state • We accept when we encounter end-of-input and empty stack 14 Table-Driven Parsing stack.push($); stack.push(S); Stack: to keep track a = input.read(); of what is pending in the derivation forever do begin X = stack.peek(); if X = a and a = $ then return SUCCESS; elsif X = a and a != $ then stack.pop(X); a = input.read(); elsif X != a and X Î N and M[X,a] not empty then stack.pop(X); stack.push(M[X,a]); /* M[X, a] = Y1…Yn */ else ERROR! X⟶Y1…Yn end 15 + * ( ) id $ E T X T X X + E e e Trace “id*id” T ( E ) id Y Y e * T e e Stack Input Action E E $ id*id$ T X T X $ id*id$ id Y T X id Y X $ id*id$ terminal Y X $ *id$ * T id Y e * T X $ *id$ terminal T X $ id$ id Y * T id Y X $ id$ terminal Y X $ $ e id Y X $ $ e $ $ Accept! e 16 When to pick Y ® e? Productions • Choice between Y ® * T and Y ® e 1 E ® T X • FIRST(*T) = { * } 2 X ® e • For Y ® e we compute FOLLOW(Y) 3 X ® + E • FOLLOW(Y) = ? 4 T ® ( E ) • FOLLOW(Y) = FOLLOW(T) 5 T ® id Y • FOLLOW(T) = ( FIRST(X) – {e} ) + 6 Y ® * T FOLLOW(E) 7 Y ® e • FOLLOW(T) = { + , ) , $ } • FOLLOW(Y) = { + , ) , $ } 17 Predictive Parsing table • Given a grammar produce the predictive parsing table • We need to to know for all rules A ® a | b the lookahead symbol • Based on the lookahead symbol the table can be used to pick which rule to push onto the stack • This can be done using two sets: FIRST and FOLLOW 18 Predictive Parsing Table • For Nonterminal A, rule A ® a, and the token t, M[A, t] = a in two cases: • If a ⇒* t b – a can derive a t in the first position – We say that t Î First(a) • A ® a and a ⇒ * e and S ⇒ * b A t δ – Useful if stack has A, input is t and A cannot derive t – In this case only option is to get rid of A (by a ⇒ * e) • Can work only if t can follow A in at least on derivation – We say t Î Follow(A) 19 FIRST and FOLLOW 20 Conditions for LL(1) • Necessary conditions: – no ambiguity – no left recursion – Left factored grammar • A grammar G is LL(1) if - whenever A ® a | b 1. First(a) Ç First(b) = Æ 2. a Þ* e implies !(b Þ* e) 3. a Þ* e implies First(b) Ç Follow(A) = Æ 21 ComputeFirst(a: string of symbols) // assume a = X1 X2 X3 … Xn if X1 Î T then First[a] := {X1} else begin i:=1; First[a] := ComputeFirst(X1)\{e}; while Xi Þ* e do begin if i < n then First[a] := First[a] È ComputeFirst(Xi+1)\{e}; else First[a] := First[a] È {e}; i := i + 1; end Recursion in computing FIRST end causes problems when faced with recursive grammar rules 22 ComputeFirst; modified foreach X Î T do First[X] := {X}; foreach p Î P : X ® e do First[X] := {e}; repeat foreach X Î N, p : X ® Y1 Y2 Y3 … Yn do begin i:=1; while Yi Þ* e and i <= n do begin First[X] := First[X] È First[Yi]\{e}; i := i+1; end if i = n+1 then First[X] := First[X] È {e}; until no change in First[X] for any X; 23 ComputeFirst; modified foreach X Î T do First[X] := X; foreach p Î P : X ® e do First[X] := {e}; repeat foreach X Î N, p : X ® Y1 Y2 Y3 … Yn do begin i:=1; Non-recursive FIRST computation while Yi Þ* eworksand i with<= n leftdo-recursive begin grammars. First[X] := First[XComputes] È First[Ya fixed i]point\{e}; for FIRST[X] i := i+1; for all non-terminals X in the grammar. end But this algorithm is very inefficient. if i = n+1 then First[X] := First[X] È {e}; until no change in First[X] for any X; 24 First Sets Productions First(+) = {+} First(E) = ? 1 E ® T X First(*) = {*} First(T) ⊆ First(E) 2 X ® e 3 X ® + E First( ‘(‘ ) = {‘(’} First(T) = {id, ‘(‘} 4 T ® ( E ) First( ‘)’ ) = {‘)’} First(E) = {id, ‘(‘} 5 T ® id Y 6 Y ® * T First(id) = {id} First(X) = {+, e} 7 Y ® e First(Y) = {*, e} 25 Follow Sets • Algorithm sketch 1. Add $ to Follow(S) 2. For each production A ⟶ a X b • Add First(b) – {e} to Follow(X) 3. For each A ⟶ a X b where e Î First(b) • Add Follow(A) to Follow(X) – Repeat steps 2-3 until no follow set grows 26 ComputeFollow Follow(S) := {$}; repeat foreach p Î P do case p = A ® aBb begin Follow[B] := Follow[B] È ComputeFirst(b)\{e}; if e Î First(b) then Follow[B] := Follow[B] È Follow[A]; end case p = A ® aB Follow[B] := Follow[B] È Follow[A]; until no change in any Follow[N] 27 Follow Sets. Example Follow(E) = {$, )} Productions Follow(E)⊆ Follow(X) 1 E ® T X Follow(X) = {$, )} Follow(X)⊆ Follow(E) 2 X ® e Follow(T) = {+, $, )} First(X)-{e}⊆ Follow(T) 3 X ® + E Follow(Y) = {+, $, )} 4 T ® ( E ) Follow(E)⊆ Follow(T) Follow(‘(‘) = {(, id} 5 T ® id Y Follow(Y)⊆ Follow(T) Follow(‘)‘) = {+,$, )} 6 Y ® * T Follow(T)⊆ Follow(Y) 7 Y ® e Follow(+) = {(, id} Follow(*) = {(, id} Follow(id) = {*,+,$,)} 28 Building the Parse Table • Compute First and Follow sets • For each production A ® a – For each t Î First(a) • M[A,t] = a – If e Î First(a), for each t Î Follow(A) • M[A,t] = a – If e Î First(a) and $ Î Follow(a) • M[A,$] = a – All undefined entries are errors 29 Predictive Parsing Table First(E) = {id, ‘(‘} First(T) = {id, ‘(‘} Follow(E) = {$, )} Follow(T) = {+, $, )} First(X) = {+, e} First(Y) = {*, e} Follow(X) = {$, )} Follow(Y) = {+, $, )} Productions 1 E ® T X + * ( ) id $ 2 X ® e E T X T X 3 X ® + E X + E e e 4 T ® ( E ) 5 T ® id Y T ( E ) id Y 6 Y ® * T Y e * T e e 7 Y ® e 30 Example First/Follow S ® AB A ® c | e Not an LL(1) grammar B ® cbB | ca First(A) = {c, e} Follow(A) = {c} First(B) = {c} Follow(A) Ç First(cbB) = First(c) = {c} First(ca) = {c} Follow(B) = {$} First(S) = {c} Follow(S) = {$} 31 Converting to LL(1) S ® AB Note that grammar A ® c | e is regular: c? (cb)* ca B ® cbB | ca c (c b c b … c b) c a c c (b c b … c b c) a (c b c b … c b) c a c (b c b … c b c) a S ® cAa same as: A ® cB | B c c? (bc)* a B ® bcB | e 32 Verifying LL(1) using F/F sets S ® cAa A ® cB | B B ® bcB | e First(A) = {b, c, e} Follow(A) = {a} First(B) = {b, e} Follow(B) = {a} First(S) = {c} Follow(S) = {$} 33 Building the Parse Table • Compute First and Follow sets • For each production A ® a – foreach a Î First(a) add A ® a to M[A,a] – If e Î First(a) add A ® a to M[A,b] for each b in Follow(A) – If e Î First(a) add A ® a to M[A,$] if $ Î Follow(a) – All undefined entries are errors 34 Predictive Parsing Table Productions 1 T ® F T’ FIRST(T) = {id, (} FOLLOW(T) = {$, )} 2 T’ ® e FIRST(T’) = {*, e} FOLLOW(T’) = {$,)} 3 T’ ® * F T’ FIRST(F) = {id, (} FOLLOW(F) = {*,$,)} 4 F ® id 5 F ® ( T ) * ( ) id $ T T ® F T’ T ® F T’ T’ T’ ® * F T’ T’ ® e T’ ® e F F ® ( T ) F ® id 35 Revisit conditions for LL(1) • A grammar G is LL(1) iff - whenever A ® a | b 1.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    45 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us