Top-Down Parsing & Bottom-Up Parsing I

Top-Down Parsing & Bottom-Up Parsing I

Top-Down Parsing and Introduction to Bottom-Up Parsing Lecture 7 Instructor: Fredrik Kjolstad Slide design by Prof. Alex Aiken, with modifications 1 Predictive Parsers • Like recursive-descent but parser can “predict” which production to use – By looking at the next few tokens – No backtracking • Predictive parsers accept LL(k) grammars – L means “left-to-right” scan of input – L means “leftmost derivation” – k means “predict based on k tokens of lookahead” – In practice, LL(1) is used 2 LL(1) vs. Recursive Descent • In recursive-descent, – At each step, many choices of production to use – Backtracking used to undo bad choices • In LL(1), – At each step, only one choice of production – That is • When a non-terminal A is leftmost in a derivation • And the next input symbol is t • There is a unique production A ® a to use – Or no production to use (an error state) • LL(1) is a recursive descent variant without backtracking 3 Predictive Parsing and Left Factoring • Recall the grammar E ® T + E | T T ® int | int * T | ( E ) • Hard to predict because – For T two productions start with int – For E it is not clear how to predict • We need to left-factor the grammar 4 Left-Factoring Example • Recall the grammar E ® T + E | T T ® int | int * T | ( E ) • Factor out common prefixes of productions E ® T X X ® + E | e T ® int Y | ( E ) Y ® * T | e 5 LL(1) Parsing Table Example • Left-factored grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • The LL(1) parsing table: next input token int * + ( ) $ E T X T X X + E e e T int Y ( E ) Y * T e e e rhs of production to use 6 leftmost non-terminal E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e LL(1) Parsing Table Example • Consider the [E, int] entry – “When current non-terminal is E and next input is int, use production E ® T X” – This can generate an int in the first position int * + ( ) $ E T X T X X + E e e T int Y ( E ) Y * T e e e 7 E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e LL(1) Parsing Tables. Errors • Consider the [Y,+] entry – “When current non-terminal is Y and current token is +, get rid of Y” – Y can be followed by + only if Y ® e int * + ( ) $ E T X T X X + E e e T int Y ( E ) Y * T e e e 8 E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e LL(1) Parsing Tables. Errors • Consider the [Y,(] entry – “There is no way to derive a string starting with ( from non-terminal Y” – Blank entries indicate error situations int * + ( ) $ E T X T X X + E e e T int Y ( E ) Y * T e e e 9 Using Parsing Tables • Method similar to recursive descent, except – For the leftmost non-terminal S – We look at the next input token a – And choose the production shown at [S,a] • A stack records frontier of parse tree – Non-terminals that have yet to be expanded – Terminals that have yet to matched against the input – Top of stack = leftmost pending terminal or non-terminal • Reject on reaching error state • Accept on end of input & empty stack 10 LL(1) Parsing Algorithm (using the table) initialize stack = <S $> and next repeat case stack of <X, rest> : if T[X,*next] = Y1…Yn then stack ¬ <Y1…Yn, rest>; else error (); <t, rest> : if t == *next ++ then stack ¬ <rest>; else error (); until stack == < > 11 LL(1) Parsing Algorithm $ marks bottom of stack initialize stack = <S $> and next repeat For non-terminal X on top of stack, case stack of lookup production <X, rest> : if T[X,*next] = Y1…Yn then stack ¬ <Y1…Yn, rest>; else error (); Pop X, push production <t, rest> : if t == *next ++ rhs on stack. For terminal t on top of then stack ¬ <rest>; Note stack, check t matches next else error (); leftmost input token. symbol of rhs is on top of until stack == < > the stack. 12 E ® T X T ® int Y | ( E ) LL(1) Parsing Example X ® + E | e Y ® * T | e Stack Input Action E $ int * int $ T X T X $ int * int $ int Y int Y X $ int * int $ terminal Y X $ * int $ * T * T X $ * int $ terminal T X $ int $ int Y int Y X $ int $ terminal Y X $ $ e X $ $ e $ $ ACCEPT 13 Constructing Parsing Tables: The Intuition • Consider non-terminal A, production A ® a, and token t • T[A,t] = a in two cases: 1. If A ® a ®* t b – a can derive a t in the first position – We say that t Î First(a) 2. If A ® a ®* e and S ®* γ A t d – Useful if stack has A, input is t, and A cannot derive t – In this case only option is to get rid of A (by deriving e) • Can work only if t can follow A in at least one derivation – We say t Î Follow(A) 14 Computing First Sets Definition First(X) = { t | X ®* ta} È {e | X ®* e} Algorithm sketch: 1. First(t) = { t } 2. e Î First(X) • if X ® e or • if X ® A1 … An and e Î First(Ai) for all 1 £ i £ n 3. First(a) Í First(X) • if X ® A1 … An a and e Î First(Ai) for all 1 £ i £ n 15 First Sets: Example 1. First(t) = { t } 2. e Î First(X) – if X ® e or – if X ® A1…An and e Î First(Ai) for all 1 £ i £ n 3. First(a) Í First(X) – if X ® A1…An a and e Î First(Ai) for all 1 £ i £ n E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e First( E ) = First( X ) = First( T ) = First( Y ) = 16 First Sets: Example • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • First sets First( ( ) = { ( } First( T ) = {int, ( } First( ) ) = { ) } First( E ) = {int, ( } First( int) = { int } First( X ) = {+, e } First( + ) = { + } First( Y ) = {*, e } First( * ) = { * } 17 Computing Follow Sets • Definition: Follow(X) = { t | S ®* b X t d } • Intuition – If X ® A B then First(B) Í Follow(A) and Follow(X) Í Follow(B) • if B ®* e then Follow(X) Í Follow(A) – If S is the start symbol then $ Î Follow(S) 18 Computing Follow Sets (Cont.) Algorithm sketch: 1. $ Î Follow(S) 2. For each production A ® aXb – First(b) - {e} Í Follow(X) 3. For each production A ® aXb where e Î First(b) – Follow(A) Í Follow(X) 19 Follow Sets: Example • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • Follow sets Follow( + ) = { int, ( } Follow( * ) = { int, ( } Follow( ( ) = { int, ( } Follow( E ) = {), $} Follow( X ) = {$, ) } Follow( T ) = {+, ) , $} Follow( ) ) = {+, ) , $} Follow( Y ) = {+, ) , $} Follow( int) = {*, +, ) , $} 20 Computing the Follow Sets (for the Non-Terminals) • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • $ Î Follow(E) 21 Computing the Follow Sets (for the Non-Terminals) • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • $ Î Follow(E) • First(X) Í Follow(T) • Follow(E) Í Follow(X) • Follow(E) Í Follow(T) because e Î First(X) 22 Computing the Follow Sets (for the Non-Terminals) • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • $ Î Follow(E) • First(X) Í Follow(T) • Follow(E) Í Follow(X) • Follow(E) Í Follow(T) because e Î First(X) • ) Î Follow(E) 23 Computing the Follow Sets (for the Non-Terminals) • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • $ Î Follow(E) • First(X) Í Follow(T) • Follow(E) Í Follow(X) • Follow(E) Í Follow(T) because e Î First(X) • ) Î Follow(E) • Follow(T) Í Follow(Y) 24 Computing the Follow Sets (for the Non-Terminals) • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • $ Î Follow(E) • First(X) Í Follow(T) • Follow(E) Í Follow(X) • Follow(E) Í Follow(T) because e Î First(X) • ) Î Follow(E) • Follow(T) Í Follow(Y) • Follow(X) Í Follow(E) 25 Computing the Follow Sets (for the Non-Terminals) • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • $ Î Follow(E) • First(X) Í Follow(T) • Follow(E) Í Follow(X) • Follow(E) Í Follow(T) because e Î First(X) • ) Î Follow(E) • Follow(T) Í Follow(Y) • Follow(X) Í Follow(E) • Follow(Y) Í Follow(T) 26 Computing the Follow Sets (for the Non-Terminals) • Recall the grammar E ® T X X ® + E | e T ® ( E ) | int Y Y ® * T | e • $ Î Follow(E) • First(X) Í Follow(T) Follow(X) • Follow(E) Í Follow(X) $ Follow(E) • Follow(E) Í Follow(T) Follow(T) • ) Î Follow(E) ) First(X) • Follow(T) Í Follow(Y) • Follow(X) Í Follow(E) Follow(Y) • Follow(Y) Í Follow(T) 27 Computing the Follow Sets (for the Non-Terminals) Follow(X) $ Follow(E) Follow(T) ) + First(X) Follow(Y) 28 Computing the Follow Sets (for the Non-Terminals) $,) $,) Follow(X) $ Follow(E) $,),+ Follow(T) ) + First(X) $,),+ Follow(Y) 29 Computing the Follow Sets (for all symbols) $,) $,) Follow(X) $ Follow(E) $,),+ Follow(T) ) + First(X) (,int (,int $,),+ Follow(+) First(E) $,),+ Follow()) Follow(Y) (,int $,),+,* Follow(() * (,int (,int First(Y) Follow(int) First(T) Follow(*) 30 Constructing LL(1) Parsing Tables • Construct a parsing table T for CFG G • For each production A ® a in G do: – For each terminal t Î First(a) do • T[A, t] = a – If e Î First(a), then for each t or $ Î Follow(A) do • T[A, t] = a 31 Notes on LL(1) Parsing Tables • If any entry is multiply defined then G is not LL(1) – If G is ambiguous – If G is left recursive – If G is not left-factored – And in other cases as well • Most programming language CFGs are not LL(1) 32 Bottom-Up Parsing • Bottom-up parsing is more general than top- down parsing – And just as efficient – Builds on ideas in top-down parsing • Bottom-up is the preferred method • Concepts today, algorithms next time 33 An Introductory Example • Bottom-up parsers don’t need left-factored grammars • Revert to the “natural” grammar for our example: E ® T + E | T T ® int * T | int | (E) • Consider the string: int * int + int 34 The Idea Bottom-up parsing reduces a string to the start symbol by inverting productions: int * int + int T ® int int * T + int T ® int * T T + int T ® int T + T E ® T T + E E ® T + E E 35 Observation • Read the productions in reverse (from bottom to top) • This is a reverse rightmost derivation! int * int + int

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    66 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us