Top-Down Parsing

Top-Down Parsing

Outline • Recursive Descent Parsing • Predictive Parsers Top-Down Parsing Originated from Prof. Aiken CS 143 Modified by Yu Zhang 1 2 Intro to Top-Down Parsing: The Idea Recursive Descent Parsing (递归下降的分析) • Consider the grammar • The parse tree is constructed 1 E T |T + E –From the top T int | int * T | ( E ) – From left to right t2 3 t9 • Token stream is: ( int5 ) • Terminals are seen in order of 4 7 appearance in the token • Start with top-level non-terminal E stream: t5 t6 t8 – Try the rules for E in order t2 t5 t6 t8 t9 3 4 Recursive Descent Parsing Recursive Descent Parsing E T |T + E E T |T + E T int | int * T | ( E ) T int | int * T | ( E ) E E T ( int5 ) ( int5 ) 5 6 1 Recursive Descent Parsing Recursive Descent Parsing E T |T + E E T |T + E T int | int * T | ( E ) T int | int * T | ( E ) E E T T Mismatch: int is not ( ! int Backtrack … ( int5 ) ( int5 ) 7 8 Recursive Descent Parsing Recursive Descent Parsing E T |T + E E T |T + E T int | int * T | ( E ) T int | int * T | ( E ) E E T T Mismatch: int is not ( ! int * T Backtrack … ( int5 ) ( int5 ) 9 10 Recursive Descent Parsing Recursive Descent Parsing E T |T + E E T |T + E T int | int * T | ( E ) T int | int * T | ( E ) E E T T Match! Advance input. ( E ) ( E ) ( int5 ) ( int5 ) 11 12 2 Recursive Descent Parsing Recursive Descent Parsing E T |T + E E T |T + E T int | int * T | ( E ) T int | int * T | ( E ) E E T T Match! Advance input. (E) (E) T T ( int5 ) ( int5 ) 13 14 int Recursive Descent Parsing Recursive Descent Parsing E T |T + E E T |T + E T int | int * T | ( E ) T int | int * T | ( E ) E E T T Match! Advance input. End of input, accept. (E) (E) T T ( int5 ) ( int5 ) 15 16 int int A Recursive Descent Parser. Preliminaries A (Limited) Recursive Descent Parser (2) • Let TOKEN be the type of tokens • Define boolean functions that check the token – Special tokens INT, OPEN, CLOSE, PLUS, TIMES string for a match of – A given token terminal • Let the global next point to the next token bool term(TOKEN tok) { return *next++ == tok; } – The nth production of S: bool Sn() { … } – Try all productions of S: bool S() { … } 17 18 3 A (Limited) Recursive Descent Parser (3) A (Limited) Recursive Descent Parser (4) • For production E T • Functions for non-terminal T bool T1() { return term(INT); } bool E1() { return T(); } bool T2() { return term(INT) && term(TIMES) && T(); } • For production E T + E bool T3() { return term(OPEN) && E() && term(CLOSE); } bool E2() { return T() && term(PLUS) && E(); } • For all productions of E (with backtracking) bool T() { TOKEN *save = next; bool E() { return (next = save, T ()) TOKEN *save = next; 1 || (next = save, T2()) return (next = save, E1()) || (next = save, T3()); } || (next = save, E2()); } 19 20 Recursive Descent Parsing. Notes. Example • To start the parser E T |T + E ( int ) –Initialize next to point to first token T int | int * T | ( E ) –Invoke E() bool term(TOKEN tok) { return *next++ == tok; } • Notice how this simulates the example parse bool E1() { return T(); } bool E2() { return T() && term(PLUS) && E(); } bool E() {TOKEN *save = next; return (next = save, E1()) • Easy to implement by hand || (next = save, E2()); } bool T () { return term(INT); } – But not completely general 1 bool T2() { return term(INT) && term(TIMES) && T(); } – Cannot backtrack once a production is successful bool T3() { return term(OPEN) && E() && term(CLOSE); } – Works for grammars where at most one production can bool T() { TOKEN *save = next; return (next = save, T1()) succeed for a non-terminal || (next = save, T2()) || (next = save, T3()); } 21 22 When Recursive Descent Does Not Work Elimination of Left Recursion • Consider a production S S a • Consider the left-recursive grammar bool S1() { return S() && term(a); } S S | bool S() { return S1(); } •Sgenerates all strings starting with a and •S()goes into an infinite loop followed by a number of • A left-recursive grammar has a non-terminal S • Can rewrite using right-recursion S + S for some S S’ • Recursive descent does not work in such cases S’ S’ | 23 24 4 More Elimination of Left-Recursion General Left Recursion • In general • The grammar S A | S S | … | S | | … | 1 n 1 m A S • All strings derived from S start with one of is also left-recursive because + 1,…,m and continue with several instances of S S 1,…,n • Rewrite as • This left-recursion can also be eliminated S 1 S’ | … | m S’ S’ 1 S’ | … | n S’ | • See Dragon Book for general algorithm –Section 4.3 25 26 Summary of Recursive Descent Predictive Parsers (预测分析器) • Simple and general parsing strategy • Like recursive-descent but parser can – Left-recursion must be eliminated first “predict” which production to use – … but that can be done automatically – By looking at the next few tokens –No backtracking • Unpopular because of backtracking (回溯) • Predictive parsers accept LL(k) grammars – Thought to be too inefficient –Lmeans “left-to-right” scan of input –Lmeans “leftmost derivation” • In practice, backtracking is eliminated by –kmeans “predict based on k tokens of lookahead” restricting the grammar – In practice, LL(1) is used 27 28 LL(1) vs. Recursive Descent Predictive Parsing and Left Factoring • In recursive-descent, • Recall the grammar – At each step, many choices of production to use E T + E | T – Backtracking used to undo bad choices T int | int * T | ( E ) • In LL(1), – At each step, only one choice of production • Hard to predict because –That is • When a non-terminal A is leftmost in a derivation –For T two productions start with int • The next input symbol is t –For E it is not clear how to predict • There is a unique production A to use – Or no production to use (an error state) • We need to left-factor the grammar • LL(1) is a recursive descent variant without backtracking 29 30 5 Left-Factoring Example Left-Recursion & Left-Factoring Example • Recall the grammar • Recall the grammar E T + E | T E E + T | T -- +: left assoc. T int | int * T | ( E ) T int | int * T | ( E ) • Factor out common prefixes of productions • Eliminate left-recursion & factor out common E T X prefixes X + E | E T X T ( E ) | int Y X + T X | Y * T | T ( E ) | int Y Y * T | 31 32 LL(1) Parsing Table Example LL(1) Parsing Table Example (Cont.) • Left-factored grammar •Consider the [E, int] entry E T X X + E | – “When current non-terminal is E and next input is T ( E ) | int Y Y * T | int, use production E T X” • The LL(1) parsing table: next input token – This can generate an int in the first position int * + ( ) $ E T X T X •Consider the [Y,+] entry X + E – “When current non-terminal is Y and current token T int Y ( E ) is +, get rid of Y” Y * T –Ycan be followed by + only if Y rhs of production to use leftmost non-terminal 33 34 LL(1) Parsing Tables. Errors Predictive Parsing • Blank entries indicate error situations 输入 a + b $ •Consider the [E,*] entry – “There is no way to derive a string starting with * 预测分析程序 from non-terminal E” 栈 X 输出 Y Z $ 分析表M 36 35 6 Using Parsing Tables LL(1) Parsing Algorithm • Method similar to recursive descent, except initialize stack = <S $> and next – For the leftmost non-terminal S repeat – We look at the next input token a case stack of – And choose the production shown at [S,a] <X, rest> : if T[X,*next] == Y1…Yn then stack <Y1… Yn rest>; • A stack records frontier of parse tree else error (); – Non-terminals that have yet to be expanded <t, rest> : if t == *next ++ – Terminals that have yet to matched against the input then stack <rest>; – Top of stack = leftmost pending terminal or non-terminal else error (); until stack == < > • Reject on reaching error state • Accept on end of input & empty stack 37 38 LL(1) Parsing Algorithm LL(1) Parsing Example $ marks bottom of stack initialize stack = <S $> and next Stack Input Action repeat For non-terminal X on top of stack, E $ int * int $ T X lookup production case stack of T X $ int * int $ int Y <X, rest> : if T[X,*next] == Y1…Yn int Y X $ int * int $ terminal then stack <Y1… Yn rest>; Y X $ * int $ * T else error (); Pop X, push * T X $ * int $ terminal <t, rest> : if t == *next ++ production T X $ int $ int Y For terminal t on top of then stack <rest>; rhs on stack. stack, check t matches next else error (); Note int Y X $ int $ terminal input token. until stack == < > leftmost Y X $ $ symbol of rhs X $ $ is on top of the stack. $ $ ACCEPT 39 40 Constructing Parsing Tables: The Intuition Computing First Sets • Consider non-terminal A, production A , & token t Definition • T[A,t] = in two cases: First(X) = { t | X * t} { | X * } * •If t Algorithm sketch: – can derive a t in the first position –We say thatt First() 1. First(t) = { t } 2. First(X) •IfA and * and S * A t •if X – Useful if stack has A, input is t, and A cannot derive t •if X A1 … An and First(Ai) for 1 i n – In this case only option is to get rid of A (by deriving ) 3. First() First(X) if X A1 … An •Can work only if t can follow A in at least one derivation –and First(A ) for 1 i n –We sayt Follow(A) i 41 42 7 First Sets.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us