Computer Science 332 Compiler Construction

Computer Science 332 Compiler Construction

Top-Down Parsing Computer Science 332 Goal: ¡ Find a leftmost derivation for an input string, or Compiler Construction ¡ Construct a parse tree for the input starting from the root and creating nodes of the parse tree in preorder (parent, then children) 4.4: Top-Down Parsing Discussed deterministic special case – predictive parsing – in 2.4. (Skip Sections on Transition Diagrams and Error Recovery) General case is nondeterministic (backtracking) Of more theoretical than practical interest Recursive-Descent Parsing Recursive-Descent Parsing Requires backtracking Requires backtracking Consider grammar Consider grammar ¢ S cAd S ¢ cAd ¢ A ab | a A ¢ ab | a Parse input string w = cad: Parse input string w = cad: S c a d c a d Recursive-Descent Parsing Recursive-Descent Parsing Requires backtracking Requires backtracking Consider grammar Consider grammar ¢ S ¢ cAd S cAd A ¢ ab | a A ¢ ab | a Parse input string w = cad: S Parse input string w = cad: S c A d c A d c a d c a d Recursive-Descent Parsing Recursive-Descent Parsing Requires backtracking Requires backtracking Consider grammar Consider grammar S ¢ cAd S ¢ cAd A ¢ ab | a A ¢ ab | a Parse input string w = cad: S Parse input string w = cad: S c A d c A d a b a b c a d c a d Recursive-Descent Parsing Recursive-Descent Parsing Requires backtracking Requires backtracking Consider grammar Consider grammar ¢ S ¢ cAd S cAd A ¢ ab | a A ¢ ab | a Parse input string w = cad: S Parse input string w = cad: S c A d c A d a b FAIL c a d c a d Recursive-Descent Parsing Recursive-Descent Parsing Requires backtracking Requires backtracking Consider grammar Consider grammar S ¢ cAd S ¢ cAd A ¢ ab | a A ¢ ab | a Parse input string w = cad: S Parse input string w = cad: S c A d c A d a a c a d c a d Recursive-Descent Parsing Nonrecursive Predictive Parsing Requires backtracking Maintain stack explicitly, instead of relying on run- Consider grammar time support for recursion. S ¢ cAd Components ¡ A ¢ ab | a Input buffer : w$ ¡ Parse input string w = cad: S Stack : terminals and nonterminals ¡ Parsing table : c A d nonterminal × input symbol | ¢ production a SUCCEED ¡ Output stream : derivation c a d Nonrecursive Predictive Parsing Predictive Parsing Algorithm set input pointer ip to first symbol of w$ repeat Table M determines action based on stack symbol let X be the top stack symbol and a the symbol pointed to by ip X and input symbol a. if X is a terminal or $ then if X = a then Initial stack is start symbol on top of $. pop X from the stack and advance ip Possibilities are else error () else /* X is a nonterminal */ 1. X = a = $ : halt successfully if M[X, a] = X ¡ Y Y ... Y then begin 1 2 k 2. X = a $ : pop X and advance input pointer pop X from the stack push Y , Y , ..., Y onto the stack with Y on top /* order ? */ 3. X = nonterminal : Consult table entry M[X, a]. k k-1 1 1 output the production X ¡ Y Y ... Y If empty, report error. Else pop X and push table 1 2 k end entry. else error() until X = $ /* stack is empty */ Predictive Parsing Example FIRST and FOLLOW Grammar (note elimination of E T E' ∈ left recursion): E' + T E' | • Recall FIRST from Chapter 2: FIRST(α) is set of T F T' α terminals that begin strings derived from . Input: id + id * id T' * F T' | ∈ F ( E ) | id • Together with FOLLOW, helps us build parse Table: table from grammar. Input Symbol id + * ( ) $ • FOLLOW(A) is set of terminals a that can appear Nonterminal immediately to the right of A in some sentential E E ¡ T E' E ¡ T E' ⇒* α β ¡ E' E' ¡ +T E' E' ¡ ∈ E' ∈ form; i.e., a such that S Aa . T T ¡ F T' T ¡ F T' ¡ T' T' ¡ ∈ T' ¡ *F T'∋ T' ¡ ∈ T' ∈ F F ¢ id F ¡ ( E ) COMPUTING FIRST COMPUTING FOLLOW 1. If X is terminal, then FIRST(X) is {X}. 1. Place $ in FOLLOW(S), where S is the start symbol. £ 2. If X ∈ is a production, add ∈ to FIRST(X). £ 2. If there is a production A αBβ, then everything in 3. If X £ Y Y ... Y is a production, place a in FIRST(X) if β ∈ 1 2 k FIRST( ) except for is placed in FOLLOW(B). for some i, a is in FIRST(Y) and ∈ is in all of £ i 3. If there is a production A αB, or a production FIRST(Y ) ... FIRST(Y ); that is, Y ... Y ⇒* ∈. If ∈ is A £ αBβ where FIRST(β) contains ∈ (i.e., β ⇒* ∈), then 1 i-1 1 i-1 in FIRST(Y) for all j = 1, 2, ..., k, then add ∈ to everything in FOLLOW(A) is in FOLLOW(B). j FIRST(X). For example, everything in FIRST(Y ) is E T E' 1 Exercise: Compute FIRST, ∈ ∈ FOLLOW for nonterminals in E' + T E' | surely in FIRST(X). If Y does not derive , then we add 1 grammar: T F T' nothing more to FIRST(X), but if Y .⇒* ∈, then we add 1 T' * F T' | ∈ FIRST(Y ) and so on. F ( E ) | id 2 Construction of Predictive Parse Tables LL(1) Grammars Input: Grammar G Output: Parsing table M Ambiguous grammars will have more than one entry £ M[A, a] for some nonterminal A, terminal a. 1. For each production A α of the grammar, do steps 2 and 3. E.g., ambiguous if / then / else grammar : S iEtSS' | a S' eS | ∈ £ 2. For each terminal a in FIRST(α), add A α to M[A, a]. E b £ ∈ α α 3. If is in FIRST( ), add add A to M[A, b] for each This grammar produces a table M containing entry ∈ α £ £ terminal b in FOLLOW(A). If is FIRST( ) and $ is in M[S', e] = {S' ∈, S' eS} (because FOLLOW(S') = FOLLOW(A), add A £ α to M[A, $]. {e, $}). 4. Make each undefined entry of M be error. LL(1) Grammars LL(1) Grammars A grammar without such duplicate entries is called LL(1). First L means “read input Left to right”. So what to do when M has multiply-defined entries? Second L means “build Leftmost derivation”. Can try to make G LL(1) by eliminating left recursion, and left factoring the result – may produce an LL(1) grammar. 1 means one symbol of lookahead in input to make decisions. Won't work for some grammars, like our if / then / else No ambiguous or left-recursive grammar can be LL(1). example. £ α β, More technically: Grammar G is LL(1) iff for A | For such grammars, we may be able to eliminate all but one of 1.For no terminal a do both α and β derive strings beginning the multiple entries; e.g., change M[S', e] = {S' £ ∈, S' £ eS} with a. to M[S', e] = S' £ eS. 2.At most one of α and β can derive the empty string. But this must be done on a case-by case basis; there are no universal rules. 3.If β ⇒* ∈, then α does not derive any string beginning with a terminal in FOLLOW(A)..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us