Lecture 11: Parsing Parsing Arithmetic Grammar Recursive Descent Parser

Lecture 11: Parsing Parsing Arithmetic Grammar Recursive Descent Parser

Parsing Lecture 11: Parsing • Build parse tree from an expression! CSC 131! Interested in abstract syntax tree! Fall, 2014! • ! - drops irrelevant details from parse tree Kim Bruce Arithmetic grammar Recursive Descent Parser Base recognizer (ignore building tree now) on productions:! <exp> ::= <exp> <addop> <term> ! <exp> ::= <exp> <addop> <term>! | <term> ! ! <term> ::= <term> <mulop> <factor> ! addop (fst:rest) = if fst==’+’ or fst==’-‘ then rest! | <factor> ! else error ...! <factor> ::= ( <exp> ) ! ! | NUM ! exp input = let! | ID ! inputAfterExp = exp input! <addop> ::= + | -! inputAfterAddop = addOp inputAfterExp! <mulop> ::= * | / rest = term inputAfterAddop! in ! rest or Look at parse tree & abstract syntax tree for 2 * 3 + 7 fun exp input = term(addOp(exp input)); Problems Rewrite Grammar <exp> ::= <term> <termTail> (1)! <termTail> ::= <addop> <term> <termTail> (2)! | ε (3)! <term> ::= <factor> <factorTail> (4)! <factorTail> ::= <mulop> <factor> <factorTail> (5)! • How do we select which production to use | ε (6)! when alternatives?! <factor> ::= ( <exp> ) (7)! | NUM (8)! Left-recursive - never terminates • | ID (9)! <addop> ::= + | - (10)! <mulop> ::= * | / (11) No lef recursion" How do we know which production to take? Predictive Parsing FIRST Goal: a1a2...an! • Intuition: b ∈ First(X) i% there is a derivation ! X →* bω for some ω.! S → α! 1.First(b) = b for b a terminal or the empty string! ...! 2.If have X ::= ω1 | ω2 | ... | ωn then → a1a2Xβ First(X) = First(ω1) ∪ ... ∪ First(ωn) Want next terminal character derived to be a3! 3.For any right hand side u1u2...un! ! a3 in First(γ) - First(u1) ⊆ First(u1u2...un) ! Need to apply a production X ::= γ where ! if all of u , u ..., u can derive the empty string then 1) γ can eventually derive a string starting with a3 or! - 1 2 i-1 2) If X can derive the empty string, and also ! also First(ui) ⊆ First(u1u2...un) if β can derive a string starting with a3. - empty string is in First(u1u2...un) i% all of u1, u2..., un can derive the empty string a3 in Folow(X) First for Arithmetic Follow • Intuition: A terminal b ∈ Follow(X) i% there is a FIRST(<addop>) = & +, - ' ! derivation S →* vXbω for some v and ω.! FIRST(<mulop>) = & *, / ' ! 1.If S is the start symbol then put EOF ∈ Follow(S) FIRST(<factor>) = & (, NUM, ID ' ! 2.For all rules of the form A ::= wXv,! FIRST(<term>) = & (, NUM, ID ' ! a.Add all elements of First(v) to Follow(X) FIRST(<exp>) = & (, NUM, ID ' ! If v can derive the empty string then add all elts of FIRST(<termTail>) = & +, -, ε ' ! b. Follow(A) to Follow(X)" FIRST(<factorTail>) = & *, /, ε ' • Follow(X) only used if can derive empty string from X. Follow for ArithmeticOnly needed to Predictive Parsing, redux calculate for <termTail>, Goal: a1a2...an! FOLLOW(<exp>) = & EOF, ) ' ! <factorTail> ! ! FOLLOW(<termTail>) = FOLLOW(<exp>) = & EOF, ) ' ! S → α! FOLLOW(<term>) = FIRST(<termTail>) ∪ ! ...! FOLLOW(<exp>) ∪ FOLLOW(<termTail>) ! → a1a2Xβ = & +, -, EOF, ) ' ! Want next terminal character derived to be a FOLLOW(<factorTail>) = & +, -, EOF, ) ' ! 3! ! FOLLOW(<factor>) = & *, /, +, -, EOF ( " Need to apply a production X ::= γ where ! FOLLOW(<addop>) = & (, NUM, ID ( Not needed!" 1) γ can eventually derive a string starting with a3 or! FOLLOW(<mulop>) = & (, NUM, ID ( ' 2) If X can derive the empty string, then see ! if β can derive a string starting with a3. Building Table Need Unambiguous • Put X ::= α in entry (X,a) if either! • No table entry should have more than one production to ensure it’s unambiguous, as otherwise we - a in First(α), or! don’t know which rule to apply.! - e in First(α) and a in Follow(X) • Laws of predictive parsing:! • Consequence: X ::= α in entry (X,a) i% there is - If A ::= α1 | ...| αn then for all i() j, a derivation s.t. applying production can First(αi) ∩ First(αj) = ∅. ! eventually lead to string starting with a. - If X →* ε, then First(X) ∩ Follow(X) = ∅. See ArithParse.hs Non- ID NUM Addop Mulop ( ) EOF • Laws of predictive parsing:! terminals <exp> 1 1 1 - If A ::= α1 | ...| αn then for all i() j, First(αi) ∩ First(αj) = ∅. ! <termTail> 2 3 3 - If X →* ε, then First(X) ∩ Follow(X) = ∅.! <term> 4 4 4 • 2nd is OK for arithmetic:! <factTail> 6 5 6 6 - FIRST(<termTail>) = & +, -, " ' ! ' <factor> 9 8 7 - FOLLOW(<termTail>) = & EOF, ) ' ! no overlap! <addop> 10 - FIRST(<factorTail>) = & *, /, " '! <mulop> 11 - FOLLOW(<factorTail>) = & +, -, EOF, ) ' ' Read o) fom table which production to apply! Table-Driven Stack-based Parser • http://en.wikipedia.org/wiki/LL_parser! • Start with “S $” on stack and “input $” to be recognized.! Alternatives to Recursive Descent • Use table to replace non-terminals on top of Parsers stack.! • If terminal on top of stack matches next input then erase both and proceed.! • Success if end up clearing stack and input! • Show with ID * (NUM + NUM)$ Another alternative More Options • LR(1) parsers -- bottom up, gives right-most derivation. Also stack-based.! Parser Combinators! • YACC is LR(1). ANTLR is LL(1).! • - Domain specific language for parsing.! k in LL(k) and LR(k) indicates how many • - Even easier to tie to grammar than recursive descent! letters of look ahead are necessary -- e.g. length of strings in columns of table.! - Build into Haskell and Scala, definable elsewhere! • Talk about when cover Scala • Compiler writers are happiest with k=1 to avoid exponential blow-up of table. May have to rewrite grammars..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us