Lecture 11: Parsing • Build parse tree from an expression CSC 131 Interested in abstract syntax tree Fall, 2014 • - drops irrelevant details from parse tree Kim Bruce
Arithmetic grammar Recursive Descent Parser Base recognizer (ignore building tree now) on productions:
Predictive Parsing FIRST
Goal: a1a2...an • Intuition: b ∈ First(X) iff there is a derivation X →* bω for some ω. S → α 1.First(b) = b for b a terminal or the empty string ... 2.If have X ::= ω1 | ω2 | ... | ωn then → a1a2Xβ First(X) = First(ω1) ∪ ... ∪ First(ωn) Want next terminal character derived to be a3 3.For any right hand side u1u2...un a3 in First(γ) - First(u1) ⊆ First(u1u2...un) Need to apply a production X ::= γ where if all of u , u ..., u can derive the empty string then 1) γ can eventually derive a string starting with a3 or - 1 2 i-1 2) If X can derive the empty string, and also also First(ui) ⊆ First(u1u2...un) if β can derive a string starting with a3. - empty string is in First(u1u2...un) iff all of u1, u2..., un can derive the empty string a3 in Folow(X) First for Arithmetic Follow
• Intuition: A terminal b ∈ Follow(X) iff there is a FIRST(
Follow for ArithmeticOnly needed to Predictive Parsing, redux calculate for
• Put X ::= α in entry (X,a) if either • No table entry should have more than one production to ensure it’s unambiguous, as otherwise we - a in First(α), or don’t know which rule to apply. - e in First(α) and a in Follow(X) • Laws of predictive parsing: • Consequence: X ::= α in entry (X,a) iff there is - If A ::= α1 | ...| αn then for all i̸≠ j, a derivation s.t. applying production can First(αi) ∩ First(αj) = ∅. eventually lead to string starting with a. - If X →* ε, then First(X) ∩ Follow(X) = ∅.
See ArithParse.hs
Non- ID NUM Addop Mulop ( ) EOF • Laws of predictive parsing: terminals
Another alternative More Options
• LR(1) parsers -- bottom up, gives right-most derivation. Also stack-based. Parser Combinators • YACC is LR(1). ANTLR is LL(1). • - Domain specific language for parsing. k in LL(k) and LR(k) indicates how many • - Even easier to tie to grammar than recursive descent letters of look ahead are necessary -- e.g. length of strings in columns of table. - Build into Haskell and Scala, definable elsewhere • Talk about when cover Scala • Compiler writers are happiest with k=1 to avoid exponential blow-up of table. May have to rewrite grammars.