LL and LR Parsing Lecture 6

LL and LR Parsing Lecture 6

LL and LR Parsing Lecture 6 February 5, 2018 Context-free Grammars A context-free grammar consists of É A set of non-terminals N É Written in uppercase throughout these notes É A set of terminals T comprised of tokens É Lowercase or punctuation throughout these notes É A start symbol S (a non-terminal) É A set of productions (rewrite rules) Assuming E N E ε 2 or ! E Y1Y2...Yn where Yi N T ! 2 [ Compiler Construction 2/49 Context-free ? Production rules hint at expressiveness! Regular A aB,C ε Context-free A ! α ! Context-sensitive αA!β αγβ Type-0 α β! ! α,β,γ N T ∗ 2 f [ g “What just happened? We must be missing some context...” Compiler Construction 3/49 Parsing and Context-free Grammars É Lexical Analysis É Regular Expressions specify a Regular Language containing strings of characters (lexeme) that correspond to a token É Parsing É Context-free Grammars specify a Context-free Language containing strings of tokens that correspond to a grammatical rule (production) Compiler Construction 4/49 Generativeness É Regular expressions and context-free grammars are generative É You can generate every string in the language using the regex or grammar! Compiler Construction 5/49 Generating Strings É Consider regex: ab*a É You can generate aa, aba, abba, abbba, ... É Consider context-free grammar: E (E)E |! ε É You can generate ε, (), (()), (())(), ... É Generating strings with a grammar can be thought of as creating a parse tree! Compiler Construction 6/49 Language membership É We care about whether an input string of tokens is syntactically correct (e.g., obeys our language’s grammar) É So far, we have looked at theoretical implications of grammars L(G) = a1...an S ∗ a1...an f j ! g For an input string x, is x L(G)? 2 Parsing part 1: We need a yes/no answer! Compiler Construction 7/49 Language membership S a B |! b C B b b C C ! c c ! What strings are in this language? (Hint: there’s only two!) If my input string is “dabc”, we ask: can the grammar generate this string? (No) É N.B. it doesn’t matter how from a theoretical perspective, that’s the job of the parsing algorithm! Compiler Construction 8/49 Parsing Algorithms É LL (top down) É Reads input from left to right and uses left-most derivations to construct a parse tree É LR (bottom up) É Reads input from left to right and uses right-most derivations to construct a parse tree É Both algorithms are driven by the input grammar and the input to be parsed. Compiler Construction 9/49 Parsing Algorithm Intuition É You start with a sequence of tokens, t1t2t3t4t5 É and also a grammar! É Two general approaches to constructing the parse tree É top-down parsing is when you predict the grammatical rule used to produce the tokens seen so far É bottom-up parsing is when you consider tokens one at a time until you match a grammatical rule Compiler Construction 10/49 Top Down Parsing S S a B c B ! C x B B ! ε C ! d !| a B c Input string: “adxdxc” a d x d x c Compiler Construction 11/49 Top Down Parsing S S a B c B B ! C x B B ! ε C ! d !| a B c Input string: “adxdxc” a d x d x c Compiler Construction 11/49 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: “adxdxc” a d x d x c Compiler Construction 11/49 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: “adxdxc” a d x d x c Compiler Construction 11/49 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: C B “adxdxc” a d x d x c Compiler Construction 11/49 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: C B “adxdxc” a d x d x c Compiler Construction 11/49 Top Down Parsing S S a B c B B ! C x B ! B ε B C ! d !| a B c C Input string: C B “adxdxc” a d x d x ε c Compiler Construction 11/49 Bottom-up Parsing Tokens right now: a S a B c B ! C x B B ! ε C ! d !| a B c Input string: “adxdxc” a d x d x c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: ad S a B c B ! C x B B ! ε C ! d !| a B c Input string: “adxdxc” a d x d x c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aC S a B c B ! C x B B ! ε C ! d !| a B c C Input string: “adxdxc” a d x d x c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aCx S a B c B ! C x B B ! ε C ! d !| a B c C Input string: “adxdxc” a d x d x c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aCxd S a B c B ! C x B B ! ε C ! d !| a B c C Input string: “adxdxc” a d x d x c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aCxC S a B c B ! C x B B ! ε C ! d !| a B c C Input string: “adxdxc” C a d x d x c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aCxCx S a B c B ! C x B B ! ε C ! d !| a B c C Input string: “adxdxc” C a d x d x c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aCxCxε S a B c B ! C x B B ! ε C ! d !| a B c C Input string: “adxdxc” C a d x d x ε c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aCxCxB S a B c B ! C x B B ! ε C ! d !| a B c C Input string: “adxdxc” C B a d x d x ε c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aCxB S a B c B ! C x B B ! ε C ! d B !| a B c C Input string: “adxdxc” C B a d x d x ε c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aB S a B c ! B C x B B B ! ε C ! d B !| a B c C Input string: “adxdxc” C B a d x d x ε c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: aBc S a B c ! B C x B B B ! ε C ! d B !| a B c C Input string: “adxdxc” C B a d x d x ε c Compiler Construction 12/49 Bottom-up Parsing Tokens right now: S S a B c S ! B C x B B B ! ε C ! d B !| a B c C Input string: “adxdxc” C B a d x d x ε c Compiler Construction 12/49 LL(k) parsing A LL parser read tokens from left to right and constructs a top-down leftmost derivation. LL(k) parsing predicts which production rule to use from k tokens of lookahead. LL(1) parsing is a special case using one token of lookahead. LL(1) parsing is fast and easy, but does not work if the grammar is ambiguous, left-recursive, or non-left-factored. Compiler Construction 13/49 General LL(1) Algorithm É Process 1 token at a time É Consider a ‘current’ non-terminal symbol, start with S É While input is not empty É Given next 1 token (t) and ‘current’ non-terminal N, choose a rule R s.t. (N α) ! É For each element X in rule R from left to right É If X is a non-terminal, ‘expand’ X by recursing! Set ‘current’ to X and consider same token t. É If X is a terminal and if t matches. If it matches, consume t from input, loop É Note the need for particular types of grammars! What if we have a rule S Sα? ! Compiler Construction 14/49 Recursive Descent Parsing É Recursive Descent Parsing can parse LL(k) grammars with backtracing É We can use RDP to parse LL(1) grammars by recursing through the rules of the grammar based upon the next available token É Intuition: Construct mutually-recursive functions that consume tokens according to the grammar rules! É TL;DR “Try all productions exhaustively, backtrack” Compiler Construction 15/49 Recursive Descent Parsing E T + E | T T ! (E) | int | int T ! ∗ Input: int * int 1. Try E0 T1 + E2 ! 2. Try T1 (E3) ! É Nope! token ‘int’ does not match ‘(’ in T1 (E3) ! 3. Try T1 int. Match! ! É But the next token ‘*’ does not match ‘+’ from E0 4. Try T1 int T2 ! ∗ É Matches ‘int’, but ‘+’ from E0 remains unmatched 5. Exhausted choices for T1, so we backtrack to E0 Compiler Construction 16/49 Recursive Descent Parsing (2) E T + E | T T ! (E) | int | int T ! ∗ Input: int * int 6.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    76 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us