CS345H: Programming Languages Lecture 5: Introduction to Parsing 1/39 I Parser Overview

CS345H: Programming Languages Lecture 5: Introduction to Parsing 1/39 I Parser Overview

CS345H: Programming Languages Lecture 5: Introduction to Parsing Thomas Dillig Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 1/39 I Parser Overview I Context-free Grammars (CFGs) I Derivations I Ambiguity Outline I Limitations of Regular Languages Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 2/39 I Context-free Grammars (CFGs) I Derivations I Ambiguity Outline I Limitations of Regular Languages I Parser Overview Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 2/39 I Derivations I Ambiguity Outline I Limitations of Regular Languages I Parser Overview I Context-free Grammars (CFGs) Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 2/39 I Ambiguity Outline I Limitations of Regular Languages I Parser Overview I Context-free Grammars (CFGs) I Derivations Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 2/39 Outline I Limitations of Regular Languages I Parser Overview I Context-free Grammars (CFGs) I Derivations I Ambiguity Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 2/39 I But regular languages are not expressive enough to turn a stream of tokens into structure I For this, we need a more expressive formal language Regular Languages I Last time, we saw that regular languages are very useful for partitioning input into tokens Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 3/39 I For this, we need a more expressive formal language Regular Languages I Last time, we saw that regular languages are very useful for partitioning input into tokens I But regular languages are not expressive enough to turn a stream of tokens into structure Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 3/39 Regular Languages I Last time, we saw that regular languages are very useful for partitioning input into tokens I But regular languages are not expressive enough to turn a stream of tokens into structure I For this, we need a more expressive formal language Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 3/39 I Classic Example: Strings of balanced parenthesis: f(i )j j i ≥ 0g I Question: Why is there no automata that can recognize this language? Beyond Regular Languages I Many languages are not regular Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 4/39 I Question: Why is there no automata that can recognize this language? Beyond Regular Languages I Many languages are not regular I Classic Example: Strings of balanced parenthesis: f(i )j j i ≥ 0g Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 4/39 Beyond Regular Languages I Many languages are not regular I Classic Example: Strings of balanced parenthesis: f(i )j j i ≥ 0g I Question: Why is there no automata that can recognize this language? Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 4/39 I Intuition: A finite automaton that runs long enough must repeat states I Finite automaton cannot remember the number of times it has visited a particular state What Can Regular Languages Express? I Languages requiring counting modulo a fixed integer Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 5/39 I Finite automaton cannot remember the number of times it has visited a particular state What Can Regular Languages Express? I Languages requiring counting modulo a fixed integer I Intuition: A finite automaton that runs long enough must repeat states Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 5/39 What Can Regular Languages Express? I Languages requiring counting modulo a fixed integer I Intuition: A finite automaton that runs long enough must repeat states I Finite automaton cannot remember the number of times it has visited a particular state Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 5/39 I Also Recall: Comments are removed during lexing I Question: Are comments in L a regular language? Side Note: Comments in L I Recall: Comments in L start with (*, end with *) and can be nested Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 6/39 I Question: Are comments in L a regular language? Side Note: Comments in L I Recall: Comments in L start with (*, end with *) and can be nested I Also Recall: Comments are removed during lexing Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 6/39 Side Note: Comments in L I Recall: Comments in L start with (*, end with *) and can be nested I Also Recall: Comments are removed during lexing I Question: Are comments in L a regular language? Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 6/39 I Output: parse tree of the program The Functionality of the Parser I Input: sequence of tokens from the lexer Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 7/39 The Functionality of the Parser I Input: sequence of tokens from the lexer I Output: parse tree of the program Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 7/39 I Parse Input: TOKEN_IF TOKEN_ID("x") TOKEN_NEQ TOKEN_ID("y") TOKEN_THEN TOKEN_INT(1) TOKEN_ELSE TOKEN_INT(2) I Parser Output: Branch NEQ INT:1 INT:2 ID:x ID:y Example I Consider the following L expression: if x<>y then 1 else 2 Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 8/39 I Parser Output: Branch NEQ INT:1 INT:2 ID:x ID:y Example I Consider the following L expression: if x<>y then 1 else 2 I Parse Input: TOKEN_IF TOKEN_ID("x") TOKEN_NEQ TOKEN_ID("y") TOKEN_THEN TOKEN_INT(1) TOKEN_ELSE TOKEN_INT(2) Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 8/39 Branch NEQ INT:1 INT:2 ID:x ID:y Example I Consider the following L expression: if x<>y then 1 else 2 I Parse Input: TOKEN_IF TOKEN_ID("x") TOKEN_NEQ TOKEN_ID("y") TOKEN_THEN TOKEN_INT(1) TOKEN_ELSE TOKEN_INT(2) I Parser Output: Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 8/39 String of characters String of tokens String of tokens Parse tree Parsing vs. Lexing Phase Input Output Lexer Parser Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 9/39 String of tokens String of tokens Parse tree Parsing vs. Lexing Phase Input Output Lexer String of characters Parser Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 9/39 String of tokens Parse tree Parsing vs. Lexing Phase Input Output Lexer String of characters String of tokens Parser Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 9/39 Parse tree Parsing vs. Lexing Phase Input Output Lexer String of characters String of tokens Parser String of tokens Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 9/39 Parsing vs. Lexing Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 9/39 I A language for describing valid strings of tokens I A method for recognizing if a string of tokens is in this language or not I Parser must distinguish between valid and invalid strings of tokens I We need: The Role of the Parser I Not all strings of tokens are programs . Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 10/39 I A language for describing valid strings of tokens I A method for recognizing if a string of tokens is in this language or not I We need: The Role of the Parser I Not all strings of tokens are programs . I Parser must distinguish between valid and invalid strings of tokens Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 10/39 I A language for describing valid strings of tokens I A method for recognizing if a string of tokens is in this language or not The Role of the Parser I Not all strings of tokens are programs . I Parser must distinguish between valid and invalid strings of tokens I We need: Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 10/39 I A method for recognizing if a string of tokens is in this language or not The Role of the Parser I Not all strings of tokens are programs . I Parser must distinguish between valid and invalid strings of tokens I We need: I A language for describing valid strings of tokens Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 10/39 The Role of the Parser I Not all strings of tokens are programs . I Parser must distinguish between valid and invalid strings of tokens I We need: I A language for describing valid strings of tokens I A method for recognizing if a string of tokens is in this language or not Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 10/39 I Example: An L expression is expression + expression, if expression then expression else expression, ... I Context free grammars are a natural notation for this recursive structure Context-free Grammars (CFGs) I Programming language constructs have recursive structure Thomas Dillig, CS345H: Programming Languages Lecture 5: Introduction to Parsing 11/39 I Context free grammars are a natural notation for this recursive structure Context-free Grammars (CFGs) I Programming language constructs have recursive structure I Example: An L expression is expression

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    142 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us