Context Free Grammars

Context Free Grammars

Context Free Grammars https://courses.missouristate.edu/anthonyclark/333/ Some notes adapted from Professor GianLuigi Ferrari at University of Pisa Outline Topics and Learning Objectives • Formal grammar theory • Context free grammars Assessments • Context free grammars Picture from “Crafting Interpreters” Formal Grammars Regular Expressions and Interpreter and Push-Down Automata Finite State Machines Put Tokens in a Tree- Group characters like data structure Source Code (Plain Text) into smallest that represents 1. // GCD Program (in C) Lexemes/Tokens 2. int main() { meaningful units semantics of program 3. int i = getint(), j = getint(); int main ( ) { int i = getint ( ) , j = getint ( ) ; 4. while (i != j) { while ( i != j ) { 5. if (i > j) i = i - j; if ( i > j ) i = i - j ; 6. else j = j - i; Lexer/Scanner else j = j - 1 ; Parser } 7. } putint ( i ) ; 8. putint(i); } Self-evident 9. } You’ve already started this Assignments 5 through 7 We’ll create our own simple calculator language “Walk” the tree and evaluate the program given optional input from outside of the program Result Evaluate/Interpret Assignment 8 User input / File Input / Sockets / Etc. Tree-Walking is pretty We’ll build our own representation straightforward Interpreter read Lexer request … … send token token Parser request send AST AST I/O Console Tree Walker Chomsky Hierarchy • Type-0: Turing machine • Type-1: Linear bounded automaton • Type-2: Pushdown automaton • Type-3: Finite state automaton Scanning vs Parsing • Regular expressions for regular languages recognized by lexer • Context free grammars for context-free languages recognized by parsers REs cannot “count” • Cannot balance parenthesis • Cannot balance if then else expressions • Etc. Example from: Abstract Syntax Tree Writing An Interpreter In Go Lexems/Tokens We now care about syntax! No parentheses, semicolons, braces, etc. Grammar A grammar is a tool for describing a language A grammar is a set of rules (productions) for creating valid strings grammar English; sentence : subject verbPhrase object; subject : 'This' | 'Computers' | 'I'; verbPhrase : adverb verb | verb; adverb : 'never'; verb : 'is' | 'run' | 'am' | 'tell'; object : 'the' noun | 'a' noun | noun; noun : 'university' | 'world' | 'cheese' | 'lies'; sentence : subject verbPhrase object; subject : 'This' | 'Computers' | 'I'; verbPhrase : adverb verb | verb; adverb : 'never'; Generating Strings verb : 'is' | 'run' | 'am' | 'tell'; object : 'the' noun | 'a' noun | noun; noun : 'university' | 'world' | 'cheese' | 'lies'; We can use the grammar to generate strings Start with the top-level rule: sentence Replace RHS with other rules or terminals For example: This is a university Syntax vs Semantics • You can also create syntactically valid strings that do not make sense semantically • “Computers run cheese” • “This am a lies” • These are valid sentences, but they do not have any real meaning • The same can be true for our programming language rules • We’ll worry about semantics later. def f(): return “hi” float y = f() + 5; Error types Invalid lexemes (all languages will catch this problem early) var x = 5 @ “6” # ‘@’ is not a valid operator in most languages Valid lexemes, invalid syntax (all languages will catch this problem early) x var= 5 5 * # all valid lexemes, but in the wrong order Valid lexemes, valid syntax, invalid semantics (catch at compile time or runtime) var x = 5 * “6” # many languages will not multiply an integer and a string Valid lexemes, valid syntax, valid semantics var x = 5 * 6 Formal Grammars Grammar a set of rules for creating valid strings Nonterminal a grammar symbol that can be replaced by a sequence of symbols Terminal a word in the language (cannot be replaced with something else) Production a single rule in the grammar (XàY1Y2…) Derivation a sequence of rule applications that produces a valid string Start Symbol the rule used to start all derivations Null Symbol the ε symbol is used to say that a nonterminal can be replaced with nothing Example Write a regular expression for the following regular language. At least 1 zero followed by at least 1 one 0 0* 1 1* Now write a grammar for the same language Example • Define a regular expression where you have n 0’s followed by n 1’s • Define a context-free grammar where you have some number of 0’s followed by the same number of 1’s Activity 1. Write three strings that can be generated by the following grammar. S à 1S | 0T | ε T à 1T | 0S What does this recognize? S à 1S | 0T | ε T à 1T | 0S Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string Start Symbol the rule used to start all derivations Null Symbol the ε symbol, a nonterminal can be replaced with nothing Language set of all strings that can be derived from a grammar Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string Start Symbol the rule used to start all derivations Null Symbol the ε symbol, a nonterminal can be replaced with nothing Language set of all strings that can be derived from a grammar Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string Start Symbol the rule used to start all derivations Null Symbol the ε symbol, a nonterminal can be replaced with nothing Language set of all strings that can be derived from a grammar Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string Start Symbol the rule used to start all derivations Null Symbol the ε symbol, a nonterminal can be replaced with nothing Language set of all strings that can be derived from a grammar Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string Start Symbol the rule used to start all derivations Null Symbol the ε symbol, a nonterminal can be replaced with nothing Language set of all strings that can be derived from a grammar Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string S Start Symbol the rule used to start all derivations 1S 11S Null Symbol the ε symbol, a nonterminal can be replaced with nothing 110T 1101T Language set of all strings that can be derived from a grammar 11010S 11010 Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string Start Symbol the rule used to start all derivations Null Symbol the ε symbol, a nonterminal can be replaced with nothing Language set of all strings that can be derived from a grammar Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string Start Symbol the rule used to start all derivations Null Symbol the ε symbol, a nonterminal can be replaced with nothing Language set of all strings that can be derived from a grammar Grammar rules for creating valid strings of a language Nonterminal can be replaced by a sequence of symbols Terminal a word in the language S à 1S | 0T | ε Production a single rule in the grammar T à 1T | 0S Derivation a sequence of rule applications that produces a valid string • 1 Start Symbol the rule used to start all derivations • “” • 0101 Null Symbol the ε symbol, a nonterminal can be replaced with nothing • 1111 • 100 Language set of all strings that can be derived from a grammar • 00 • … 2. Write a grammar for palindromes where your terminals are the symbols ‘a’ and ‘b’. 3. Write a grammar for representing all strings that start with x number of a’s, followed by y number of b’s, followed by z number of a’s, where y = x + z..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    29 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us