Introduction to Grammars and Parsing Techniques

Introduction to Grammars and Parsing Techniques

Introduction to Grammars and Parsing Techniques Paul Klint Introduction to Grammars and Parsing Techniques 1 Grammars and Languages are one of the most established areas of Computer Science N. Chomsky, Aspects of the theory of syntax, 1965 A.V. Aho & J.D. Ullman, The Theory of Parsing, Translation and Compilimg, Parts I + II, 1972 A.V. Aho, R. Sethi, J.D. Ullman, Compiler, Principles, Techniques and Tools, 1986 D. Grune, C. Jacobs, Parsing Techniques, A Practical Guide, 2008 Why are Grammars and Parsing Techniques relevant? ● A grammar is a formal method to describe a (textual) language ● Programming languages: C, Java, C#, JavaScript ● Domain-specific languages: BibTex, Mathematica ● Data formats: log files, protocol data ● Parsing: ● Tests whether a text conforms to a grammar ● Turns a correct text into a parse tree Introduction to Grammars and Parsing Techniques 7 How to define a grammar? ● Simplistic solution: finite set of acceptable sentences ● Problem: what to do with infinite languages? ● Realistic solution: finite recipe that describes all acceptable sentences ● A grammar is a finite description of a possibly infinite set of acceptable sentences Introduction to Grammars and Parsing Techniques 8 Example: Tom, Dick and Harry ● Suppose we want describe a language that contains the following legal sentences: ● Tom ● Tom and Dick ● Tom, Dick and Harry ● Tom, Harry, Tom and Dick ● ... ● How do we find a finite recipe for this? Introduction to Grammars and Parsing Techniques 9 The Tom, Dick and Harry Grammar ● Name -> tom Non-terminals: Name, Sentence, List, End ● Name -> dick ● Name -> harry Terminals: tom, dick, harry, and, , ● Sentence -> Name ● Sentence -> List End Start Symbol: ● List -> Name Sentence ● List -> List , Name ● , Name End -> and Name Introduction to Grammars and Parsing Techniques 10 Variations in Notation ● Name -> tom | dick | harry ● <Name> ::= “tom” | “dick” | “harry” ● “tom” | “dick” | “harry” -> Name Introduction to Grammars and Parsing Techniques 11 Chomsky’s Grammar Hierarchy ● Type-0: Recursively Enumerable ● Rules: α -> β (unrestricted) ● Type-1: Context-sensitive ● Rules: αAβ -> αγβ ● Type-2: Context-free ● Rules: A -> γ ● Type-3: Regular ● Rules: A -> a and A -> aB Introduction to Grammars and Parsing Techniques 12 Context-free Grammar for TDH ● Name -> tom | dick | harry ● Sentence -> Name | List and Name ● List -> Name , List | Name Introduction to Grammars and Parsing Techniques 13 In practice ... ● Regular grammars used for lexical syntax: ● Keywords: if, then, while ● Constants: 123, 3.14, “a string” ● Comments: /* a comment */ ● Context-free grammars used for structured and nested concepts: ● Class declaration ● If statement Introduction to Grammars and Parsing Techniques 14 A sentence position := initial + rate * 60 p o s i t i o n : = i n i t i a l + r a t e * 6 0 Introduction to Grammars and Parsing Techniques 15 Lexical syntax ● Regular expressions define lexical syntax: ● Literal characters: a,b,c,1,2,3 ● Character classes: [a-z], [0-9] ● Operators: sequence (space), repetition (* or +), option (?) ● Examples: ● Identifier: [a-z][a-z0-9]* ● Number: [0-9]+ ● Floating constant: [0-9]*.[0-9]*(e-[0-9]+) Introduction to Grammars and Parsing Techniques 16 Lexical syntax ● Regular expressions can be implemented with a finite automaton ● Consider [a-z][a-z0-9]* [a-z0-9]* [a-z] Start End Introduction to Grammars and Parsing Techniques 17 Lexical Tokens Identifier Identifier Identifier Number position := initial + rate ** 60 p o s i t i o n : = i n i t i a l + r a t e * 6 0 Introduction to Grammars and Parsing Techniques 18 Context-free syntax ● Expression -> Identifier ● Expression -> Number ● Expression -> Expression + Expression ● Expression -> Expression * Expression ● Statement -> Identifier := Expression Introduction to Grammars and Parsing Techniques 19 Parse Tree Statement Expression Expression Expression Expressionexpression Expression Identifieridentifier Identifier Identifier Number position := initial + rate ** 60 p o s i t i o n : = i n i t i a l + r a t e * 6 0 Introduction to Grammars and Parsing Techniques 20 Ambiguity: one sentence, but several trees Expression Expression Expression Expression Number Number Number Number Number Number 1 + 2 * 3 1 + 2 * 3 Introduction to Grammars and Parsing Techniques 21 Two solutions ● Add priorities to the grammar: ● * > + ● Rewrite the grammar: ● Expression -> Expression + Term ● Expression -> Term ● Term -> Term * Primary ● Term -> Primary ● Primary -> Number ● Primary -> Identifier Introduction to Grammars and Parsing Techniques 22 Unambiguous Parse Tree Expression Expression Term Term Term Primary Primary Primary Number Number Number 1 + 2 * 3 Introduction to Grammars and Parsing Techniques 23 Some Grammar Transformations ● Left recursive production: ● A -> A α | β ● Example: Exp -> Exp + Term | Term ● Left recursive productions lead to loops in some kinds of parsers (recursive descent) ● Removal: ● A -> β R ● R -> α R | ε (ε is the empty string) Introduction to Grammars and Parsing Techniques 24 Some Grammar Transformations ● Left factoring: ● S -> if E then S else S | if E then S ● For some parsers it is better to factor out the common parts: ● S -> if E then S P ● P -> else S | ε Introduction to Grammars and Parsing Techniques 25 A Recognizer Grammar l -> r 1 1 l -> r 2 2 ... l -> r Source text n n Recognizer Yes or No Introduction to Grammars and Parsing Techniques 26 A Parser Grammar l -> r 1 1 l -> r 2 2 ... l -> r Source text n n Parse tree or Errors Parser Introduction to Grammars and Parsing Techniques 27 General Approaches to Parsing ● Top-Down (Predictive) ● Each non-terminal is a goal ● Replace each goal by subgoals (= elements of rule) ● Parse tree is built from top to bottom ● Bottom-Up ● Recognize terminals ● Replace terminals by non-terminals ● Replace terminals and non-terminals by left-hand side of rule Introduction to Grammars and Parsing Techniques 28 Top-down versus Bottom-up Top-down Bottom-up Expression Expression Expression Expression Number Number Number Number Number Number 4 * 5 + 6 1 + 2 * 3 Introduction to Grammars and Parsing Techniques 29 How to get a parser? ● Write parser manually ● Generate parser from grammar Introduction to Grammars and Parsing Techniques 30 Example ● Given grammar: ● Expr -> Expr + Term ● Expr -> Expr – Term ● Expr -> Term ● Term -> [0-9] ● Remove left recursion: ● Expr -> Term Rest ● Rest -> + Term Rest | - Term Rest | ε ● Term -> [0-9] Introduction to Grammars and Parsing Techniques 31 Recursive Descent Parser Expr(){ Term(); Rest(); } Rest(){ if(lookahead == '+'){ Match('+'); Term(); Rest(); } else if( lookahead == '-'){ Match('-'); Term(); Rest(); } else ; } Term(){ if(isdigit(lookahead)){ Match(lookahead); } else Error(); } Introduction to Grammars and Parsing Techniques 32 A Trickier Case: Backtracking ● Example ● S -> aSbS | aS | c ● Naive approach (input ac): ● Try aSbS, but this fails hence error ● Backtracking approach: ● First try aSbS but this fails ● Go back to initial input position and try aS, this succeeds. Introduction to Grammars and Parsing Techniques 33 Automatic Parser Generation Syntax of L Parser Generator L Parser L text tree for L Introduction to Grammars and Parsing Techniques 34 Some Parser Generators ● Bottom-up ● Yacc/Bison, LALR(1) ● CUP, LALR(1) ● SDF, SGLR ● Top-down: ● ANTLR, LL(k) ● JavaCC, LL(k) ● Rascal ● Except Rascal and SDF, all depend on a scanner generator Introduction to Grammars and Parsing Techniques 35 Assessment parser implementation ● Manual parser construction + Good error recovery + Flexible combination of parsing and actions - A lot of work ● Parser generators + May save a lot of work - Complex and rigid frameworks - Rigid actions - Error recovery more difficult Introduction to Grammars and Parsing Techniques 36 Further Reading ● http://en.wikipedia.org/wiki/Chomsky_hierarchy ● D. Grune & C.J.H. Jacobs, Parsing Techniques: A Practical Guide, Second Edition, Springer, 2008 Introduction to Grammars and Parsing Techniques 37 Further Reading ● http://en.wikipedia.org/wiki/Chomsky_hierarchy ● Paul Klint, Quick introduction to Syntax Analysis, http://www.meta-environment.org/ doc/books//syntax/syntax-analysis/syntax- analysis.html ● D. Grune & C.J.H. Jacobs, Parsing Techniques: A Practical Guide, Second Edition, Springer, 2008 Introduction to Grammars and Parsing Techniques 38.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    38 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us