1 2
Introduction to Programming Lecture 9: Describing the Syntax Michela Pedroni
16 November 2003
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
Why describe the syntax formally? 3 Syntax: Conditional 4
We know syntax descriptions for human languages: A conditional instruction consists, in order, of: An “If part”, of the form if condition. e.g. grammar for German, French, … A “Then part” of the form then compound. Zero or more “Else if parts”, each of the form Expressed in natural language elseif condition then compound. Zero orone“Else part”of theform else compound Good enough for human use The keyword end.
Ambiguous, like human languages themselves Here each condition is a boolean expression, and each compound is a compound instruction.
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
Why describe the syntax formally? 5 Why describe the syntax formally? 6
Programming languages need better descriptions: Compilers use algorithms to
More precise: must tell us unambiguously Determine if input is correct program text whether given program text is legal or not Analyse program text to extract specimens Use formalism similar to mathematics Translate program text to machine instructions Can be fed into compilers for automatic processing of programs Compilers need strict formal definition of programming language
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
1 Formal Description of Syntax 7 History 8
1954 FORTRAN: First widely recognized Use formal language to describe programming programming language (developed by John Backus languages. et Al.) 1958 ALGOL 58: Joint work of European and American groups 1960 ALGOL 60: Preparation showed a need for a Languages used to describe other languages are formal description Æ John Backus (member of called Meta-Languages ALGOL team) proposed Backus-Normal-Form (BNF) 1964: Donald Knuth suggested acknowledging Peter Naur for his contribution Æ Backus-Naur- Meta-Language used to describe Eiffel: Form BNF-E (Variant of the Backus-Naur-Form, BNF) Many variants since then, e.g. graphical variant by Niklaus Wirth
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
Formal description of a language 9 Formal Description of Syntax 10
A language is a set of phrases BNF lets us describe syntactical properties of a language A phrase is a finite sequence of tokens from a certain “vocabulary” Remember: Description of a programming language also includes lexical and semantic properties Æ other tools Not every possible sequence is a phrase of the language
A grammar specifies which sequences are phrases and which are not
BNF is used to define a grammar for a programming language
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
Grammar 11 Elements of a grammar: Terminals 12
Definition A Grammar for a language is a finite set of rules for producing phrases, such that: Terminals 1. Any sequence obtained by a finite number of applications of rules from the grammar is a phrase Tokens of the language that are not defined by of the language. a production of the grammar. E.g. keywords from Eiffel such as if, then, end 2. Any phrase of the language can be obtained by a or symbols such as the semicolon “;”or the finite number of applications of rules from the assignment “:=” grammar.
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
2 Elements of a grammar: Nonterminals 13 Elements of a grammar: Productions 14
Nonterminals Productions
Names of syntactical structures or Rules that define nonterminals of the grammar substructures used to build phrases. using a combination of terminals and (other) nonterminals
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
An example production 15 BNF Elements: Concatenation 16
Conditional: Graphical representation: else Instruction A B if Condition then Instruction end BNF: A B Terminal
Nonterminal Meaning: A followed by B
Production
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
BNF Elements: Optional 17 BNF Elements: Choice 18
Graphical representation: Graphical representation: A A
B
BNF: [ A ] BNF: A | B Meaning: A or nothing Meaning: either A or B
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
3 BNF Elements: Repetition 19 BNF Elements: Repetition, once or more 20
Graphical representation: Graphical representation:
A
A
BNF: { A }* BNF: { A }+
Meaning: sequence of zero or more A Meaning: sequence of one or more A
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
BNF elements: Overview 21 BNF Elements Combined 22
Concatenation: AB A B Conditional: else instruction Optional: [ A ] A
A if condition then instruction end Choice: A | B B written in BNF: A Repetition (zero or more): { A }* Conditional ,
Repetition (at least once): { A }+ if condition then instruction [ else instruction ] end A
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
BNF: Conditional with elseif 23 Different Grammar for Conditional 24
Conditional , If_part Then_part Else_list end Conditional , if Then_part_list []Else_part end If_part , if Boolean_expression Then_part_list , Then_part { elseif Then_part }* Then_part , then Compound Then_part , Boolean_expression then Compound Else_list , {]Elseif_part }* [ else Compound
Else_part , else Compound Elseif_part , elseif Boolean_expression Then_part
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
4 BNF elements: Overview 25 BNF-E 26
Concatenation: AB A B Used in official description of Eiffel. Every Production is one of Optional: [ A ] A Concatenation A A , BC[ D ] Choice: A | B B Choice A A , B | C | D Repetition (zero or more): { A }* A , [ B { terminal B }* ] Repetition A , { B terminal ... }* Repetition (at least once): { A }+ A
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
BNF-E Rules 27 Conditional with elseif (BNF) 28
Conditional , if Then_part_list []Else_part end Every nonterminal must appear on the left-hand side of exactly one production, called its Then_part_list , Then_part { elseif Then_part }* defining production
Every production must be of one kind: Then_part , Boolean_expression then Compound Concatenation, Choice or Repetition
Else_part , else Compound
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
BNF-E: Conditional 29 Recursive grammars 30
Conditional , if Then_part_list []Else_part end Constructs may be nested
+ Then_part_list , { Then_part elseif ... } Express this in BNF with recursive grammars
Then_part , Boolean_expression then Compound Recursion: circular dependency of productions Else_part , else Compound
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
5 Recursive grammars 31 Recursive grammars 32
Conditionals can be nested within conditionals: Production name can be used in its own definition
Definition of Then_part_list with repetition: Else_part , else Compound Then_part_list , { Then_part elseif …}*
* Compound , { Instruction ; …} Recursive definition of Then_part_list:
Then_part_list Then_part [ elseif Then_part_list ] Instruction , Conditional |||Loop Call ... ,
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
Guidelines for Grammars 33 Guidelines for Grammars 34
Keep productions short. Treat lexical constructs like terminals Identifiers easier to read Constant values better assessment of language size
Identifier , Letter (Letter | Digit | "_")* Conditional , Integer_constant , Digit+ if Boolean_expression then Compound Floating_point , [-] Digit*“." Digit+ { elseif Boolean_expression then Compound }* [ else Compound ] end Letter , "A" | "B" | ... | "Z" | "a" | ... | "z" Digit , "0" | "1" | ... | "9“
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
Guidelines for Grammars 35 Writing a Parser 36
Use unambiguous productions. One feature per Production Applicable production can be found by looking at one lexical element at a time Concatenation: Sequence of feature calls for Nonterminals, checks for Terminals Conditional , if Then_part_list [ Else_part ] end Choice: Compound , { Instruction }* Conditional with Compound per alternative Instruction , Conditional | Loop | Call | ... Repetition: Loop
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
6 Writing a Parser: EiffelParse 37 Writing a Parser: Tools 38
Automatic generation of abstract syntax tree for Yooc phrase Translates BNF-E to EiffelParse classes Based on BNF-E Yacc / Bison One class per production Translates BNF to C parser Classes inherit from predefined classes AGGREGATE, CHOICE, REPETITION, TERMINAL
Feature production defines Production
Chair of Software Engineering Intro – Lecture 9 Chair of Software Engineering Intro – Lecture 9
39
End lecture 9
Chair of Software Engineering Intro – Lecture 9
7