Specifying Syntax Components of a Grammar 1

Specifying Syntax Components of a Grammar 1

Specifying Syntax Components of a Grammar 1. Terminal symbols or terminals, Σ Language Specification 2. Nonterminal symbols or syntactic Syntax categories, N Form of phrases Vocabulary = Σ ∪ N Physical arrangement of symbols Semantics 3. Productions or rules, α ::= β, Meaning of phrases where α,β∈ (Σ ∪ N)* Result of executing a program 4 .Start Symbol (a nonterminal) Pragmatics Usage of the language Features of implementations Chapter 1 1 Chapter 1 2 Types of Grammars An English Grammar Type 0: Unrestricted grammars <sentence> ::= <noun phrase> <verb phrase> . α ::= β α <noun phrase> ::= <determiner> <noun> where contains a nonterminal | <determiner> <noun> <prepositional phrase> <verb phrase> ::= <verb> | <verb> <noun phrase> Type 1: Context-sensitive grammars | <verb> <noun phrase> <prepositional phrase> αBγ ::= αβγ <prepositional phrase> ::= <preposition> <noun phrase> Type 2: Context-free grammars <noun> ::= boy | girl | cat | telescope A ::= string of vocabulary symbols | song | feather <determiner> ::= a | the Type 3: Regular grammars A ::= a or A ::= aB <verb> ::= saw | touched | surprised | sang <preposition> ::= by | with Chapter 1 3 Chapter 1 4 A Derivation (for Figure 1.4) Concrete Syntax for Wren <sentence> ⇒ <noun phrase> <verb phrase> . ⇒ <determiner> <noun> <verb phrase> . Phrase-structure syntax: ⇒ the <noun> <verb phrase> . ⇒ the girl <verb phrase> . <program> ::= program <ident> is <block> ⇒ the girl <verb> <noun phrase> . <block> ::= <declaration seq> begin <cmd seq> end ⇒ the girl touched <noun phrase> . ε ⇒ the girl touched <determiner> <declaration seq> ::= | <declaration> <declaration seq> <noun> <prep phrase> . ⇒ the girl touched the <noun> <prep phrase> . <declaration> ::= var <var list> : <type> ; ⇒ the girl touched the cat <prep phrase> . <type> ::= integer | boolean ⇒ the girl touched the cat <preposition> <var list> ::= <var> | <var> , <var list> <noun phrase> . <cmd seq> ::= <cmd> | <cmd> ; <cmd seq> ⇒ the girl touched the cat with <noun phrase> . <cmd> ::= <var> := <expr> | skip ⇒ the girl touched the cat with <determiner> <noun> . | read <var> | write <int expr> ⇒ the girl touched the cat with a <noun> . | while <bool expr> do <cmd seq> end while ⇒ the girl touched the cat with a feather . | if <bool expr> then <cmd seq> end if | if <bool expr> then <cmd seq> Note example of a context-sensitive grammar. else <cmd seq> end if Chapter 1 5 Chapter 1 6 Lexical syntax: <expr> ::= <int expr> | <bool expr> <relation> ::= <= | < | = | > | >= | <> <integer expr> ::= <term> <weak op> ::= + | – <strong op> ::= * | / | <int expr> <weak op> <term> <ident> ::= <letter> | <ident> <letter> <term> ::= <element> | <ident> <digit> | <term> <strong op> <element> <letter> ::= a | b | c | d | e | f | g | h | i | j | k | l | m <element> ::= <numeral> | <var> | n | o | p | q | r | s | t | u | v | w | x | y | z | ( <int expr> ) | – <element> <numeral> ::= <digit> | <digit> <numeral> <bool expr> ::= <bool term> | <bool expr> or <bool term> <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 <bool term> ::= <bool element> | <bool term> and <bool element> <bool element> ::= true | false | <var> Tokens: | <comparison> | not ( <bool expr> ) | ( <bool expr> ) <token> ::= <identifier> | <numeral> <comparison> ::= <int expr> <relation> <int expr> | <reserved word> | <relation> | <weak op> | <strong op> <var> ::= <ident> | := | ( | ) | , | ; | : <reserved word> ::= program | is | begin | end | var | integer | boolean | read | write | skip | while | do | if | then | else | and | or | true | false | not Chapter 1 7 Chapter 1 8 Sample Wren Program Ambiguity program prime is var num,divisor : integer; One phrase has two different derivation trees, thus two different syntactic structures. var done : boolean; begin Ambiguous structure may lead to ambiguous read num; semantics. while num>0 do Dangling “else” divisor := 2; done := false; Associativity of expressions (ex 2, page 16) while divisor<= num/2 and not(done) do Ada: done := num = divisor*(num/divisor); Is “a(2)” a subscripted variable divisor := divisor+1 or a function call? end while; if done then write 0 Classifying Errors in Wren else write num Context-free syntax end if; — specified by concrete syntax read num • lexical syntax end while • phrase-structure syntax end Context-sensitive syntax (context constraints) • also called “static semantics” Semantic or dynamic errors Chapter 1 9 Chapter 1 10 Context Constraints in Wren Syntax of Wren 1. The program name identifier may not be declared elsewhere in the program. <token>* — abitrary strings of tokens 2. All identifiers that appear in a block must be <program> — strings derived from declared in that block. the context-free grammar 3. No identifier may be declared more than once Syntactically well formed in a block. Wren programs derived from <program> and also satisfying the context constraints 4. The identifier on the left side of an assignment command must be declared as a variable, and the expression on the right must be of the same type. 5. An identifier occurring as an (integer) Semantic Errors in Wren element must be an integer variable. 1. An attempt is made to divide by zero. 6. An identifier occurring as a Boolean element 2. A variable that has not been initialized is must be a Boolean variable. accessed. 3. A read command is executed when input file 7. An identifier occurring in a read command empty. must be an integer variable. 4. An iteration command (while) does not terminate. Chapter 1 11 Chapter 1 12 Regular Expressions Denote Languages Nonessential Ambiguity ∅ Command Sequences: ε <cmd seq> ::= <command> a for each a∈Σ | <command> ; <cmd seq> (E | F) union (E • F) concatenation <cmd seq> ::= <command> | <cmd seq> ; <command> (E*) Kleene closure May omit some parentheses following the <cmd seq> ::= <command> ( ; <command> )* precedence rules Or include command sequences with Abbreviations commands: E1 E2 = E1 • E2 <command> ::= <command> ; <command> | skip | read <var> | … E+ = E • E* En = E • E • … • E, repeated n times E? = ε | E Language Processing: scan : Character* → Token* → Note: E* = ε | E1 | E2 | E3 | … parse : Token* ???? Chapter 1 13 Chapter 1 14 Concrete Syntax (BNF) read m; if m=-9 then m:=0 end if; write 22*m Derivation Tree • Specifies the physical form of programs. <cmd seq> • Guides the parser in determining the phrase structure of a program. <command> ; <cmd seq> read <variable> • Should not be ambiguous. <command> ; <cmd seq> var(m) <command> • A derivation proceeds by replacing one if <bool expr> then <cmd seq> end if nonterminal at a time by its corresponding write <int expr> right-hand side in some production for that <bool term> <command> nonterminal. <term> <bool elem> <variable> := <expr> <term> <str op> <elem> • A derivation following a BNF definition of a <comparison> var(m) <int expr> language can be summarized in a derivation <element> * var(m) (parse) tree. <int expr> <rel> <int expr> <term> num(22) <term> <term> • Although a parser may not produce a = <element> derivation tree, the structure of the tree is <element> <element> num(0) embodied in the parsing process. var(m) - <element> num(9) Chapter 1 15 Chapter 1 16 Abstract Syntax read m; if m=-9 then m:=0 end if; write 22*m • Representation of the structure of language constructs as determined by the parser. Abstract Syntax Tree: commands • The hierarchical structure of language constructs can be represented as abstract syntax trees. read if write var(m) times equal • The goal of abstract syntax is to reduce commands num(22) var(m) language constructs to their “essential” assign var(m) minus properties. num(9) var(m) num(0) • Specifying the abstract syntax of a language means providing the possible templates for the OR various constructs of the language. if write expr • Abstract syntax generally allows ambiguity expr since parsing has already been done. timesvar(m) num(22) equal var(m) minus • Semantic specifications based on abstract num(9) syntax are much simpler. Chapter 1 17 Chapter 1 18 Assume we only have to deal with programs that Designing Abstract Syntax are syntactically correct. What basic forms can an expression take? Consider concrete syntax for expressions: <int expr> <weak op> <term> <expr> ::= <int expr> | <bool expr> <term> <strong op> <element> <integer expr> ::= <term> <numeral> <variable> | <int expr> <weak op> <term> – <element> <term> ::= <element> | <term> <strong op> <element> <bool expr> or <bool term> <element> ::= <numeral> | <var> <bool term> and <bool element> | ( <int expr> ) | – <element> true false <bool expr> ::= <bool term> not ( <bool expr> ) | <bool expr> or <bool term> <int expr> <relation> <int expr> <bool term> ::= <bool element> | <bool term> and <bool element> Abstract Syntax: <bool element> ::= true | false | <var> Expression ::= Numeral | Identifier | true | false | <comparison> | ( <bool expr> ) | Expression Operator Expression | not ( <bool expr> ) | – Expression | not(Expression) <comparison> ::= Operator ::= + | – | * | / | or | and <int expr> <relation> <int expr> | <= | < | = | > | >= | <> Chapter 1 19 Chapter 1 20 Abstract Syntax Abstract Syntactic Categories Alternative Abstract Syntax Program Type Operator Block Command Numeral Abstract Production Rules Declaration Expression Identifier Program ::= prog (Identifier, Block) Abstract Production Rules Block ::= block (Declaration*, Command+) Program ::= program Identifier is Block Declaration ::= dec (Identifier+, Type) Block ::= Declaration* begin Command+ end Type ::= integer | boolean Declaration ::= var Identifier+ :

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us