Grammars Context-Free Languages

Unit 7 Context-free Grammars Context-free Languages Reading: Sipser, chapt. 2.1 Hopcroft et. al. chapt. 5 1 דקד ו ק / דקד ו ק םי Grammar • DFA/NFA describes a regular language in a computational way: Given a word – we run it on the FA and check whether it stopped on an accepting state. • Regular Expression is another method of describing languages in a syntactic way. • A grammar is a set of rules describing languages in a syntactic way: We start with an empty string and form the word according to the grammar rules until we have the desired output (parsing). 2 The Origin of Grammar • The origin of the name grammar for this computational model is in natural languages, where grammar is a collection of rules. • This collection defines what is legal in the language and what is not. 3 Example of a grammar • Symbols: S={a,b}, Variables: V={S,B} • The following grammar generates a*b*. S®aS (S is the starting variable) S®B B®Bb B®e How can we generate ab, aab, e ? 4 The Grammars Formalism A grammar is composed of: 1. Terminals S = symbols of the alphabet of the language being defined 2. Variables V = a finite set of other symbols, each of which represents a language. 3. Start Symbol SÎV = the variable whose language is the one being defined. 4. A collection of production rules. 5 Common Notations • Terminals: Lower case, lower alphabet (a, b, c). • Variables: Upper case, lower alphabet (A, B, C). • String of terminals: Lower case, higher alphabet (x, y, u, v, w). • Mixed strings: (terminals + variables): Lower case, Greek letters (a, b, g) • Starting Variable: S 6 Production Rules • A production rule has the form: a®b • It means: – a can be replaced by b – a constructs b – a produces b 7 Example • Symbols: S={0,1}, Variables: V={S} • The language: L = 0$1$ & > 0}. S®0S1 (S is the starting variable) S®01 8 Another Example • S={a,b,#} • The following grammar generates {"#$%#$%"# | (, * ≥ 0}. S®aSa S®B B®bBb B®# 9 Derivation of a word 1. Write down the start variable. 2. Find a variable A that is written down and a rule A®a. 3. Replace the variable A with the string a. 4. Repeat steps 2+3 until no variables remain. 10 Þ notation • We use the notation “Þ” to represent an actual derivation: a Þ b • It means: the string b was derived from a using a production rule. • We can derive a!b ⇒ agb , if ! → $ is a production rule. • Example: % → 01; % → 0%1. % ⇒ 0%1 ⇒ 00%11 ⇒ 000111. 11 Production w=aacb can be written 1. S®aS S®aS | bS | cS | e 2. S®bS 3. S®cS When a variable has various 4. S®e production rules, they can all be written in one line. • How can the word w=aacb be produced? (1) (1) (3) (2) (4) S ÞaS ÞaaS ÞaacS ÞaacbS Þaacb 12 Parsing / Parsing Tree • Producing a word according to a given grammar is called parsing. • We can represent the same production sequence by a parsing tree. • Each node in the tree is either a variable or a terminal. • A terminal node is a leaf. • The resulting word is a concatenation of the labels of the leaves in left-to-right order (preorder traversal) • This is called the yield of the parsing tree. 13 Parsing Tree of w=aacb S a S a S c S b S e 14 Parsing Tree of w=aacb Or a step by step derivation: S S S a S a S a S 15 Parsing w=aacb (cont.) S S S a S a S a S a S a S a S bc S c S c S b S b S e 16 דקד ו ק רסח רשקה Context-Free Grammar (CFG) A context-free grammar (CFG) G is a 4-tuple G = (V, S, S, R), where 1. V is a finite set called the variables. 2. S is a finite set, disjoint from V, called the terminals. 3. ( ∈ * is a start symbol. 4. R is a finite set of production rules of the form: A®a where AÎV and aÎ(VÈS)* 17 Derivation in CFG • Let a, b and g be strings of variables and terminals • If A®g is a rule in the grammar, we say that aAb derives agb, written aAb Þ agb. • We write x Þ* y if there exists a sequence x1, x2, ..xk, k³0 and x Þ x1 Þ x2 Þ...Þ y . Þ means derives in one step Þ+ means derives in one or more steps Þ* means derives in zero or more steps 18 פש ו ת רסח ו ת רקהש Context Free Languages • The language of the grammar is L(G) = {wÎS* | S Þ* w} • The language generated by a Context Free Grammar (CFG) is Called a context-free language (CFL). 19 Examples over S={0,1} • Construct a grammar for the following language: L = {0,00,1} • G = (V={S},S={0,1},S, R) • R: S ® 0 Alternatively S ® 00 S ® 0 | 00 | 1 S ® 1 20 Examples over S={0,1} • Construct a grammar for the following language L = {0n1n |n³0} • G = (V={S},S={0,1},S, R) where R: S®0S1 | e • Example: let’s parse the word 0011 ! ⟹ 0!1 ⟹ 00!11 ⟹ 0011 21 Examples over S={0,1} • Construct a grammar for the following language L = {0n1n |n³1} • G = (V={S},S={0,1},S, R) where R: S ® 0S1 | 01 • Let’s parse the word w=00001111 22 Examples over S={0,1} • Construct a grammar for the following language L = {0n1m |m ³n>0} • G = (V={S},S={0,1},S, R) where R: S ® 0S1 | 0B1 B ® B1 | e 23 Examples over S={0,1} • Construct a grammar for the following language L = {0*1+} • G = (V={S,B},S={0,1},S, R) where R: S ®0S | 1B What about 0*1* ? B ®1B | e 24 Examples over S={0,1} • Construct a grammar for the following language L = {02i+1 | i³0} • G = (V={S},S={0,1},S, R) where R: S ®00S | 0 Alternatively: S ®0S0 | 0 25 Examples over S={0,1} • Construct a grammar for the following language L = {0i+11i | i³0} • G = (V={S},S={0,1},S, R) where R: S ®0S1 | 0 26 Examples over S={0,1} • Construct a grammar for the following language L = {w Î{0,1}* | |w| mod 2 = 1} • G = (V={S},S={0,1},S, R) where R: S ®0 | 1| 1S1| 0S0 |1S0 | 0S1 Let’s parse: 011100101 • Alternatively: S ®DSD | D ; D ® 0 | 1 • Alternatively: S ®DDS | D ; D ® 0 | 1 27 Examples over S={0,1} • Construct a grammar for the following language L = {0n1n |n>0}È {1n0n | n³0} • G = (V={S,A,B},S={0,1},S, R) where R: S ® A | B A ® 0A1 | 01 B ® 1B0 | e 28 Exercise Construct grammars for the following languages over S={0,1} 1. L1= {w | #1(w) is even} 2. L2= {w | #1(w) is odd} 3. L3= {w| #1(w) = #0(w)} n m n+m 4. L4= {0 10 10 | n,m ³ 0} Solution: In class 29 Define the Language for a CFG • Give a description of L(G) for the following grammar G: S ® 0S0 | 1 • L(G) = {0n10n | n³0} 30 Define the Language for a CFG • Give a description of L(G) for the following grammar G: S ® 0S0 | 1S1 | e – " # = % ∈ 0,1 ∗ % = %+, % ,- ./.0 } S ® 0S0 | 1S1 | 1 | 0 | e – " # = % ∈ 0,1 ∗ % = %+} 31 Define the Language for a CFG • Give a description of L(G) for the following grammar G: S ® 0A | 0B A®1S B®1 • L(G) = {(01)n |n³1 } • Simpler version S ® 01S | 01 32 Define the Language for a CFG • Give a description of L(G) for the following grammar G: S ® 0S11 | 0 • L(G) = {0n+112n |n³0 } 33 Define the Language for a CFG • Give a description of L(G) for the following grammar G: S ® E | NE N ® D | DN D ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 E ® 0 | 2 | 4 | 6 • L(G) = {w | w represents an even octal number} 34 Define the Language for a CFG • Give a description of L(G) for the following grammar G: S ® N.N | -N.N N ® D | DN D ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 • L(G) = {w | w represents a decimal rational number (that has a finite representation) } 35 Exercise Give a description for L(G) for each of the following grammars over S={a,b,$} : G1: S ® aSb | A A ® Aa | e G2: S ® aSb | SS | e G3: S ® aSa | bSb | aS | bS | $ Solution: In class 36 Exercise Give a description for L(G) for the following grammars over S={a} : E®E+E | ExE | D D®0|1|2|..|9 • Let’s parse the string 3+4x5 – E Þ E+E Þ D+E Þ 3+E Þ 3+ExE Þ… 3+4x5 – E Þ ExE Þ ExD Þ Ex5 Þ E+Ex5 Þ… 3+4x5 37 Exercise (cont.) • The string 3+4*5 can be E®E+E | ExE | D produced in several ways: D®0|1|2|..|9 E E E E + E x E D x E + E E E D D D D D 3 5 3 4 4 5 38 Ambiguous Production • So if we use this grammar to produce a programming language then we will have several computations for 3+4*5.

Load more