Automaty a Gramatiky

Automaty a Gramatiky

Context-Free Grammars and Languages More powerful than finite automata. Used to describe document formats, via DTD - document-type definition used in XML (extensible markup language) We introduce context-free grammar notation parse tree. There exist an ’pushdown automaton’ that describes all and only the context-free languages. Will be described later. Automata and Grammars Grammars 6 March 30, 2017 1 / 39 Palindrome example A string the same forward and backward, like otto or Madam, I’m Adam. w is a palindrome iff w = w R . The language Lpal of palindromes is not a regular language. A context-free grammar for We use the pumping lemma. palindromes If Lpal is a regular language, let n be the 1. P → λ asssociated constant, and consider: 2. P → 0 w = 0n10n. 3. P → 1 For regular L, we can break w = xyz such that y consists of one or more 0’s from the 4. P → 0P0 5. P → 1P1 first group. Thus, xz would be also in Lpal if Lpal were regular. A context-free grammar (right) consists of one or more variables, that represent classes of strings, i.e., languages. Automata and Grammars Grammars 6 March 30, 2017 2 / 39 Definition (Grammar) A Grammar G = (V , T , P, S) consists of Finite set of terminal symbols (terminals) T , like {0, 1} in the previous example. Finite set of variables V (nonterminals,syntactic categories), like {P} in the previous example. Start symbol S is a variable that represents the language being defined. P in the previous example. Finite set of rules (productions) P that represent the recursive definition of the language. Each has the form: αAβ → ω, A ∈ V , α, β, ω ∈ (V ∪ T )∗ notice the left side (head) contains at least one variable. The head - the left side, the production symbol →, the body - the right side. Definition (Context free grammar CFG) Context free grammar (CFG) je G = (V , T , P, S) has only productions of the form A → α, A ∈ V , α ∈ (V ∪ T )∗. Automata and Grammars Grammars 6 March 30, 2017 3 / 39 Chomsky hierarchy Grammar types according to productions allowed. Type 0 (recursively enumerable languages L0) general rules α → β, α, β ∈ (V ∪ T )∗, α contains at least one variable Type 1 (context languages L1) productions of the form αAβ → αωβ A ∈ V , α, β ∈ (V ∪ T )∗, ω ∈ (V ∪ T )+ with only exception S → λ, then S does not appear at the right side of any production Type 2 (context free languages L2) productions of the form A → ω, A ∈ V , ω ∈ (V ∪ T )∗ Type 3 (regular (right linear) languages L3) productions of the form A → ωB, A → ω, A, B ∈ V , ω ∈ T ∗ Automata and Grammars Grammars 6 March 30, 2017 4 / 39 Chomsky hierarchy The classes of languages are ordered L0 ⊇ L1 ⊇ L2 ⊇ L3 later we show proper inclusions L0 ⊃ L1 ⊃ L2 ⊃ L3 L0 ⊇ L1 recursively enumerable contain context free productions αAβ → αωβ have variable A in the head L2 ⊇ L3 context free contain regular languages productions A → ωB, A → ω have in the body a string (V ∪ T )∗ L1 ⊇ L2 context contain context free languages we have to eliminate rules A → λ, we can do it (later). Automata and Grammars Grammars 6 March 30, 2017 5 / 39 Derivations Using a Grammar Definition (One step derivation) Suppose G = (V , T , P, S) is grammar. Let α, ω, η, ν ∈ (V ∪ T )∗. Let α → ω be a production rule of G. Then one derivation step is: ηαν ⇒G ηων or just ηαν ⇒ ηων. We extend ⇒ to any number of derivation steps as follows. Definition (Derivation ⇒∗) Let G = (V , T , P, S) is CFG. ∗ ∗ Basis: For any string α ∈ (V ∪ T ) it derives itself, α ⇒G α. ∗ ∗ Induction: If α ⇒G β and β ⇒G γ then α ⇒G γ. ∗ ∗ If grammar G is understood, then we use ⇒ in place of ⇒G . Example (derivation E ⇒∗ a ∗ (a + b00)) E ⇒ E ∗E ⇒ I ∗E ⇒ a ∗E ⇒ a ∗(E) ⇒ a ∗(E +E) ⇒ a ∗(I +E) ⇒ a ∗(a +E) ⇒ ⇒ a ∗ (a + I) ⇒ a ∗ (a + I0) ⇒ a ∗ (a + I00) ⇒ a ∗ (a + b00) Automata and Grammars Grammars 6 March 30, 2017 6 / 39 The Language of a Grammar, Notation Convention for CFG Derivations a, b, c terminals A, B, C variables w, z strings of terminals X, Y either terminals or variables α, β, γ strings of terminals and/or variables. Definition (The Language of a Grammar) Let G = (V , T , P, S) is CFG. The language L(G) of G is the set of terminal strings that have derivations from the start symbol. ∗ ∗ L(G) = {w ∈ T |S ⇒G w} ∗ ∗ Language of a variable A ∈ V is defined L(A) = {w ∈ T |A ⇒G w}. Example (Not CFL example) L = {0n1n2n|n ≥ 1} is not context-free, there does not exist CFG grammar recognizing it. Automata and Grammars Grammars 6 March 30, 2017 7 / 39 Type 3 grammars and regular languages productions has the form A → wB, A → w, A, B ∈ V , w ∈ T ∗ an example of derivation: P : S → 0S|1A|λ, A → 0A|1B, B → 0B|1S S ⇒ 0S ⇒ 01A ⇒ 011B ⇒ 0110B ⇒ 01101S ⇒ 01101 Observations: each word contains exactly one variable (except the last one) the variable is always on the rightmost position the production A → w is the last one of the derivation any step generates terminal string and (possibly) changes the variable The relation of the grammar and a finite automata variable = state of the finite automata productions = transition function Automata and Grammars Grammars 6 March 30, 2017 8 / 39 Example of the reduction FA to a grammar Example (G, FA binary numbers divisible by 5) L = {w|w ∈ {a, b}∗&w binary numbers divisible by 5} 0 C2 E4 1 A → 1B|0A|λ 1 B → 0C|1D A 0 0 0 1 C → 0E|1A 1 D → 0B|1C 1 0 E → 0D|1E B1 D3 0 A ⇒ 0A ⇒ 0 (0) A ⇒ 1B ⇒ 10C ⇒ 101A ⇒ 101 (5) Derivation examples A ⇒ 1B ⇒ 10C ⇒ 101A ⇒ 1010A ⇒ 1010 (10) A ⇒ 1B ⇒ 11D ⇒ 111C ⇒ 1111A ⇒ 1111 (15) Automata and Grammars Grammars 6 March 30, 2017 9 / 39 FA to Grammar reduction Theorem (L ∈ RE ⇒ L ∈ L3) For any language recognized by a finite automata there exists a grammar Type 3 recognizing the language. Proof: FA to Grammar reduction L = L(A) for some automaton A = (Q, Σ, δ, q0, F ). We define a grammar G = (Q, Σ, P, q0), with productions P p → aq, iff δ(p, a) = q p → λ, iff p ∈ F Is L(A) = L(G)? λ ∈ L(A) ⇔ q0 ∈ F ⇔ (q0 → λ) ∈ P ⇔ λ ∈ L(G) a1 ... an ∈ L(A) ⇔ ∃q0,..., qn ∈ Q tž. δ(qi , ai+1) = qi+1, qn ∈ F ⇔ (q0 ⇒ a1q1 ⇒ ... a1 ... anqn ⇒ a1 ... an) is derivation of a1 ... an ⇔ a1 ... an ∈ L(G) Automata and Grammars Grammars 6 March 30, 2017 10 / 39 We aim to construct Grammar to FA reduction Opposite direction production A → aB are encoded to transition function productions A → λ define the accepting states we rewrite productions A → a1 ... anB, A → a1 ... an with more terminals we introduce new variables H2,..., Hn define productions A → a1H2, H2 → a2H3,..., Hn → anB or A → a1H2, H2 → a2H3,..., Hn → an productions A → B correspond to λ transitions Lemma For any Type 3 grammar there exist a Type 3 grammar with the same languages with all productions of the form: A → aB, A → B, A → λ,A, B ∈ V , a ∈ T. Automata and Grammars Grammars 6 March 30, 2017 11 / 39 Standard form of a grammar Type 3 Lemma For any Type 3 grammar there exist a Type 3 grammar with the same languages with all productions of the form: A → aB, A → B, A → λ,A, B ∈ V , a ∈ T. Proof. We define G| = (V |, T , S, P|), for each rule we introduce set of new variables Y2,..., Yn, Z1,..., Zn and define P P| A → aB A → aB A → λ A → λ A → a1 ... anBA → a1Y2, Y2 → a2Y3,... Yn → anB Z → a1 ... an Z → a1Z1, Z1 → a2Z2,..., Zn−1 → anZn, Zn → λ we may eliminate also rules: A → B transitive closure U(A) = {B|B ∈ V &A ⇒∗ B} A → w for all Z ∈ U(A) and (Z → w) ∈ P| Automata and Grammars Grammars 6 March 30, 2017 12 / 39 Theorem (Reduction Type 3 grammar to a λ–NFA) For any language L of a Type 3 grammar there exists a λ–NFA recognizing the same language. Proof: Reduction Type 3 grammar to a λ–NFA We take a grammar G = (V , T , P, S) with all productions of the form A → aB, A → B, A → λ, A, B ∈ V , a ∈ T generating L (previous lemma) we define a non–deterministic λ–NFA A = (V , T , δ, {S}, F ) where: F = {A|(A → λ) ∈ P} δ(A, a) = {B|(A → aB) ∈ P} δ(A, λ) = {B|(A → B) ∈ P} L(G) = L(A) λ ∈ L(G) ⇔ (S → λ) ∈ P ⇔ S ∈ F ⇔ λ ∈ L(A) a1 ... an ∈ L(G) ⇔ there exists a derivation ∗ ∗ (S ⇒ a1H1 ⇒ ... ⇒ a1 ... anHn ⇒ a1 ... an) ⇔∃ H0,..., Hn ∈ V tak že H0 = S, Hn ∈ F Hi+1 ∈ δ(Hi , ak ) for the step a1 ... ak−1Hi ⇒ a1 ... ak−1ak Hi+1 Hi+1 ∈ δ(Hi , λ) for the step a1 ... ak Hi ⇒ a1 ... ak Hi+1 ⇔ a1 ... an ∈ L(A) Automata and Grammars Grammars 6 March 30, 2017 13 / 39 Left (and right) linear grammars Definition (Left (and right) linear grammars) Type 3 grammars are also called right linear (the variable is always at right). A grammar G is left linear iff all production has the form A → Bw, A → w, A, B ∈ V , w ∈ T ∗.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    39 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us