Automata and Grammars

Automata and Grammars Prof. Dr. Friedrich Otto Fachbereich Elektrotechnik/Informatik, Universität Kassel 34109 Kassel, Germany E-mail: [email protected] SS 2018 Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 1 / 294 Lectures and Seminary SS 2018 Lectures: Thursday 9:00 - 10:30, Room S 11 Start: Thursday, February 22, 2018, 9:00. Seminary: Thursday 10:40 - 12:10, Room S 11 Start: Thursday, February 22, 2018, 10:40. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 2 / 294 Exercises: From Thursday to Thursday next week! To pass seminary: At least two successful presentations in class (10 %) and passing of midtem exam on April 5 (10:40 - 11:40) (20 %) Final Exam: (dates have been fixed!) Written exam (2 hours) on May 31 (9:00 - 11:30) in S 11 Oral exam (30 minutes per student) on June 14 (9:00 - 12:30) in S 10 Written exam (2. try) on June 21 (9:00 - 11:30) in S 11 Oral exam (2. try) on June 28 (9:00 - 12:30) in S 11 The written exam accounts for 50 % of the final grade, while the oral exam accounts for 20 % of the final grade. Moodle: Course ”Automata and Grammars” NTIN071. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 3 / 294 Literature: P.J. Denning, J.E. Dennis, J.E. Qualitz; Machines, Languages, and Computation. Prentice-Hall, Englewood Cliffs, N.J., 1978. M. Harrison; Introduction to Formal Language Theory. Addison-Wesley, Reading, M.A., 1978. J.E. Hopcroft, R. Motwani, J.D. Ullman; Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Boston, 2nd. ed., 2001. H.R. Lewis, Ch.H. Papadimitriou; Elements of the Theory of Computation. Prentice Hall, Englewood Cliffs, N.J., 1981. G. Rozenberg, A. Salomaa (editors); Handbook of Formal Languages, Vol.1: Word, Language, Grammar. Springer, Heidelberg, 1997. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 4 / 294 1. Introduction 1. Introduction Formal Languages (not natural languages): A formal language is a set of (finite) sequences of symbols (words, strings) from a fixed finite set of symbols (alphabet). How to describe a formal language? There are various options: - by a complete enumeration, but: L = f w 2 fa; b; cg∗ j jwj = 20 g : jLj = 320 ∼ 3:5 × 109 - by a mathematical expression, but: how to check that a given word belongs to the language described? - by a generative device: a grammar or a rewriting system. A grammar tells you how to generate (all) the words of a language! - by an analytical device: an automaton or an algorithm. An automaton recognizes exactly the words of a given language! Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 5 / 294 1. Introduction The Roots of the Theory of Formal Languages: - Combinatorics on Words (A. Thue 1906, 1912) - Semigroup and Group Theory (M. Dehn 1911) - Logic (A. Turing 1926, E. Post 1936, A. Church 1936) Church’s Thesis A function is effectively computable iff there is a Turing machine that computes this function. Let f :Σ∗ ; Γ∗ be a (partial) function. dom(f ) = f u 2 Σ∗ j f (u) is defined g : domain range(f ) = f v 2 Γ∗ j 9u 2 Σ∗ : f (u) = v g : range ker(f ) = f u 2 Σ∗ j f (u) = " g : kernel graph(f ) = f u#f (u) j u 2 dom(f ) g : graph f is computable iff there is an “algorithm” that accepts the language graph(f ). Classes of Automata: Restrictions of the Turing machine Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 6 / 294 1. Introduction - Linguistics (N. Chomsky 1956) : phrase structure grammars - Biology (A. Lindenmayer 1968) : L-systems (T. Head 1987) : DNA-computing, H-systems (G. Paun 1999) : Membrane computing, P-systems Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 7 / 294 1. Introduction Overview: Chapter 2: Regular Languages and Finite Automata Chapter 3: Context-free Languages and Pushdown Automata Chapter 4: Turing Machines and Recursively Enumerable and Context-Sensitive Languages Chapter 5: Summary Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 8 / 294 2. Regular Languages and Finite Automata Chapter 2: Regular Languages and Finite Automata Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 9 / 294 2. Regular Languages and Finite Automata 2.1. Words, Languages, and Morphisms 2.1. Words, Languages, and Morphisms An alphabet Σ is a finite set of symbols or letters. For all n 2 N :Σn = fu :[1; n] ! Σg: set of words of length n over Σ, that is, u = (u(1); u(2);:::; u(n)) = u1u2 ::: un. Σ0 = f"g : " = empty word Σ+ = S Σn : set of all non-empty words over Σ n≥1 Σ∗ = S Σn : set of all words over Σ n≥0 The concatenation · :Σ∗ × Σ∗ ! Σ∗ is defined through u · v = uv. For u 2 Σm and v 2 Σn, uv 2 Σm+n. ∗ The length function j:j :Σ ! N is defined through juj = n for all u 2 Σn; n ≥ 0. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 10 / 294 2. Regular Languages and Finite Automata 2.1. Words, Languages, and Morphisms Lemma 2.1 (a) The operation of concatenation is associative, that is, (u · v) · w = u · (v · w): (b) For all u 2 Σ∗, u · " = " · u = u. (c) If u · v = u · w, then v = w (Left cancellability). (d) If u · w = v · w, then u = v (Right cancellability). (e) If u · v = x · y, then exactly one of the following cases holds: (1) juj = jxj, u = x, and v = y. (2) juj > jxj, and there exists z 2 Σ+ such that u = x · z and y = z · v. (3) juj < jxj, and there exists z 2 Σ+ such that x = u · z and v = z · y. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 11 / 294 2. Regular Languages and Finite Automata 2.1. Words, Languages, and Morphisms Abbreviations: uv stands for u · v. u0 = "; u1 = u; un+1 = unu for all u 2 Σ∗; n ≥ 1. Further Basic Notions: The mirror function R :Σ∗ ! Σ∗ is defined through "R = "; (ua)R = auR for all u 2 Σ∗; a 2 Σ: Hence, (abbc)R = c(abb)R = cb(ab)R = cbba. If uv = w, then u is a prefix and v is a suffix of w. If u 6= " (v 6= "), then v is a proper suffix (proper prefix) of w. If uvz = w, then v is a factor of w. It is a proper factor, if uz 6= ". Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 12 / 294 2. Regular Languages and Finite Automata 2.1. Words, Languages, and Morphisms A language L over Σ is a subset L ⊆ Σ∗. The cardinality of L is denoted by jLj. ∗ Let L; L1; L2 ⊆ Σ . L1 · L2 = f uv j u 2 L1; v 2 L2 g is the product of L1 and L2. It is also called the concatenation of L1 and L2. L0 = f"g; L1 = L; Ln+1 = Ln · L for all (n ≥ 1): L+ = S Ln and L∗ = S Ln are the plus closure and the star closure n≥1 n≥0 (Kleene closure) of L. LR = f w R j w 2 L g is the mirror language of L. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 13 / 294 2. Regular Languages and Finite Automata 2.1. Words, Languages, and Morphisms Example: ∗ (a) Σ1 := f0; 1g : "; 0; 10; 110 2 Σ1. i L1 = f 10 j i ≥ 0 g: R j R i L1 = f 0 1 j j ≥ 0 g und L1 · L1 = f 10 1 j i ≥ 0 g. n n L2 = f 0 1 j n ≥ 1 g. (b) Σ2 := fa; b; cg : R ∗ L3 = f wcw j w 2 fa; bg g: marked palindromes of even length ∗ L4 = f ww j w 2 fa; bg g: copy language L5 = f w j jwja ≡ 0 mod 2 and jwjb ≡ 1 mod 3 g. + (c) Σ3 := f0; 1;:::; 9; :; +; −; eg : 112:45; −23e + 17 2 Σ3 L6 = funsigned integer in PASCALg Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 14 / 294 2. Regular Languages and Finite Automata 2.1. Words, Languages, and Morphisms Let Σ and ∆ be two alphabets. A mapping h :Σ∗ ! ∆∗ is a morphism, if the equality h(uv) = h(u) · h(v) holds for all u; v 2 Σ∗. Lemma 2.2 If h :Σ∗ ! ∆∗ is a morphism, then h(") = ". For a set S, 2S denotes the power set of S. ∗ A mapping ' :Σ∗ ! 2∆ is a substitution, if the two following conditions are satisfied: – 8u; v 2 Σ∗ : '(uv) = '(u) · '(v). – '(") = f"g. For L ⊆ Σ∗, h(L) = f h(w) j w 2 L g and '(L) = S '(w). w2L Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 15 / 294 2. Regular Languages and Finite Automata 2.1. Words, Languages, and Morphisms Example: Let Σ = fa; b; cg and ∆ = f0; 1g. If h :Σ∗ ! ∆∗ is defined through a 7! 01; b 7! 1; c 7! ", then h(baca) = 10101. ∗ If ' :Σ∗ ! 2∆ is defined through a 7! f01; 001g; b 7! f1i j i ≥ 1g; c 7! f"g; then '(baca) = f 1i 0101; 1i 01001; 1i 00101; 1i 001001 j i ≥ 1 g. A morphism h is called "-free, if h(a) 6= " for all a 2 Σ. A substitution ' is called "-free, if " 62 '(a) for all a 2 Σ. A substitution ' is called finite, if '(a) is a finite set for all a 2 Σ. ∗ h−1 : ∆∗ ! 2Σ is defined through h−1(v) = f u 2 Σ∗ j h(u) = v g.

Automata and Grammars

Formal Languages and Automata Theory Géza Horváth, Benedek Nagy Szerzői Jog © 2014 Géza Horváth, Benedek Nagy, University of Debrecen

INF2080 Context-Sensitive Langugaes

Languages Generated by Context-Free and Type AB → BA Rules

CDM [1Ex]Context-Sensitive Grammars

Context-Sensitive Grammars and Linear- Bounded Automata

Appendix8 Shape Grammars in Chapter 9 There Is a Reference to Possible Applications of Shape Grammars. One Such Application Conc

INF2080 Context-Sensitive Langugaes

An Introduction to the Theory of Formal Languages and Automata

Normal Forms with Ap- Plications Obecné Gramatiky: Normální Formy a Jejich Aplikace

Chapter 2 Syntax-Based Machine Translation

Chicago Journal of Theoretical Computer Science the MIT Press

Context-Sensitive Languages and Linear Bounded Automata