Improved Learning of AC0 Functions

Total Page:16

File Type:pdf, Size:1020Kb

Improved Learning of AC0 Functions 0 Improved Learning of AC Functions Merrick L. Furst Jeffrey C. Jackson Sean W. Smith School of Computer Science School of Computer Science School of Computer Science Carnegie Mellon University Carnegie Mellon University Carnegie Mellon University Pittsburgh, PA 15213 Pittsburgh, PA 15213 Pittsburgh, PA 15213 [email protected] [email protected] [email protected] Abstract A variant of one of our algorithms gives a more general learning method which we conjecture produces reasonable 0 Two extensions of the Linial, Mansour, Nisan approximations of AC functions for a broader class of in- 0 put distributions. A brief outline of the general algorithm AC learning algorithm are presented. The LMN method works when input examples are drawn is presented along with a conjecture about a class of distri- uniformly. The new algorithms improve on theirs butions for which it might perform well. by performing well when given inputs drawn Our two learning methods differ in several ways. The from unknown, mutually independent distribu- direct algorithm is very similar to the LMN algorithm and tions. A variant of the one of the algorithms is depends on a substantial generalization of their theory. This conjectured to work in an even broader setting. algorithm is straightforward, but our bound on its running time is very sensitive to the probability distribution on the inputs. The indirect algorithm, discovered independently 1 INTRODUCTION by Umesh Vazirani [Vaz], is more complicated but also relatively simple to analyze. Its time bound is only mildly Linial, Mansour, and Nisan [LMN89] introduced the use affected by changes in distribution, but for distributions not too far from uniform it is likely greater than the direct of the Fourier transform to accomplish Boolean function 0 bound. We suggest a possible hybrid of the two methods learning. They showed that AC functions are well- characterized by their low frequency Fourier spectra and which may have a better running time than either method alone for certain distributions. gave an algorithm which approximates such functions rea- sonably well from uniformly chosen examples. While the The outline of the paper is as follows. We begin with 0 class AC is provablyweak in that it does not contain modu- de®nitions and proceed to discuss the key idea behind our lar counting functions, from a learning theory point of view direct extension: we use an appropriate change of basis for 0 AC it is fairly rich. For example, contains polynomial-size the space of n-bit functions. Next, we prove that under 0 DNF and addition. Thus the LMN learning procedure is po- this change AC functions continue to exhibit the low- tentially powerful, but the restriction that their algorithm be order spectral property which the LMN result capitalizes given examples drawn according to a uniform distribution on. After giving an overview of the direct algorithm and is particularly limiting. analyzing its running time, we discuss the indirect approach and compare it with the ®rst. Finally, we indicate some A further limitation of the LMN algorithm is its running time. Valiant's [Val84] learning requirements are widely directions for further research. accepted as a baseline characterization of feasible learning. They include that a learning algorithm should be distribu- 2 DEFINITIONS AND NOTATION tion independent and run in polynomial time. The LMN poly log n algorithm runs in quasi-polynomial (O 2 ) time. ; ;ng n All sets are subsets of f1 ... , where as usual repre- In this paper we develop two extensions of the LMN learn- sents the number of variables in the function to be learned. ing algorithm which produce good approximatingfunctions Capital letters denote set variables unless otherwise noted. X when samples are drawn according to unknown distribu- The complement of a set X is indicated by , although we tions which assign values to the input variables indepen- abuse the concept of complementation somewhat: in cases S dently. Call such a distribution mutually independent since where X is speci®ed to be a subset of some other set , X = S X X = f ;ngX it is the joint probability distribution corresponding to a set ,otherwise 1 ; ... Strings of of mutually independent random variables [Fel57]. The 0 =1 variables are referred to by barred lower case letters x running times of the new algorithms are dependent on the (e.g. ) which may be superscripted to indicate one of a j x x i distributionÐthe farther the distribution is from uniform sequence of strings (e.g. ). i refers to the th variable x the higher the boundÐand, as is the case for the LMN in a string . Barred constants (e.g. 0, ) indicate strings of algorithm, the time bounds are quasi-polynomial in n. the given value with length implied by context. Unless otherwise speci®ed, all functions are assumed to 3 AN ORTHONORMAL BASIS FOR n ; g have as domain the set of strings f0 1 . The range of BOOLEAN FUNCTIONS SAMPLED g Boolean functions will sometimes be f0,1 , particularly when we are dealing with circuit models, but will usually be UNDER MUTUALLY INDEPENDENT ; g f1 1 for reasons that should become clear subsequently. DISTRIBUTIONS Frequently we will write sets as arguments where strings X f x f would be expected, e.g. f rather than for a 3.1 RATIONALE n ; g f X function on f0 1 . In such cases is a shorthand for v cX cX f where is a characteristic function de®ned Given some element of a vector space and a basis for v X = i 2 X c this space, a discrete Fourier transform expresses as the by i 0if and 1 otherwise. Note that the sense of this function is opposite the natural one in which 0 coef®cients of the linear combination of basis vectors repre- represents set absence. senting v . We are interested in learning Boolean functions 0 on n bits, which can be represented as Boolean vectors of n An AC function is a Boolean function which can be com- length 2 . Linial et al. used as the basis for this space the puted by a family of acyclic circuits (one circuit for each n characters of the group Z2 in R. The characters are given n number n of inputs) consisting of AND and OR gates plus by the 2 functions negations only on inputs and satisfying two properties: jA\X j : X = 1 A The number of gates in each circuit (its size) is bounded jAj f j Each is simply a polynomial of degree ,and A A by a ®xed polynomial in n. Aj k g j spans the space of polynomials of degree not The maximum number of gates between an input and more than k. With an inner product de®ned by the output (the circuit depth) is a ®xed constant. X n f xg x f; gi = h 2 x A random restriction p;q is a function which given input n ; g x2f 0 1 x p maps i to with ®xed probability and assigns 0's and 1's to the other variables according to a probability distribution p f k = hf; f i and norm k the characters form an orthonor- q x f .If represents the input to a function then p;q induces mal basis for the space of real-valued functions on n bits. d another function f which has variables corresponding to The Fourier coef®cients of a function f with respect to this the stars and has the other variables of f ®xedto0or1. ^ f = hf; i f basis are simply A , the projections of on the This is a generalization of the original de®nition [FSS81] A basis vectors. Given a suf®cient number m of uniformly P q in which is the uniform distribution. The subscripts of m j j j j x ;fx f x x =m selected examples , is a A j = are generally dropped and their values understood from 1 ^ f context. good estimate of A [LMN89]. The function obtained by setting a certain subset S of the What happens if the examples are chosen according to some q variables of f to the values indicated by the characteristic nonuniform distribution ? If we blindly applied the LMN S f dS X function of a subset X is denoted by or, learning algorithm we would actually be calculating an P f x xq x A S f dX approximation to n for each . when is implied by context, simply . For example, if A x2f ; g 0 1 = f ; g X = f g f dX f S 1 3 and 3 then is the function with That is, we would be calculating the expected value of the x x q variable 1 setto1and 3 to 0. product f with respect to rather than with respect to the A We will use several parameters of the probability distribu- uniform distribution. This leads to the following observa- tion: if we modify the de®nition of the inner product to be q = [x = ] i tion throughout the sequel. We de®ne i Pr 1 , with respect to the distribution q on the inputs and modify where the probability is with respect to q . Another param- eter which we will use frequently is the basis to be orthonormal under this new inner product then the estimated coef®cients should on average be close = = ; = : i max 1 i 1 1 to the appropriate values in this new basis. i We assume that this value is ®nite, since in®nite implies More precisely, de®ne the inner product to be some variable is actually a constant and can be ignored by X hf; gi = f xg xq x the learning procedure.
Recommended publications
  • Introduction to the Theory of Computation, Michael Sipser
    Introduction to the Theory of Computation, Michael Sipser Chapter 0: Introduction Automata, Computability and Complexity: · They are linked by the question: o “What are the fundamental capabilities and limitations of computers?” · The theories of computability and complexity are closely related. In complexity theory, the objective is to classify problems as easy ones and hard ones, whereas in computability theory he classification of problems is by those that are solvable and those that are not. Computability theory introduces several of the concepts used in complexity theory. · Automata theory deals with the definitions and properties of mathematical models of computation. · One model, called the finite automaton, is used in text processing, compilers, and hardware design. Another model, called the context – free grammar, is used in programming languages and artificial intelligence. Strings and Languages: · The string of the length zero is called the empty string and is written as e. · A language is a set of strings. Definitions, Theorems and Proofs: · Definitions describe the objects and notions that we use. · A proof is a convincing logical argument that a statement is true. · A theorem is a mathematical statement proved true. · Occasionally we prove statements that are interesting only because they assist in the proof of another, more significant statement. Such statements are called lemmas. · Occasionally a theorem or its proof may allow us to conclude easily that other, related statements are true. These statements are called corollaries of the theorem. Chapter 1: Regular Languages Introduction: · An idealized computer is called a “computational model” which allows us to set up a manageable mathematical theory of it directly.
    [Show full text]
  • A Quantum Query Complexity Trichotomy for Regular Languages
    A Quantum Query Complexity Trichotomy for Regular Languages Scott Aaronson∗ Daniel Grier† Luke Schaeffer UT Austin MIT MIT [email protected] [email protected] [email protected] Abstract We present a trichotomy theorem for the quantum query complexity of regular languages. Every regular language has quantum query complexity Θ(1), Θ˜ (√n), or Θ(n). The extreme uniformity of regular languages prevents them from taking any other asymptotic complexity. This is in contrast to even the context-free languages, which we show can have query complex- ity Θ(nc) for all computable c [1/2,1]. Our result implies an equivalent trichotomy for the approximate degree of regular∈ languages, and a dichotomy—either Θ(1) or Θ(n)—for sensi- tivity, block sensitivity, certificate complexity, deterministic query complexity, and randomized query complexity. The heart of the classification theorem is an explicit quantum algorithm which decides membership in any star-free language in O˜ (√n) time. This well-studied family of the regu- lar languages admits many interesting characterizations, for instance, as those languages ex- pressible as sentences in first-order logic over the natural numbers with the less-than relation. Therefore, not only do the star-free languages capture functions such as OR, they can also ex- press functions such as “there exist a pair of 2’s such that everything between them is a 0.” Thus, we view the algorithm for star-free languages as a nontrivial generalization of Grover’s algorithm which extends the quantum quadratic speedup to a much wider range of string- processing algorithms than was previously known.
    [Show full text]
  • CS 273 Introduction to the Theory of Computation Fall 2006
    Welcome to CS 373 Theory of Computation Spring 2010 Madhusudan Parthasarathy ( Madhu ) [email protected] What is computable? • Examples: – check if a number n is prime – compute the product of two numbers – sort a list of numbers – find the maximum number from a list • Hard but computable: – Given a set of linear inequalities, maximize a linear function Eg. maximize 5x+2y 3x+2y < 53 x < 32 5x – 9y > 22 Theory of Computation Primary aim of the course: What is “computation”? • Can we define computation without referring to a modern c computer? • Can we define, mathematically, a computer? (yes, Turing machines) • Is computation definable independent of present-day engineering limitations, understanding of physics, etc.? • Can a computer solve any problem, given enough time and disk-space? Or are they fundamental limits to computation? In short, understand the mathematics of computation Theory of Computation - What can be computed? Computability - Can a computer solve any problem, given enough time and disk-space? - How fast can we solve a problem? Complexity - How little disk-space can we use to solve a problem -What problems can we solve given Automata really very little space? (constant space) Theory of Computation What problems can a computer solve? Not all problems!!! Computability Eg. Given a C-program, we cannot check if it will not crash! Verification of correctness of programs Complexity is hence impossible! (The woe of Microsoft!) Automata Theory of Computation What problems can a computer solve? Even checking whether a C-program will Computability halt/terminate is not possible! input n; assume n>1; No one knows Complexity while (n !=1) { whether this if (n is even) terminates on n := n/2; on all inputs! else n := 3*n+1; Automata } 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1.
    [Show full text]
  • Complexity Theory Lectures 1–6
    Complexity Theory 1 Complexity Theory Lectures 1–6 Lecturer: Dr. Timothy G. Griffin Slides by Anuj Dawar Computer Laboratory University of Cambridge Easter Term 2009 http://www.cl.cam.ac.uk/teaching/0809/Complexity/ Cambridge Easter 2009 Complexity Theory 2 Texts The main texts for the course are: Computational Complexity. Christos H. Papadimitriou. Introduction to the Theory of Computation. Michael Sipser. Other useful references include: Computers and Intractability: A guide to the theory of NP-completeness. Michael R. Garey and David S. Johnson. Structural Complexity. Vols I and II. J.L. Balc´azar, J. D´ıaz and J. Gabarr´o. Computability and Complexity from a Programming Perspective. Neil Jones. Cambridge Easter 2009 Complexity Theory 3 Outline A rough lecture-by-lecture guide, with relevant sections from the text by Papadimitriou (or Sipser, where marked with an S). Algorithms and problems. 1.1–1.3. • Time and space. 2.1–2.5, 2.7. • Time Complexity classes. 7.1, S7.2. • Nondeterminism. 2.7, 9.1, S7.3. • NP-completeness. 8.1–8.2, 9.2. • Graph-theoretic problems. 9.3 • Cambridge Easter 2009 Complexity Theory 4 Outline - contd. Sets, numbers and scheduling. 9.4 • coNP. 10.1–10.2. • Cryptographic complexity. 12.1–12.2. • Space Complexity 7.1, 7.3, S8.1. • Hierarchy 7.2, S9.1. • Descriptive Complexity 5.6, 5.7. • Cambridge Easter 2009 Complexity Theory 5 Complexity Theory Complexity Theory seeks to understand what makes certain problems algorithmically difficult to solve. In Data Structures and Algorithms, we saw how to measure the complexity of specific algorithms, by asymptotic measures of number of steps.
    [Show full text]
  • Natural Proofs Barrier
    CS 388T: Advanced Complexity Theory Spring 2020 Natural Proofs Prof. Dana Moshkovitz Scribe: Maruth Goyal 1 Overview Complexity theorists have long been working on proving relations between various complexity classes. Their proofs would often treat Turing Machines as black boxes. However, they seemed to be unable to prove equality or separation for NP 6⊂ P. A reason for this was found with Baker, Gill, and Solovay's Relativization barrier [BGS75], which should any proof for such a separation must be non-relativizing. i.e., it must not hold with respect to arbitrary oracles. Hence, researchers turned to circuits as a way of analyzing various classes by considering more explicit constructions. This begs the question, is their a similar barrier for separating NP from P=poly? To separate a function f from some circuit complexity class C, mathematicians often consider some larger property of f, and then separate that property from C. For instance, in the case of PARITY [FSS84], we observed it cannot be simplified even under arbitrary random restrictions, while AC0 functions can be. While this seemed like a powerful proof technique, in 1997 Razborov and Rudich [RR] showed that to separate NP from P=poly, there is a large classification of properties which cannot be the basis for such a proof. They called these properties \natural properties". 2 Natural Properties Definition 1. A property P is said to be natural if: 1. Useful: 8g 2 C:g2 = P but f 2 P 2. Constructivity: Given g : f0; 1gn ! f0; 1g can check if g 2 P in time 2O(n).
    [Show full text]
  • Theory of Computer Science May 22, 2017 — E2
    Theory of Computer Science May 22, 2017 | E2. P, NP and Polynomial Reductions Theory of Computer Science E2.1 P and NP E2. P, NP and Polynomial Reductions E2.2 Polynomial Reductions Malte Helmert University of Basel E2.3 NP-Hardness and NP-Completeness May 22, 2017 E2.4 Summary Malte Helmert (University of Basel) Theory of Computer Science May 22, 2017 1 / 30 Malte Helmert (University of Basel) Theory of Computer Science May 22, 2017 2 / 30 Overview: Course Overview: Complexity Theory contents of this course: logic I X Complexity Theory . How can knowledge be represented? . How can reasoning be automated? E1. Motivation and Introduction E2. P, NP and Polynomial Reductions I automata theory and formal languages X . What is a computation? E3. Cook-Levin Theorem I computability theory X E4. Some NP-Complete Problems, Part I . What can be computed at all? E5. Some NP-Complete Problems, Part II I complexity theory . What can be computed efficiently? Malte Helmert (University of Basel) Theory of Computer Science May 22, 2017 3 / 30 Malte Helmert (University of Basel) Theory of Computer Science May 22, 2017 4 / 30 Further Reading (German) Further Reading (English) Literature for this Chapter (German) Literature for this Chapter (English) Theoretische Informatik { kurz gefasst Introduction to the Theory of Computation by Uwe Sch¨oning(5th edition) by Michael Sipser (3rd edition) I Chapters 3.1 and 3.2 I Chapter 7.1{7.4 Malte Helmert (University of Basel) Theory of Computer Science May 22, 2017 5 / 30 Malte Helmert (University of Basel) Theory of Computer Science May 22, 2017 6 / 30 E2.
    [Show full text]
  • Theory of Computation Michael Sipser 18.404/6.840 Fall 2021 Course Information
    Theory of Computation Michael Sipser 18.404/6.840 Fall 2021 Course Information Instructor: Michael Sipser, 2{438, 617{253{4992, [email protected], office hours: Tu 4:15{5, zoom hours Mo 1-2pm (see website for zoom link) Homepage: math.mit.edu/18.404 TAs: Fadi Atieh, is [email protected] Kerri Lu, [email protected] Zhezheng Luo, [email protected] Rene Reyes, [email protected] George Stefanakis, [email protected] Sam Tenka, [email protected] Nathan Weckwerth, [email protected] Course Outline I Automata and Language Theory (2 weeks). Finite automata, regular expressions, push-down automata, context free grammars, pumping lemmas. II Computability Theory (3 weeks). Turing machines, Church{Turing thesis, decidability, halting problem, reducibility, recursion theorem. { Midterm Exam: Thursday, October 21, 2021. III Complexity Theory (7 weeks). Time and space measures of complexity, complexity classes P, NP, L, NL, PSPACE, BPP and IP, complete problems, P versus NP conjecture, quantifiers and games, hierarchy theorems, provably hard problems, relativized computation and oracles, probabilistic computation, interactive proof systems. { Final Exam: 3 hours, emphasizing second half of the course. Prerequisites To succeed in this course you require experience and skill with mathematical concepts, theorems, and proofs. If you did reasonably well in 6.042, 18.200, or any other proof-oriented mathematics course, you should be fine. The course moves quickly, covering about 90% of the textbook. The homework assignments generally require proving some statement, and creativity in finding proofs will be necessary. Note that 6.045 has a significant overlap with 18.404, depending on who is teaching 6.045, and it often uses the same book though it generally covers less, with less depth, and with easier problem sets.
    [Show full text]
  • A Fixed-Depth Size-Hierarchy Theorem for AC0 for Any S(N) Exp(No(1))
    A Fixed-Depth Size-Hierarchy Theorem for AC0[ ] via the Coin ⊕ Problem Nutan Limaye ∗ Karteek Sreenivasaiah † Srikanth Srinivasan ‡ Utkarsh Tripathi § S. Venkitesh ¶ February 21, 2019 Abstract In this work we prove the first Fixed-depth Size-Hierarchy Theorem for uniform AC0[ ]. In ⊕ 0 particular, we show that for any fixed d, the class d,k of functions that have uniform AC [ ] formulas of depth d and size nk form an infinite hierarchy.C We show this by exibiting the first⊕ class of explicit functions where we have nearly (up to a polynomial factor) matching upper and lower bounds for the class of AC0[ ] formulas. The explicit functions are derived⊕ from the δ-Coin Problem, which is the computational problem of distinguishing between coins that are heads with probability (1+ δ)/2 or (1 δ)/2, where δ is a parameter that is going to 0. We study the complexity of this problem and− make progress on both upper bound and lower bound fronts. Upper bounds. For any constant d 2, we show that there are explicit monotone • AC0 formulas (i.e. made up of AND and≥ OR gates only) solving the δ-coin problem that have depth d, size exp(O(d(1/δ)1/(d−1))), and sample complexity (i.e. number of inputs) poly(1/δ). This matches previous upper bounds of O’Donnell and Wimmer (ICALP 2007) and Amano (ICALP 2009) in terms of size (which is optimal) and improves the sample complexity from exp(O(d(1/δ)1/(d−1))) to poly(1/δ). Lower bounds.
    [Show full text]
  • Arxiv:Quant-Ph/0001106V1 28 Jan 2000
    Quantum Computation by Adiabatic Evolution Edward Farhi, Jeffrey Goldstone∗ Center for Theoretical Physics Massachusetts Institute of Technology Cambridge, MA 02139 Sam Gutmann† Department of Mathematics Northeastern University Boston, MA 02115 Michael Sipser‡ Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139 MIT CTP # 2936 quant-ph/0001106 Abstract We give a quantum algorithm for solving instances of the satisfiability problem, based on adiabatic evolution. The evolution of the quantum state is governed by a time-dependent Hamiltonian that interpolates between an initial Hamiltonian, whose ground state is easy to construct, and a final Hamiltonian, whose ground state encodes the satisfying assignment. To ensure that the system evolves to the desired final ground state, the evolution time must be big enough. The time required depends on the minimum energy difference between the two lowest states of the interpolating Hamiltonian. We are unable to estimate this gap in general. We give some special symmetric cases of the satisfiability problem where the symmetry allows us to estimate the gap and we show that, in these cases, our algorithm runs in polynomial time. 1 Introduction We present a quantum algorithm for the satisfiability problem (and other combinatorial search prob- lems) that works on the principle of quantum adiabatic evolution. arXiv:quant-ph/0001106v1 28 Jan 2000 An n-bit instance of satisfiability is a formula C C C (1.1) 1 ∧ 2 ∧···∧ M where each clause Ca is True or False depending on the values of some subset of the bits. For a single clause, involving only a few bits, it is easy to imagine constructing a quantum device that evolves to a state that encodes the satisfying assignments of the clause.
    [Show full text]
  • P = NP Problem, and Consider One of the Great Open Problems of Science
    P =? NP Scott Aaronson∗ Abstract In 1950, John Nash sent a remarkable letter to the National Security Agency, in which— seeking to build theoretical foundations for cryptography—he all but formulated what today ? we call the P = NP problem, and consider one of the great open problems of science. Here I survey the status of this problem in 2016, for a broad audience of mathematicians, scientists, and engineers. I offer a personal perspective on what it’s about, why it’s important, why it’s reasonable to conjecture that P = NP is both true and provable, why proving it is so hard, the landscape of related problems,6 and crucially, what progress has been made in the last half- century toward solving those problems. The discussion of progress includes diagonalization and circuit lower bounds; the relativization, algebrization, and natural proofs barriers; and the recent works of Ryan Williams and Ketan Mulmuley, which (in different ways) hint at a duality between impossibility proofs and algorithms. Contents 1 Introduction 3 ? 1.1 The Importance of P = NP ................................ 4 ? 1.2 Objections to P = NP ................................... 5 1.2.1 TheAsymptoticObjection . .... 5 1.2.2 The Polynomial-Time Objection . ...... 6 1.2.3 TheKitchen-SinkObjection. ...... 7 1.2.4 The Mathematical Snobbery Objection . ....... 7 1.2.5 TheSourGrapesObjection . .... 8 1.2.6 TheObviousnessObjection . ..... 8 1.2.7 TheConstructivityObjection . ....... 9 1.3 FurtherReading .................................. .... 9 ? 2 Formalizing P = NP and Central Related Concepts 10 2.1 NP-Completeness...................................... 12 2.2 OtherCoreConcepts............................... ..... 14 2.2.1 Search, Decision, and Optimization . ......... 15 2.2.2 The Twilight Zone: Between P and NP-complete ...............
    [Show full text]
  • Lecture Notes
    ECS 120 Notes David Doty based on Introduction to the Theory of Computation by Michael Sipser ii Copyright c December 1, 2015, David Doty No part of this document may be reproduced without the expressed written consent of the author. All rights reserved. Based heavily (some parts copied) on Introduction to the Theory of Computation, by Michael Sipser. These are notes intended to assist in lecturing from Sipser's book; they are not entirely the original work of the author. Some proofs are also taken from Automata and Computability by Dexter Kozen. Contents Introduction v 0.1 Mathematical Background . viii 0.1.1 Implication Statements . viii 0.1.2 Sets . viii 0.1.3 Sequences and Tuples . ix 0.1.4 Functions and Relations . .x 0.1.5 Strings and Languages . xi 0.1.6 Graphs . xii 0.1.7 Boolean Logic . xii 0.2 Proof by Induction . xii 0.2.1 Proof by Induction on Natural Numbers . xii 0.2.2 Induction on Other Structures . xiii 1 Regular Languages 1 1.1 Finite Automata . .1 1.1.1 Formal Definition of a Finite Automaton (Syntax) . .1 1.1.2 More Examples . .3 1.1.3 Formal Definition of Computation by a DFA (Semantics) . .4 1.1.4 The Regular Operations . .5 1.2 Nondeterminism . .8 1.2.1 Formal Definition of an NFA (Syntax) . .9 1.2.2 Formal Definition of Computation by an NFA (Semantics) . 10 1.2.3 Equivalence of DFAs and NFAs . 11 1.3 Regular Expressions . 14 1.3.1 Formal Definition of a Regular Expression .
    [Show full text]
  • Introduction to the Theory of Computation
    Introduction to the Theory of Computation Michael Sipser Now Available from PWS Publishing Company. Errata CONTENTS OF THE FIRST EDITION 0. Introduction 1. AUTOMATA, COMPUTABILITY, AND COMPLEXITY Complexity theory - Computability theory - Automata theory 2. MATHEMATICAL NOTIONS AND TERMINOLOGY Sets - Sequences and tuples - Functions and relations - Graphs - Strings and languages - Boolean logic - Summary of mathematical terms 3. DEFINITIONS, THEOREMS, AND PROOFS Finding proofs 4. TYPES OF PROOF Proof by construction - Proof by contradiction - Proof by induction PART ONE: AUTOMATA AND LANGUAGES 1. Regular Languages 1. FINITE AUTOMATA Formal definition of a finite automaton - Examples of finite automata - Formal definition of computation - Designing finite automata - The regular operations 2. NONDETERMINISM Equivalence of NFAs and DFAs - Closure under the regular operations 3. REGULAR EXPRESSIONS Formal definition of a regular expression - Equivalence with finite automata 4. NONREGULAR LANGUAGES The pumping lemma for regular languages 2. Context-Free Languages 1. CONTEXT-FREE GRAMMARS Formal definition of a context-free grammar - Examples of context-free grammars - Designing context-free grammars - Ambiguity - Chomsky normal form 2. PUSHDOWN AUTOMATA Formal definition of a pushdown automaton - Examples of pushdown automata - Equivalence with context-free grammars 3. NON-CONTEXT-FREE LANGUAGES The pumping lemma for context-free languages PART TWO: COMPUTABILITY THEORY 3. The Church-Turing Thesis 1. TURING MACHINES Formal definition of a Turing machine - Examples of Turing machines 2. VARIANTS OF TURING MACHINES Multitape Turing machines - Nondeterministic Turing machines - Enumerators - Equivalence with other models 3. THE DEFINITION OF ALGORITHM Hilbert's problems - Terminology for describing Turing machines 4. Decidability 1. DECIDABLE LANGUAGES 2. THE HALTING PROBLEM The diagonalization method - The halting problem is undecidable - A Turing- unacceptable language 5.
    [Show full text]