Verified Programming of Turing Machines in Coq
Total Page:16
File Type:pdf, Size:1020Kb
Verified Programming of Turing Machines in Coq Yannick Forster Fabian Kunze Maximilian Wuttke Saarland University Saarland University Saarland University Saarbrücken, Germany Saarbrücken, Germany Saarbrücken, Germany [email protected] [email protected] [email protected] Abstract of their biggest disadvantages: When it comes to detailed We present a framework for the verified programming of or formal reasoning, Turing machines soon become very multi-tape Turing machines in Coq. Improving on prior work hard to treat. This is maybe best reflected by the fact that by Asperti and Ricciotti in Matita, we implement multiple while many basic areas of computer science, like logic, gram- layers of abstraction. The highest layer allows a user to im- mars, automata, or programming languages theory have plement nontrivial algorithms as Turing machines and verify been formalised in proof assistants, formalisations of even 1 their correctness, as well as time and space complexity com- basic complexity-theoretic results are not available. While positionally. The user can do so without ever mentioning constructing Turing machines on paper might be possible, states, symbols on tapes or transition functions: They write verifying a non-trivial machine defined in terms of states programs in an imperative language with registers contain- and transition functions in a proof assistant seems entirely ing values of encodable data types, and our framework con- infeasible. structs corresponding Turing machines. There were several attempts of formalising Turing ma- As case studies, we verify a translation from multi-tape to chines in proof assistants. Asperti and Ricciotti[2015] and single-tape machines as well as a universal Turing machine, Xu, Zhang, and Urban[2013] verify universal Turing ma- both with polynomial time overhead and constant factor chines in Matita and Isabelle/HOL, respectively, and Ciaffagli- space overhead. one[2016] formalises the undecidability of Turing machine halting in Coq. However, none of these results analyse time CCS Concepts • Theory of computation → Turing ma- or space complexity of their machines. chines; Type theory. The main difficulty for detailed reasoning about Turing Keywords Turing machines, verification, universal machine, machines is their lack of compositionality. For example, it is Coq not clear at all how to compose a two-tape Turing machine with a three-tape Turing machine that works on a different ACM Reference Format: alphabet. Therefore, it is common to rely on pseudo code or Yannick Forster, Fabian Kunze, and Maximilian Wuttke. 2020. Veri- prose describing the intended behaviour. The exact imple- fied Programming of Turing Machines in Coq. In Proceedings of mentation as well as its correctness or resource analysis is the 9th ACM SIGPLAN International Conference on Certified Pro- left as an exercise to the reader. In a mechanised proof, those grams and Proofs (CPP ’20), January 20–21, 2020, New Orleans, LA, USA. ACM, New York, NY, USA, 15 pages. https://doi.org/10.1145/ details cannot be left out. Luckily, it is possible to hide those 3372885.3373816 details behind suitable abstractions. We present a framework that aims to have the cake and 1 Introduction eat it too when it comes to mechanising computation in terms of Turing machines: Algorithms are stated in the style Turing machines are, at least on paper, the foundation of of a register based while-language; a corresponding Turing modern computability and complexity theory, in part due machine is automatically constructed behind the scene. Our to the conceptual simplicity of their definition. However, framework furthermore characterises the semantics by de- this simplicity leads to a lack of structure, which is also one riving two relations for each machine, one witnessing partial Permission to make digital or hard copies of all or part of this work for correctness, which can subsume a space-consumption analy- personal or classroom use is granted without fee provided that copies sis, and one witnessing termination, which can subsume a are not made or distributed for profit or commercial advantage and that running time analysis. These relations are similar to relations copies bear this notice and the full citation on the first page. Copyrights a Hoare-like logic would derive for the algorithm, especially for components of this work owned by others than the author(s) must in that they follow the internal structure of the program. The be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific only task left for the user is to simplify those synthesised permission and/or a fee. Request permissions from [email protected]. relations into a more high-level, hand-written description of CPP ’20, January 20–21, 2020, New Orleans, LA, USA the semantics. © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-7097-4/20/01...$15.00 1For example the Cook–Levin theorem (SAT is NP-complete), the inclusions https://doi.org/10.1145/3372885.3373816 % ⊆ NP ⊆ PSPACE ⊆ EXP, or the time and space hierarchy theorems. CPP ’20, January 20–21, 2020, New Orleans, LA, USA Yannick Forster, Fabian Kunze, and Maximilian Wuttke Our imperative abstractions for Turing machines are shal- the value of a tape that contains a number, or to copy the lowly embedded into Coq’s type theory: Primitive operations value from one tape to another tape. are predefined Turing machines performing primitive tasks. As case studies, we give a mechanisation of a translation Control-flow operators like If or While are Coq-functions from multi-tape to single-tape machines as well as a mechan- constructing new Turing machines from existing ones. isation of a universal Turing machine in Coq. Both machines To make reasoning and programming of Turing machines have a polynomial time and constant factor space overhead. feasible, we introduce three layers of abstractions, L1 - L3. To the best of our knowledge, this is the first formalised The lower layers are heavily inspired by the definitions As- universal machine verified w.r.t. time and space complexity perti and Ricciotti[2015] use to formalise multi-tape Turing for any model of computation in any proof assistant. machines in the proof assistant Matita. Definitions, lemmas, and theorems in the PDF-version of On the lowest layer L0 (which is actually not an abstrac- this document are hyperlinked with the accompanying Coq = " = 2 tion), we define -tape Turing machines : TMΣ over finite development ; those links are marked by a -symbol. alphabets Σ and their semantics in Coq, based on the defini- Structure Section2 introduces notation and type-theoretic tions by Asperti and Ricciotti. preliminaries. Sections3 to6 give an overview over the layers Layer L introduces labelled machines " : TM= ¹!º, which 1 Σ L - L . We report on the implementation of a translation additionally contain an arbitrary finite type ! together with a 0 3 from multi-tape to single-tape machines and a universal function labelling every state of the machine with an element machine in Sections7 and8. We conclude by commenting of this type. Labels can be seen as a partitioning of states, on the mechanised proofs (Section9), an overview of related abstracting away from implementation details. Based on work (Section 10), and a discussion of our results and future this notion we define two verification primitives: realisation work (Section 11). The appendix [Forster et al. 2019] contains (partial correctness) and termination. A labelled machine " = ! ' = ! = a full definition of Turing machines with all details. : TMΣ ¹ º realises a relation ⊆ TapeΣ × ¹ × TapeΣº, " ' written ⊨ , if for every terminating computation, the 2 Preliminaries input tapes are in relation with the label of the terminating state and the output tapes. Dually, a machine terminates in We work in constructive type theory with inductive types ) = " ) and an impredicative universe of propositions. a running time relation ⊆ TapeΣ × N, written # , if for every related pair of input tapes C and step counts :, the The basic inductive types we use are the Booleans B ::= 1 machine actually terminates in : steps on input C. Realisation true j false, the unit type ::= ¹º, the natural numbers = = - . and termination can be used to express the correctness of a N ::= 0 j S for : N, product types × , sum types - . = machine. ¸ , and the type with exactly elements F=. Given a - - G Programming machines on layer L is quite hard because type , we further define options O¹ º ::= ; j b c and lists 1 - G 퐴 G - 퐴 - we have to define concrete transition functions. We thus L¹ º ::= »¼ j :: for : and : L¹ º. These notations are shared with vectors G® : - = of fixed length = : N. We introduce layer L2 with primitive machines that can read, G - = 8 write and move the head as well as lifting operators and index elements of vectors ® : by indices : F= writing G 8 5 퐴 control-flow operators. ®» ¼. We write @ to map a function over a list, vector or Lifting operators and control-flow operators can be used tape. To inline case distinctions over inductive values inside formulas, we write match B with ? ) A j ... with patterns to compose machines. For example, given machines "1 : 1 1 = ! " < !0 ?1 and results A8 for pattern matching. TMΣ ¹ º and 2 : TMΓ ¹ º, we can use tape-lifting operators to obtain machines " 0 : TM=¸< ¹!º and " 0 : TM=¸< ¹!0º, and We encode relations as functions returning a proposition. 1 Σ 2 Γ _ =< . = < alphabet-liftings to obtain machines " 00 : TM=¸< ¹!º and Thus ¹ : Nº = defines the identity relation on 1 Σ¸Γ ' 퐴 퐵 " 00 : TM=¸< ¹!0º. We can then use control-flow operators to natural numbers. We write ⊆ × as an abbreviation 2 Σ¸Γ ' 퐴 퐵 ' ' compose them into a machine " 00; " 00 : TM=¸< ¹!0º.