Formalization of the Undecidability of the Halting Problem for a Functional Language

Formalization of the Undecidability of the Halting Problem for a Functional Language Thiago Mendon¸ca Ferreira Ramos1, César Muñoz2, Mauricio Ayala-Rincón1, Mariano Moscato3, Aaron Dutle2, and Anthony Narkawicz2 1 University of Bras´ılia, Bras´ılia, Brazil [email protected], [email protected] 2 NASA Langley Research Center, Hampton, VA, USA cesar.a.munoz,aaron.m.dutle,anthony.narkawicz @nasa.gov { } 3 National Institute of Aerospace, Hampton, VA, USA [email protected] Abstract. This paper presents a formalization of the proof of the undecidability of the halting problem for a functional programming language. The computational model consists of a simple first-order functional language called PVS0 whose operational semantics is specified in the Prototype Verification System (PVS). The formalization is part of a termination analysis library in PVS that includes the specification and equivalence proofs of several notions of termination. The proof of the undecidability of the halting problem required classical constructions such as mappings between naturals and PVS0 programs and inputs. These constructs are used to disprove the existence of a PVS0 program that decides termination of other programs, which gives rise to a contradic- tion. 1 Introduction In computer science, program termination is the quintessential example of a property that is undecidable, a fact that is well-known as the undecidability of the halting problem [12]. This undecidability implies that it is not possible to build a compiler that would verify whether a program terminates for any given input. Despite this undecidability, it is possible to construct algorithms that partially decide termination, i.e., they correctly answer whether an input program“terminates or not”, but may also answer “do not know”. Termination analysis of programs is an active area of research. Indeed, substantial progress in this area is regularly presented in meetings such as the International Workshop on Termination and the Annual International Termination Competition. To formally verify correctness of termination analysis algorithms, it is often necessary to specify and prove equivalence among multiple notions of termination. Given a formal model of computation, one natural notion of termination is specified as for all inputs there exists an output provided under the operational semantics of the model. Another notion of termination could be specified con- sidering whether or not the depth of the expansion tree of computation steps for 2 T. M. Ferreira Ramos et al. all inputs is finite. These two notions rely on the semantics of the computational model. A more syntactic approach, attributed to Turing [13], is to verify that the actual arguments decrease in any repeating control structure, e.g., recursion, unbounded loop, fix-point, etc., of the program according to some well-founded relation. This notion is used in the majority of proof assistants, where the user must provide the well-founded relation. The main contribution of this work is the formalization in the Prototype Verification System (PVS) [10] of the theorem of undecidability of the halting problem for a model of computation given by a functional language called PVS0. The formal development includes the definition of PVS0, its operational semantics, and the specification and proof of several concepts used in termination analysis of PVS0 programs. For the undecidability proof of the halting problem, only the semantic notions of termination are used. Turing termination for the language PVS0 is also discussed to show how semantic and syntactic termination criteria are related. The formalization is available as part of the NASA PVS Library under the directory PVS0.4 All lemmas and theorems presented in this paper were formalized and verified in PVS. 2 Semantic Termination The PVS0 language is a simple functional language whose expressions are de- scribed by the following grammar. expr ::= cnst vr op1(expr) op2(expr, expr) rec(expr) ite(expr, expr, expr) | | | | | The grammar above is specified in PVS through the abstract data type: PVS0Expr[T:TYPE+] : DATATYPE BEGIN cnst(get_val:T) : cnst? vr : vr? op1(get op:nat,get arg:PVS0Expr) : op1? op2(get op:nat,get arg1,get arg2:PVS0Expr) : op2? rec(get arg:PVS0Expr) : rec? ite(get cond,get if,get else:PVS0Expr) : ite? END PVS0Expr A PVS specification of an abstract data type includes the constructors, e.g., ite, the accessors, e.g., get cond, and the recognizers, e.g., ite?. In this data type, T is a parametric nonempty type that represents the type of values that serve as inputs and outputs of PVS0 programs. Furthermore, cnst is the constructor of constant values of type T, vr is the unique variable constructor, op1 and op2 are constructors of unary and binary operators, respectively, rec is the constructor of recursion, and ite is the constructor of conditional “if-then-else” expressions. 4 https://github.com/nasa/pvslib. Formalization of the Undecidability of the Halting Problem 3 The first parameter of the constructors op1 and op2 is an index representing built-in unary and binary operators, respectively. The uninterpreted type T and uninterpreted unary and binary operators en- able the encoding of arbitrary first-order PVS functions as programs in PVS0. Indeed, the operational semantics of PVS0Expr is given in terms of a non-empty set Val ,whichinterpretsthetypeT.ThetypePVS0[Val ] of PVS0 programs with values in Val consists of all 4-tuples of the form (O ,O , , expr), such that 1 2 ? – O is a list of PVS functions of type Val Val ,whereO (i), i.e., the i-th 1 ! 1 element of the list O1,interpretstheindexi in the constructor op1, – O is a list of PVS functions of type Val Val Val ,whereO (i), i.e., the 2 ⇥ ! 2 i-th element of the list O2,interpretstheindexi in the constructor op2, – is a constant of type Val representing the Boolean value false in the conditional? construction ite, and – expr is a PVS0Expr[Val ], which is the syntactic representation of the program itself. Henceforth, O1 and O2 represent the length of the lists O1 and O2,respectively. The choice| | of lists| of| functions for interpreting unary and binary operators helps in the enumeration of PVS0 programs, which is necessary in the undecidability proof. Given a program (O1,O2, ,ef ) of type PVS0[Val ], the semantic evaluation of an expression e of type PVS0Expr? [Val ] is given by the curried inductive relation " of type PVS0[Val ] (PVS0Expr[Val ] Val Val ) bool defined as follows. ! ⇥ ⇥ ! Intuitively, the relation "(O1,O2, ,ef )(e, vi,vo) defined below holds when given a program (O ,O , ,e ) the evaluation? of the expression e on the input 1 2 ? f value vi is the value vo. "(O ,O , ,e )(e, v ,v ):=CASES e OF 1 2 ? f i o cnst(v):vo = v; vr : vo = vi; op1(j, e ):j< O v0 Val : 1 | 1|^9 2 "(O ,O , ,e )(e ,v ,v0) v = O (j)(v0); 1 2 ? f 1 i ^ o 1 op2(j, e ,e ):j< O v0,v00 Val : 1 2 | 2|^9 2 "(O ,O , ,e )(e ,v ,v0) 1 2 ? f 1 i ^ "(O ,O , ,e )(e ,v ,v00) 1 2 ? f 2 i ^ vo = O2(j)(v0,v00); rec(e ): v0 al : "(O ,O , ,e )(e ,v ,v0) 1 9 2V 1 2 ? f 1 i ^ "(O ,O , ,e )(e ,v0,v ) 1 2 ? f f o ite(e ,e ,e ): v0 : "(O ,O , ,e )(e ,v ,v0) 1 2 3 9 1 2 ? f 1 i ^ IF v0 = THEN "(O1,O2, ,ef )(e2,vi,vo) ELSE "6 (O? ,O , ,e )(e ,v?,v ). 1 2 ? f 3 i o In the definition of ", the parameters ef and e are needed since the evaluation of a program (O ,O , ,e ) leads to evaluation of sub expressions e of e and, 1 2 ? f f when a recursive call is evaluated, the whole expression ef should be considered again (see the recursive case rec(e1) above). 4 T. M. Ferreira Ramos et al. For example, consider below a PVS0 program that computes the Ackermann function. Example 1. Let Val be the set N N of pairs of natural numbers, =(1, 0), =(0, 0), and a be the PVS0 program⇥ (O ,O , ,e ), where > ? 1 2 ? a O1(0)(m, n):=IF m =0THEN ELSE , O (1)(m, n):=IF n =0THEN >ELSE ?, 1 > ? O1(2)(m, n):=(n +1, 0), O (3)(m, n):=IF m>0 THEN (m 1, 1) ELSE , 1 − ? O1(4)(m, n):=IF n>0 THEN (m, n 1) ELSE , O (0)((m, n), (i, j)) := IF m>0 THEN (m −1,i) ELSE ?, 2 − ? ea := ite(op1(0, vr), op1(2, vr), ite(op1(1, vr), rec(op1(3, vr)), rec(op2(0, vr, rec(op1(4, vr)))))). It is proved in PVS that a computes the Ackermann function, i.e., for any n, m, k N, ackermann(m, n)=k if and only if "(a)(ea, (n, m), (k, i)), for some i,where2ackermann is the recursive function defined in PVS as ackermann(m, n):=IF m =0THEN n +1 ELSIF n =0THEN ackermann(m 1, 1) ELSE ackermann(m 1, ackermann− (m, n 1)). − − In the definition of a,thetypeVal encodes the two inputs of the Ackermann function, but also the output of the function, which is given by the first entry of the second pair. The proof of one of the implications in the statement of Example 1 proceeds by induction using a lexicographic order on (m, n). The other implication is proved using the induction schema generated for the inductive relation ". Al- though it is not logically deep, this proof is tedious and long.

Load more