Formalization of the Undecidability of the Halting Problem for a Functional Language
Thiago Mendon¸ca Ferreira Ramos1, C´esar Mu˜noz2, Mauricio Ayala-Rinc´on1, Mariano Moscato3, Aaron Dutle2, and Anthony Narkawicz2
1 University of Bras´ılia, Bras´ılia, Brazil [email protected], [email protected] 2 NASA Langley Research Center, Hampton, VA, USA cesar.a.munoz,aaron.m.dutle,anthony.narkawicz @nasa.gov { } 3 National Institute of Aerospace, Hampton, VA, USA [email protected]
Abstract. This paper presents a formalization of the proof of the un- decidability of the halting problem for a functional programming lan- guage. The computational model consists of a simple first-order func- tional language called PVS0 whose operational semantics is specified in the Prototype Verification System (PVS). The formalization is part of a termination analysis library in PVS that includes the specification and equivalence proofs of several notions of termination. The proof of the un- decidability of the halting problem required classical constructions such as mappings between naturals and PVS0 programs and inputs. These constructs are used to disprove the existence of a PVS0 program that decides termination of other programs, which gives rise to a contradic- tion.
1 Introduction
In computer science, program termination is the quintessential example of a property that is undecidable, a fact that is well-known as the undecidability of the halting problem [12]. This undecidability implies that it is not possible to build a compiler that would verify whether a program terminates for any given input. Despite this undecidability, it is possible to construct algorithms that partially decide termination, i.e., they correctly answer whether an input program“terminates or not”, but may also answer “do not know”. Termination analysis of programs is an active area of research. Indeed, substantial progress in this area is regularly presented in meetings such as the International Workshop on Termination and the Annual International Termination Competition. To formally verify correctness of termination analysis algorithms, it is often necessary to specify and prove equivalence among multiple notions of termina- tion. Given a formal model of computation, one natural notion of termination is specified as for all inputs there exists an output provided under the operational semantics of the model. Another notion of termination could be specified con- sidering whether or not the depth of the expansion tree of computation steps for 2 T. M. Ferreira Ramos et al. all inputs is finite. These two notions rely on the semantics of the computational model. A more syntactic approach, attributed to Turing [13], is to verify that the actual arguments decrease in any repeating control structure, e.g., recursion, unbounded loop, fix-point, etc., of the program according to some well-founded relation. This notion is used in the majority of proof assistants, where the user must provide the well-founded relation. The main contribution of this work is the formalization in the Prototype Verification System (PVS) [10] of the theorem of undecidability of the halting problem for a model of computation given by a functional language called PVS0. The formal development includes the definition of PVS0, its operational seman- tics, and the specification and proof of several concepts used in termination analysis of PVS0 programs. For the undecidability proof of the halting problem, only the semantic notions of termination are used. Turing termination for the language PVS0 is also discussed to show how semantic and syntactic termina- tion criteria are related. The formalization is available as part of the NASA PVS Library under the directory PVS0.4 All lemmas and theorems presented in this paper were formalized and verified in PVS.
2 Semantic Termination
The PVS0 language is a simple functional language whose expressions are de- scribed by the following grammar. expr ::= cnst vr op1(expr) op2(expr, expr) rec(expr) ite(expr, expr, expr) | | | | | The grammar above is specified in PVS through the abstract data type:
PVS0Expr[T:TYPE+] : DATATYPE BEGIN cnst(get_val:T) : cnst? vr : vr? op1(get op:nat,get arg:PVS0Expr) : op1? op2(get op:nat,get arg1,get arg2:PVS0Expr) : op2? rec(get arg:PVS0Expr) : rec? ite(get cond,get if,get else:PVS0Expr) : ite? END PVS0Expr
A PVS specification of an abstract data type includes the constructors, e.g., ite, the accessors, e.g., get cond, and the recognizers, e.g., ite?. In this data type, T is a parametric nonempty type that represents the type of values that serve as inputs and outputs of PVS0 programs. Furthermore, cnst is the constructor of constant values of type T, vr is the unique variable constructor, op1 and op2 are constructors of unary and binary operators, respectively, rec is the constructor of recursion, and ite is the constructor of conditional “if-then-else” expressions.
4 https://github.com/nasa/pvslib. Formalization of the Undecidability of the Halting Problem 3
The first parameter of the constructors op1 and op2 is an index representing built-in unary and binary operators, respectively. The uninterpreted type T and uninterpreted unary and binary operators en- able the encoding of arbitrary first-order PVS functions as programs in PVS0. Indeed, the operational semantics of PVS0Expr is given in terms of a non-empty set Val ,whichinterpretsthetypeT.ThetypePVS0[Val ] of PVS0 programs with values in Val consists of all 4-tuples of the form (O ,O , , expr), such that 1 2 ? – O is a list of PVS functions of type Val Val ,whereO (i), i.e., the i-th 1 ! 1 element of the list O1,interpretstheindexi in the constructor op1, – O is a list of PVS functions of type Val Val Val ,whereO (i), i.e., the 2 ⇥ ! 2 i-th element of the list O2,interpretstheindexi in the constructor op2, – is a constant of type Val representing the Boolean value false in the conditional? construction ite, and – expr is a PVS0Expr[Val ], which is the syntactic representation of the program itself.
Henceforth, O1 and O2 represent the length of the lists O1 and O2,respec- tively. The choice| | of lists| of| functions for interpreting unary and binary operators helps in the enumeration of PVS0 programs, which is necessary in the undecid- ability proof. Given a program (O1,O2, ,ef ) of type PVS0[Val ], the semantic evaluation of an expression e of type PVS0Expr? [Val ] is given by the curried inductive relation " of type PVS0[Val ] (PVS0Expr[Val ] Val Val ) bool defined as follows. ! ⇥ ⇥ ! Intuitively, the relation "(O1,O2, ,ef )(e, vi,vo) defined below holds when given a program (O ,O , ,e ) the evaluation? of the expression e on the input 1 2 ? f value vi is the value vo.
"(O ,O , ,e )(e, v ,v ):=CASES e OF 1 2 ? f i o cnst(v):vo = v; vr : vo = vi; op1(j, e ):j< O v0 Val : 1 | 1|^9 2 "(O ,O , ,e )(e ,v ,v0) v = O (j)(v0); 1 2 ? f 1 i ^ o 1 op2(j, e ,e ):j< O v0,v00 Val : 1 2 | 2|^9 2 "(O ,O , ,e )(e ,v ,v0) 1 2 ? f 1 i ^ "(O ,O , ,e )(e ,v ,v00) 1 2 ? f 2 i ^ vo = O2(j)(v0,v00); rec(e ): v0 al : "(O ,O , ,e )(e ,v ,v0) 1 9 2V 1 2 ? f 1 i ^ "(O ,O , ,e )(e ,v0,v ) 1 2 ? f f o ite(e ,e ,e ): v0 : "(O ,O , ,e )(e ,v ,v0) 1 2 3 9 1 2 ? f 1 i ^ IF v0 = THEN "(O1,O2, ,ef )(e2,vi,vo) ELSE "6 (O? ,O , ,e )(e ,v?,v ). 1 2 ? f 3 i o In the definition of ", the parameters ef and e are needed since the evaluation of a program (O ,O , ,e ) leads to evaluation of sub expressions e of e and, 1 2 ? f f when a recursive call is evaluated, the whole expression ef should be considered again (see the recursive case rec(e1) above). 4 T. M. Ferreira Ramos et al.
For example, consider below a PVS0 program that computes the Ackermann function.
Example 1. Let Val be the set N N of pairs of natural numbers, =(1, 0), =(0, 0), and a be the PVS0 program⇥ (O ,O , ,e ), where > ? 1 2 ? a
O1(0)(m, n):=IF m =0THEN ELSE , O (1)(m, n):=IF n =0THEN >ELSE ?, 1 > ? O1(2)(m, n):=(n +1, 0), O (3)(m, n):=IF m>0 THEN (m 1, 1) ELSE , 1 ? O1(4)(m, n):=IF n>0 THEN (m, n 1) ELSE , O (0)((m, n), (i, j)) := IF m>0 THEN (m 1,i) ELSE ?, 2 ? ea := ite(op1(0, vr), op1(2, vr), ite(op1(1, vr), rec(op1(3, vr)), rec(op2(0, vr, rec(op1(4, vr)))))).
It is proved in PVS that a computes the Ackermann function, i.e., for any n, m, k N, ackermann(m, n)=k if and only if "(a)(ea, (n, m), (k, i)), for some i,where2ackermann is the recursive function defined in PVS as ackermann(m, n):=IF m =0THEN n +1 ELSIF n =0THEN ackermann(m 1, 1) ELSE ackermann(m 1, ackermann (m, n 1)). In the definition of a,thetypeVal encodes the two inputs of the Ackermann function, but also the output of the function, which is given by the first entry of the second pair. The proof of one of the implications in the statement of Example 1 proceeds by induction using a lexicographic order on (m, n). The other implication is proved using the induction schema generated for the inductive relation ". Al- though it is not logically deep, this proof is tedious and long. However, it is mechanizable assuming that the PVS function and the PVS0 program share the same syntactical structure. As part of the work presented in this paper, a PVS strategy that automatically discharges equivalences between PVS functions and PVS0 programs was developed. This strategy is convenient since one of the ob- jectives of this work is to reason about computational aspects of PVS functions through their embeddings in PVS0. Example 1 also illustrates the use of built-in operators in PVS0. Despite its simplicity, this language is not minimal from a fundamental point of view. Indeed, since the type T is generic, any PVS function can be used as a building block in the construction of a PVS0 program. This feature is justified since all PVS functions are total. Therefore, they can be considered as atomic built-in operators. However, in contrast to proof assistants based on constructive logic, PVS allows for the definition of non-computable functions. The consequences of these features will be clear in the undecidability proof of the halting problem. The following lemma states that the semantic evaluation relation " is deter- ministic. Formalization of the Undecidability of the Halting Problem 5
Lemma 1. Let pvso be a program of type PVS0[Val ]. For any expression e of type PVS0Expr[Val ] and all values v ,v0 ,v00 Val , i o o 2
"(pvso)(e, vi,vo0 ) and "(pvso)(e, vi,vo00) implies vo0 = vo00.
The proof of this lemma uses the induction schema generated for the inductive relation ". The relation " is functional but not total, i.e., there are programs pvso and values vi, for which there is no value vo that satisfies "(pvso)(pvsoe,vi,vo), where pvsoe is the program expression in pvso. This suggests the following definition of the semantic termination predicate.
T (pvso,v ):= v Val : "(pvso)(pvso ,v ,v ). " i 9 o 2 e i o
This predicate states that for a given program pvso and input vi, the evaluation of the program’s expression pvsoe on the value vi terminates with the output value vo. The program pvso is total with respect to " if it satisfies the following predicate.5 T (pvso):= v Val : T (pvso,v). " 8 2 " Semantic termination can also be specified by a function of type PVS0[Val ] ! (PVS0Expr[Val ] Val N Val ) defined as follows. ⇥ ⇥ ! [{}}