Lost in Extraction, Recovered an Approach to Compensate for the Loss of Properties and Type Dependencies When Extracting Coq Components to Ocaml

Extended Abstract @ ML Family Workshop 2015 Lost in Extraction, Recovered An Approach to Compensate for the Loss of Properties and Type Dependencies when Extracting Coq Components to OCaml Eric´ Tanter Nicolas Tabareau PLEIAD Lab Inria Computer Science Department (DCC) Nantes, France University of Chile, Santiago, Chile [email protected] [email protected] Abstract The situation can be even worse with type dependencies, since Extracting Coq programs to OCaml enables a model of software extracting dependent structures to OCaml typically introduces un- development in which some critical components of a system can safe operations. Consequently, it is easy for an OCaml client to be developed in Coq and proven correct, before being extracted to produce segfaults. Consider the following example adapted from OCaml and integrated with the rest of the system. However, due CPDT [1], in which the types of the instructions for a stack ma- to the gap between Coq and OCaml, key features like properties chine are explicit about their effect on the size of the stack: and type dependencies disappear upon extraction, leading to po- Inductive dinstr : nat ! nat ! Set := tentially unsafe and unsound executions of certified components. j IConst : 8 n, nat ! dinstr n (S n) We describe an approach to protect extracted components through j IPlus : 8 n, dinstr (S (S n)) (S n). mostly automatic generation of runtime checks, ensuring that the assumptions made by certified components are checked dynami- An IConst instruction operates on any stack of size n, and cally when interacting with non-certified components. produces a stack of size (S n), where S is the successor constructor of nats. Similarly, an IPlus instruction consumes two values from the stack (hence the stack size must have the form (S (S n))), and 1. Introduction pushes back one value. A dependently-typed stack of depth n is OCaml Programmers should be able to exploit program extraction represented by nested pairs: to integrate Coq certified components within their systems, while ensuring that the assumptions made by the certified components Fixpoint dstack( n : nat): Set := are enforced. However, extraction of Coq components to OCaml match n with suffers from a number of limitations that can lead to unsound or j O ) unit even unsafe executions, e.g. yielding segmentation faults. j S n’ ) nat × dstack n’ While some unsafe executions can be produced by using ef- end%type. fectful functions as arguments to functions extracted from Coq— The exec function, which executes an instruction on a given because referential transparency is broken—we focus on two fun- stack and returns the new stack can be defined as follows: damental issues that manifest even when the OCaml code is pure: properties and type dependencies. Definition exec nm (i : dinstr nm ): dstack n ! dstack m := Let us consider subset types, the canonical way to attach a match i with property to a value [1,5]. Subset types are of the form fa:A j Pa g, j IConst n ) fun s ) (n, s) denoting the elements a of type A for which property P a holds. j IPlus ) fun s ) More precisely, an inhabitant of fa:A j Pa g is a dependent pair (a let ’(arg1,( arg2, s’)) := s in ; p), where a is a term of type A, and p is a proof term of type P a. (arg1 + arg2, s’) Suppose a function divide of type nat ! fn:nat j n > 0g ! nat end. defined in Coq. To define divide, the programmer works under the Of special interest is the fact that in the IPlus case, the stack s assumption that the second argument is strictly positive. However, is deconstructed by directly grabbing the top two elements through this guarantee is lost when extracting the function: Coq establishes pattern matching, without having to check that the stack has at least a difference between programs (in Type), which have computa- two elements— this is guaranteed by the type dependencies. tional content, and proofs (in Prop), which are devoid of computa- Because such type dependencies are absent in OCaml, the exec tional meaning and are therefore erased during extraction. Ideally, function is extracted into a function that relies on unsafe coercions: we would like to extract divide to an OCaml function that checks that the second argument is indeed positive, or fails otherwise: (* exec : int -> int -> dinstr -> dstack -> dstack *) let exec n m i s = let divide m n = if n > 0 then ... match i with else failwith "invalid argument" | IConst (n0, n1) -> Obj.magic (Pair (n1, s)) This would allow the original code (...) to be executed in a sound | IPlus n0 -> context, where n > 0 indeed holds. Such an approach to transfer let Pair (arg1, t) = Obj.magic s in properties as runtime checks can be supported as long as the stated let Pair (arg2, s') = t in property is decidable. Obj.magic (Pair ((add arg1 arg2), s')) 1 2015/8/12 The dstack indexed type from Coq cannot be expressed in OCaml, able type class library.) The decision procedure for a given property so the extracted code defines the (plain) type dstack as: is synthesized automatically through type class resolution. For instance, the OCaml code for divide’ calls our cast function type dstack = Obj.t after applying the decidable le nat decision procedure for ≤: where Obj.t is the abstract internal representation type of any let divide' n m = value. Therefore, the type system has in fact no information at all cast_fun_dom (decidable_le_nat 1) (divide n) about stacks: the unsafe coercion Obj.magic (of type 'a -> 'b) is used to convert from and to this internal representation type. The Cast errors. Notice how the OCaml client code above does not dangerous coercion is the one in the IPlus case, when coercing s obtain optional values when using divide’, but instead obtains plain to a nested pair of depth at least 2. Consequently, applying exec nats if the cast succeeds, or a runtime error otherwise. To avoid with an improper stack yields a segmentation fault:1 imposing a monadic style to deal with the potential for cast errors, we can represent cast failure in Coq as an axiom.4 # exec 0 0 (IPlus 0) [];; Specifically, we introduce one axiom, failed cast, which states Segmentation fault: 11 that for any indexed property on elements of type A, we can build a While it might be possible to preserve some type dependencies value of type fa:A j Pa g: by exploiting advanced features of the target language type system Axiom failed cast: (in particular, GADTs in OCaml), there will inevitably be cases 8f A:Typeg fP : A ! Propg (a:A)(msg: Prop), fa : A j Pa g. where the dependencies expressed in Coq outsmart the target type This axiom is obviously a lie, but it allows us to provide a cast system and extraction relies on unsafe features. operator as a function of type A ! fa:A j Pa g, even though a We propose an approach in which a richly-typed Coq function 2 cast can fail. Hence programmers can use cast, denoted ?, without can be casted to a function with a standard type signature that having to deal with optional values. matches the target language type system, in a way that automatically embeds the necessary runtime checks in the casted function. Definition cast( A:Type)(P : A ! Prop) (dec : 8 a, Decidable (Pa )) : A !f a : A j Pa g := fun a: A ) 2. Recovering Properties match deca with We develop safe casts3 for Coq, paving the way for gradual certified j inl p ) (a ; p) programming; notably, we show that this is feasible entirely within j inr ) failed cast a (Pa ) standard Coq, without extending the underlying theory and imple- end. mentation. Note that the focus of this presentation is on OCaml The cast operator applies the decision procedure to the given extraction, not on gradual certified programming in Coq (the latter value and, depending on the outcome, returns either the dependent is presented in full details in [6]). pair with the obtained proof, or a failed cast. Considering the Concretely, we can cast divide from type nat ! fn:nat j n > definition of cast, we see that a cast fails if and only if the property 0g ! nat to a function divide’ with the plain type nat ! nat ! nat d P a does not hold according to the decision procedure. A subtlety in a safe manner, using the ? cast operator, which weakens func- in the definition of cast is that the casted value must not be exposed tion domains: as a dependent pair if the decision procedure fails. An alternative Definition divide’ n :=? d (divide n). definition could always return (a ; x) where x is some error axiom When applied, the extracted divide’ function checks that its if the cast failed. Doing so, however, would ruin the interest of second argument is strictly positive, and raises a runtime cast error the cast framework in the context of program extraction: because otherwise: all properties (in Prop) are erased, a casted value would not be extracted to a value associated with a runtime check, but just to a # divide' 4 2;; plain, unchecked value. - : nat = 2 The cast operator denoted ?d weakens the domain of a plain # divide' 10 0;; function type fa : A j Pa g ! B, which expects a value of a Exception: Failure "Cast has failed". subset type as argument, into a standard function type A ! B, by downcasting the argument: There are two major challenges to supporting casts of properties: decidability and the potential for cast errors.

Load more