Overcoming Restraint: Modular Refinement using Cogent’s Principled Foreign Function Interface

Louis Cheung ! UNSW Sydney, Australia Liam O’Connor ! Ï University of Edinburgh, United Kingdom Christine Rizkallah ! Ï UNSW Sydney, Australia

Abstract Cogent is a restricted functional language designed to reduce the cost of developing verified systems code. However, Cogent does not support recursion nor iteration, and its imposes restrictions that are sometimes too strong for low-level system programming. To overcome these restrictions, Cogent provides a foreign function interface (FFI) between Cogent and C which allows for implementing those parts of the system which cannot be expressed in Cogent, such as data structures and iterators over these data structures, to be implemented in C and called from Cogent. The Cogent framework automatically guarantees correctness of the overall Cogent-C system when provided proofs that the C components are functionally correct and satisfy Cogent’s FFI constraints. We previously implemented file systems in Cogent and verified key file system operations. However, the C components and the FFI constraints that define the Cogent-C interoperability were axiomatized. In this paper, we verify the correctness and FFI constraints of the C implementation of word arrays used in the file systems. We demonstrate how these proofs modularly compose with existing Cogent theorems and result in a functional correctness theorem of the overall Cogent-C system. This demonstrates that Cogent’s FFI constraints ensure correct and safe inter-language interoperability.

Keywords and phrases compilers, verification, type-systems, language interoperability, data-structures

Acknowledgements We thank Abdallah Saffidine, Amos Robinson, Jashank Jeremy, and Vincent Jackson for their feedback on the paper. Special thanks to Jashank for the heavy editing. We highly appreciate Vincent’s proof engineering work, over the past couple of years, on maintaining the compiler proof chain and the BilbyFs proofs on ever-changing versions Cogent and Isabelle/HOL.

1 Introduction

Cogent [14] is a restricted pure functional language and a certifying compiler [17, 14] designed to ease creating high-assurance systems [3]. It has a foreign function interface (FFI) that enables implementing parts of a system in C. Cogent’s main restrictions are the purposeful lack of recursion and iteration, which ensures totality, and its uniqueness type

arXiv:2102.09920v2 [cs.PL] 18 Mar 2021 system, which enforces a uniqueness invariant that guarantees memory safety. Even in the restricted target domains of Cogent, real programs contain some amount of iteration, primarily over specific data structures. This is achieved through Cogent’s integrated FFI: engineers provide data structures and functions to operate on these data structures, including iterators, in a special dialect of C, and imports them into Cogent, including in formal reasoning. The special C code, called anti-quoted C, is C code that can refer to Cogent data types and functions, translated into standard C along with the Cogent program by the Cogent compiler. As long as the C components respect Cogent’s foreign function interface — i.e., are correct and respect the uniqueness invariant — the Cogent framework guarantees soundness for overall Cogent-C system. 2 Modular Refinement using Cogent’s Principled Foreign Function Interface

We previously implemented two real-world Linux file systems in Cogent— ext2 and BilbyFs — and verified key operations of BilbyFs3 [ ]. This demonstrated Cogent’s suitability for implementing systems code, and significantly reducing the cost of verification. The implementations of ext2 and BilbyFs rely on an external C library of data structures. This library includes fixed-length word arrays along with iterators for implementing loops, and Cogent stubs for accessing a range of the Linux kernel’s internal APIs. This library was carefully designed to ensure compatibility with Cogent’s FFI constraints. We had previously only verified the Cogent parts of these file system operations, and axiomatized statements of the underlying C correctness and FFI constraints defining Cogent-C interoperability. To fully verify a system written in Cogent and C, one needs to provide manually-written abstractions of the C parts, and manually prove refinement through Cogent’s FFI. The effort required for this manual verification remains substantial, but the reusability ofthese libraries allows for the amortisation this verification cost across different systems. In this paper, we verify the correctness and uniqueness invariants of the word array imple- mentation used in the BilbyFs file system. This demonstrates that is it possible for theC components of a real-world Cogent-C system to satisfy Cogent’s FFI conditions. We demonstrate that the compiler-generated refinement proofs guaranteeing semantic preser- vation and memory safety compose with manual C proofs at each intermediate level up to Cogent’s generated shallow embedding. Our specification of word arrays as Isabelle/HOL lists and verification of array operations against abstract Isabelle/HOL functions on lists, including an array iterator specified as a map-accumulate function over a fragment of a list, are reusable beyond the context of Cogent. Moreover, our proofs are reusable in the verification of any future Cogent-C system that uses word arrays. Our results show that Cogent’s two-language approach is not only suitable for developing real-world systems but also for verification. We show how to modularly compose compiler generated theorems with external theorems. This demonstrates how language interoperability can help guarantee correctness of an overall system. More generally, this work provides our community with a case-study demonstrating how a principled foreign function interface between a high-level functional language with a safe type-system and a low-level imperative language with an unsafe type system can be structured in order to modularly achieve refinement guarantees of the overall system. In particular, our work supports, and is well-described by, Ahmed’s claim that: “compositional compiler correctness is, in essence, a language interoperability problem: for viable solutions in the long term, high-level languages must be equipped with principled foreign-function interfaces that specify safe interoperability between high-level and low-level components, and between more precisely and less precisely typed code.” [1] Our code and proofs are available online [4].

2 Cogent

The Cogent programming language [14] is intended for the implementation of operating system components such as file systems. It is a purely functional language, but itcanbe compiled into efficient C code suitable for systems programming. The Cogent compiler produces three artefacts: C code, a shallow embedding of the Cogent code in Isabelle/HOL [13], and a formal refinement theorem relating the two14 [ , 17] (arrow (1) in Figure 1). The refinement theorem and proof rely on several intermediate embeddings also generated by the Cogent compiler, some related through language level L. Cheung, L. O’Connor, and C. Rizkallah 3

Overview Verification Correctness Framework legend Correctness ⊑ refines manual Cogent Cogent generated Shallow Embedding Shallow Embedding manual proofs (6) generated proofs (3) Memory Safety ⊑ ⊑ interface refines refines Polymorphic Cogent Polymorphic Cogent (2) (5) (value semantics) (value semantics ) Isabelle/HOL ξv Systems Data-structures ⊑ ⊑ refines refines Monomorphic Cogent Monomorphic Cogent

(1) (4) (value semantics) (value semantics ξv) ⊑ ⊑ C code Systems Data-structures refines refines + safe + safe Monomorphic Cogent Monomorphic Cogent

Cogent compiler (update semantics) (update semantics ξu ) ⊑ ⊑ Cogent Template C refines refines Program Data-structures C System C Data-structure (imported to Isabelle) (imported to Isabelle)

Figure 1 An overview of the Cogent framework (left) and the verification phases (right). proofs, and others through translation validation phases (Section 2.3). The compiler certificate guarantees that correctness theorems proven on top of the shallow embedding also hold for the generated C, which eases verification, and serves as the basis for further functional correctness proofs (Figure 1, arrow (3)). A key part of the compiler certificate describes Cogent’s uniqueness type system, which enforces that each mutable heap object has exactly one active pointer in scope at any point in time. This uniqueness invariant allows modelling imperative computations as pure functions: the allocations and repeated copying commonly found in can be replaced with destructive updates, and the need for garbage collection is eliminated, resulting in predictable and efficient code. Well-typed Cogent programs have two interpretations: a pure-functional value semantics, which has no notion of a heap and treats all objects as immutable values, and an imperative update semantics, describing the destructive mutation of heap objects. These two semantic interpretations correspond (Section 2.3), meaning that any reasoning about the functional semantics also applies to the imperative semantics. This correspondence further guarantees that well-typed Cogent programs are memory safe (Figure 1, arrow (2)). As mentioned, the underlying C data structures of our two file systems were carefully designed to ensure compatibility with Cogent’s FFI constraints (Section 2.2). But the verification of the file systems previously assumed this compatibility (Figure 1, arrows (4),(5)) as well as the correctness of the C code (Figure 1, arrow (6)). Our verification discharges these assumptions, giving us greater confidence in both the file systems and Cogent’s FFI.

2.1 Language Design and Examples

Cogent has unit, numeric, and boolean primitive types, as well as functions, sum types (i.e., variants) and product types (i.e., tuples and records). Users can declare additional abstract 4 Modular Refinement using Cogent’s Principled Foreign Function Interface

types τ, τ ′ ::= () | Bool | U8 | U16 | U32 | U64 (primitive types) | (τ, τ ′)(tuples) | ⟨K τ⟩ (variant types - constructor K) | {fi : τi} (record types - field names f) | A τ (abstract types - constructor A) Overlines indicate a set (order is not important).

Figure 2 Grammar for desugared types in core Cogent.

types in Cogent, and define them externally (e.g., in C). Cogent does not support closures, so partial application via currying is not common. Thus, functions of multiple arguments take a tuple or record of those arguments. Cogent types are shown in Figure 2. Similar to ML, Cogent supports parametric polymorphism on top-level functions. From a polymorphic C template, the compiler generates multiple specialised C implementations, one for each concrete instantiation used in the Cogent code. Variables of polymorphic type are by default linear (must be used exactly once). This allows the polymorphic type variable to be instantiated to any type. Types and functions provided in external C code are called abstract in Cogent. As mentioned, The Cogent compiler has an infrastructure for linking the C implementations and the compiled Cogent code, and allows embedding Cogent types and expressions inside the C code using quasi-quotation. Abstract types may also be given type parameters. Polymorphic functions and parameterised types are translated into a family of automatically generated C functions and types for each concrete type used in Cogent.

1 type Array a 2 length :(Array a)! → U32 ▶ Example 1 (Array). 3 get : ((Array a)!, U32) → a 4 put : (Array a, U32, a) → Array a

Example 1 is a Cogent program that declares an externally-defined array library interface, where array indices and the array length are unsigned 32-bit integers, U32. Though the Array type interface may appear purely functional, Cogent assumes that all abstract types are linear, ensuring that the uniqueness invariant applies to variables of type Array. Therefore, any implementation of the abstract put function is free to destructively update the provided Array, without contradicting the purely functional semantics of Cogent. When functions only need to read from a data structure, uniqueness types can complicate a program unnecessarily by requiring a programmer to thread through all state, even unchanged state. The !-operator helps to avoid this by converting linear, writable types to read-only types that can be freely shared or discarded. This is analogous to a borrow in Rust. The length and get functions can read from the given array, but may not write to it. Cogent ensures that in Cogent code, read-only references are never simultaneously live with writable references. This restriction is necessary to be able to reason equationally about Cogent programs, and to preserve the refinement theorem connecting Cogent’s two semantics. This condition is assumed of abstract functions by the Cogent semantics and must be proven of the C implementations for pointers that are visible to Cogent. As Cogent does not support recursion, iteration is expressed using abstract higher-order functions, providing basic traversal combinators such as map and fold for abstract types.

1 map : (a → a, Array a) → Array a ▶ Example 2 (Simple Array Iterators). 2 fold : ((a!, b) → b, b,(Array a)!) → b L. Cheung, L. O’Connor, and C. Rizkallah 5

In Example 2, the map is passed a function of type a → a. As such, it is able to destructively overwrite the array with the result of the function applied to each element. While Cogent supports higher-order functions — functions that accept functions as arguments or return functions — it does not support nested lambda abstractions or closures, as these can require allocation if they capture variables. Thus, to invoke the map or fold functions, a separate top-level function must be defined, such as in this function to suman array of 32-bit unsigned integers:

1 add : (U32, U32) → U32 2 add (x, y) = x + y ▶ Example 3 (Sum). 3 sum : (Array U32)! → U32 4 sum arr = fold (add, 0, arr)

Instead of relying on closure captures, we provide alternative iterator functions which carry an additional observer read-only input (of type c!). Example 4 presents the types of the word array iterators mapAccum and fold that are used in BilbyFs and ext2. They account for this extra observer input and allow for iterating over an interval of the array. The function mapAccum is a generalised version of map, similar to that in Haskell. It allows threading an accumulating argument through the map function. We present the verification of these iterators in Section 3.

1 mapAccum : {f:(a, b c!) → (a, b), acc: b, arr:(Array a)!, 2 frm: U32, to: U32, obsv: c!} → (Array a, b) ▶ Example 4 (Iterators). 3 fold : {f:(a, b c!) → b, acc: b, arr:(Array a)!, 4 frm: U32, to: U32, obsv: c!} → b

We re-frame our sum example in terms of BilbyFs’s generalised version of fold and present its verification in Section 4.1. The observable input is not needed, so we pass the unittype, which also allows us to omit the !.

1 add : (U32, U32, ()) → U32 2 add (x, y, z) = x + y ▶ Example 5 (Sum). 3 sum : (Array U32)! → U32 4 sum arr = fold {f: add, acc:0, arr: arr, frm:0, to: length x, obsv:()}

2.2 Dynamic Semantics and Uniqueness Invariant

Cogent’s big-step value semantics is defined through the judgement ξv,V ⊢e⇓v v. Given a function ξv : fid → (v → v), that for an abstract function with identifier fid defines its value semantics, this judgement evaluates an expression e under a context V to a value v. Its imperative update semantics, which additionally may manipulate a store µ, is defined ′ through the judgement ξu,U ⊢e|µ⇓u u|µ . Given a function ξu : fid → (u × µ → u × µ) that similarly defines the update semantics of abstract functions, this judgement maps acontext U, an expression e, and a store µ to a value u and a new store µ′. The correspondence between Cogent’s two semantics relies on a uniqueness invariant and on type preservation. To ensure memory safety, Cogent’s FFI places restrictions on how evaluation may affect the heap. Given an input set of writable pointers wi, an input heap µi, an output set of pointers wo and an output heap µo, the relation, wi | µi frame wo | µo, ensures three 6 Modular Refinement using Cogent’s Principled Foreign Function Interface

properties for any pointer p:

inertia: p∈ / wi ∪ wo −→ µi(p) = µo(p)

leak freedom: p ∈ wi −→ p∈ / wo −→ µo(p) = ⊥

fresh allocation: p∈ / wi −→ p ∈ wo −→ µi(p) = ⊥ Inertia ensures that pointers not in the frame remain unchanged; leak freedom ensures that pointers removed from the frame no longer point to anything; and fresh allocation ensures that pointers added to the frame were not initially pointing to anything. The framing relation implies that the correspondence relation between the value and update semantics is unaffected by updates to separate parts of the heap.

2.3 The Cogent Verification Framework Figure 1 gives an overview of the verification framework. The overall proof that the Ccode refines the purely functional shallow embedding in Isabelle/HOL is broken into anumber of sub-proofs and translation validation phases. The compiler generates four embeddings: a top-level shallow embedding in terms of pure functions; a polymorphic deep embedding of the Cogent program, which is interpreted using the value semantics; a monomorphic deep embedding of the Cogent program, which can be interpreted using either the value or update semantics; and an Isabelle/HOL representations of the C code generated by the compiler, imported into Isabelle/HOL by the C-parser used in the seL4 project [8]. The C parser generates a deep embedding of C in Isabelle/HOL, and, using AutoCorres [5, 6], is then abstracted to a corresponding monadic embedding. The compiler also generates proofs that the program is well-typed, that the pure interpret- ation of the polymorphic deep embedding is a refinement of the top-level shallow embedding, that the monomorphic embedding is a refinement of the polymorphic one, and that theC code is a refinement of the imperative interpretation of the monomorphic deep embedding. These refinement proofs, along with the typing proof and the refinement theorem between the two semantic interpretations, are composed to produce a refinement theorem stretching from the C code all the way up to the pure shallow embedding.

2.4 Type Preservation and Uniqueness Invariant The static semantics are defined through a standard typing judgement A;Γ ⊢ e : τ, with an additional context A that tracks assumptions about the linearity of type variables in τ. Dynamic values in the value semantics are typed by the simple judgement v : τ, whereas update semantics values must be typed with the store µ to type the parts of the value that are on the heap. Update semantics values are typed by the judgement u|µ : τ[r ∗ w], which additionally includes the heap footprint, consisting of the sets of read-only (r) and writable (w) pointers the value can access. Disjointness requirements are added to the definition of this judgement to ensure that the uniqueness invariant is maintained. We use the same notation for value typing on contexts. Each of the key refinement and typing theorems for Cogent code must be manually proved about the semantics of our abstract functions ξv and ξu. For our value semantics functions, functions, our type preservation lemma statements are of the form:

▶ Theorem 6 (Value Semantics Type Preservation).

A;Γ ⊢ e : τ ∧ ξv,V ⊢e⇓v v −→ v : τ L. Cheung, L. O’Connor, and C. Rizkallah 7

Or in other words, if the expression e is of type τ under the typing context Γ and typing assumptions A; the evaluation environment V has types Γ; and e evaluates to v under V and abstract function environment ξv, then v has type τ. To handle abstract higher order functions, the abstract function environment ξv contains all lower-order functions, and only abstract functions defined in ξv can be evaluated. The typing theorems are progressively built up from first order to second order and so on. In our libraries, second-order functions are the highest order needed. In the update semantics, we prove type preservation and the frame constraints simultan- eously, as the heap footprints in the value typing relation shows which pointers are accessible from a value, and the frame relation ensures that the evaluation of update semantics functions does indeed only access pointers in the heap footprint. The lemma statements for proving type preservation and the frame constraint in update semantics for update semantics functions are of the form:

▶ Theorem 7 (Update Semantics Type Preservation and Frame Constraint). ′ A;Γ ⊢ e : τ ∧ U|µ : Γ[r ∗ w] ∧ ξu,U ⊢e|µ⇓u u|µ −→ ∃r′ w′. r′ ⊆ r ∧ w|µ frame w′|µ′ ∧ u|µ : τ[r′ ∗ w′] For functions written in Cogent, this theorem is proven once and for all, but for our abstract functions, we proved these statements manually. The proof follows from the definitions of value typing and, in the update semantics, the frame relation. To show correspondence of the two semantics, we combine these two value typing relations into one combined judgement u|µ : v : τ [r ∗ w]. We show this relation is preserved across evaluation of both semantics and use it to show the refinement theorem connecting the two semantics:

▶ Theorem 8 (Value ⇒ Update refinement). For any program e where A;Γ ⊢ e : τ, if ′ U|µ : V :Γ[r ∗ w] and U ⊢ e | µ ⇓u u | µ , then there exists a value v and pointer sets ′ ′ ′ ′ ′ ′ ′ r ⊆ r and w such that V ⊢ e ⇓v v, and u|µ : v : τ [r ∗ w ] and w | µ frame w | µ . Just as with typing preservation, this relation is automatically preserved for Cogent functions, but must be proven manually for our abstract functions. Specifically, we show the type preservation theorems hold for these functions: forany abstract function f of type τ → τ ′, if the argument values u and v correspond and are well ′ ′ ′ ′ typed i.e. if (v : τ), (u|µ : τ[r ∗ w]), ξvf v = v , and ξuf (u, µ) = (u , µ ), then v : τ and u′|µ′ : τ ′[r′ ∗ w′] for some r′ ⊆ r and w′ such that w | µ frame w′ | µ′. The well-typedness requirement ensures that the returned value does not include internal aliasing, and the frame requirements ensure that the abstract function does not manipulate any state other than values passed into it. In order to show refinement, we must also assume that abstract functions evaluate in the value semantics whenever they evaluate in the update semantics.

3 Word Arrays

Word arrays are arrays stored on the heap with elements having any primitive numeric type supported by Cogent (Figure 2). We verify five array operations: the length, get, and put functions in Example 1, and the fold and mapAccum iterators in Example 4.

3.1 Specification/Axiomatization Word arrays are abstracted as Isabelle/HOL lists. Operations on word arrays are specified as Isabelle functions on lists. The specification for the word array functions presented in 8 Modular Refinement using Cogent’s Principled Foreign Function Interface

1 length :[a] → word32 2 length xs = of_nat(List.length xs) 3 get :[a] → word32 → a 4 get xs i = if unat i < List.length then xs ! (unati) else 0 5 put:[a] → word32 → a → [a] 6 put xs i v = xs[unat i := v] 7 fold : (b → a → c → b) → b → [a] → word32 → word32 → c → b 8 fold f acc xs frm to obs = List.fold (λ acc el. f acc el obs) acc (slice frm to xs) 9 mapAccum : (b → a → c → (a, b)) → b → [a] → word32 → word32 → c → ([a], b) 10 mapAccum f acc xs frm to obs = ′ ′ 11 let (xs , acc ) = ′ ′ ′ ′ 12 List.fold (λ el (xs, acc). let (el , acc ) = f acc el obs in (xs @[el ], acc )) (slice frm to xs)([], acc) ′ ′ 13 in (take frm xs @ xs @ drop (max frm to) xs, acc )

Figure 3 Functional correctness specification of the word array operations in Isabelle/HOL.

Examples 1 and 4 is provided in Figure 3. The majority of the word array operations behave similar to Isabelle/HOL’s list operations. The Isabelle library function unat converts a 32-bit word to a natural number, of_nat converts a natural number to a 32-bit word, take n xs returns the first n elements of the list xs, drop n xs removes the first n elements of xs the first n elements removed from xs, slice n m xs returns the sublist that starts at index n and ends at index m of the list xs, List.length returns the length of a list, List.fold is the fold function for lists, and @ is list concatenate. One exception is get, which behaves differently to its corresponding list operation: our implementation of get does not cause an exception when the given index is out of bounds, but instead returns the value 0. Another exception is mapAccum, which is not part of the standard list library in Isabelle/HOL, and must be implemented in terms of fold.

3.2 Proving Refinement

The Cogent compiler automatically proves refinement Cogent functions in four main phases, these are: C to update, update to monomorphic value, monomorphic to polymorphic value, and polymorphic value to shallow. This means that the compiler must automatically generate for each semantic layer an embedding of the program, and a value relation for all Cogent types between the semantic layer in each phase. Since Cogent’s refinement proof relies on the typing information to show that no aliasing occurs, a value typing relation for each Cogent type in the update and value semantics is also generated. In our verification, we prove refinement analogously to the automatic proofs ofCogent code. Thus we also define embeddings for word arrays and their operations for each semantic layer; value relations for word arrays between each semantic layer; and value typing for word array in both the update semantics and the value semantics.

Embeddings

Since we use AutoCorres in our verification, we do not need to define our own embedding of word arrays and our functions for the C semantic layer as AutoCorres automatically does this for us. For the top-level shallow semantic layer, we have already defined our specification for L. Cheung, L. O’Connor, and C. Rizkallah 9

1 datatype UArray = 1 datatype (a, b) VArray = 2 UWA “type”“word32”“ptrtyp” 2 VWA “type”“(a, b) vval” 3 3 4 upd_length (µ, x)(µ’, y) = 4 val_length x y = 5 ∃ t len arr. (x = UWA t len 5 ∃ t xs. (x = VWA t xs) ∧ arr) ∧ (y = len) ∧ (µ = µ’) (List.length xs = unat y)

(a) Deep embedding in the update semantics. (b) Deep embedding in the value semantics.

Figure 4 Deep embedding of word arrays and length operation in Isabelle/HOL. word arrays and its operations in Isabelle/HOL in our axiomatisation, so this specification forms our embedding for the shallow semantic layer. Our embedding of word arrays in the update semantics, shown in Figure 4a, contains the type of its elements, the length of the array, and the pointer to the first element in the array. This is very similar to the word array struct in C. It differs slightly, because it contains the type of its elements, so that we can reuse the embedding for word arrays with different element types. The similarity with the word array struct in C has the benefit of making the value relation very simple, i.e., if the lengths and pointers are equal then they are related. In addition, we can make all pointers associated to a word array visible from Cogent, which in turn forces us to prove memory safety of all pointers associated to a word array. The word array operations are defined as input-output relations, where Figure 4a shows the embedding for length. The embedding of word arrays in the value semantics, shown in Figure 4b, consists the type of its elements and a list of values, which should be pairwise related to each of its elements. The value semantics embedding is very similar to the shallow embedding of word arrays, which is simply a list containing all of its elements, so that the value relation between the shallow and the value semantics is simple. The embeddings for the word array operations were also defined as input-output relations, and is similar to the corresponding embeddings in the update semantics, as shown in Figure 4b.

Value Typing

In our verification, word arrays may only have elements which are primitive numeric types. This makes the value typing relation fairly simple, as the value typing of primitive numeric types is also simple. The value typing in the value semantics is defined as:

▶ Definition 9.

(VWA t xs): n τs s = (n = “Array”) ∧ (τs = [t]) ∧ (∀i < length xs. xs ! i : t)

An consists of its name, a list of type arguments and its sigil, i.e., if it is on the heap and if it is read-only or not. Hence our value relation checks that the value is of the correct type by checking the name matches and the type in the embedding matches the type list. It also checks each element value types to the type in the type list, i.e., the word array’s element type. In the update semantics, the value typing is defined as: 10 Modular Refinement using Cogent’s Principled Foreign Function Interface

▶ Definition 10 (Update Semantics: Value Typing for Array).

(UWA t len p)|µ :(n τs s)[r ∗ w] = (n = “Array”) ∧ (τs = [t]) ∧ (unat(len) × size(t) ≤ max_word) ∧ (∀i < len. ∃v. µ(p + size(t) × i) = Some v ∧ v|µ : t[∅ ∗ ∅]) ∧ (s = readonly −→ r = {p + i | ∀i. i < len} ∧ w = ∅) ∧ (s = writable −→ w = {p + i | ∀i. i < len} ∧ r = ∅)

where max_word is 2n − 1 where n is the number of bits in the word.

This checks for the same things as the value typing in value semantics, and also checks for the following: if the word array is read-only then all of its pointers are read-only, otherwise they are writable; all pointers point valid elements; and the size of the array is less than the maximum size of the heap. The constraint on the length ensures that the array does wrap on itself. As mentioned earlier, Cogent has constraints on the value typing relation for abstract types. In our case, these constraints were fairly easy to discharge as they followed from the definition our value relation or case analysis.

Value Relation

As mentioned earlier, the value relations between the C and update, and the value and shallow, are very simple. In the former, the two a related if the length values and the pointer values are equal, and the element types are related. In the latter, the two are related if the length of the lists are the same, the elements are pairwise related, and the element types are related. For example, the value relation between the update and C is

▶ Definition 11 (Value Relation between Update and C for 32-bit Word Arrays).

VC(UWA t len arr, x) = (t = U32) ∧ (len = lenC x) ∧ (arr = arrC x)

where lenC and arrC extract the length of the array pointer to the first element of the array from the word array embedding in the C semantics.

The value relation between the update and the value semantics was also fairly straightfor- ward, since a word array value in the update semantics is related to a word array value in the value semantics if the lengths are the same, the elements are pairwise related, and the element types are the same. The value relation also calls the value typing relations to ensure that the word array values are valid.

Refinement

Once we have the embeddings, value typing and value relations, we have all the ingredients we need to prove refinement in each phase for our abstract functions. Below we outline how to prove refinement for first order abstract functions. This is similar for second order abstract functions except that we also assume that we can prove refinement for all first order abstract functions. The refinement relation corres between C and the update semantics is defined as: L. Cheung, L. O’Connor, and C. Rizkallah 11

▶ Definition 12 (C ⇒ Update correspondence).

corres R e pm ξu U Γ µ σ = (∃r w. U|µ : Γ[r ∗ w]) −→ (µ, σ) ∈ R −→ ′ ′ (¬failed(pm σ) ∧ (∀vmσ . (vm, σ ) ∈ results(pm σ) −→ ′ ′ ′ ′ (∃µ u. ξu,U ⊢e|µ⇓u u|µ ∧ (µ , σ ) ∈ R ∧ VC(u, vm))))

In essence, this means that if the arguments are of the correct type, the Cogent state µ and C state σ correspond, i.e., in the set of corresponding states R, and if the C computation succeeds, i.e., has no undefined behaviour as indicated by ¬failed, then the corresponding update semantics embedding evaluates, and the resulting values and states correspond, where results is a set of pairs that consist of return values and new C states. We can then prove that if the arguments to the functions are related then we have corres between the C and update semantics. More formally, we have:

▶ Theorem 13 (C ⇒ Update refinement).

∀u vm. VC(u, vm) −→ corres R (f x) pm ξu (x 7→ u)(x : τ) µ σ

To bridge the gap between the update semantics and value semantics, we prove a refinement theorem that depends on value typing preservation (Theorem 7, Theorem 6),the frame relation, and the semantic correspondence (Theorem 8), as described in Subsection 2.4. To make the shift from monomorphic to polymorphic, we prove that if evaluation occurs after monomorphising expressions (Me) with the mapping rpm and similarly with values (Mv), then evaluation occurs without monomorphising. More formally, we prove:

▶ Theorem 14 (Monomorphisation).

′ ′ ′ ∀v . ξm, (x 7→ Mv(rpm, v))⊢Me(rpm, (f x))⇓v Mv(rpm, v ) −→ ξp, (x 7→ v)⊢(f x)⇓v v

Note that ξm is the monomorphised abstract function environment of ξp. Finally, we bridge the gap between deep and shallow, i.e., our functional correctness specification by showing the following:

▶ Theorem 15 (Polymorphic Value ⇒ Shallow refinement).

sh (x7→v) ∀v v. VS(v , v) −→ ((f x) ⊑ (p v )) s s s s ξv

sh where the refinement relation (s ⊑ e)V is defined as ∀r. ξ ,V ⊢e⇓ r −→ V (ξ s, r). ξv v v S v

Together, this means that if the arguments are value related, and if the deep embedding executes, then so will the shallow and the results will be related. For abstract functions that do not call other functions, the proofs of Theorems 8 and 13–15 followed from the evaluation rules and the relevant definitions. For abstract functions that call other functions, we prove theorems that are variants of Theorems 8 and 13–15, which assumes that Theorems 6–8 and 13–15 holds for all their arguments, which are functions, and that they are type correct. These then followed from the assumptions, evaluation rules, and the relevant definitions. Our proofs can be reused for different word sizes, e.g., 32-bit and 64-bit, with only minor changes. 12 Modular Refinement using Cogent’s Principled Foreign Function Interface

4 Modular Verification of a Cogent-C System

The Cogent compiler automatically generates the type correctness proofs for all functions and the refinement proofs for each phase for all Cogent functions. For Cogent functions that do not call any abstract functions, it will emit the theorem:

▶ Theorem 16 (Shallow ⇒ C refinement). If all the abstract functions defined in ξu and ξv preserve typing and the frame constraints in the update and monomorphic value semantics respectively, then:

∀µ σ. V(rpm, ξp, µ, vc, vu, vp, vs)µ ∧ ξm ∼A Mev rpm ξp ∧ V :Γ ∧ ξu ∼A ξm −→ correspondence rpm R (ps vs)(f x) pc ξu ξm ξp UV Γ µ σ

where U = (x 7→ vu), V = (x 7→ Mv(rpm, vp)), and Γ = (x 7→ τ).

Essentially, this theorem states that the monadic shallow embedding, derived from the C code, refines our shallow embedding transitively, i.e., refines the update, then the value and then the shallow, if the following is true: their arguments are all value related (V); all abstract functions in the update semantics correspond to their embeddings in the value semantics, and upward propagation occurs (ξu ∼A ξv); the monomorphised abstract functions refine their polymorphic counterparts (ξm ∼A Mev rpm ξp); and the evaluation environment V in the monomorphic value semantics value types to the type environment Γ. The value relation V is defined as:

▶ Definition 17 (Shallow to C value relation).

V(rpm, ξp, vs, vp, vu, vc, µ) =

∃r w τ. VS(ξp vs, vp) ∧ µ|vu : Mv(rpm, vp): τ [r ∗ w] ∧ VC(vu, vc))

This is simply the composition of all the value relations for each phase. correspondence is defined as:

▶ Definition 18 (Shallow ⇒ C refinement).

correspondence rpm R vs e pc ξu ξm ξp UV Γ µ σ = (µ, σ) ∈ R −→ (∃r w. µ|U : V :Γ[r ∗ w]) −→ (¬failed(pc σ) ∧ ′ ′ ′ ′ (∃vc σ . (vc, σ ) ∈ results (pc σ) −→ (∃µ vu vp. ξu,U ⊢Me(rpm, e)|µ⇓u vu|µ ∧ ′ ′ ′ ξm,V ⊢Me(rpm, e)⇓v Mv(rpm, vp) ∧ (µ , σ ) ∈ R ∧ V(rpm, ξp, µ , vc, vu, vp, vs))))

In essence, this means that if the states are related, the arguments to the functions are of the correct type and are value related, the C code executes successfully, then so will the update and the value semantics embeddings, and the final results will all be related. For Cogent programs that call abstract functions, the refinement theorem the compiler produces is similar to Theorem 16, however, it also contains extra assumptions about the abstract functions called from the Cogent program. The assumptions state that the abstract functions used in the Cogent program refine their specifications and execute on the arguments passed by the Cogent program. Once the generated assumptions about abstract functions are discharged, we obtain a shallow embedding to C refinement theorem about the linked Cogent-C program. Further correctness theorems can be proven on top of the shallow embedding. To show this in action, we use the sum program presented in Example 5. L. Cheung, L. O’Connor, and C. Rizkallah 13

4.1 Example: Summing Array Elements

In Example 5, we define a Cogent function that calculates the sum of a 32-bit word array. After compiling this program and linking it to the word array implementation described in Section 3, the compiler produces a refinement theorem similar to Theorem 16. However, it also adds in the additional assumptions:

▷ Claim 19 (Update ⇒ C refinement assumption for length).

∀u vm. VC(u, vm) −→

corres R (lengthu x) lengthc ξu (x 7→ u)(x : Array U32 r) µ σ ▷ Claim 20 (Update ⇒ C refinement assumption for fold).

∀u vm. VC(u, vm) ∧ (fC vm = FUN_ENUM_add) −→ corres R (foldu x) foldc ξu (x 7→ u)(x : τ) µ σ where fC extracts the field f from a struct, FUN_ENUM_add is the function ID for the function add, τ = {arr : Array U32 , frm : U32, to : U32, f : τf , acc : U32, obsv : ()} and τf = {elem : U32, acc : U32, obsv : ()} → U32. ▶ Note 21. At the time of writing this paper, Cogent’s automatic proof generation was not working for functions that call abstract function in the polymorphic shallow to deep refinement phase. It fails to produce additional assumptions similar to Claim 19 forfirst order abstract functions and Claim 20 for higher order functions. These assumptions would then be proven manually using Theorem 15 and a variant of it for first order and higher order abstract functions, respectively. However, we can discharge Claim 19 since we proved Theorem 13 for length. We can also discharge Claim 20 by using a similar theorem to Theorem 13, which only differs because it assumes that we have proven Theorem 13 for the function f that it can call, and that f is type correct. In this case, f is the add function, and Cogent automatically proves these results for that function. So our refinement theorem for the function sum is the same as Theorem 16. We additionally prove that all abstract functions preserve typing and satisfy the frame constraints, and that all abstract functions refine their specifications in each of the refinement phases. Thus, our final refinement theorem for the function sum becomes:

▶ Theorem 22 (Shallow ⇒ C refinement for sum).

∀µ σ. V(rpm, ξp, µ, vc, vu, vp, vs)µ −→ correspondence rpm R (sums vs)(sumd x) sumc ξu ξm ξp UV Γ µ σ where U = (x 7→ vu), V = (x 7→ Mv(rpm, vp)), and Γ = (x 7→ τ). Functional correctness of the generated shallow embedding of sum is shown by equating it to the built-in sum_list in Isabelle, assuming that the list is definable in C, and this is proved by inducting on the list:

▶ Theorem 23 (Correctness of sum). 32 List.length xs < 2 −→ sums xs = sum_list xs

▶ Corollary 24 (Correctness of sum on natural numbers). List.length xs ≤ 232 −→ sum_list (map unat xs) < 232 −→ P unat (sums xs) = (sum_list x←xs unat xs) P where x←xs unat xs) is Isabelle’s shorthand for sum_list (map unat xs). 14 Modular Refinement using Cogent’s Principled Foreign Function Interface

4.2 Example: BilbyFs We previously verified the functional correctness of key operations in BilbyFs3 [ ]. This proof relied on the correctness of abstract functions including an axiomatization of word array operations [2]. We removed the axiomatization for the five core word array operations that we verified, and were able to show that the functional correctness proofs composefor the combined system. A few word array operations that remain in the axiomatization are: create, set, copy, cmp, which all depend on platform specific functions such as malloc; and modify, which updates an element in an array with a function, but for word-arrays this can be implemented in terms of our verified put.

5 Discussion and Lessons Learned

Although it is possible to directly prove that a program refines its functional correctness specification, intermediate embeddings ease this process. In Cogent, there are clear, distinct refinement phases designed to abstract away one particular implementation feature atatime. For instance, the refinement phase between the update and the value semantics abstracts away mutation and pointers. This eases further refinement proofs on top of the value semantics. In our experience, even when verifying abstract functions by hand, we benefit from this directed approach to refinement. Cogent automatically proves refinement for all Cogent functions, assuming that all the abstract functions refine their specification, and thus we have a partial proof of compilation correctness for all Cogent functions. We can reduce this partiality or even eliminate it entirely, proving whole program compilation correctness, by manually proving the functional correctness of the necessary abstract functions. The value typing for abstract types is provided by the user, who is allowed to define this relation as they like, so long as itstill satisfies Cogent’s FFI constraints. These constraints ensure that the refinement theorems generated by the Cogent compiler are not invalidated. This modular framework greatly reduces the burden of proof engineers as they only need to focus their attention on the important tasks leaving the grunt work to the Cogent compiler. Cogent’s refinement theorem depends on memory safety. Therefore it proves, sim- ultaneously to the refinement theorem, memory safety properties for all memory visible from Cogent for all Cogent functions automatically. To prevent abstract functions from invalidating this guarantee, the value typing judgement identifies the available pointers and the frame relation ensures that this heap footprint is accurate. At the bare minimum, we need to show that all abstract functions do not mutate any parts of Cogent memory unless they have been given explicit permission, i.e., the piece of memory is passed as argument to an abstract function with writable permission. However, to guarantee memory safety globally, all pointers must be visible from Cogent. In our word array verification, we accomplish this by defining the read-only and writable pointer sets in our value typing for word arrays (in the update semantics) as the set of all pointers to each element in the array. This is only possible because we define our embedding of word arrays (also inthe update semantics) as pointer to the first element in the array and the length of the array. If we instead choseto define our embedding in the update semantics to look similar to our embedding in the value semantics, we would be unable to make use of the frame constraints to show memory safety. Whilst verifying word arrays, we found that our Isabelle/HOL formalisation of Cogent’s FFI value typing constraints were missing some constraints that were present in our pen-and- paper formalisation. We also found that the automatically generated Cogent refinement theorems require a small amount of manual intervention for integration with the manual L. Cheung, L. O’Connor, and C. Rizkallah 15 proofs about abstract functions (e.g., for passing in the ξ function defining the semantics of abstract functions). Currently, Cogent’s refinement proof generator for the shallow to value refinement phase uses a cheat tactic to assume the correctness of abstract functions rather than listing them as explicit assumptions (see Note 21). Listing these assumptions explicitly, makes integrating their manual proofs easier than it is currently (similar to what is done on the update-to-C refinement phase). These minor inconveniences did not appear earlier since this is the first verification effort that verifies a Cogent-C system using Cogent’s FFI interface. They provide an avenue for further improvements to Cogent’s proof FFI. Cogent’s automatic refinement framework assumes users will provide it embeddings, value relations and environments for abstract functions after the automatic proof has run. One problem we faced was that Cogent’s evaluation relation only terminates if the all abstract functions terminate and higher order abstract functions only terminate if the evaluation relation terminates, i.e., they mutually recursive. We cannot define them as mutually recursive, however, since abstraction functions are defined by users later in the process, so we cannot close the recursion before users have provided these definitions. This lead to us imposing a constraint that a n order abstract function can only call a function of an order less than n. This forces us to prove refinement for all functions of order n − 1 or less, before we can prove refinement for an abstract function of order n. This constraint is not burdensome, and is already satisfied by all Cogent codebases. Although not a serious issue, we found that Cogent’s refinement framework uses both Isabelle/HOL’s consts and type classes for its value relation between the shallow and value semantics, and the update and monadic shallow C semantics, respectively. In our verification of word arrays, we found that we could prove more general lemmas and theorems when the value relation was defined as a type class rather thana consts, and the type class version was much neater. In hindsight, we should have used a type class for both of them, as we could then take advantage of the fact that we can add constraints with type classes whilst we cannot for consts.

6 Related Work

Modular Refinement Patterson and Ahmed [15] have defined a spectrum of compiler verification theorems focusing on compositional compiler correctness, extensible through linking. Cogent’s certifying compiler does not neatly fall on this spectrum as the compiler itself does the linking of the Cogent-C system, and we are linking with manually verified C rather than other compiler outputs. Nevertheless, the FFI constraints we prove would remain the same if it were to link to other compiler outputs. Similar to Cogent, the Cito language [19] supports cross-language linking and modular verification in an interactive theorem prover. Unlike Cogent, Cito is a low-level C-like language without a sophisticated type system, so the FFI requirements are simpler: it does not ensure memory safety, and any verification on top of Cito is done through a Hoare-style axiomatic semantics rather than through pure equational reasoning. An important part of verified partial compilation is how the semantics of linking isdefined for the source language. The simplest way to define linking on the source language isto require that all programs that can be linked must refine a program definable in the source language. SepCompCert [7] and Pilsner[12] take this approach. This approach is suitable for languages unlike Cogent, that are expressive enough for defining all the programs that we want to link with. As Cogent is a restricted language, we require C expressions to refine 16 Modular Refinement using Cogent’s Principled Foreign Function Interface

more expressive Isabelle/HOL specifications, not Cogent ones. Our Cogent compiler generates shallow embeddings of Cogent programs in Isabelle/HOL, which enables linking at the Isabelle specification level. Note that if we were to change the Cogent language, so long as are still able to shallowly embed the new language in Isabelle/HOL, linking would remain unaffected. Hence, hence, Cogent supports source independent linking. Verified compilation typically occurs in more than one pass. That is, we we showthatthe source program is refined by some intermediate representation, which in turn is refined by some other intermediate representation, and so on until it is refined by our target program. Then by transitivity, we know that the target program refines the source program. However, these passes may have subtle dependencies on each other [18, 16], which can necessitate reproving correctness for all existing passes when new passes are introduced. On the other hand, if the verification passes can occur in isolation, then the compiler supports vertical compositionality, and is thus quite modular. The Cogent compiler is technically not a verified compiler, but like a verified compiler it consists of many refinement phases whichare independent of each other, and thus we also benefit from this vertical compositionality.

Verified Data Structures Word arrays are widely used data structures, particularly in low level systems programming. The Cogent compiler uses the C-parser [20] to import C code into Isabelle/HOL and AutoCorres [5, 6] to simplify the imported C code to a monadic shallow embedding in Isabelle/HOL. Within AutoCorres, there are a number of example verified programs, including Binary search and Quicksort, which use 32-bit word arrays. Thus, AutoCorres also includes verified get and put operations for word arrays. Just for these two operations, their verification is much smaller in terms of lines of proof (roughly 250 lines, which is about half of ours for these functions), which can be attributed to the fact that we have more refinement steps. On the other hand, our proofs from the update semantics upwards are reusable for any primitive element type, whereas AutoCorres only verifies 32-bit word arrays. If we were only concerned with proving functional correctness of a word array imple- mentation, we could have taken a top down approach, i.e., generate an implementation from the specification, using existing tools or frameworks9 [ , 10, 11], rather than a bottom up approach, i.e., prove that an implementation refines its specification. The top down approach would likely lessen the verification burden, however, the bottom up approach allows for more control over the exact implementation produced, and is suitable for cases where an implementations already exist. Our aim was to verify an existing implementation of word arrays for BilbyFs, as well as to demonstrate the technique for more sophisticated data structures where this top-down approach may not be viable.

7 Conclusion

We have presented a case-study demonstrating a cross-language approach to proving refine- ment between Cogent, a safe functional language, and C, an unsafe imperative language. Namely, we verified the word array implementation provided in Cogent’s ADT library that was used the implementation of two real-world Linux file systems and showed that it respects Cogent’s FFI. We demonstrated proving correctness of Cogent’s FFI linking. We also proved that our specification of word arrays discharges the axiomatization of word arrays in the BilbyFs functional correctness specification. These case studies demonstrate that our verification can be modularly composed with Cogent’s refinement proofs, as well as with functional correctness proofs to ensure the correctness of an overall Cogent-C system. L. Cheung, L. O’Connor, and C. Rizkallah 17

References 1 Amal Ahmed. Compositional Compiler Verification for a Multi-Language World. In Delia Kesner and Brigitte Pientka, editors, 1st International Conference on Formal Structures for Computation and Deduction (FSCD 2016), volume 52 of Leibniz International Proceedings in Informatics (LIPIcs), pages 1:1–1:1, Dagstuhl, Germany, 2016. Schloss Dagstuhl–Leibniz- Zentrum fuer Informatik. URL: http://drops.dagstuhl.de/opus/volltexte/2016/5968, doi:10.4230/LIPIcs.FSCD.2016.1. 2 Sidney Amani. A Methodology for Trustworthy File Systems. PhD thesis, CSE, UNSW, Sydney, Australia, August 2016. 3 Sidney Amani, Alex Hixon, Zilin Chen, Christine Rizkallah, Peter Chubb, Liam O’Connor, Joel Beeren, Yutaka Nagashima, Japheth Lim, Thomas Sewell, Joseph Tuong, Gabriele Keller, Toby Murray, Gerwin Klein, and Gernot Heiser. Cogent: Verifying high-assurance file system implementations. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 175–188, Atlanta, GA, USA, April 2016. doi: 10.1145/2872362.2872404. 4 Cogent FFI examples (code and proofs). https://github.com/NICTA/cogent/tree/ wordarray-example/itp2021-artefact, 2021. Accessed February 2021. 5 David Greenaway, June Andronick, and Gerwin Klein. Bridging the gap: Automatic verified abstraction of C. In International Conference on Interactive Theorem Proving, pages 99–115, Princeton, New Jersey, USA, August 2012. 6 David Greenaway, Japheth Lim, June Andronick, and Gerwin Klein. Don’t sweat the small stuff: Formal verification of C code without the pain. In ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 429–439, Edinburgh, UK, June 2014. ACM. doi:10.1145/2594291.2594296. 7 Jeehoon Kang, Yoonseung Kim, Chung-Kil Hur, Derek Dreyer, and Viktor Vafeiadis. Light- weight verification of separate compilation. In Proceedings of the 43rd Annual ACM SIGPLAN- SIGACT Symposium on Principles of Programming Languages, POPL ’16, page 178–190, New York, NY, USA, 2016. Association for Computing Machinery. doi:10.1145/2837614.2837642. 8 Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andronick, David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey Tuch, and Simon Winwood. seL4: Formal verification of an OS kernel. In ACM Symposium on Operating Systems Principles, pages 207–220, Big Sky, MT, USA, October 2009. 9 Peter Lammich. Generating verified LLVM from isabelle/hol. In John Harrison, John O’Leary, and Andrew Tolmach, editors, 10th International Conference on Interactive Theorem Proving, ITP 2019, September 9-12, 2019, Portland, OR, USA, volume 141 of LIPIcs, pages 22:1–22:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.ITP.2019.22. 10 Peter Lammich. Refinement to imperative hol. Journal of Automated Reasoning, 62(4):481–503, Apr 2019. doi:10.1007/s10817-017-9437-1. 11 Peter Lammich and Andreas Lochbihler. The Isabelle collections framework. In Matt Kaufmann and Lawrence C. Paulson, editors, Interactive Theorem Proving, pages 339–354, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg. 12 Georg Neis, Chung-Kil Hur, Jan-Oliver Kaiser, Craig McLaughlin, Derek Dreyer, and Viktor Vafeiadis. Pilsner: A compositionally verified compiler for a higher-order imperative language. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Program- ming, ICFP 2015, page 166–178, New York, NY, USA, 2015. Association for Computing Machinery. doi:10.1145/2784731.2784764. 13 Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Isabelle/HOL - A Proof Assistant for Higher-Order Logic, volume 2283 of Lecture Notes in Computer Science. Springer, 2002. 14 Liam O’Connor, Zilin Chen, Christine Rizkallah, Sidney Amani, Japheth Lim, Toby Murray, Yutaka Nagashima, Thomas Sewell, and Gerwin Klein. Refinement through restraint: Bringing 18 Modular Refinement using Cogent’s Principled Foreign Function Interface

down the cost of verification. In International Conference on Functional Programming, Nara, Japan, September 2016. 15 Daniel Patterson and Amal Ahmed. The next 700 compiler correctness theorems (functional pearl). Proc. ACM Program. Lang., 3(ICFP):85:1–85:29, 2019. doi:10.1145/3341689. 16 James T. Perconti and Amal Ahmed. Verifying an open compiler using multi-language semantics. In Zhong Shao, editor, Programming Languages and Systems, pages 128–148, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg. 17 Christine Rizkallah, Japheth Lim, Yutaka Nagashima, Thomas Sewell, Zilin Chen, Liam O’Connor, Toby Murray, Gabriele Keller, and Gerwin Klein. A framework for the automatic formal verification of refinement from Cogent to C.In International Conference on Interactive Theorem Proving, Nancy, France, August 2016. 18 Gordon Stewart, Lennart Beringer, Santiago Cuellar, and Andrew W. Appel. Compositional compcert. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’15, page 275–287, New York, NY, USA, 2015. Association for Computing Machinery. doi:10.1145/2676726.2676985. 19 Peng Wang, Santiago Cuellar, and Adam Chlipala. Compiler verification meets cross-language linking via data abstraction. In Andrew P. Black and Todd D. Millstein, editors, Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA 2014, part of SPLASH 2014, Portland, OR, USA, October 20-24, 2014, pages 675–690. ACM, 2014. doi:10.1145/2660193.2660201. 20 Simon Winwood, Gerwin Klein, Thomas Sewell, June Andronick, David Cock, and Michael Norrish. Mind the gap: A verification framework for low-level C. In S. Berghofer, T. Nipkow, C. Urban, M. Wenzel, editor, International Conference on Theorem Proving in Higher Order Logics, pages 500–515, Munich, Germany, August 2009. Springer. doi:10.1007/ 978-3-642-03359-9_34.