Functional programming languages Part VI: towards Coq mechanization
Xavier Leroy
INRIA Paris
MPRI 2-4, 2016-2017
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 1 / 58 Prologue: why mechanization? Why formalize programming languages?
To obtain mathematically-precise definitions of languages. (Dynamic semantics, type systems, . . . )
To obtain mathematical evidence of soundness for tools such as type systems (well-typed programs do not go wrong) type checkers and type inference static analyzers (e.g. abstract interpreters) program logics (e.g. Hoare logic, separation logic) deductive program provers (e.g. verification condition generators) interpreters compilers.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 2 / 58 Prologue: why mechanization? Challenge 1: scaling up
From calculi (λ, π) and toy languages (IMP, MiniML) to real-world languages (in all their ugliness), e.g. Java, C, JavaScript.
From abstract machines to optimizing, multi-pass compilers producing code for real processors.
From textbook abstract interpreters to scalable and precise static analyzers such as Astr´ee.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 3 / 58 Prologue: why mechanization? Challenge 2: trusting the maths
Authors struggle with huge LaTeX documents.
Reviewers give up on checking huge but rather boring proofs. Proofs written by computer scientists are boring: they read as if the author is programming the reader. (John C. Mitchell)
Few opportunities for reuse and incremental extension of earlier work.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 4 / 58 Prologue: why mechanization? Opportunity: machine-assisted proofs
Mechanized theorem proving has made great progress in the last 20 years. (Cf. the monumental theorems of Gonthier et al: 4 colors, Feit-Thompson.)
Emergence of engineering principles for large mechanized proofs.
The kind of proofs used in programming language theory are a good match for theorem provers: large definitions, many cases mostly syntactic techniques, no deep mathematics concepts.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 5 / 58 Prologue: why mechanization? The POPLmark challenge
In 2005, Aydemir et al challenged the POPL community: How close are we to a world where every paper on programming languages is accompanied by an electronic appendix with machine-checked proofs? 12 years later, about 20% of the papers at recent POPL conferences come with such an electronic appendix.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 6 / 58 Prologue: why mechanization? Proof assistants
1 A formal specification language to write definitions and state theorems. 2 Commands to build proofs, often in interaction with the user + some proof automation. 3 Often, an independent, automatic checker for the proofs thus built.
Popular proof assistants in programming language research: Coq, HOL4, Isabelle/HOL.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 7 / 58 Examples of mechanization Outline
1 Prologue: why mechanization?
2 Examples of mechanization
3 Bound variables and alpha-conversion de Bruijn indices Higher-Order Abstract Syntax Nominal approaches Locally nameless
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 8 / 58 Examples of mechanization Examples of Coq mechanization
See the commented Coq development on the course Web site.
Module Contents From lecture Sequences Library on sequences of transitions Semantics Small-step and big-step semantics; Leroy’s lecture #1 equivalence proofs Typing Simply-typed λ-calculus and type R´emy’slecture #1 soundness proof Compiler Compilation to Modern SECD and Leroy’s lecture #2 correctness proof CPS CPS conversion and correctness proof Leroy’s lecture #3
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 9 / 58 Examples of mechanization Coq in a nutshell, 1: Computations and functions A pure functional language in the style of ML, with recursive functions defined by pattern-matching.
Fixpoint factorial (n: nat) := match n with | O => 1 | S p => n * factorial p end.
Fixpoint concat (A: Type) (l1 l2: list A) := match l1 with | nil => l2 | h :: t => h :: concat t l2 end.
Note: all functions must terminate.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 10 / 58 Examples of mechanization Coq in a nutshell, 2: mathematical logic
The usual logical connectors and quantifiers.
Definition divides (a b: N) := exists n: N, b = n * a.
Theorem factorial_divisors: forall n i, 1 <= i <= n -> divides i (factorial n).
Definition prime (p: N) := p > 1 /\ (forall d, divides d p -> d = 1 \/ d = p).
Theorem Euclid: forall n, exists p, p >= n /\ prime p.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 11 / 58 Examples of mechanization Coq in a nutshell, 3: inductive types
Like Generalized Abstract Data Types in OCaml and Haskell.
Inductive nat: Type := | O: nat | S: nat -> nat.
Inductive list: Type -> Type := | nil: forall A, list A | cons: forall A, A -> list A -> list A.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 12 / 58 Examples of mechanization Coq in a nutshell, 4: inductive predicates
Similar to definitions by axioms and inference rules.
Inductive even: nat -> Prop := | even_zero: even O | even_plus_2: forall n, even n -> even (S (S n)).
Compare with the inference rules:
even n even O even (S(S(n))) Technically: even is the GADT that describes derivations of even n statements.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 13 / 58 Examples of mechanization Lessons learned
Definitions and statements of theorems are close to what we did on paper.
Proof scripts are ugly but no one has to read them.
Ratio specs : proof scripts is roughly 1 : 1.
No alpha-conversion? (much more on this later).
Good reuse potential, for small extensions (e.g. primitive let, booleans, etc) and not so small ones (e.g. subtyping).
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 14 / 58 Bound variables and alpha-conversion Outline
1 Prologue: why mechanization?
2 Examples of mechanization
3 Bound variables and alpha-conversion de Bruijn indices Higher-Order Abstract Syntax Nominal approaches Locally nameless
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 15 / 58 Bound variables and alpha-conversion Bound variables and alpha-conversion
Most programming languages provide constructs that bind variables: Function abstractions λx.a Definitions let x = a in b Pattern-matching match a with (x, y) → b Quantifiers (in types) ∀α. α → α
It is customary to identify terms up to alpha-conversion, that is, renaming of bound variables:
λx. x + 1 ≡α λy. y + 1 ∀α. α list ≡α ∀β. β list
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 16 / 58 Bound variables and alpha-conversion Substitutions, capture, and alpha-conversion
In the presence of binders, textual substitution a{x ← b} must avoid capturing a free variable of b by a binder in a:
(λy. x + y){x ← 2 × z} = λy. 2 × z + y (λy. x + y){x ← 2 × y} = λy. 2 × y + y
In the second case, the free y is captured by λy.
A major reason for considering terms up to alpha-conversion is that capture can always be avoided by renaming bound variables on the fly during substitution:
(λy. x + y){x ← 2 × y} = (λz. x + z){x ← 2 × y} = λz. 2 × y + z
This is called capture-avoiding substitution.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 17 / 58 Bound variables and alpha-conversion An oddity: no alpha-conversion ?!?
The Coq development uses names for free and bound variables, but terms are not identified up to renaming of bound variables.
Fun(x, Var x) 6= Fun(y, Var y) if x 6= y
Likewise, substitution is not capture-avoiding:
(Fun(y, a)){x ← b} = Fun(y, a{x ← b}) if x 6= y
Correct only if b is closed.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 18 / 58 Bound variables and alpha-conversion To alpha-convert or not to alpha-convert? α-conversion not needed α-conversion required For semantics Weak reductions of Strong reductions closed programs Symbolic evaluation For type systems Simply-typed Polymorphism in general (∀, ∃) Hindley-Milner (barely) System F and above. Dependent types. For program transformations Compilation to abstract mach. Compile-time reductions Naive CPS conversion (barely) One-pass CPS conversion Naive monadic conversion Administrative reductions
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 19 / 58 Bound variables and alpha-conversion What is so hard with alpha-conversion?
Working with a quotient set:
λx.a ≡α λy. a{x ← y} if y not free in a
Need to ensure that every definition and statement is compatible with this equivalence relation.
Example: the free variables of a term are defined by recursion on a representative of an equivalence class:
FV (x) = x FV (λx.a) = FV (a) \{x} FV (a b) = FV (a) ∪ FV (b)
Must prove compatibility: FV (a) = FV (b) if a ≡α b.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 20 / 58 Bound variables and alpha-conversion What is so hard with alpha-conversion?
Need to define appropriate induction principles:
To prove a property P of a term by induction on the term, in the case P(λx.a), what can I assume as induction hypothesis? just that P(a) holds? or that P(a{x ← y}) also holds for any y such that λx.a ≡α λy.a{x ← y}?
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 21 / 58 Bound variables and alpha-conversion What is so hard with alpha-conversion?
Just defining capture-avoiding substitution:
(λy.a){x ← b} = λz. (a{y ← z}) {x ← b} with z fresh
Need to define what “z fresh” means, e.g. z is the smallest variable not free in a nor in b.
Then, we get a recursion that is not structural( a{y ← z} is not a syntactic subterm of λy.a) and therefore not directly accepted by Coq. Must use a different style of recursion, e.g. on the size of the term rather than on its structure.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 22 / 58 Bound variables and alpha-conversion Mechanizing alpha-conversion
Several approaches to the handling of binders and alpha-conversion using interactive theorem provers. Described next: 1 de Bruijn indices 2 higher-order abstract syntax 3 nominal logics 4 locally nameless.
The “POPLmark challenge” compare these approaches (and more) on a challenge problem (type soundness for F<:). 15 solutions in 5 proof assistants. No consensus.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 23 / 58 Bound variables and alpha-conversion de Bruijn indices Outline
1 Prologue: why mechanization?
2 Examples of mechanization
3 Bound variables and alpha-conversion de Bruijn indices Higher-Order Abstract Syntax Nominal approaches Locally nameless
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 24 / 58 Bound variables and alpha-conversion de Bruijn indices de Bruijn indices (Nicolaas de Bruijn, the Automath prover, 1967)
a ::= n variable (n ∈ N) | λ.a abstraction, binds variable 0 | a1 a2 application
Represent every variable by its distance to its binder.
λx.λy. x y ≡ λ.λ. 1 0 ≡ λz.λw. z w
A canonical representation: α-convertible terms are equal.
Used in many early mechanizations (e.g. G. Huet, Residual Theory in lambda-Calculus: A Formal Development, J. Funct. Program. 4(3), 1994; B. Barras and B. Werner, Coq in Coq, 1997).
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 25 / 58 Bound variables and alpha-conversion de Bruijn indices Substitution with de Bruijn indices The definition of substitution is not obvious: x if x < n x{n ← a} = a if x = n x − 1 if x > n
(λ.b){n ← a} = λ. b{n + 1 ← ⇑0 a} (b c){n ← a} = b{n ← a} c{n ← a} For abstractions, the indices of free variables in a must be incremented by one to avoid capturing variable 0 bound by the λ. This is achieved by the lifting operator ⇑n, which increments all variables ≥ n. ( x if x < n ⇑n x = x + 1 if x ≥ n
⇑n λ.a = λ. ⇑n+1 a
⇑n (a b) = (⇑n a)(⇑n b)
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 26 / 58 Bound variables and alpha-conversion de Bruijn indices Properties of substitution and lifting
Some technical commutation lemmas: if p ≥ n,
(⇑n a){n ← b} = a
⇑n (a{p ← b}) = (⇑n a){p + 1 ←⇑n b}
⇑p (a{n ← b}) = (⇑p+1 a){n ←⇑p b}
a{n ← b}{p ← c} = a{p + 1 ←⇑n c}{n ← b{p ← c}}
Pro: systematic proofs, easy to automate. Cons: not intuitive, unclear what indices really mean.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 27 / 58 Bound variables and alpha-conversion de Bruijn indices Issues with de Bruijn indices
Statements of theorems are polluted by adjustments of indices for free variables and don’t look quite like what we’re used to.
For example, in textbook, named notation, the weakening lemma is:
0 Weakening : E1, E2 ` a : τ =⇒ E1, x : τ , E2 ` a : τ
with the same a in hypothesis and conclusion. (Plus: implicit hypothesis that x is not bound in E1 or E2.)
With de Bruijn indices, the second a is lifted by an amount that is equal to the number of binders after that of x, namely, the length of the E1 environment:
0 Weakening : E1, E2 ` a : τ =⇒ E1, τ , E2 `⇑|E1| a : τ
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 28 / 58 Bound variables and alpha-conversion de Bruijn indices Issues with de Bruijn indices
Likewise, for stability of typing by substitution, the top-level statement is quite readable:
τ 0, E ` a : τ ∧ E ` b : τ 0 =⇒ E ` a{0 ← b} : τ
but the statement that can be proved by induction is less intuitive:
0 0 E1, τ , E2 ` a : τ ∧ E1, E2 ` b : τ =⇒ E1, E2 ` a{|E1| ← b} : τ
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 29 / 58 Bound variables and alpha-conversion de Bruijn indices Summary
de Bruijn indices make it possible to mechanize the theory of many languages, including difficult features such as multiple kinds of variables (e.g. type variables and term variables in System F) binders that bind multiple variables (e.g. ML pattern-matching).
The statements of theorems look somewhat different from what we do on paper using names, and take time getting used to.
Some “boilerplate” (definitions and properties of substitutions and lifting) is necessary, but can be automated to some extent. See for instance F. Pottier’s DBlib library, https://github.com/fpottier/dblib.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 30 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax Outline
1 Prologue: why mechanization?
2 Examples of mechanization
3 Bound variables and alpha-conversion de Bruijn indices Higher-Order Abstract Syntax Nominal approaches Locally nameless
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 31 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax Higher-Order Abstract Syntax (HOAS) (Pfenning, Elliott, Miller, Nadathur, 1987–1988; the Twelf logical framework)
Leverage the power of your logic: it has α-conversion built-in!
Represent binding in terms using functions of your logic/language.
type term = ---> type term = | Var of name | Lam of name * term | Lam of term -> term | App of term * term | App of term * term
λx.λy. x y is represented as Lam(fun x -> Lam(fun y -> App(x,y))).
Substitution is just function application!
let betared (Lam f) arg = f arg
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 32 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax Higher-Order Abstract Syntax (HOAS)
How to reason about non-closed terms? Using logical implications and hypothetical judgments!
Example: typing rules for simply-typed λ-calculus
assm(x, τ) ∀x, (assm(x, τ1) ⇒ ` f x : τ2)
` x : τ ` Lam f : τ1 → τ2
The typing environment is encoded as the set of logical assumptions assm(x, τ) available in the logical environment.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 33 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax Why no HOAS in Coq?
Problem 1: “exotic” function terms. In Coq or Caml, the type term -> term contains more functions than just HOAS encodings of λ-terms. The latter are parametric in their arguments, while Coq/Caml functions can also discriminate over their arguments, e.g.
f = fun x -> match x with App(_,_) -> x | Lam _ -> App(x,x)
These “exotic” functions invalidate some expected properties of substitutions. For example, we expect
a{x ← b} = a{x ← c} =⇒ b = c or ∀b, a{x ← b} = a
However, this property fails with the “exotic” function f above,
f (Lam g) = App(Lam g, Lam g) = f (App (Lam g, Lam g))
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 34 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax Why no HOAS in Coq?
Problem 2: contravariant inductive definitions are not allowed in Coq.
Inductive term := Lam : (term -> term) -> term
It would make it possible to define nonterminating computations:
Definition delta (t: term): term := match t with Lam f => f t end. Definition omega : term := delta (Lam delta).
Likewise, inductive predicates with contravariant (negative) occurrences lead to logical inconsistency:
Inductive bot : Prop := Bot : (bot -> False) -> bot.
By inversion, bot implies bot -> False and therefore False.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 35 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax The Twelf system (F, Pfenning, C. Schurmann et al; http://twelf.plparty.org/)
A logical framework that supports defining and reasoning upon languages and logics using Higher-Order Abstract Syntax: Terms of the language are encoded in LF using HOAS for binders (cf. lecture #3 of Y. R´egis-Gianas). Properties of terms (such as well-typedness) and meta-properties (such as soundness of typing) are specified as inference rules in λ-Prolog.
The Coq difficulties with HOAS (exotic functions, nontermination) are avoided by restricting the expressiveness of the functional language (LF): it has no destructors (no pattern-matching), hence can only express functions that are “parametric” in their arguments.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 36 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax Parametric HOAS (Adam Chlipala, 2008)
A form of HOAS that is compatible with Coq and similar proof assistants.
Idea 1: break the contravariant recursion term = Lam of term -> term by introducing a type V standing for variables:
Inductive term (V: Type): Type := | Var: V -> term V | Fun: (V -> term V) -> term V | App: term V -> term V -> term V.
Example: the identity function λx.x is represented by
Fun (fun (x: V) -> Var x)
for any type V of your choice.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 37 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax Parametric HOAS For a fixed type V (say, nat), term V still contains exotic functions:
Fun (fun n => match n with O => Var nat 1 | _ => Var nat n end)
Idea 2: rule out exotic functions by quantifying over all possible types V. Thus, the type of closed terms is
Definition term0 := forall V, term V.
and the type of terms having one free variable is
Definition term1 := forall V, V -> term V.
We can then define constructors for closed terms:
Definition APP (a b: term0) : term0 := fun (V: Type) => App (a V) (b V). Definition FUN (a: term1) : term0 := fun (V: Type) => Fun (a V).
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 38 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax Substitution in parametric HOAS
Substitution is almost function application:
Definition subst (f: term1) (a: term0) : term0 := fun (V: Type) => squash (f (term V) (a V))
squash is the canonical injection from term (term V) into term V obtained by erasing the Var constructors:
Fixpoint squash (V: Type) (a: term (term V)) : term V := match a with | Var x => x | Fun f => Fun (fun v => f (Var v)) | App b c => App (squash b) (squash c) end.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 39 / 58 Bound variables and alpha-conversion Higher-Order Abstract Syntax For more information
Chapter 17 of Certified Programming with Dependent Types, A. Chlipala.
The Lambda Tamer library http://ltamer.sourceforge.net/
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 40 / 58 Bound variables and alpha-conversion Nominal approaches Outline
1 Prologue: why mechanization?
2 Examples of mechanization
3 Bound variables and alpha-conversion de Bruijn indices Higher-Order Abstract Syntax Nominal approaches Locally nameless
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 41 / 58 Bound variables and alpha-conversion Nominal approaches Nominal approaches Nominal logic (A. Pitts, 2001): A first-order logic that internalizes the notions of names, binding, and equivariance (invariance under α-conversion). The Nominal package for Isabelle/HOL (Ch. Urban et al, 2006): A library for the Isabelle/HOL proof assistant that implements the main notions of nominal logic and automates the definition of data types with binders (e.g. AST for programming languages) modulo α-conversion. Example 1 (Nominal Isabelle definition) atom_decl name nominal_datatype lam = Var "name" | App "lam × lam" | Lam "name lam"
Automatically defines the correct equality-modulo-α over type lam, as well as an appropriate induction principle.
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 42 / 58 Bound variables and alpha-conversion Nominal approaches Names and swaps
α-equivalence is defined not in terms of renamings {x ← y} but in terms of swaps (permutations of two names)
x = {x ← y; y ← x} y
Unlike renamings, swaps are bijective (self-inverse) and can be applied uniformly both to free and bound variables, with no risk of capture:
x x x (λz.a) = λ (z). (a) y y y
X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 43 / 58 Bound variables and alpha-conversion Nominal approaches Nominal types
A nominal type is a type equipped with a swap operation satisfying common-sense properties such as