Functional Programming Languages Part VI: Towards Coq Mechanization

Functional programming languages Part VI: towards Coq mechanization Xavier Leroy INRIA Paris MPRI 2-4, 2016-2017 X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 1 / 58 Prologue: why mechanization? Why formalize programming languages? To obtain mathematically-precise definitions of languages. (Dynamic semantics, type systems, . ) To obtain mathematical evidence of soundness for tools such as type systems (well-typed programs do not go wrong) type checkers and type inference static analyzers (e.g. abstract interpreters) program logics (e.g. Hoare logic, separation logic) deductive program provers (e.g. verification condition generators) interpreters compilers. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 2 / 58 Prologue: why mechanization? Challenge 1: scaling up From calculi (λ, π) and toy languages (IMP, MiniML) to real-world languages (in all their ugliness), e.g. Java, C, JavaScript. From abstract machines to optimizing, multi-pass compilers producing code for real processors. From textbook abstract interpreters to scalable and precise static analyzers such as Astrée. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 3 / 58 Prologue: why mechanization? Challenge 2: trusting the maths Authors struggle with huge LaTeX documents. Reviewers give up on checking huge but rather boring proofs. Proofs written by computer scientists are boring: they read as if the author is programming the reader. (John C. Mitchell) Few opportunities for reuse and incremental extension of earlier work. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 4 / 58 Prologue: why mechanization? Opportunity: machine-assisted proofs Mechanized theorem proving has made great progress in the last 20 years. (Cf. the monumental theorems of Gonthier et al: 4 colors, Feit-Thompson.) Emergence of engineering principles for large mechanized proofs. The kind of proofs used in programming language theory are a good match for theorem provers: large definitions, many cases mostly syntactic techniques, no deep mathematics concepts. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 5 / 58 Prologue: why mechanization? The POPLmark challenge In 2005, Aydemir et al challenged the POPL community: How close are we to a world where every paper on programming languages is accompanied by an electronic appendix with machine-checked proofs? 12 years later, about 20% of the papers at recent POPL conferences come with such an electronic appendix. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 6 / 58 Prologue: why mechanization? Proof assistants 1 A formal specification language to write definitions and state theorems. 2 Commands to build proofs, often in interaction with the user + some proof automation. 3 Often, an independent, automatic checker for the proofs thus built. Popular proof assistants in programming language research: Coq, HOL4, Isabelle/HOL. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 7 / 58 Examples of mechanization Outline 1 Prologue: why mechanization? 2 Examples of mechanization 3 Bound variables and alpha-conversion de Bruijn indices Higher-Order Abstract Syntax Nominal approaches Locally nameless X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 8 / 58 Examples of mechanization Examples of Coq mechanization See the commented Coq development on the course Web site. Module Contents From lecture Sequences Library on sequences of transitions Semantics Small-step and big-step semantics; Leroy's lecture #1 equivalence proofs Typing Simply-typed λ-calculus and type Rémy'slecture #1 soundness proof Compiler Compilation to Modern SECD and Leroy's lecture #2 correctness proof CPS CPS conversion and correctness proof Leroy's lecture #3 X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 9 / 58 Examples of mechanization Coq in a nutshell, 1: Computations and functions A pure functional language in the style of ML, with recursive functions defined by pattern-matching. Fixpoint factorial (n: nat) := match n with | O => 1 | S p => n * factorial p end. Fixpoint concat (A: Type) (l1 l2: list A) := match l1 with | nil => l2 | h :: t => h :: concat t l2 end. Note: all functions must terminate. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 10 / 58 Examples of mechanization Coq in a nutshell, 2: mathematical logic The usual logical connectors and quantifiers. Definition divides (a b: N) := exists n: N, b = n * a. Theorem factorial_divisors: forall n i, 1 <= i <= n -> divides i (factorial n). Definition prime (p: N) := p > 1 /\ (forall d, divides d p -> d = 1 \/ d = p). Theorem Euclid: forall n, exists p, p >= n /\ prime p. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 11 / 58 Examples of mechanization Coq in a nutshell, 3: inductive types Like Generalized Abstract Data Types in OCaml and Haskell. Inductive nat: Type := | O: nat | S: nat -> nat. Inductive list: Type -> Type := | nil: forall A, list A | cons: forall A, A -> list A -> list A. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 12 / 58 Examples of mechanization Coq in a nutshell, 4: inductive predicates Similar to definitions by axioms and inference rules. Inductive even: nat -> Prop := | even_zero: even O | even_plus_2: forall n, even n -> even (S (S n)). Compare with the inference rules: even n even O even (S(S(n))) Technically: even is the GADT that describes derivations of even n statements. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 13 / 58 Examples of mechanization Lessons learned Definitions and statements of theorems are close to what we did on paper. Proof scripts are ugly but no one has to read them. Ratio specs : proof scripts is roughly 1 : 1. No alpha-conversion? (much more on this later). Good reuse potential, for small extensions (e.g. primitive let, booleans, etc) and not so small ones (e.g. subtyping). X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 14 / 58 Bound variables and alpha-conversion Outline 1 Prologue: why mechanization? 2 Examples of mechanization 3 Bound variables and alpha-conversion de Bruijn indices Higher-Order Abstract Syntax Nominal approaches Locally nameless X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 15 / 58 Bound variables and alpha-conversion Bound variables and alpha-conversion Most programming languages provide constructs that bind variables: Function abstractions λx:a Definitions let x = a in b Pattern-matching match a with (x; y) ! b Quantifiers (in types) 8α: α ! α It is customary to identify terms up to alpha-conversion, that is, renaming of bound variables: λx: x + 1 ≡α λy: y + 1 8α: α list ≡α 8β: β list X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 16 / 58 Bound variables and alpha-conversion Substitutions, capture, and alpha-conversion In the presence of binders, textual substitution afx bg must avoid capturing a free variable of b by a binder in a: (λy: x + y)fx 2 × zg = λy: 2 × z + y 4 (λy: x + y)fx 2 × yg = λy: 2 × y + y 8 In the second case, the free y is captured by λy. A major reason for considering terms up to alpha-conversion is that capture can always be avoided by renaming bound variables on the fly during substitution: (λy: x + y)fx 2 × yg = (λz: x + z)fx 2 × yg = λz: 2 × y + z 4 This is called capture-avoiding substitution. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 17 / 58 Bound variables and alpha-conversion An oddity: no alpha-conversion ?!? The Coq development uses names for free and bound variables, but terms are not identified up to renaming of bound variables. Fun(x, Var x) 6= Fun(y, Var y) if x 6= y Likewise, substitution is not capture-avoiding: (Fun(y; a))fx bg = Fun(y; afx bg) if x 6= y Correct only if b is closed. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 18 / 58 Bound variables and alpha-conversion To alpha-convert or not to alpha-convert? α-conversion not needed α-conversion required For semantics Weak reductions of Strong reductions closed programs Symbolic evaluation For type systems Simply-typed Polymorphism in general (8, 9) Hindley-Milner (barely) System F and above. Dependent types. For program transformations Compilation to abstract mach. Compile-time reductions Naive CPS conversion (barely) One-pass CPS conversion Naive monadic conversion Administrative reductions X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 19 / 58 Bound variables and alpha-conversion What is so hard with alpha-conversion? Working with a quotient set: λx:a ≡α λy: afx yg if y not free in a Need to ensure that every definition and statement is compatible with this equivalence relation. Example: the free variables of a term are defined by recursion on a representative of an equivalence class: FV (x) = x FV (λx:a) = FV (a) n fxg FV (a b) = FV (a) [ FV (b) Must prove compatibility: FV (a) = FV (b) if a ≡α b. X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 20 / 58 Bound variables and alpha-conversion What is so hard with alpha-conversion? Need to define appropriate induction principles: To prove a property P of a term by induction on the term, in the case P(λx:a), what can I assume as induction hypothesis? just that P(a) holds? or that P(afx yg) also holds for any y such that λx:a ≡α λy:afx yg? X. Leroy (INRIA) Functional programming languages MPRI 2-4, 2016-2017 21 / 58 Bound variables and alpha-conversion What is so hard with alpha-conversion? Just defining capture-avoiding substitution: (λy:a)fx bg = λz: (afy zg) fx bg with z fresh Need to define what \z fresh" means, e.g. z is the smallest variable not free in a nor in b.

Functional Programming Languages Part VI: Towards Coq Mechanization

Experience Implementing a Performant Category-Theory Library in Coq

Testing Noninterference, Quickly

An Anti-Locally-Nameless Approach to Formalizing Quantifiers Olivier Laurent

Mathematics in the Computer

Xmonad in Coq: Programming a Window Manager in a Proof Assistant

A Survey of Engineering of Formally Verified Software

Metamathematics in Coq Metamathematica in Coq (Met Een Samenvatting in Het Nederlands)

Experience Implementing a Performant Category-Theory Library in Coq

A Toolchain to Produce Verified Ocaml Libraries

Dependently Typed Programming

Basics of Type Theory and Coq Logic Types ∧, ∨, ⇒, ¬, ∀, ∃ ×, +, →, Q, P Michael Shulman

Lecture 4 : Advanced Functional Programming with COQ