<<

Quick Intro to MLTT*

Sergey Goncharov FAU Erlangen-Nürnberg

April 28, 2021

* (intensional) Martin-Löf Roadmap

1 A Cherry-Banana Calculus

2 Formal Systems

3 Principles of Type Theory

4 Formal MLTT„

„adapted from the HoTT book 1 47 A Cherry-Banana Calculus A Cherry-Banana Calculus

Consider a Cherry-Banana Calculus:

X X Y X (i) (ii) (iii) (iv) X XY X

Each rule scheme represents an infinite number of rules, obtained by replacing variables X, Y with non-empty finite sequences of and

We can build proofs or derivations, like (i) (i) (iv) (ii) (iii)

starting from rules without premises, i.e. 2 47 Deriavability

We have shown that is derivable But is not derivable: no rule has it as the conclusion The fact that is not derivable is a meta-property of this system, it is provable in a very weak meta-logic But is not derivable either: neither derivation ? ? ? (ii) (iii)

can be finished. This requires a stronger meta-logic, supporting induction. For a proof: define the derivation height, and show by induction that it cannot be finite

3 47 Derivable and Admissible Rules

A rule (scheme) is derivable, if it can be obtained by X combining existing rules, e.g. , for X (i) X (iv) (ii) X (iii) X A rule (scheme) is admissible if by adding it, one can derive X nothing new. Trivial example X Derivable rules are always admissible, but admissible need not be derivable

4 47 Formal Systems Formal Systems

A consists of Finite alphabet of symbols Language over that alphabet, determined by a grammar rules

Example: Alphabet: , Language: X ::= | | X | X Inference rules: as above

Schematization ensures that an infinite of rules can be organized in a finite way. More generally, one typically requires the set of rules to be recursive or recursively enumerable

5 47 More Examples

λ-calculus: I Language M, N ::= x | λx. M | NM I Rules: η-conversion, β-reduction Operational I Language for judgements M → N (small-step) or M ⇓ v (big-step), requires a sublanguage for terms M, N and values v I Custom set of rules First-order logic (Hilbert style) I Language: first-order formulas φ φ → ψ I Lots of axioms + two rules: (modus ponens) and ψ φ (generalization) ∀x. φ First-order logic (Gentzen style)

I Language of sequents φ1, . . . , φn ` ψ1, . . . , ψm of formulas I Lots of rules + one φ ` φ

6 47 Foundations

Formal systems (since Hilbert) are used in foundations of mathematics: inference process captures building new knowledge from old, inferred judgements are Two big questions: I (In-)consistency: Can we derive contradictions: φ and ¬φ? I (In-)completeness: Can we always derive either φ or ¬φ? These questions are "meta" (!) – we need another formal system, to deal with them for a given system However, we can also encode the meta-logic in the system itself, provided the system is suciently expressive For systems far behind what is needed for formalizing mathematics (Robinson “induction free” arithmetic) neither consistency not completeness can be established by itself. This is what Gödel incompleteness means

7 47 Foundations:

Most of mathematicians in the world work in a specially chosen foundation of mathematics – set theory, say ZFC, if you really press them to detail ZFC =Zermelo–Fraenkel set theory + the axiom ofChoice ZFC is manufactured in two stages: I A formal system for first-order logic I A first order theory, i.e. the axioms, sets must satisfy By Gödel, ZFC is incomplete (or inconsistent), i.e. there are formulas that are neither provable not disprovable, e.g. continuum hypothesis, existence of large cardinals, etc By adding them (assuming consistency) one obtains another incomplete system

8 47 Set Theory: Issues

There are lots of issues with set theory, type theory aims to fix The language of set theory is unsuitable for direct use: numbers, functions, smooth manifolds must all be encoded as sets. Think of π ∩ R Logic is decoupled from structures we involve, i.e. we care so much how true statements are derived, but not how complex structures are build from their parts The axioms are arbitrary and do not help to orient proofs towards reaching goals The background logic of sets features debatable principles, preventing computational implementation, hence impinging automation: excluded middle, , impredicativity

9 47 Example: Total Computable Functions

Total computable functions are those computable functions (normally partial), which are total But what means total? Totality is a logical property, which must be proven. Proven where? I The theory of computable functions can be developed in natural arithmetic, and then it is the question, which axioms we choose. There exist extremely slowly converging functions (see Goodstein’s ), whose totality cannot be proven in Peano arithmetic I Also in set theory, because of incompleteness, there are total computable functions, which are not provably total Logic and computation are thus intimately connected, but that connection is obscured by dumb encodings into set theory and running along dierent paths

10 47 Practical Considerations

Set theory is a very theoretical theory – it is done by set theorists, the rest never mind and just do math Theoretically, set theory is relevant to the whole math, though. That is an example of the general situation: Theoretically, there is no dierence between theory and practice. Practically, there is Type theory aims at an inclusive, unifying platform, reconciling various aspects of algebra, logic and computability in one package. This is of direct use for programming, verification, proof automation, but also for abstract math, citing Voevodsky:

And I now do my mathematics with a proof assistant. I have a lot of wishes in terms of getting this proof assistant to work better, but at least I don’t have to go home and worry about having made a mistake in my work. I know that if I did something, I did it, and I don’t have to come back to it nor do I have to worry about my being too complicated or about how to convince others that my arguments are correct. I can just trust the computer. There are many people in computer science who are contributing to our program, but most mathematicians still don’t believe that it is a good idea. And I think that is very wrong.

11 47 Principles of Type Theory Types

Simply-typed λ-calculus is a good start By adding types to the untyped lambda calculus, we prevent non-termination via (λx. xx)(λx. xx) and enable strong normalization: every λ-term has a unique normal form, regardless of the reduction strategy We need two kinds of rules: term formation rules Γ ` f : A → B Γ ` t: A Γ, x: A ` t: A Γ ` f t: B Γ ` λx. t: A → B

and reduction rules for αβη-reduction (omited) Types are used for programming to prevent errors, to structure code, avoid boiler plate, etc, etc We need a far more advanced type system than simply typed λ-calculus, even more than most of the advanced programming languages have

12 47 as Types

If we add to the simply-typed λ-calculus the unit type 1, the empty type 0 and binary coproducts, we obtain a beautiful correspondence between propositions and types

Proposition Type False 0 True 1 A ∧ B A × B A ∨ B A + B A → B A → B

It is advisable to spell this out for every raw, for example, for products, we see that to have an inhabitant of A × B, we have to provide an inhabitant both for A and for B A constructive of logic and types is thus coded in!

13 47 Proofs as Programs

A judgement Γ ` t: A (where Γ = (x1 : A1,..., x1 : A1)) can now be viewed as a proof of A1 ∧ ... ∧ An → A Example: a proof of (A → B) → (B ∨ A) → B:

f : A → B ` λx: B + A. case x of inl y 7→ y; inr z 7→ f z: B

The derivation rules of propositional logic in the (viz. Fitch) style precisely correspond to formation rules of λ-terms Computational interpretation: a is derivable, i we can construct a program, i.e. a proof, which it types Exception is the excluded middle law: φ ∨ (φ → False), which fails to have a computational interpretation in this sense (!)

14 47 Induction as

If we add the type of natural numbers N (and other so-called inductive types), we obtain a connection between induction (logic) and structural recursion (programs) zero: 1 → N and suc: N → N are constructors Recursion on N (primitive recursion), produces unique ˆ f : X × N → Y for any given fzero : X → Y and fsuc : X × N → X, such that ˆ ˆ ˆ f(x, zero) = fzero(x) f(x, suc n) = fsuc(f(x, n), n)

A proof “by induction” = constructing a witness by resursion General (non-structural) recursion fix:(X → X) → X (like in Haskell) would contradict both I logic: every proposition X is provable with fix(id: X → X), and I computational interpretation: (fix f) is generally a non-terminating program (infinite proof?)

15 47 Dependent Types

Stil, we can extend the type system in many other “safe” ways Consider the type Vec(n) of vectors of length n. Then, the type of all vectors is the dependent sum P Vec(n) (i: N) n 7→ (0,..., n − 1) is an inhabitant of the dependent product Q Vec(n) (compare with → P Vec(n)) (n: N) N (i: N) P and Q extend the propositions as types correspondence:

Proposition Type P (∃x: X) φ(x) (x : X) φ(x) Q (∀x: X) φ(x) (x : X) φ(x) Type theories supporting dependent types are called dependent type theories. System F, behind Haskell, is not dependent type, MLTT is

16 47 Proof Relevance

Proof relevance is what makes the distinction between set theory and type theory so massive I In set theory we hasten to prove things and to forget the proofs, remembering however that proofs exist I It type theory we are bound to carry all proofs that have ever been done forever, if we still want to use the entailed theorems, of course . In a manner of a blockchain A typical example is a proof of existence: for an inhabitant of P (x : X) φ(x) we need a pair (x: X, p: φ(x)), i.e. not only we must show that the predicate φ is satisfiable, but we also need to supply a witness x that really satisfies it

Proof relevance can be seriously suppressed, by using a special axiom for equality. But that is incompatible with some interpretations of type theory, such as HoTT 17 47 Universes

If we are serious about using type theory for foundations, (almost) everything must have a type, because things that we want to talk about (terms, proofs, equations) must belong to corresponding classes and then we also need to talk about these classes, hence they must be internalized Can we have the type of all types U, so that X : U for every type X? Recall Russel’s paradox:

{X | X ∈/ X} (barber, who shaves everyone but himself) A hurdle of this sort in type theory is Girard’s paradox

Hence a cumulative hierarchy of universes U0 ⊆ U1 ⊆ ... Un+1 is the “planet B” where we go after we overcrowded Un

18 47 (Im-)Predicativity

Among many things that are wrong with the Russel’s beast, it is impredicative: it aims to inhabit a totality, by alluding to its a priori existence For an analogy: “a least upper bound in a partial order” secretly presupposes that it exists, although not every partial order has a least upper bound Before set theory was fixed, impredicativity was considered as a possible source of troubles (but not anymore) With types, impredicativity can be eciently tamed, and type theories often allow it in some forms (e.g. the Calculus of Inductive Constructions, behind Coq), but now MLTT (!)

19 47 Equality

Recall that simply typed λ-calculus is strongly normalizing, so we could regard the reduction relation as equality. Then two terms are equal if they reduce to the same normal form. Type-theoretically, this is definitional or judgmental equality If definitional equality was enough, we could solve any problem with bending normalization This motivates propositional equality via identity types x ≡A y, inhabited by witnesses of that x, y : A are equal. Thus equality becomes a proposition, like any other property Identity types are the key, and the most sophisticated ingredient of MLTT

20 47 Formal MLTTa aadapted from the HoTT book Judgments

There are two basic judgments: The first, a: A, asserts that a term a has type A The second, a = b: A, states that the two terms a and b are judgmentally equal at type A These judgments are inductively defined by a set of inference rules, we will describe

21 47 Contexts

To construct an a of a type A is to derive a: A Formally the above judgments occur in an ambient context, or list of assumptions, of the form

x1 : A1,..., xn : An

An element xi : Ai of the context expresses the assumption that the variable xi has type Ai. The variables x1,..., xn appearing i n the context must be distinct We abbreviate contexts with the letters Γ and ∆

22 47 Context Judjements

The judgment a: A in context Γ is written

Γ ` a: A

and means a: A under the assumptions listed in Γ When the list of assumptions is empty, we write ` a: A or · ` a: A. Analogously for the equality judgments Importantly, contexts must be well-formed which motivates the third kind of judments:

(x1 : A1,..., xn : An) ctx

expressing the fact that each Ai is a type in the context x1 : A1, x2 : A2,..., xi−1 : Ai−1. This implies that each Ai contains only the variables x1,..., xi−1

23 47 Binding and Substitution

To bind or abstract a variable x in an expression B is to embed both x and B in a context where occurrences of x can only be changed simultaneously Various binding notation: x 7→ B, λx. B, and x. B

Also, each variable xi of a judgment

x1 : A1,..., xn : An ` a: A

can be regarded as bound in its scope, consisting of the expressions Ai+1,..., An, a, and A We identify expressions up to α-conversion, e.g. x. B is the same as y. B[y/x] (if y is bound in B) More generally, B[a1/x1,..., an/xn]

substitutes simultaneously a1,..., an for x1,..., xn 24 47 Rules

In summary, we have three kinds of judgments

Γ ctx Γ ` a: A Γ ` a = a0 : A

subject to the rules, we define next A general rule has the form

J1 ···Jk (Name) J

It says that we may derive the conclusion J , provided that we have already derived the hypotheses J1,..., Jk There may be extra side conditions that need to be checked before the rule is applicable

25 47 Example Derivation

A derivation of a judgment is a finite tree constructed from inference rules, with the judgment at the root of the tree Example:

(ctx-emp) · ctx (1-form) ` 1: U0 (ctx-ext) x: 1 ctx (Vble) x: 1 ` x: 1 (Π-intro) · ` λx: 1. x: 1 → 1

26 47 Rules for Contexts

The judgment Γ ctx expresses the fact that Γ is a well-formed context, and is governed by the rules

x1 : A1,..., xn−1 : An−1 ` An : Ui (ctx-emp) (ctx-ext) · ctx (x1 : A1,..., xn : An) ctx

Side condition: xn must be distinct from the x1,..., xn−1

It is a meta-property that if x1 : A1,..., xn : An ` b: B is derivable, then (x1 : A1,..., xn : An) must be well-formed; thus (ctx-ext) need not hypothesize well-formedness of the context to the left of xn

27 47 Structural Rules

The fact that the context holds assumptions is expressed by:

(x1 : A1,..., xn : An) ctx (Vble) x1 : A1,..., xn : An ` xi : Ai

Substitution and weakening are admissible principles. For the typing judgments these principles are manifested as

Γ ` a: A Γ, x: A, ∆ ` b: B (Subst1) Γ, ∆[a/x] ` b[a/x]: B[a/x]

Γ ` A: Ui Γ, ∆ ` b: B (Wkg ) Γ, x: A, ∆ ` b: B 1

There are analogous rules for judgmental equalities

28 47 Structural Rules, Continued

Judgmental equality is an equivalence relation:

Γ ` a: A Γ ` a = b: A Γ ` a = b: A Γ ` b = c: A Γ ` a = a: A Γ ` b = a: A Γ ` a = c: A

respected by typing:

Γ ` a: A Γ ` A = B: Ui Γ ` a = b: A Γ ` A = B: Ui Γ ` a: B Γ ` a = b: B

For all the type formers we assume rules stating that constructors preserve definitional equality, e.g.

0 Γ ` A: Ui Γ, x: A ` B: Ui Γ, x: A ` b = b : B Π 0 0 Q ( -intro-eq) Γ ` λx: A. b = λx: A . b : (x : A)B

29 47 Type Universes

We postulate an infinite hierarchy of type universes

U0, U1, U2,...

Each Ui is contained in Ui+1, and any type in Ui is in Ui+1:

Γ ctx Γ ` A: Ui (U-intro) (U-cumul) Γ ` Ui : Ui+1 Γ ` A: Ui+1

It will be entailed that Γ ` a: A implies Γ ` A: Ui for some i. In other words, if A is a type then it is in some universe Another property: Γ ` a = b: A implies Γ ` a: A and Γ ` b: A

30 47 Type Formation Rules

Each type former will be introduced independently of the others For each type former, the rules are classified as follows I a formation rule, stating when the type former can be applied I introduction rules, stating how to inhabit the type I elimination rules, or an induction principle, stating how to use an element of the type I computation rules: judgmental equalities explaining what happens when elimination rules are applied to results of introduction rules I optional uniqueness principles: judgmental equalities explaining how every element of the type is uniquely determined by the results of elimination rules applied to it

31 47 The Empty Type 0

Γ ctx Γ, x: 0 ` C : Ui Γ ` a: 0 (0-form) (0-elim) Γ ` 0: Ui Γ ` ind0(x. C, a): C[a/x]

In ind0, x is bound in C There are no introduction rules and no computation rules

32 47 The Unit Type 1

Γ ctx Γ ctx (1-form) (1-intro) Γ ` 1: Ui Γ ` ?: 1

Γ, x: 1 ` C : Ui Γ ` c: C[?/x]Γ ` a: 1 (1-elim) Γ ` ind1(x. C, c, a): C[a/x]

Γ, x: 1 ` C : Ui Γ ` c: C[?/x] (1-comp) Γ ` ind1(x. C, c,?) = c: C[?/x]

In ind1, x is bound in C We do not postulate a judgmental uniqueness principle – propositional uniqueness is provable

33 47 Dependent Function Types: Formation, Introduction, Elimination

For the dependent function type these rules are:

Γ ` A: Ui Γ, x: A ` B: Ui Q (Π-form) Γ ` (x : A)B: Ui

Γ, x: A ` b: B Q (Π-intro) Γ ` λx: A. b: (x : A)B Q Γ ` f : (x : A)B Γ ` a: A (Π-elim) Γ ` f(a): B[a/x]

34 47 Dependent Function Types: Computation, Uniqueness

Γ, x: A ` b: B Γ ` a: A (Π-comp) Γ ` (λx: A. b)(a) = b[a/x]: B[a/x]

Q Γ ` f : (x : A)B Q (Π-uniq) Γ ` f = (λx: A. f(x)): (x : A)B

The expression λx: A. b binds free occurrences of x in b, as Q does (x : A) B for B. When x does not occur freely in B so that B does not depend on A, we obtain as a special case the ordinary function type Q A → B :≡ (x : A) B. We take this as the definition of →.

35 47 Coproduct Types: Formation, Introduction

Γ ` A: Ui Γ ` B: Ui (+-form) Γ ` A + B: Ui

Γ ` A: Ui Γ ` B: Ui Γ ` a: A (+-intro1) Γ ` inl a: A + B

Γ ` A: Ui Γ ` B: Ui Γ ` b: B (+-intro2) Γ ` inr b: A + B

36 47 Coproduct Types: Elimination, Computation

Γ, z: A + B ` C : Ui Γ, x: A ` c: C[inl x/z] Γ, y : B ` d: C[inr y/z]Γ ` e: A + B (+-elim) Γ ` indA+B(z. C, x. c, y. d, e): C[e/z]

Γ, z: A + B ` C : Ui Γ, x: A ` c: C[inl x/z] Γ, y : B ` d: C[inr y/z]Γ ` a: A (+-comp1) Γ ` indA+B(z. C, x. c, y. d, inl a) = c[a/x]: C[inl a/z]

Γ, z: A + B ` C : Ui Γ, x: A ` c: C[inl x/z] Γ, y : B ` d: C[inr y/z]Γ ` b: B (+-comp2) Γ ` indA+B(z. C, x. c, y. d, inr b) = d[b/y]: C[inr b/z]

37 47 Natural Numbers: Formation, Introduction, Elimination

Γ ctx Γ ctx (N-form) (N-intro1) Γ ` N: Ui Γ ` 0: N

Γ ` n: N (N-intro2) Γ ` suc n: N

Γ, x: N ` C : Ui Γ ` c0 : C[0/x] Γ, x: N, y : C ` cs : C[suc x/x]Γ ` n: N (N-elim) Γ ` indN(x. C, c0, x. y. cs, n): C[n/x]

38 47 Natural Numbers: Computation

Γ, x: N ` C : Ui Γ ` c0 : C[0/x]Γ, x: N, y : C ` cs : C[suc x/x] (N-comp1) Γ ` indN(x. C, c0, x. y. cs, 0) = c0 : C[0/x]

Γ, x: N ` C : Ui Γ ` c0 : C[0/x] Γ, x: N, y : C ` cs : C[suc x/x]Γ ` n: N (N-comp2) Γ ` indN(x. C, c0, x. y. cs, suc n)

= cs[n/x, indN(x. C, c0, x. y. cs, n)/y]: C[suc n/x]

We omit analogous inductively defined types (lists, trees, etc)

39 47 Dependent Pair Types: Formation, Introduction

Γ ` A: Ui Γ, x: A ` B: Ui P (Σ-form) Γ ` (x : A)B: Ui

Γ, x: A ` B: Ui Γ ` a: A Γ ` b: B[a/x] P (Σ-intro) Γ ` (a, b): (x : A)B

When B does not contain free occurrences of x, we obtain as P a special case the cartesian product A × B :≡ (x : A) B. We take this as the definition of the Cartesian product Judgmental uniqueness principle could be added. Currently, their propositional versions are derivable

40 47 Dependent Pair Types: Elimination, Computation

P Γ, z: (x : A)B ` C : Ui P Γ, x: A, y : B ` g: C[(x, y)/z]Γ ` p: (x : A)B (Σ-elim) Γ ` indP (z. C, x. y. g, p): C[p/z] (x : A) B P Γ, z: (x : A)B ` C : Ui Γ, x: A, y : B ` g: C[(x, y)/z] Γ ` a: A Γ ` b: B[a/x] (Σ-comp) Γ ` indP (z. C, x. y. g, (a, b)) = g[a/x, b/y]: C[(a, b)/z] (x : A) B

41 47 Identity Types: Formation, Introduction

Γ ` A: Ui Γ ` a: A Γ ` b: A (≡-form) Γ ` a ≡A b: Ui

Γ ` A: Ui Γ ` a: A (≡-intro) Γ ` refla : a ≡A a

42 47 Identity types: Elimination, Computation

Γ, x: A, y : A, p: x ≡A y ` C : Ui Γ, z: A ` c: C[z/x, z/y, reflz/p] 0 Γ ` a: A Γ ` b: A Γ ` p : a ≡A b 0 0 (≡-elim) Γ ` ind≡A (x. y. p. C, z. c, a, b, p ): C[a/x, b/y, p /p]

Γ, x: A, y : A, p: x ≡A y ` C : Ui Γ, z: A ` c: C[z/x, z/y, reflz/p]Γ ` a: A (≡-comp) Γ ` ind≡A (x. y. p. C, z. c, a, a, refla)

= c[a/z]: C[a/x, a/y, refla/p]

43 47 MLTT: The Goods

Proof relevance Predicativity: types are inhabited layer by layer, universe by universe, no circularity in introducing new constructs Strong normalization: every closed term has a unique normal form, which is a term made of constructors only (think of propositional equality – every proof reduces to reflexivity) Definitional equality is decidable, type checking is decidable The logic of propositions is the familiar first order logic with equality

44 47 MLTT: The Bads

Proof relevance No quotients, i.e. there are lists (inductive type: a free algebra, with no equations), but no finite multisets (quotient inductive type: lists + equations for permuting elements) No sets (alternative: setoids) Q No function : (x : X) f(x) ≡ g(x) does not entail f ≡ g No proposition extensionality: equivalence of propositions does not entail equality

45 47 No, but.. S/N is a meta-property (!) From the inside MLTT is incomplete w.r.t. this aspect. So, I It can be completed so that refl is the only inhabitant (e.g. set theory interpretation) I It can be completed so that there are many other inhabitants (e.g. HoTT interpretation)

The J-eliminator

If you think that the J-eliminator

Γ, x: A, y : A, p: x ≡A y ` C : Ui Γ, z: A ` c: C[z/x, z/y, reflz/p] 0 Γ ` a: A Γ ` b: A Γ ` p : a ≡A b Γ ` J(x. y. p. C, z. c, a, b, p0): C[a/x, b/y, p0/p]

is an overkill, that is OK By strong normalization, any identity normalizes to refl. Does this entail that refl is the only inhabitant of ≡?

46 47 The J-eliminator

If you think that the J-eliminator

Γ, x: A, y : A, p: x ≡A y ` C : Ui Γ, z: A ` c: C[z/x, z/y, reflz/p] 0 Γ ` a: A Γ ` b: A Γ ` p : a ≡A b Γ ` J(x. y. p. C, z. c, a, b, p0): C[a/x, b/y, p0/p]

is an overkill, that is OK By strong normalization, any identity normalizes to refl. Does this entail that refl is the only inhabitant of ≡? No, but.. S/N is a meta-property (!) From the inside MLTT is incomplete w.r.t. this aspect. So, I It can be completed so that refl is the only inhabitant (e.g. set theory interpretation) I It can be completed so that there are many other inhabitants (e.g. HoTT interpretation)

46 47 The K-eliminator

What would really enforce equality of all identity proofs is the K-eliminator: 0 Γ, p: a ≡A a ` C : Ui Γ ` c: C[refla/p]Γ ` p : a ≡A a Γ ` K(p. C, c, p0): C[p0/p]

It is then provable that any two identity proofs are equal, or that any identity proof is equal to refl The tension between types that satisfy uniqueness of identity proof and the rest is in the heart of HoTT

47 / 47 Wellcome to HoTT!