<<

U.U.D.M. Project Report 2016:47

Double Interpretations for Typed Logical Systems

Jonne Mickelin Sätherblom

Examensarbete i matematik, 15 hp Handledare: Erik Palmgren, Stockholms universitet Ämnesgranskare: Vera Koponen Examinator: Jörgen Östensson December 2016

Department of Mathematics Uppsala University

Double Negation Interpretations for Typed Logical Systems

Jonne Mickelin S¨atherblom

November 26, 2016 Abstract

The G¨odel-Gentzen negative translation provides a method of proving conservativity of over for a wide class of formulas. Together with Dragalin’s and Friedman’s A-translation this can be extended to simple existence statements. We discuss how to apply this to translate from classical Peano arithmetic as well as a stronger arithmetical theory based on G¨odel’ssystem T . Finally, we mention how the translations can be used to extract algorithms from classical proofs, and give a negative result in form of a theory where the negative translations cannot be used to translate theorems into constructive logic.

Sammanfattning

G¨odel-Gentzens negativa ¨overs¨attning ger en metod f¨or att visa att klassisk logik ¨ar konservativ ¨over intuitionistisk logik f¨or vissa klasser av formler. Tillsammans med Dragalin-Friedmans A-¨overs¨attning kan man visa att detta g¨aller ¨aven f¨or enkla existensp˚ast˚aenden.Vi diskuterar hur dessa kan till¨ampas p˚asatser ur klassisk Peanoaritmetik samt en starkare aritmetisk teori baserad p˚aG¨odels system T . Slutligen n¨amner vi hur man kan anv¨anda ¨overs¨attningarna f¨or att extrahera algoritmer ur klassiska bevis, samt ger ett negativt resultat i form av en teori d¨ar negativa ¨overs¨attningar inte kan anv¨andas f¨or att ¨overs¨atta satser till konstruktiv logik. Contents

1. Introduction 4 1.1. Many-sorted logic ...... 6 1.2. Free variables and substitution ...... 7 1.3. Formula schemas and theories ...... 8 1.4. Classical, intuitionistic and ...... 9

2. The G¨odel-Gentzen negative translation 12

3. Consequences of the negative translation 16 3.1. Properties of schemas ...... 16 3.2. Conservativity results ...... 20 3.3. Identifying wiping, spreading and isolating formulas ...... 21

4. Heyting and Peano arithmetic 22 4.1. The arithmetical hierarchy ...... 22 4.2. Towards translations of arithmetic ...... 24

5. Provably recursive functions 25 5.1. Markov’s rule ...... 25 5.2. The Dragalin-Friedman A-translation ...... 27 5.3. Conservativity for Π2-formulas in HA ...... 29

6. Extracting algorithms 31 6.1. G¨odel’ssystem T ...... 31 6.2. Interpreting atomic formulas and optimizing extracted programs ...... 33 6.3. Extraction of program terms ...... 33 6.4. Formulas as specifications ...... 37 6.5. Applications to classical proofs ...... 40

7. Finite type arithmetic and the axiom of choice 42 7.1. A model of E-HAω ...... 42 7.2. The axiom of choice and constructive mathematics ...... 43 7.3. Translations of HAω ...... 45 7.4. Translations of choice ...... 46

A. Proof of Lemma 1.1 49

B. Proof of the soundness for program extraction 52

3 1. Introduction

Many proofs in mathematics make use of the so-called (LEM), which says that for any statement ϕ, either ϕ or its negation holds. In symbols:

ϕ ∨ ¬ϕ. (LEM)

Proofs that rely on this principle often appear somewhat toothless, as illustrated by the following example:

Theorem. There are two irrational numbers a, b such that ab is rational.

√ √ 2 Proof. Consider the number 2 . It is either rational or irrational, by the law of excluded middle.√ In the √ √ √ 2 √ first case, we can pick a = b = 2, since 2 is irrational. In the second case, we set a = 2 , b = 2, since both are irrational and √ √ 2 √ √ √ 2 √ 2· 2 √ 2 ab = 2 = 2 = 2 = 2.

√ √ √ 2 √ 2 √ 2 While the proof tells us that either 2 or 2 is irrational, it gives us no way of knowing which one it is. By extension, merely having a proof of a theorem that claims the existence of an object with a given property generally gives us no way to actually find such an object. The law of excluded middle was one of the protagonists of a dispute in the mathematical community during the beginning of the last century. The discovery of paradoxes in Cantor’s naive set theory had caused some distress in the mathematical community, with regards to the foundations on which they built their practice. Possibly fuelled by these doubts, a topologist by the name of L.E.J. Brouwer presented a novel philosophy of the foundations of mathematics. Mathematics, he argued, should have a basis in mental constructions rather than mechanical manipulations of strings of symbols on a paper. Moreover, mathematics is a purely human endeavour, and not an exploration of objects living in some ideal world. This put him at odds with both the traditional Platonist philosophy and Hilbert’s new ideas of Formalism. Among other, perhaps more controversial ideas, Brouwer argued that proofs themselves should be given as constructions. In his intuitionistic mathematics, we regard a statement ϕ(x) as “true” if we can construct a proof of ϕ(x), and “false” if we can construct a proof that the assumption ϕ(x) leads to a contradiction. With this in mind, we cannot claim that the law of excluded middle holds without simultaneously giving either a proof or a refutation for every mathematical statement. Although Brouwer himself only implicitly formulated his idea of what constitutes a construction, some of his students gave a more formal description of how to construct proofs. This is the so-called Brouwer–Heyting–Kolmogorov (BHK) interpretation: (i) A proof of a conjunction ϕ ∧ ψ is a pair containing a proof of ϕ together with a proof of ψ. (ii) A proof of a disjunction ϕ ∨ ψ is either a proof of ϕ or a proof of ψ, and some information that indicates which of the two it is. (iii) A proof of an implication ϕ → ψ is a method that transforms a proof of ϕ into a proof of ψ. (iv) A proof of a universal quantification ∀x.ϕ(x) is a method that, for any element d, gives a proof of ϕ(d).

4 (v) A proof of an existential quantification ∃x.ϕ(x) is an element d and a proof of ϕ(d). (vi) There is no proof for absurdity ⊥. A proof of a negation ¬ϕ := ϕ → ⊥ is then a method that given some hypothetical proof of ϕ produces a proof of absurdity. A proof of the general statement ϕ ∨ ¬ϕ under the BHK interpretation consists of either a proof of ϕ or a proof of ¬ϕ, together with a label that tells us which of the two it is. This means that we cannot prove the disjunction until we can claim to have a proof of one of the two disjuncts. Besides LEM, several other commonly used principles fail to have a construction. For example, the least number principle that says that every predicate on the natural numbers is well-founded:

∃x ∈ N.ϕ(x) → ∃x ∈ N.(ϕ(x) ∧ ∀y < x.¬ϕ(y)). (LNP) Consider what would happen if ϕ(x) was

x = 2 ∨ (x = 1 ∧ ψ) ∨ (x = 0 ∧ ¬ψ).

for some ψ. We know that ϕ(2) is true, so by LNP, there is a smallest number x satisfying ϕ. This number cannot be 2, since then we would have ¬ψ and ¬¬ψ (otherwise x = 1 or x = 0 would satisfy ϕ(x)). But this is a contradiction. Thus, x = 0 or x = 1, but these are equivalent to ψ and ¬ψ respectively, so ψ ∨ ¬ψ. So the least number principle implies LEM. Note, however, that we can prove ¬¬LEM and ¬¬LNP, which means we cannot possibly hope to refute these principles. All we have showed is that the principles are not compatible with Brouwer’s idea of what should constitute a proof. Such principles are called weak counterexamples, to distinguish them from ordinary counterexamples, which immediately refute a formula. Philosophical objections aside, one has found connections between constructivism and computer science: While the phrase “method” in the BHK-interpretation is left undefined, we can interpret it to mean “algorithm”. As such, the witnesses for constructive formulas are computer programs! This is, among other things, the basis for constructive type theory, which has seen some rise in popularity in recent years. This text will be concerned with translations of classical formulas into constructive logic. As it turns out, a considerable subset of classical logic, and indeed classical mathematics, can be made constructive. We can do this by trying to manually adapt non-constructive proofs, but for sufficiently well-behaved theories and theorems, the process can be automated.

In general, for theories T1 and T2 in languages L1 ⊆ L2 and with possibly different underlying logical systems, we say that T2 is a conservative extension of T1 if T1 ⊆ T2 and for any formula ϕ ∈ L1 we have T2 ` ϕ ⇒ T1 ` ϕ.

If the implication holds for a class Γ of formulas, we say that T2 is a conservative extension of T1 with respect to Γ-formulas. In the following, we will prove for several classes Γ that certain classical theories are conservative over their constructive counterparts. While it is assumed that the reader has some basic familiarity with formal logic, we dedicate Section 1.1 to defining the logical systems we will use and summarizing some basic notions regarding formulas and derivations. In Section 2 we define the G¨odel-Gentzen negative translation and show how classical logic can be embedded within intuitionistic logic. Section 3 investigates sufficient conditions for proving conservativity of the classical version of a theory over its intuitionistic counterpart, with respect to certain formulas.

5 This will be applied to Peano arithmetic beginning in Section 4, and extended in Section 5 using the so-called Dragalin-Friedman A-translation. The aim of Section 6 is to make the notion of “proofs-as-programs” concrete, and develop the necessary tools for extracting actual computer programs from intuitionistic proofs. We also give an example of a program extracted from a non-trivial arithmetical proof. Finally, in Section 7 we try to apply the translations to a stronger theory, in which it is possible to speak about higher-order functions and collections of objects, and show how our methods from the previous sections break down in the presence of the axiom of choice. Sections 1–4 are self-contained, while Sections 5–7 presuppose some knowledge of recursion theory and typed λ-calculus, as formulated in, for example, [6] and [1] respectively.

Acknowledgements

I would like to thank my advisor Erik Palmgren for his patience during my slightly delayed work on this thesis and for initially proposing the idea and introducing me to the subject. I also want to thank Alvar Bjerkeng van Keppel for many rewarding discussions and for extensive proofreading. A final thank you goes to my friend Valentina Chapovalova for her untiring support and encouragement.

1.1. Many-sorted logic

First-order logic, as it is often presented, allows for quantification over a single, homogeneous, universe of objects. This tends to be inconvenient in practice, as many mathematical theories deal with several different classes, or sorts, of objects. In linear algebra, for example, we make, at the very least, a distinction between scalars and vectors, and may also speak about subspaces, homomorphisms, and so on. Algebra and mathematics in general contains numerous other examples of this phenomenon, but it is also encountered in computer science and higher-order logics such as type theory. The usual way to codify this distinction within first-order logic is to introduce predicates for every sort of objects, such as

K(x) ⇐⇒“x is a scalar” V (x) ⇐⇒“x is a vector” and then restrict, or “relativize”, the quantifications using these predicates. A universal quantification over vectors would then look like ∀x.V (x) → ϕ(x) and an existential one would have the form ∃x.V (x)∧ϕ(x). A more natural approach is taken by many-sorted logic, a variant of first-order logic that makes the notion of sorts inherent to the language. Suppose we are given a set I of sorts. The language of many-sorted logic will contain the following symbols: • Parentheses: (, ) • Logical symbols: ∧, ∨, →, ⊥, ∀, ∃ σ σ • For every sort σ ∈ I a countable supply x1 , x2 ,... of variables of sort σ. • For every sort σ ∈ I a class of constant symbols whose elements are said to be of sort σ.

• For sorts σ1, . . . , σn ∈ I a class of predicate symbols whose elements are said to have signature (or arity)(σ1, . . . , σn).

6 • For sorts σ1, . . . , σn, τ ∈ I a class of function symbols whose elements are said to have signature (or arity)(σ1, . . . , σn, τ). Then we define the terms of the language. Every term will be associated with a sort in such a way that they respect the signatures of function symbols. Definition (Terms). • If x is a variable or constant of sort σ, then x is a term of sort σ.

• If t1, . . . , tn are terms of sorts σ1, . . . , σn respectively, and f is a function symbol of arity σ1, . . . , σn, τ, then f(t1, . . . , tn) is a term of sort τ. Notation. We use the notation xσ for variables of sort σ. Likewise for constants and terms. Further, σ1,...,σn,τ we will indicate that the arity of a function f is (σ1, . . . , σn, τ) by writing f , and similarly for predicates. The superscripts may be omitted when the sort or arity is clear by context. Definition (Formulas).

σ1 σn σ1,...,σn • Atomic formulas: ⊥ is an atomic formula. If t1 , . . . , tn are terms and R is a relation symbol, then R(t1, . . . , tn) is an atomic formula. A formula of this kind will also be referred to as a prime formula. • Connectives: If ϕ, ψ are formulas, then ϕ ∧ ψ, ϕ ∨ ψ and ϕ → ψ are formulas. • Quantifiers: If ϕ is a formula and x is a variable of sort σ, then ∀xσ.ϕ and ∃xσ.ϕ are formulas.

The quantifiers should be interpreted as a restriction of the usual quantifier symbols to the elements of a specific sort. For example, a formula ∀xσ.ϕ will be a universal quantification over just the elements of σ. Notation. We will use the notation ϕ ≡ ψ to denote syntactic equivalence between the formulas ϕ and ψ, that is, ϕ and ψ are the same formula, up to renaming of bound variables (see Section 1.2). Definition. We define ¬ϕ ≡ ϕ → ⊥, ϕ ↔ ψ ≡ (ϕ → ψ) ∧ (ψ → ϕ) and > ≡ ¬⊥. Remark. We can view ordinary first-order logic as a special case of many-sorted logic, where I only contains one sort. We refer to this logic as single-sorted first order logic.

1.2. Free variables and substitution

Definition. We define the set of free variables FV of a term recursively: FV(xσ) = {xσ} if xσ is a variable FV(cσ) = ∅ if cσ is a constant symbol

FV(f(t1, . . . , tn)) = FV(t1) ∪ ... ∪ FV(tn)

This is then extended to formulas as follows: FV(⊥) = ∅

FV(P (t1, . . . , tn)) = FV(t1) ∪ ... ∪ FV(tn) FV(ϕ ∧ ψ) = FV(ϕ) ∪ FV(ψ) FV(ϕ ∨ ψ) = FV(ϕ) ∪ FV(ψ) FV(ϕ → ψ) = FV(ϕ) ∪ FV(ψ) FV(∀xσ.ϕ) = FV(ϕ) \{xσ} FV(∃xσ.ϕ) = FV(ϕ) \{xσ}

7 A variable that occurs in a formula, but which is not free is called bound. The set of bound variables of a formula ϕ is denoted BV(ϕ).

Notation. We write ϕ(xσ) to indicate that the variable xσ is free in the formula ϕ. Definition (Substitution). For terms sτ and tσ we define the substitution s[t/x] of a variable xσ by t in s as follows: ( t if x ≡ y y[t/x] ≡ for variables y y otherwise c[t/x] ≡ c for constants c

f(x1, . . . , xn)[t/x] ≡ f(x1[t/x], . . . , xn[t/x]) for functions f.

Likewise, for formulas ϕ, we define ϕ[t/x]:

⊥[t/x] ≡ ⊥

σ1 σn σ1 σn P (x1 , . . . , xn )[t/x] ≡ P (x1 [t/x], . . . , xn [t/x]) (ϕ ∧ ψ)[t/x] ≡ ϕ[t/x] ∧ ψ[t/x] (ϕ ∨ ψ)[t/x] ≡ ϕ[t/x] ∨ ψ[t/x] (ϕ → ψ)[t/x] ≡ ϕ[t/x] → ψ[t/x] ( ∀yσ.ϕ if x ≡ y (∀yσ.ϕ)[t/x] ≡ ∀yσ.ϕ[t/x] if x 6≡ y ( ∃yσ.ϕ if x ≡ y (∃yσ.ϕ)[t/x] ≡ ∃yσ.ϕ[t/x] if x 6≡ y

Notation. For formulas ϕ(x1, x2, . . . , xn) and terms t1, t2, . . . , tn we write

ϕ[t1, t2, . . . , tn/x1, x2, . . . , xn] for ϕ[t1/x1][t2/x2] ··· [tn/xn].

We will also make use of textual substitution of subformulas: For expressions ϕ, ψ and χ, we define ϕ[χ/ψ] as the result of simultaneously replacing every occurrence of a subformula ψ in ϕ by χ. It might happen that free variables of χ become bound in ϕ[χ/ψ], which may not always be desired.

1.3. Formula schemas and theories

We will also consider formula (or axiom) schemas. These are built like formulas, except they may contain placeholders ρ1, ρ2,.... These are syntactically similar to relation symbols: Every placeholder has a signature, and may occur at any place where a relation symbol of that signature is allowed. For example, if ρ has signature (σ, τ) and f is a function symbol of signature (τ, σ), the following is a formula schema: ∀xσ.∃yτ (ρ(x, y) ∧ ρ(f(y), y)) A set of formula schemas is called a theory. Remark. Every formula is a formula schema without any placeholders.

8 Definition. Suppose that F is a formula schema containing a placeholder ρ of signature (σ1, . . . , σn) σ1 σn and that ψ is a formula containing free variables x1 , . . . , xn . We define the substitution F [ψ/ρ] of ψ σ1 σn for ρ in F by replacing every occurrence ρ(t1 , . . . , tn ) in F with ψ[t1, . . . , tn/x1, . . . , xn].

We write F [ψ1, . . . , ψn/ρ1, . . . , ρn] for successive substitution of ψi for ρi in F (for i = 1, . . . , n).

If ϕ is a formula such that ϕ ≡ F [ψ1, . . . , ψn/ρ1, . . . , ρn] for some ψ1, ρ1, . . . , ψn, ρn, we say that ϕ is an instance of the formula schema F . We extend the notions of free variables, substitution of terms and textual substitution to formula schemas in the straightforward manner.

1.4. Classical, intuitionistic and minimal logic

The systems we use will be based on so-called natural deduction. Natural deduction proofs are represented as trees where each node is of the form

D1 D2 ... Dn ψ such that each Di is a proof tree and the root ψ is a formula (called the conclusion or consequent of the proof tree). The leaves of the tree (that is, nodes with no subproofs) are assumptions or antecedents of D [ϕ , . . . , ϕ ] the proof tree. We write to indicate that the proof tree D has root ψ, and 1 n to indicate ψ D that D contains 0 or more occurrences of the formulas ϕ1, . . . , ϕn as assumptions. Our proof trees are freely generated by a set of rules defined below. Rules have the form

ϕ1 ϕ2 . . . ϕn L ψ which means that the rule named L may be applied to proof trees Di (i = 1, . . . , n), to form a proof ϕi tree

D1 D2 ... Dn ψ

This is called an instance of the rule. A rule of the form

[ϕ1, . . . , ϕn] . . ψ 0 χ L additionally requires that its instances are of the form

[ϕ1, . . . , ϕn] D ψ 0 χ L

9 Some rules limit the scope of their assumptions. When we write

h [ϕ1, . . . , ϕn] . .

ψ 0 χ L , h we mean that the assumptions in the list marked by h may not be used following the application of the rule. We say that the rule discharges the assumptions. Assumptions that are not discharged called open.

Definition. The system of classical many-sorted logic CQC is given by the following rules:

[¬ϕ]h . . ⊥ ⊥ ϕ ⊥i ϕ ⊥c, h ϕ ψ ϕ ∧ ψ ϕ ∧ ψ ∧I ∧E ∧E ϕ ∧ ψ ϕ 1 ψ 2

[ϕ]h1 [ψ]h2 . . . . ϕ ψ ϕ ∨ ψ χ χ ∨I ∨I ∨E, h , h ϕ ∨ ψ 1 ϕ ∨ ψ 2 χ 1 2

[ϕ]h . . ψ ϕ → ψ ϕ →I, h →E ϕ → ψ ψ

ϕ(xσ) ∀xσ.ϕ ∀I ∀E ∀xσ.ϕ ϕ[M σ/xσ]

[ψ(xσ)]h . . ϕ[M σ/xσ] ∃xσ.ψ χ ∃I ∃E, h ∃xσ.ϕ χ In order to apply some of the rules, we must make sure that some side-conditions are satisfied: In ∀I x may not occur free in an open assumption in the derivation of ϕ. For ∃E, we require that every assumption in the derivation of χ that contains x free must be discharged at the same time as ϕ(x). Moreover, χ itself may not contain x free. The system of intuitionistic many-sorted logic, IQC is given by the same rules, with the exception of the rule ⊥c. If we also remove the rule ⊥i we obtain the system of minimal many-sorted logic, MQC.

Notation. We write Γ `c ϕ if there is a derivation within CQC of ϕ, all of whose open assumptions are elements in Γ. Similarly, for IQC and MQC, we write Γ `i ϕ and Γ `m ϕ respectively. If the formal system is clear by context, we may instead write Γ ` ϕ. For a theory T we write T ` ϕ if there is a derivation whose open assumptions are instances of schemas in T.

10 Notation. We will occasionally omit parts of proof trees that have already been proved elsewhere. The omitted parts will be indicated by a doubly stroked line in the tree. For example, if we have previously proved ϕ `m ψ, then we may write a derivation of ϕ, χ `m ψ ∧ χ as follows:

ϕ ψ χ ∧I ψ ∧ χ

The following collection of derivations will be used throughout the text:

Lemma 1.1. For all formulas ϕ, ψ

(i) ϕ `m ¬¬ϕ

(ii) ¬¬¬ϕ `m ¬ϕ

(iii) ¬¬⊥ `m ⊥

(iv) ϕ ∨ ψ `m ¬(¬ϕ ∧ ¬ψ)

(v) ϕ → ψ `m ¬ψ → ¬ϕ

(vi) ϕ → ψ `m ¬¬ϕ → ¬¬ψ

(vii) ¬(ϕ → ψ) `i ¬(¬ϕ ∨ ψ) σ σ (viii) ∃x .ϕ `m ¬∀x .¬ϕ σ σ (ix) ¬∃x .ϕ `m ∀x .¬ϕ σ σ (x) ∀x .¬ϕ `m ¬∃x .ϕ

(xi) ¬¬ϕ ∧ ¬¬ψ `m ¬¬(ϕ ∧ ψ)

(xii) ¬¬(ϕ ∧ ψ) `m ¬¬ϕ ∧ ¬¬ψ

(xiii) ¬¬ϕ → ¬¬ψ `i ¬¬(ϕ → ψ) σ σ (xiv) ∃x .¬¬ϕ `m ¬¬∃x .ϕ σ σ (xv) ¬¬∀x .ϕ `m ∀x .¬¬ϕ σ σ (xvi) ¬∀x .¬ϕ `m ¬¬∃x .ϕ

Proof. See Appendix A.

Lemma 1.2. If FV(ϕ) ∩ BV(Γ ∪ {ψ}) = ∅, then Γ `m ψ ⇒ Γ[ϕ/⊥] `m ψ[ϕ/⊥].

Proof. In MQC the symbol ⊥ carries no special meaning, so it can be replaced by any other formula.

Further reading

For an extensive treatment of many-sorted logic and how it relates to single-sorted logic, see [16]. The natural deduction style proof systems for many-sorted logic used in this thesis are based on the one in [11], page 225.

11 2. The G¨odel-Gentzen negative translation

The main protagonists of this text will be certain translations of logical formulas. Given logical systems S1 and S2 with languages L1 and L2, a syntactic translation of S1 into S2 is a function F : L1 → L2 such that, for any set Γ ⊆ L1 and formula ϕ ∈ L1, we have

Γ `S1 ϕ if and only if F(Γ) `S2 F(ϕ).

Here F(Γ) denotes the set {F(ϕ) | ϕ ∈ Γ}1. We will consider translations between the systems CQC, IQC and MQC. The first such translation is one from CQC into MQC, formulated independently by K. G¨odeland G. Gentzen in the 1930s. A similar, and equivalent, translation was found earlier by A. Kolmogorov, but remained undiscovered by his contemporaries until some years later. The presentation in this section and the next one largely follows Troelstra and van Dalen [23]. Definition. For every formula ϕ we define the G¨odel-Gentzennegative translation ϕg as follows:

pg ≡ ¬¬p for prime formulas p (ϕ ∧ ψ)g ≡ ϕg ∧ ψg (ϕ ∨ ψ)g ≡ ¬(¬ϕg ∧ ¬ψg) (ϕ → ψ)g ≡ ϕg → ψg (∀xσ.ϕ)g ≡ (∀xσ.ϕg) (∃xσ.ϕ)g ≡ ¬(∀xσ.¬ϕ).

g g g g σ g σ g We have `c p ↔ p , `c (ϕ ∨ ψ) ↔ ϕ ∨ ψ and `c (∃x .ϕ) ↔ ∃x .ϕ , since reductio ad absurdum and the DeMorgan laws hold in CQC. By induction, this gives the following theorem: Theorem 2.1. For all formulas ϕ, g `c ϕ ↔ ϕ .

The following pages will show that (·)g is a translation from CQC into MQC. Let us however start by making some remarks on the formulas in the image of the translation. First of all, such formulas will never contain ∨ or ∃. Secondly, since every prime formula (including ⊥) is mapped to its double negation, every atom in a formula will occur as a negation. We give such formulas a special name.

Definition. A formula ϕ is said to be negative if the only atomic subformulas of ϕ are and ϕ does not contain ∨ or ∃.

So the negative translation of any formula is negative. Theorem 2.2. For all negative formulas ϕ,

`m ¬¬ϕ → ϕ

Proof. We proceed by induction on the structure of ϕ. Since ϕ is negative, we only have to consider four cases:

• If ϕ ≡ ¬p for some atomic p we have `m ¬¬ϕ → ϕ by Lemma 1.1(ii).

1This general definition of syntactic translations is taken from [17].

12 D • If ϕ ≡ ψ ∧ χ then ψ and χ are negative, so that we have derivations 1 and D2 ¬¬ψ → ψ ¬¬χ → χ which gives the derivation

[¬¬(ψ ∧ χ)]1 [¬¬(ψ ∧ χ)]1 D1 ¬¬ψ ∧ ¬¬χ D2 ¬¬ψ ∧ ¬¬ψ ¬¬ψ → ψ ¬¬ψ ¬¬χ → χ ¬¬χ ψ χ ψ ∧ χ 1 ¬¬(ψ ∧ χ) → ψ ∧ χ D • If ϕ ≡ ψ → χ then in particular, χ is negative, giving a derivation ¬¬χ → χ . Then we have

[ψ → χ]4 [ψ]2 [¬χ]3 χ ⊥ 4 [¬¬(ψ → χ)]1 ¬(ψ → χ) D ⊥ ¬¬χ → χ ¬¬χ 3 χ 2 ψ → χ 1 ¬¬(ψ → χ) → (ψ → χ) D • If ϕ ≡ ∀xσ.ψ then ψ is negative, so we have a derivation , which gives ¬¬ψ → ψ

[∀xσ.ψ]3 [¬ψ(M σ)]2 ψ(M σ) ⊥ 3 [¬¬∀xσ.ψ]1 ¬∀xσ.ψ D ⊥ 2 ¬¬ψ(M σ) → ψ(M σ) ¬¬ψ(M σ) ψ(M σ) ∀xσ.ψ 1 ¬¬∀xσ.ψ → ∀xσ.ψ

Lemma 2.3. For all negative formulas ϕ

g `m ϕ → ϕ

Proof. Since ϕ is negative, it will not contain ∨ or ∃. The translation will thus only replace every negated atomic subformula ¬p ≡ p → ⊥ of ϕ with ¬¬p → ¬¬⊥. By the previous Lemma, `m (¬¬p → ¬¬⊥) → ¬¬p → ⊥, which is equal to ¬¬¬p, so by Lemma 1.1(ii),

`m (¬¬p → ¬¬⊥) → ¬p

The theorem follows by induction.

13 Lemma 2.4. If Γ is a set of formulas and ϕ is in the ∧, →, ∀, ⊥-fragment of many-sorted logic, then

g g Γ `c ϕ ⇒ Γ `m ϕ

Proof. We prove this by recursively translating a deduction tree D in CQC for ϕ to a tree Dg in MQC for ϕg. We do this for all deductions of ϕ and argue that if it only contains open assumptions from Γ, then it’s translation will only contain open assumptions from Γg. • In the base case, where D consists of an assumption ϕ, simply set Dg to ϕg, so if ϕ ∈ Γ then ϕg ∈ Γg. • Now, for an application of one of the rules concerning ∧, → and ∀, the translation is simply another instance of that same rule. For example, if we have

D1 D2 ϕ ψ ϕ ∧ ψ

D D Dg Dg we first translate 1 and 2 into some trees 1 and 2 and then construct ϕ ψ ϕg ψg

g g D1 D2 ϕg ψg ϕg ∧ ψg Also note that if one of these rules discharges an assumption of its premises, the corresponding assumption will also be discharged in the translation: The translation of

[ψ]1 [ψg]1 g D1 is D1 χ χg → I, 1 → I, 1 ψ → χ ψg → χg • For ∀I, we must also show that the side-condition of the rule is preserved: Suppose D is the following derivation

D1 ψ(xσ) ∀I ∀xσ.ψ

By the side-condition of ∀I, D1 cannot contain an open assumption with x free. But the induction hypothesis and the fact that the negative translation preserves free variables implies the same g must hold for D1 . Then we can translate D into g D1 ψg(xσ) ∀I ∀xσ.ψg • If D is on the form [¬ϕ] D ⊥ ϕ ⊥c

14 we first construct a tree [¬ϕg] Dg ⊥g and then replace the application of the rule by the following tree:

[¬ϕg]1 Dg ⊥g D0 ⊥ 1 ¬¬ϕg → ϕg ¬¬ϕg ϕg where D0 is the deduction of ¬¬ϕg → ϕg given by Theorem 2.2, since ϕg is negative. • Similarly, for a tree on the form D ⊥ ϕ ⊥i we can first construct [¬ϕg] Dg ⊥g Dg by adding the open assumption ¬ϕg to the tree 1 given by the induction hypothesis, and then ⊥g replace it by a tree similar to the one above.

Theorem 2.5. For all formulas ϕ and sets Γ of formulas

g g Γ `c ϕ ⇐⇒ Γ `m ϕ .

g g Proof. For the first direction, first note that Γ `c ϕ ⇒ Γ `c ϕ (by Theorem 2.1). ϕ is negative, so we g g g g g can apply Lemma 2.4 to get Γ `m (ϕ ) . Lemma 2.3 gives us Γ `m ϕ . The other direction follows directly from Theorem 2.1.

15 3. Consequences of the negative translation

The negative translation is a map between formulas, but, as Theorem 2.5 shows, it also admits a corresponding translation of proofs of classical logic into minimal logic. This means that, in a sense, classical reasoning is valid in minimal logic, within the image of the negative translation. Now we are interested in sufficient conditions for inverting the negative translation within minimal (or at least intuitionistic) logic. That is, when does

g g Γ `m ϕ =⇒ Γ `m ϕ or g g Γ `m ϕ =⇒ Γ `i ϕ hold? Together with Theorem 2.5 this would show that every classical proof of ϕ can be translated into a constructive one. We will consider the more general problem of translations of proofs within a theory.

g g Definition. A theory T is M-closed under if for every instance ϕ of a schema in T we have T `m ϕ .

g g Lemma 3.1. If T is M-closed under and T `c ϕ, then T `m ϕ .

Proof. T `c ϕ means that there is a set ∆ containing instances of schemas from T such that ∆ `c ϕ. g g g g From Theorem 2.5 we have ∆ `m ϕ . Since T is M-closed under , we have T `m ϕ .

This means that if T is M-closed under g, it suffices to consider formulas ϕ for which we can derive ϕg → ϕ. One class of such formulas is given by Lemma 2.3:

g Theorem 3.2. If T is M-closed under and ϕ is negative, then T `c ϕ ⇐⇒ T `m ϕ.

In other words, classical logic is a conservative extension of intuitionistic logic for negative formulas! The remainder of this section will be devoted to the study of other classes of formulas ϕ such that T ` ϕg → ϕ.

3.1. Properties of schemas

Definition. If ϕ[ρ1, . . . , ρn] is a formula schema and ψ1, . . . , ψn formulas, we say that ϕ is g g g • spreading if `m ϕ[ψ1 , . . . , ψn] → ϕ[ψ1, . . . , ψn] g g g • wiping if `m ϕ[ψ1, . . . , ψn] → ϕ[ψ1 , . . . , ψn] g g g • isolating if `m ϕ[ψ1, . . . , ψn] → ¬¬ϕ[ψ1 , . . . , ψn] Similarily, we use the terms I-spreading, I-wiping and I-isolating for the corresponding notions in IQC. Remark. The above definitions also apply to formulas, since a formula is a schema without placeholders. g For example, a formula ϕ is spreading if `m ϕ → ϕ . Definition. A schema is essentially isolating if it is of the form ∀xσ(ψ → ∀yτ .χ) with ψ spreading and χ isolating.

16 Theorem 3.3. If every schema ϕ in a theory T is spreading, then T is M-closed under g.

Proof. Let ϕ[ψ1, . . . , ψn] be an instance of ϕ. Then, since ϕ is spreading, we have

g g g ` ϕ[ψ1 , . . . , ψn] → ϕ[ψ1, . . . , ψn]

g g But ϕ[ψ1 , . . . , ψn] is also an instance of ϕ, so

g g g T ` ϕ[ψ1 , . . . , ψn] and hence T ` ϕ[ψ1, . . . , ψn] .

As we will see, there is a similar result that allows us to use the above classes of schemas to invert the negative translation. Before we can formulate this result, however, we will need some machinery. Given an implication ϕ → ψ, we say that the left hand side ϕ is negative, and the right hand side ψ is positive. Compare this to the occurrence of ϕ in the formula ϕ → ⊥, versus the one in ⊥ → ϕ. This notion can be extended to any subformula of ϕ and ψ.

Definition. The class of formula contexts CON is generated by the following grammar:

F := ∗ | ϕ ∧ F | F ∧ ϕ | ϕ ∨ F | F ∨ ϕ | ϕ → F | F → ϕ | ∀xσ.F | ∃xσ.F where ϕ ranges over formulas and σ over sorts.

A formula context can be thought of as a formula containing exactly one “hole” (namely ∗). If we substitute ∗ for a formula ϕ in some context F , we get a formula, which we denote F [ϕ].

Definition. A subformula occurrence (sfo) of a formula ϕ is a pair (F, ψ) where ψ is a formula and F ∈ CON, such that F [ψ] ≡ ϕ. We will also refer to such an sfo as a (subformula) occurrence of ψ in ϕ. Definition. The positive contexts, POS, and negative contexts, NEG are defined simultaneously as follows, where P ∈ POS and N ∈ NEG:

P := ∗ | ϕ ∧ P | P ∧ ϕ | ϕ ∨ P | P ∨ ϕ | ϕ → P | N → ϕ | ∀xσ.P | ∃xσ.P

N := ϕ ∧ N | N ∧ ϕ | ϕ ∨ N | N ∨ ϕ | ϕ → N | P → ϕ | ∀xσ.N | ∃xσ.N

A positive (negative) occurrence of ψ in ϕ is an sfo (F, ψ) of ϕ such that F is positive (negative).

17 Lemma 3.4. If F ∈ CON and z¯ is a list of variables that are free in ψ and χ, but bound in F [ψ] and F [χ], then F ∈ POS ⇒ ∀z.¯ (ψ → χ) `m F [ψ] → F [χ] and F ∈ NEG ⇒ ∀z.¯ (ψ → χ) `m F [χ] → F [ψ]

Proof. We use induction on the structure of F : The case F ≡ ∗ is immediate. Suppose F ∈ POS. If F ≡ ϕ ∧ P for some P ∈ POS, then we have the following derivation tree:

[∀z.¯ (ψ → χ)] 1 D [ϕ ∧ P [ψ]] [ϕ ∧ P [ψ]]1 P [ψ] → P [χ] P [ψ] ϕ P [χ] ϕ ∧ P [χ] 1 ϕ ∧ P [ψ] → ϕ ∧ P [χ] where D is given by induction hypothesis since P ∈ POS. The cases F ≡ P ∧ ϕ, P ∨ ϕ, ϕ ∨ P, ϕ → P are similar. Suppose F ≡ ∀xσ.P where P ∈ POS. Without loss of generality, we can assume z¯ ≡ xσ, y¯, since the variables in a universal quantification can be reordered. Note that P [θ][M σ/xσ] ≡ P [M σ/xσ][θ[M σ/xσ]] for any formula θ. Then we have the following derivation tree:

[∀xσ, y.¯ (ψ → χ)] ∀y.¯ (ψ[M σ/xσ] → χ[M σ/xσ]) σ 1 D [∀x .P [ψ]] P [ψ][M σ/xσ] → P [χ][M σ/xσ] P [ψ][M σ/xσ] P [χ][M σ/xσ] ∀xσ.P [χ] 1 ∀xσ.P [ψ] → ∀xσ.P [χ] where D is given by induction hypothesis since P [M σ/xσ] ∈ POS. The case F ≡ ∃xσ.P is similar:

[∀xσ, y.ψ¯ (x) → χ(x)] ∀y.ψ¯ [x/x](x) → χ[x/x](x) D P [ψ] → P [χ][P [ψ](x)]2 P [χ](x) [∃xσ.P [ψ]]1 ∃xσ.P [χ] 2 ∃xσ.P [χ] 1 ∃xσ.P [ψ] → ∃xσ.P [χ] where D is given by induction hypothesis.

18 If F ≡ N → ϕ for some N ∈ NEG, we have the following derivation:

[∀z.¯ (ψ → χ)] D N[χ] → N[ψ][N[χ]]2 [N[ψ] → ϕ]1 N[ψ] ϕ 2 N[χ] → ϕ 1 (N[ψ] → ϕ) → (N[χ] → ϕ) where D is given by induction hypothesis. The case where F ∈ NEG is completely analogous. Lemma 3.5. Let ϕ, ψ and χ be formulas such that FV(χ) ∩ BV(ϕ) = ∅. If z¯ is a list of variables free in ψ and χ and bound in ϕ, then

(i) If ψ occurs only positively in ϕ, then ∀z.¯ (ψ → χ) `m ϕ → ϕ[χ/ψ]

(ii) If ψ occurs only negatively in ϕ, then ∀z.¯ (ψ → χ) `m ϕ[χ/ψ] → ϕ

Proof.

(i) We can find F1,...,Fn ∈ POS such that

F1[ψ] ≡ ϕ

F2[ψ] ≡ F1[χ] . .

Fn[ψ] ≡ Fn−1[χ]

and Fn[χ] ≡ ϕ[χ/ψ]. That is, Fi[ψ] is ϕ with i − 1 occurrences of ψ replaced with χ. Then, by successive application of Lemma 3.4 we have

∀z.¯ (ψ → χ) `m Fi[ψ] → Fi[χ]

for i = 1, . . . n. By construction of Fi, we have

∀z.¯ (ψ → χ) `m Fi−1[χ] → Fi[χ]

for i = 2, . . . , n and ∀z.¯ (ψ → χ) `m F1[ψ] → F1[χ] so that

∀z.¯ (ψ → χ) `m F1[ψ] → Fn[χ].

But F1[ψ] ≡ ϕ and Fn[χ] ≡ ϕ[χ/ψ], which concludes the proof. (ii) Similar to (i).

Lemma 3.6. Let ϕ[ρ¯] be a formula schema and ψ and χ be formulas such that FV(χ) ∩ BV(ϕ) = ∅. If A¯ = A1,...,An is a list of formulas, and z¯ is a list of variables free in ψ and χ and bound in ϕ[A/¯ ρ¯], then

(i) If ψ occurs only positively in ϕ, then ∀z.¯ (ψ → χ) `m ϕ[A/¯ ρ¯] → ϕ[χ/ψ][A/¯ ρ¯].

(ii) If ψ occurs only negatively in ϕ, then ∀z.¯ (ψ → χ) `m ϕ[χ/ψ][A/¯ ρ¯] → ϕ[A/¯ ρ¯].

Proof. By a similar construction to Lemma 3.5.

19 3.2. Conservativity results

g Theorem 3.7. If T is M-closed under and ϕ is wiping (I-wiping), then T `c ϕ if and only if T `m ϕ (T `i ϕ).

Proof. Follows directly from Lemma 3.1 and the definition of (I-)wiping formulas.

g Theorem 3.8. Suppose T is M-closed under , ϕ is isolating and that T `c ϕ.

(i) If ⊥ does not occur in T or ϕ, then T `m ϕ.

(ii) If ⊥ occurs only positively in T, and only negatively in ϕ, then T `i ϕ.

Proof. g (i) From Lemma 3.1 we have T `m ϕ , which gives T `m ¬¬ϕ since ϕ is isolating. We use Lemma 1.2 to replace ⊥ by ϕ in T and ¬¬ϕ ≡ (ϕ → ⊥) → ⊥, giving

T[ϕ/⊥] `m (ϕ[ϕ/⊥] → ϕ) → ϕ,

but by assumption ⊥ does not occur in T or ϕ, so T `m (ϕ → ϕ) → ϕ. Now, `m ϕ → ϕ, so T `m ϕ. (ii) As in (i) we derive T[ϕ/⊥] `m (ϕ[ϕ/⊥] → ϕ) → ϕ.

It suffices to show two things, namely that `i ϕ[ϕ/⊥] → ϕ and that for any instance ψ[A/¯ ρ¯] of a schema ψ[¯ρ] in T[ϕ/⊥], we have T `i ψ[A/¯ ρ¯].

The former follows from Lemma 3.5, which gives ⊥ → ϕ `m ϕ[ϕ/⊥] → ϕ[⊥/⊥], that is, ⊥ → ϕ `m ϕ[ϕ/⊥] → ϕ, so that `i ϕ[ϕ/⊥] → ϕ. For the latter, if ψ[ρ¯] ∈ T[ϕ/⊥] then there is some ψ∗[ρ¯] ∈ T such that ψ∗[ϕ/⊥] ≡ ψ. By Lemma 3.6 ∗ ∗ we have ⊥ → ϕ `m ψ [A/¯ ρ¯] → ψ [ϕ/⊥][A/¯ ρ¯], since, by assumption, ⊥ occurs only positively in ∗ ∗ ∗ ∗ ψ . But then `i ψ [A/¯ ρ¯] → ψ [ϕ/⊥][A/¯ ρ¯], so that T `i ψ [ϕ/⊥][A/¯ ρ¯].

g Theorem 3.9. Suppose T is M-closed under , ϕ is essentially isolating and that T `c ϕ.

(i) If ⊥ does not occur in T or ϕ, then T `m ϕ.

(ii) If ⊥ occurs only positively in T, and only negatively in ϕ, then T `i ϕ.

Proof. Since ϕ is essentially isolating, it is of the form ∀xσ.(ψ → ∀yτ .χ) where ψ is spreading and χ is isolating. (i) Since ψ is spreading, we have g T, ψ `m ψ g so the theory T, ψ is M-closed under . Since T, ψ `c χ we can apply Theorem 3.8 to get a derivation T, ψ `m χ so that σ τ T `m ∀x .(ψ → ∀y .χ).

(ii) Similar, with the observation that a negative occurrence of a formula in ∀xσ.(ψ → ∀yτ .χ) is a positive occurrence in ψ or a negative occurrence in χ, whence we can apply the second part of Theorem 3.9.

20 3.3. Identifying wiping, spreading and isolating formulas

The above results can be applied to suitable wiping, spreading and isolating formula schemas to show conservativity results in various theories. However, it would be tedious to provide formal proofs for whenever we want to show that a particular schema is, say, isolating or that a theory is M-closed under g. Instead, we formulate sufficient conditions that we can use to show the above by inspection of the structure of formula schemas.

Definition. The classes S, W and I are defined simultaneously as follows, using the parameters S, S1,S2 ∈ S, W, W1,W2 ∈ W and J, J1,J2 ∈ I:

σ σ S := ⊥ | P | ρ | S1 ∧ S2 | S1 ∨ S2 | ∀x .S | ∃x .S | J → S σ W := ⊥ | ρ | W1 ∧ W2 | ∀x .W | S → W σ J := P | W | J1 ∧ J2 | J1 ∨ J2 | ∃x .J where P ranges over prime formulas different from ⊥ and ρ over placeholders. Theorem 3.10. (i) If ϕ ∈ S then ϕ is spreading. (ii) If ϕ ∈ W then ϕ is wiping. (iii) If ϕ ∈ I then ϕ is isolating.

Proof. By induction on the structure of ϕ using Lemma 1.1.

There are similar classes of formulas for intuitionistic logic: The classes Si, Wi and Ii are generated by the same syntactic categories as above, with the modification that J has the form

σ J := P | W | J1 ∧ J2 | J1 ∨ J2 | ∃x .J | S → J

Theorem 3.11.

(i) If ϕ ∈ Si then ϕ is I-spreading.

(ii) If ϕ ∈ Wi then ϕ is I-wiping.

(iii) If ϕ ∈ Ii then ϕ is I-isolating.

Proof. By induction on the structure of ϕ using Lemma 1.1.

21 4. Heyting and Peano arithmetic

We now demonstrate the tools from Sections 2 and 3 on Peano arithmetic, which we formulate as a many-sorted theory containing only one sort, called 0, meant to represent the natural numbers. The language contains the constant 0, the unary function symbol S, function symbols for all primitive recursive functions, and one binary predicate =. They satisfy the following axioms: (i) Axioms for equality

x = x x = y → y = x x = y ∧ y = z → x = z

(ii) For any n-ary function symbol f:

x1 = y1 ∧ ... ∧ xn = yn → f(x1, . . . , xn) = f(y1, . . . , yn)

(iii) ¬S(x) = 0 (iv) S(x) = S(y) → x = y (v) The defining equations for all primitive recursive functions. For example:

0 + n = n 0 · n = 0 S(m) + n = S(m + n) S(m) · n = m + m · n

(vi) The induction axiom schema:

ρ(0) → ∀x.(ρ(x) → ρ(S(x))) → ∀x.ρ(x)

By tradition, Peano arithmetic within IQC is called Heyting arithmetic. We use the notation PA and HA, respectively, for the classical and intuitionistic versions of the theory.

4.1. The arithmetical hierarchy

k Definition. A relation R ⊆ N is said to be (PA)-definable if there is a PA-formula ϕ(x1, . . . , xk) such that N |= ϕ(n1, . . . , nk) if and only if (n1, . . . , nk) ∈ R where N is the standard model of PA.

Definition. A function f : N → N is (PA)-definable if its graph {(n, f(n)) | n ∈ N} is definable. Definition. Define x ≤ y ≡ ∃z.x + z = y and x < y ≡ x ≤ y ∧ ¬x = y.

For quantifiers, we use the notation

∀x ≤ z.ϕ(x) ≡ ∀x.x ≤ z → ϕ(x)

∃x ≤ z.ϕ(x) ≡ ∃x.x ≤ z ∧ ϕ(x), and likewise for <.

22 Definition (The arithmetical hierarchy). A formula is bounded if all quantifiers that occur in the formula are on the form ∀x ≤ z.ψ(x) or ∃x ≤ z.ψ(x). In this case we also say that the formula belongs to the class ∆0.

We define, for n ∈ N, the classes Σn and Πn of formulas as follows:

• ϕ is in both Σ0 and Π0 if ϕ is bounded.

• ϕ is in Σn+1 if ϕ ≡ ∃x.ψ(x) where ψ ∈ Πn.

• ϕ is in Πn+1 if ϕ ≡ ∀x.ψ(x) where ψ ∈ Σn. Remark. There is a standard way to encode a list of natural numbers as a single number, via a bijective primitive recursive function S k → , with corresponding primitive recursive projection functions k∈N N N N → N (see [19], page 70). Since these functions have corresponding function symbols in PA, every 0 quantification of the form Qx1 ··· Qxk.ϕ is equivalent to a formula Qx.ϕ for some suitable formula 0 ϕ . Thus, every formula on the form ∃x1 · · · ∃xk.ϕ with ϕ ∈ Πn is equivalent to a Σn+1-formula (and similarly for Πn+1-formulas). We use this fact implicitly throughout the remainder of the thesis. Theorem 4.1. Equality in HA is decidable, that is

HA `i ∀x.∀y.x = y ∨ ¬x = y.

Proof. By induction in HA.

Theorem 4.2. For every ∆0-formula ϕ(¯x) in HA, there is a term tϕ(¯x) such that

HA `i tϕ(¯x) = 0 ↔ ϕ(¯x).

Proof. By induction on ϕ. We set t⊥ ≡ 1, so that

HA `i 1 = 0 ↔ ⊥, and for atomic formulas of the form x = y, we set tx=y(x, y) = |x − y|.

Now suppose that HA `i tψ(x¯) = 0 ↔ ψ(x¯) and HA `i tχ(y¯) = 0 ↔ χ(y¯) for some terms tψ and tχ. We define

• tψ∧χ(¯x, y¯) = tψ(¯x) + tχ(¯y)

• tψ∨χ(¯x, y¯) = tψ(¯x) · tχ(¯y) . • tψ→χ(¯x, y¯) = (1 − tψ(¯x)) · tχ(¯y)

• t∀x≤z.ψ(x,y¯)(¯y) = Σx≤ztψ(x, y¯)

• t∃x≤z.ψ(x,y¯)(¯y) = Πx≤ztψ(x, y¯) where −. is given by ( x − y if x > y x −. y = . 0 otherwise Note that all the above functions are primitive recursive, hence HA contains the appropriate function symbols to define these terms inside the theory.

The reader may verify that HA `i tϕ(¯x) = 0 ↔ ϕ(¯x) for all of the above.

Corollary 4.3. For any ∆0-formula ϕ in HA

HA `i ϕ ∨ ¬ϕ.

Proof. By Theorems 4.1 and 4.2.

23 4.2. Towards translations of arithmetic

Theorem 4.4.HA is M-closed under g.

Proof. By inspection, we see that every axiom belongs to the class S, so by Theorems 3.10 and 3.3, HA is M-closed under g.

Furthermore, the only occurrence of ⊥ in HA is in Axiom (iii), where it occurs positively. This means that any formula provable in Peano arithmetic is provable in Heyting arithmetic if it is isolating and only contains ⊥ negatively.

Theorem 4.5. For any ∆0-formula ϕ(¯x) in HA

g HA `i ϕ(¯x) → ϕ(¯x).

Proof. Immediate from 4.3.

Corollary 4.6. For any Π1-formula ϕ in HA

g HA `i ϕ(¯x) → ϕ(¯x).

Proof. Immediate from 4.5.

Theorem 4.7.HA is consistent if and only if PA is consistent.

g g Proof. Suppose PA `c ⊥. Then HA `i ⊥ by Theorems 4.4 and 3.1, and ⊥ ≡ ¬¬⊥, which is equivalent to ⊥ by Lemma 1.1. The holds trivially since HA is a subsystem of PA.

24 5. Provably recursive functions

A fundamental result from recursion theory states that a (partial) function f : N → N is recursive if and only if its graph is definable by a Σ1-formula. That is, f is recursive if and only if

f is defined at x and f(x) = y ⇐⇒ ∃z.ψ(x, y, z) is true in the standard model for some ∆0-formula ψ. Hence f is defined at x if and only if the formula

∃y.∃z.ψ(x, y, z) holds in the standard model, and it is total if and only if

∀x.∃y.∃z.ψ(x, y, z) holds in N .

Definition. A partial function f : N → N is provably recursive in a theory T that includes arithmetic if

(a) The graph of f is definable by a Σ1-formula ϕ and (b) T ` ∃y.ϕ(x, y)(f is provably total).

(c) T ` ϕ(x, y1) ∧ ϕ(x, y2) → y1 = y2 (f is provably univalent).

In fact, there is a ∆0-formula T (e, x, u) and a primitive recursive function U : N → N such that for all recursive functions f, f(x) is defined and computes y if and only if

∃e.∃z.T (e, x, z) ∧ U(z) = y is true in N .

The formula T (e, x, z) is usually called Kleene’s T -predicate and states that the Turing machine coded by e terminates on input x and that the computation is coded by z. The function U extracts the result of running the machine from z. Notation. Analogously to regular function application and λ-abstraction, we write {e}(x) for the result of applying the function coded by e to x (if defined) and given a term t(x), we write Λx.t(x) for the code t0 such that {t0}(x) = t.

5.1. Markov’s rule

A natural question to ask is whether Peano and Heyting arithmetic have the same provably recursive functions.

Note that both formulas in (b) and (c) above are equivalent to Σ1-formulas: ϕ(x, y) is Σ1 and hence so is ∃y.ϕ(x, y). Furthermore, if ϕ(x, y) ≡ ∃z.ψ¯ (x, y, z¯) for some ∆0-formula ψ, then

(ϕ(x, y1) ∧ ϕ(x, y2) → y1 = y2) is minimally equivalent to

(∃z¯1.ψ(x, y1, z¯1)) ∧ (∃z¯2.ψ(x, y2, z¯2)) → y1 = y2, which is equivalent to ∃z¯1∃z¯2.ψ(x, y1, z¯1) ∧ ψ(x, y2, z¯2) → y1 = y2.

First, does PA `c ∃y.ϕ(x, y) imply HA `i ∃y.ϕ(x, y)?

25 Let us try to apply the G¨odel-Gentzen translation: If

PA `c ∃y.ϕ(x, y) then g HA `i ¬∀y.¬ϕ(x, y) . Hence, by Lemma 1.1, g HA `i ¬¬∃y.ϕ(x, y) .

By assumption, ϕ(x, y) was ∆0, so indeed

HA `i ¬¬∃y.ϕ(x, y). by Theorem 4.5. Both of the above criteria would hold in Heyting arithmetic if we could only apply the following proof rule (dubbed Markov’s Rule for Primitive Recursive matrices):

¬¬∃n.ϕ(n, m) ` ∃n.ϕ(n, m) (where ϕ ∈ ∆0)(MRPR)

As we recall, reductio ad absurdum does not hold in general in IQC, and we can even show that it is not derivable as a formula in HA even when limited to Σ1-formulas. However, we could still make a case for why this rule makes sense constructively: Suppose ¬¬∃n.ϕ(n, m) and that we want to find a number n that satisfies ϕ(n, m). Let us check for every number i = 0, 1, 2,... whether ϕ(i, m) is true and return the smallest such number. This process should terminate: ϕ(i, m) is ∆0, so its truth can be checked using a simple computation. The only way that it would not terminate would be if there were no i satisfying ϕ(i, m), but this is not the case, by assumption. Hence our search will eventually find a witness i to ∃n.ϕ(n, m). While it is possible to motivate such a construction semantically, we can show (using the realizability interpretation of Section 6) that MRPR is not derivable in HA as a rule. That is, there is no single proof tree (with placeholders) that can replace every possible instance of the rule

¬¬∃n.ϕ(n, m) ∃n.ϕ(n, m) in IQC and using assumptions from HA.

However, we can prove that HA is closed under addition of MRPR. In other words, if

HA + MRPR `i ψ then we can show that HA `i ψ. We do so with a negative translation of IQC into MQC.

26 5.2. The Dragalin-Friedman A-translation

This section introduces the A-translation as discovered by Friedman [8], but the presentation follows Leivant ([15] and [14]). Let us begin with a discussion of the role of the rule ⊥i in IQC. Suppose Γ `i ϕ.

Lemma 5.1. If Γ `i ϕ, then there is a proof of ϕ from Γ that only invokes the rule ⊥i to derive atomic formulas.

Proof. Suppose the proof contains an instance

[∆] D ⊥ ϕ ⊥i

Since ∆ `i ⊥, we can invoke the ⊥i rule to derive any subformula of ϕ as needed, and then use the introduction rules to derive ϕ from its subformulas.

Suppose we have a derivation T `i ϕ for some formula ϕ and theory T. Consider the structure of such a derivation:

• If it contains no applications of the rule ⊥i, then it’s already a derivation in MQC, so T `m ϕ.

• If it does contain some instance of ⊥i, we can assume without loss of generality that it is of the form D ⊥ ⊥ P i with P atomic. Whenever we encounter such an instance we may replace it with D ⊥ ∨I P ∨ ⊥ 1 and continue the proof of ϕ “as usual”, but everywhere using P ∨ ⊥ instead of P . This will be a proof in MQC. Let us define a translation ϕ⊥ of formulas, that replaces every non-⊥ atomic subformula P of ϕ with P ∨ ⊥. Formally: Definition. Let

⊥⊥ ≡ ⊥ P (¯x)⊥ ≡ P (¯x) ∨ ⊥ ⊥ ⊥ ⊥ (ϕ  ψ) ≡ ϕ  ψ , for  ∈ {∧, ∨, →} (∀xσϕ)⊥ ≡ ∀xσϕ⊥ (∃xσϕ)⊥ ≡ ∃xσϕ⊥

Theorem 5.2. If for any formula ϕ, ⊥ `i ϕ ↔ ϕ .

Proof. For atoms, we have ϕ⊥ ≡ ϕ ∨ ⊥, which is intuitionistically equivalent to ϕ. The proof now follows by induction on ϕ.

27 Lemma 5.3. For any formula ϕ, ⊥ ⊥ `m ϕ

Proof. By induction on the structure of ϕ.

⊥ ⊥ Lemma 5.4. For any formula ϕ, if Γ `i ϕ, then Γ `m ϕ .

Proof. By induction on the length of the proof for Γ `i ϕ. The case where the derivation is a single application of the rule ⊥i is covered by Lemma 5.3.

So intuitionistic logic holds within the image of (·)⊥ in MQC. As we have noted earlier, ⊥ has no special meaning in minimal logic, so we could do the above reasoning with any formula in place of ⊥.

Definition. Let A be a formula and ϕ a formula with FV (A) ∩ BV (ϕ) = ∅. Define

ϕA ≡ (ϕ⊥)[A/⊥]

This is the Dragalin–Friedman A-translation.

Remark. Because of the requirement on ϕ and A in the above definition, we have (ϕ[M σ/xσ])A ≡ ϕA[M σ/xσ] for all xσ,M σ. Also note that variables can always be renamed to meet this requirement.

A A A A Theorem 5.5. If ϕ `i ψ, and ϕ , ψ are defined, then ϕ `i ψ .

⊥ ⊥ Proof. Lemma 5.4 gives ϕ `m ψ . From Lemma 1.2 we conclude

⊥ ⊥ ϕ [A/⊥] `m (ψ )[A/⊥], that is A A ϕ `m ψ .

A A Corollary 5.6. If ϕ is defined, then A `i ϕ .

Proof. Follows immediately from the previous theorem, since ⊥ `i ϕ.

The A-translation is thus a translation of intuitionistic into minimal logic, where A takes the role of falsity, and where the above Corollary corresponds to the rule ex falso qoud libet. We have the following generalization of Theorem 5.2:

Theorem 5.7. If ϕA is defined, then A ¬A `i ϕ ↔ ϕ .

Proof. Similar to the proof of Theorem 5.2, noting that

¬A ∧ A `i ⊥, so that ¬A `i ϕ ∨ A → ϕ.

28 Similarly to the G¨odel-Gentzen translation, we say that a theory T is closed under the A-translation if A T `m ϕ for every instance ϕ of a schema in T. Theorem 5.8. If T is closed under the A-translation for some A, then

A T `i ϕ ⇒ T `m ϕ .

Theorem 5.9.HA is closed under the A-translation.

Proof. We have (s = t)A ≡ s = t ∨ A, so HA trivially proves the A-translation of all its axioms on this form. V 0 0 For the axioms on the form i(ti = si) → t = s , we have !A ^ 0 0 ^ 0 0 (ti = si) → t = s ≡ (ti = si ∨ A) → t = s ∨ A. i i V An informal proof of this goes as follows: Suppose i(ti = si ∨ A). Then we can either derive A or we can derive ti = si for all i. In the former case we are done, and in the latter we can apply the axiom to find t0 = s0.

Finally, for an instance IAϕ ≡ ϕ(0) → (∀x.ϕ(x) → ϕ(S(x))) → ∀x.ϕ(x) of the induction axiom, we have

A A  A A A IAϕ ≡ ϕ(0) → ∀x.ϕ(x) → ϕ(S(x)) → ∀x.ϕ(x) which is itself an instance of the induction axiom, hence provable.

5.3. Conservativity for Π2-formulas in HA

The Dragalin-Friedman translation can be used to show that HA is closed under Markov’s rule. Definition. We say that a formula schema ϕ[¯ρ] is exporting if for all formulas ψ¯ there is a derivation

¯ ⊥ ¯⊥ ϕ[ψ] `m ϕ[ψ ] ∨ ⊥.

Remark. If ϕ[¯ρ] is exporting, then for any A such that ϕ[ψ¯]A is defined,

¯ A ¯A ϕ[ψ] `m ϕ[ψ ] ∨ A.

A Theorem 5.10. If T is closed under (·) , ϕ is exporting and T `i ¬¬ϕ, then T `m ϕ.

Proof. By Theorem 5.8 ϕ T `m (¬¬ϕ) , that is ϕ T `m (ϕ → ϕ) → ϕ. Since ϕ is exporting, we find T `m (ϕ ∨ ϕ → ϕ) → ϕ, but ϕ ∨ ϕ → ϕ is a in MQC, so T `m ϕ.

29 Lemma 5.11. Any HA-formula on the form ∃x.t(x, y¯) = 0 is exporting.

Proof. We have (∃x.t(x, y¯) = 0)⊥ ≡ ∃x.t(x, y¯) = 0 ∨ ⊥, which gives the following proof tree:

[t(x, y¯) = 0]1 ∃x.t(x, y¯) = 0 [⊥]2 [t(x, y¯) = 0 ∨ ⊥]3 (∃x.t(x, y¯) = 0) ∨ ⊥ (∃x.t(x, y¯) = 0) ∨ ⊥ 1, 2 [∃x.t(x, y¯) = 0 ∨ ⊥] (∃x.t(x, y¯) = 0) ∨ ⊥ 3 (∃x.t(x, y¯) = 0) ∨ ⊥

Theorem 5.12. If ϕ(n, m¯ ) is ∆0, then

HA `i ¬¬(∃n.ϕ(n, m¯ )) ⇒ HA `i ∃n.ϕ(n, m¯ ).

Proof. By Lemma 4.2 there is a term tϕ(n, m¯ ) with

HA `i ϕ(n, m¯ ) ↔ tϕ(n, m¯ ) = 0.

From 5.9, 5.10 and 5.11 it now follows that

HA `i ¬¬(∃n.tϕ(n, m¯ )) ⇒ HA `m ∃n.tϕ(n, m¯ ).

The above equivalence finishes the proof.

As promised in the beginning of this section, we can apply the tools we’ve developed to show that theories closed under (·)A are closed under addition of other rules as well.

Theorem 5.13. If T is closed under (·)A, then T is closed under independence of premiss, that is, the following rule: T `i ¬ψ → ∃x.χ ⇒ T `i ∃x.¬ψ → χ where x is not free in ψ.

Proof. By Theorem 5.7, we have in particular

A `i ϕ → (¬A → ϕ) for all ϕ, A such that ϕA is defined. ¬¬ψ ¬¬ψ Suppose T `i ¬ψ → ∃x.χ. Then, by Theorem 5.8, we have T `m (¬ψ) → ∃x.χ . With ϕ ≡ χ ¬¬ψ ¬¬ψ above and Lemma 1.1, we find that `i χ → (¬ψ → χ), so that T `i (¬ψ) → ∃x.(¬ψ → χ). ¬¬ψ ¬¬ψ ¬¬ψ Now, (¬ψ) ≡ ψ → ¬¬ψ and we have `i ψ → (¬ψ → ψ) (with ϕ ≡ ψ above). By a simple ¬¬ψ ¬¬ψ proof, `i (¬ψ → ψ) → ¬¬ψ, and hence `i ψ → ¬¬ψ, that is `i (¬ψ) .

This gives T `i ∃x.(¬ψ → χ).

30 6. Extracting algorithms

As mentioned in the introduction, the BHK interpretation tells us that constructive proofs encode algorithms or methods, but the actual nature of these methods was left undefined. Brouwer, who originally formulated the idea and was more philosophically motivated, never actually formalized this. After all, his philosophy stated that formalization is secondary to the practice of doing mathematics. Nevertheless, algorithmic interpretations did eventually show up after the advent of computability theory. First to formalize the idea was Kleene in [12], where he introduced a semantics for Heyting arithmetic called the realizability interpretation, where these algorithms were expressed as G¨odelcodes for partial recursive functions. It is based on the observation that constructive proofs can contain information not visible in the conclusion of the proof, namely that of the witnesses of proofs of ∨- and ∃-formulas. The interpretation maps HA-formulas to new HA-formulas where this information is made explicit, by encoding the “methods” from the BHK interpretation as numbers inside HA. Not surprisingly, G¨odelcodes turn out to be somewhat unwieldy as descriptions for actual computer programs. If we want to extract the computational content from our proofs, we need a different programming language. Here we will use a modified version of the realizability interpretation that interprets the methods from the BHK interpretation as programs in a simply typed λ-calculus. The program corresponding to a proof of a formula ϕ will be called a realizer of ϕ. In essence, realizers of → and ∀-formulas will be functions that take terms and compute realizers for their antecedents. From the realizers of ∃-formulas can be extracted functions or numbers with certain properties. Realizers for ∨-formulas will give an indication of which of the two cases is true. Modified realizability was first introduced by Kreisel, although this section follows the work of Schwicht- enberg [18], and uses a slightly altered version of modified realizability.

6.1. G¨odel’ssystem T

The extracted programs will be written in G¨odel’ssystem T , based on ordinary typed λ-calculus, but equipped with an additional combinator rσ that allow us to define terms of type σ via recursion, and a ground type 0, representing natural numbers. Every term in the language will have a type, defined as follows: Definition. The set of finite types T is defined inductively as follows: • 0 ∈ T • If σ, τ ∈ T then σ × τ, σ → τ ∈ T .

We indicate that a term M has type σ by writing M : σ. Terms N : σ × τ should be understood as pairs of terms of type σ and τ, while those of type σ → τ as functions that compute terms of type τ when given terms of type σ. The language contains the following constants: • 0 : 0 • S : 0 → 0

• pσ,τ : σ → (τ → (σ × τ))

• p0σ,τ :(σ × τ) → σ

• p1σ,τ :(σ × τ) → τ

• rσ : σ → ((σ → (0 → σ)) → (0 → σ))

31 If M : σ → τ and N : σ, then we say that M is applicable to N.A term is either a constant, a variable, σ σ a λ-abstraction λx . t(x ) where x is a variable and t is a term, or t1 t2 where t1, t2 are terms and t1 is applicable to t2. In order to save on parentheses, we adopt the convention that application of terms is left-associative, that is MNK ≡ (MN) K for all terms M,N,K. To give an operational semantics of this language, we define the following transition relation:

p0 (p MN) β M p1 (p MN) β N r MN 0 β M r MN (SK) β N (r MNK) K and 0 0 0 If M β M then MN β M N and NM β NM

Remark. It can be shown that this transition relation is strongly normalizing, that is, that every sequence of transition steps terminates, and that they all reduce to an unique term.

Notation. We write t =β s if t and s normalize to the same term.

λ-abstraction lets us define higher order functionals, between any types σ and τ. Furthermore, we can use the recursive combinator r to construct not only numbers (that is, terms of type 0), but higher order functions.

Example. The Ackermann function A : N2 → N is defined as follows: A(0, n) = S(n) A(S(m), 0) = A(m, 1) A(S(m),S(n)) = A(m, A(S(m), n))

This function is famously not primitive recursive. Nonetheless, it can be defined as a term a0×0→0 in T , which shows that the system contains more functions than the primitive recursive ones:

  ! 0×0 0→0 0 0    0 0  a = λv . r0→0 S λf .λm .λn . r0 f (S 0) λk .λj .f k n (p0 v) (p1 v)

We can also define booleans in T : Definition. If b : 0, x : σ and y : σ, we define

true ≡ S 0 false ≡ 0

if b then x else y ≡ rσ y (λn.λm. x) b

Hence if true then x else y β x and if false then x else y β y.

32 6.2. Interpreting atomic formulas and optimizing extracted programs

We will soon define the realizability interpretation, but first we make some observations. What would be the realizer of closed atomic formulas? By the reasoning outlined in the introduction to this section, a proof of an atomic formula contains no extra hidden information to be added to the formula. Consequently, an atomic formula could be realized by any arbitrary term. Another way to look at it is that proofs of atomic formulas contain no computational content, since the truth of such a formula can be verified by a mechanical process, which gives us no interesting information. By extension, implications whose antecedent is atomic cannot possibly be used to give us any computa- tional content. The same is true for conjunctions and universal quantifications. Generally, we define the Harrop formulas as the class of formulas generated by

H := ⊥ | P | H1 ∧ H2 | ϕ → H1 | ∀x.H1 where H1,H2 are given by H, and ϕ is an arbitrary formula. We observe that no useful information can be extracted from the proof of a Harrop formula. In order to avoid having the programs we extract waste time on computing these useless terms, the extraction will be defined in such a way as to remove the subterms resulting from proofs of Harrop formulas. This also results in somewhat more understandable programs.

6.3. Extraction of program terms

Next we explain how to extract programs from proofs in IQC as terms in T . We will augment the type system given in Section 6.1 with an additional type, ∗, which has a single element u : ∗. Call this new system T ∗. Any proof of a Harrop formula will simply give the trivial program u. For every sort in the many-sorted language, we define a type T (ρ) which represents objects of sort ρ. In the case of HA, which contains only a single sort 0, we set T (0) = 0. We extend T to a mapping from formulas to T ∗. T (⊥) = ∗ T (P ) = ∗  T (ϕ) if T (ψ) = ∗  T (ϕ ∧ ψ) = T (ψ) if T (ϕ) = ∗  T (ϕ) × T (ψ) otherwise  T (0) if T (ϕ) = ∗ = T (ψ)  T (0) × T (ϕ) if T (ϕ) 6= ∗ = T (ψ) T (ϕ ∨ ψ) = T (0) × T (ψ) if T (ϕ) = ∗= 6 T (ψ)  T (0) × (T (ϕ) × T (ψ)) otherwise  T (ψ) if T (ϕ) = ∗  T (ϕ → ψ) = ∗ if T (ψ) = ∗  T (ϕ) → T (ψ) otherwise ( ∗ if T (ϕ) = ∗ T (∀xσ.ϕ) = T (σ) → T (ϕ) otherwise ( T (σ) if T (ϕ) = ∗ T (∃xσ.ϕ) = T (σ) × T (ϕ) otherwise

33 Remark. If ϕ is ∃–∨-free, then T (ϕ) = ∗.

In order to realize an ex falso proof, we need a way to construct an arbitrary term in any given type, as follows:

∗ Definition. For every type τ ∈ T we define a term cτ as follows:

c0 = 0 τ cτ→τ 0 = λx .cτ 0

cτ×τ 0 = p cτ cτ 0

c∗ = u.

We will now define a function · that takes proofs and produces terms in T ∗. Since passing around and drawing trees inside mathematicalJ K formulas is tedious, we introduce a shorthand for denoting proof trees in a linear fashion, using proof terms. Each tree will have a corresponding term, and vice versa. 0 We annotate the usual proof trees from IQC with their corresponding terms: If D has the form D M : ϕ then the proof term of D is M. Then, for example, a rule on the form

ϕ ψ will be written as

M : ϕ N(M): ψ which should be read as “if M is the proof term corresponding to a proof of ϕ, then the term N(M) is a proof of ψ”. For assumptions (axioms), we write u : ϕ to indicate that u is a proof variable (proof constant) corresponding to the assumption (axiom) ϕ. Unless stated otherwise, letters u, v, . . . indicate proof variables or constants. Note that a proof term will indicate exactly which rule was used to derive the proof, and furthermore it will contain every derivation leading up to it as subterms. By examining a term, we can then reconstruct the proof tree.

M ⊥ : ⊥ M ϕ : ϕ N ψ : ψ ⊥i ∧I !(M ⊥)ϕ : ϕ hM ϕ,N ψiϕ∧ψ : ϕ ∧ ψ

M ϕ∧ψ : ϕ ∧ ψ M ϕ∧ψ : ϕ ∧ ψ ∧E ∧E ϕ∧ψ ϕ 1 ϕ∧ψ ψ 2 π0(M ) : ϕ π1(M ) : ψ

M ϕ : ϕ M ψ : ψ ∨I1 ∨I2 inl(M ϕ)ϕ∨ψ : ϕ ∨ ψ inr(M ψ)ϕ∨ψ : ϕ ∨ ψ

[uϕ : ϕ]h1 [vψ : ψ]h2 . . . . ϕ∨ψ χ χ M : ϕ ∨ ψ s1(u) : χ s2(v) : χ χ ∨E, h1, h2 Du,v(M, s1, s2) : χ

34 [uϕ : ϕ]h . . M(u)ψ : ψ M ϕ→ψ : ϕ → ψ N ϕ : ϕ →I, h →E (λuϕ.M ψ)ϕ→ψ : ϕ → ψ (M ϕ→ψ N ϕ)ψ : ψ

σ σ M ϕ(x ) : ϕ(xσ) M ∀x .ϕ : ∀xσ.ϕ σ σ ∀I σ σ σ ∀E (λxσ.M ϕ(x ))∀x .ϕ : ∀xσ.ϕ (M ∀x .ϕ tσ)ϕ[t /x ] : ϕ[tσ/xσ]

[uϕ(y) : ϕ[y/x]]h . . σ σ σ M ϕ[t /x ] : ϕ[tσ/xσ] M ∃x .ϕ : ∃xσ.ϕ N(u, y)ψ : ψ ∃E, h σ ϕ[tσ/xσ ] σ ∃I ∃xσ.ϕ ψ ht ,M i : ∃x .ϕ Eu,y(M ,N ): ψ For every proof term M : ϕ we define a T ∗-term M : T (ϕ). If T (ϕ) = ∗, we set M ϕ = u. Otherwise, we define it as follows2: J K J K

ϕ qu y = xu : T (ϕ) where xu is a fixed variable corresponding to u, if u is a proof variable

and xu is a fixed term corresponding to u, if u is a proof constant ⊥ ϕ r!(M ) z = cT (ϕ)   M if T (ψ) = ∗  rhM ϕ,N ψiz = JN K if T (ϕ) = ∗ J K pT (ϕ),T (ψ) M N otherwise J KJ K  u if T (ϕ) = ∗ ϕ∧ψ  rπ0(M )z = M if T (ψ) = ∗ J K p0T (ϕ),T (ψ) M otherwise J K   M if T (ϕ) = ∗ ϕ∧ψ J K rπ1(M )z = u if T (ψ) = ∗  p1T (ϕ),T (ψ) M otherwise J K  0 if T (ϕ) = ∗ = T (ψ)  h0, M i if T (ϕ) 6= ∗ = T (ψ) rinl(M ϕ)ϕ∨ψz = h0,J c K i if T (ϕ) = ∗= 6 T (ψ)  T (ψ)  h0, M , cT (ψ)i otherwise  J K 1 if T (ϕ) = ∗ = T (ψ)  h1, c i if T (ϕ) 6= ∗ = T (ψ) rinr(M ψ)ϕ∨ψz = T (ϕ) h1, M i if T (ϕ) = ∗= 6 T (ψ)   J K h1, cT (ϕ), M i otherwise J K

2Note that [18] restrict themselves to the ∃ − ∨-free fragment of minimal logic. We lift this restriction by also defining · for the rules corresponding to ⊥, ∨ and ∃. J K

35  if M then s else s if T (ϕ) = ∗ = T (ψ)  2 1  J K J K J K if (p0 M ) then s2 if T (ϕ) 6= ∗ = T (ψ)   J K J Kh i  else s1 p1 M /xu   J K J K  h i r ϕ∨ψ χ χ z if (p0 M ) then s2 p1 M /xv if T (ϕ) = ∗= 6 T (ψ) Du,v(M , s1 , s2 ) =  J K J K J K  else s1   J Kh i if (p0 M ) then s2 p1p1 M /xv otherwise   J K J K J K  h i  else s1 p0p1 M /xu  J K J K   M if T (ϕ) = ∗ r ϕ ψz λu M = J TK(ϕ) λxu M otherwise J K ( M if T (ϕ) = ∗ rM ϕ→ψ N ϕz = JMK N otherwise J KJ K qλxσM ϕy = λxσ M σ J K rM ∀x .ϕtσz = M t J K ( σ σ t if T (ϕ) = ∗ rhtσ,M ϕ[t /x ]iz = p t M otherwise  J K  h i  N M , u/y, xu if T (ϕ) = ∗ σ  r ∃x .ϕ ψ z J K J K Eu,y(M ,N ) = h i   N p0 M , p1 M /y, xu otherwise J K J K J K

For program extractions of proofs in a theory we also have to define uϕ for every axiom ϕ. J K In HA we define for the induction axiom schema IAρ : ρ(0) → ∀x.(ρ(x) → ρ(S(x))) → ∀x.ρ(x)

IAρ ru z = rT (ρ) : T (ρ) → (0 → T (ρ) → T (ρ)) → (0 → T (ρ)) and for every other axiom ϕ we have T (ϕ) = ∗, hence uϕ = u. J K Example. Given ϕ(x) ≡ x = 0 ∨ ∃y.S(y) = x, we have the following proof of ∀x.ϕ(x) in minimal HA:

d2 : x = x

d3 : S(x) = S(x)

hx, d3i : ∃y.S(y) = S(x)

d1 : 0 = 0 inr(hx, d3i): ϕ(S(x)) IAϕ u : IAϕ inl(d1): ϕ(0) λu.inr(hx, d3i): ϕ(x) → ϕ(S(x)) IAϕ u inl(d1):(∀x.ϕ(x) → ϕ(S(x))) → ∀x.ϕ(x) λx.λu.inr(hx, d3i): ∀x.ϕ(x) → ϕ(S(x))

IAϕ u inl(d1)(λx.λu.inr(hx, d3i)) : ∀x.ϕ(x)

36 We have

T (ϕ(x)) = T (x = 0 ∨ ∃y.S(y) = x) = T (0) × T (∃y.S(y) = x) = T (0) × T (0) = 0 × 0 and hence T (∀x.ϕ(x)) = 0 → 0 × 0.

  s IAϕ  { u inl(d1) λx.λu.inr hx, d3i = rT (ϕ) qinl(d1)y qλx.λu.inr(hx, d3i)y

= r0×0 qinl(d1)y qλx.λu.inr(hx, d3i)y

= r0×0 h0, cT (∃y.S(y)=x)i qλx.λu.inr(hx, d3i)y

= r0×0 h0, c0i qλx.λu.inr(hx, d3i)y

= r0×0 h0, 0i (λx.λxu.qinr(hx, d3i)y)

= r0×0 h0, 0i (λx.λxu.h1, qhx, d3iy)

= r0×0 h0, 0i (λx.λxu.h1, xi)

Let us analyze this term a bit. If x = 0, then r0×0 h0, 0i (λx.λxu.h1, xi) x = h0, 0i and if x = S y for some y, then it normalizes to h1, yi. If we interpret the first coordinate of the tuples as a flag indicating the success of the algorithm, we can treat the term as a definition of the predecessor function

pred 0 = error pred Sx = x.

Thus, the proof of the existence of a predecessor for each non-zero number gives us a method for computing it.

6.4. Formulas as specifications

We have defined, for every proof term M : ϕ, an extracted program M in T , which is intended to reflect the construction within M. For example, the program extractedJ fromK an existence proof will allow us to compute an explicit witness for the existence. A slightly different perspective is that we start with a formula ϕ representing a specification for a computer program and extract an explicit program that satisfies that specification. If, for example, ϕ was on the form ∀x.∃y.ψ(x, y), we will get a function that takes an x and computes an y in such a way that it satisfies ψ(x, y). In order to formally state and prove these results we will need a theory that deals with programs in T . This theory can be formulated in a couple of different ways, but we follow the presentation given in [24].

Definition. The theory of finite type arithmetic, HAω, is a many sorted theory defined as follows: The sorts are the types in T . For every sort σ, τ, ρ ∈ T we have the following constants: • 0 of sort 0 • S of sort 0 → 0

• pσ,τ of sort σ → (τ → (σ × τ))

• p0σ,τ of sort (σ × τ) → σ

37 • p1σ,τ of sort (σ × τ) → τ

• kσ,τ of sort σ → (τ → σ)

• sρ,σ,τ of sort (ρ → (σ → τ)) → ((ρ → σ) → (ρ → τ))

• rσ of sort σ → ((σ → (0 → σ)) → (0 → σ))

Furthermore, it contains functions Apσ,τ of arity ((σ → τ), σ, τ) and predicates =σ of arity (σ, σ).

Notation. Whenever the meaning is clear, we will omit writing out Apσ,τ , and drop the subscripts to the above constants and the =σ predicates. We will also assume that the sorts of the arguments to σ→(τ→σ) σ Apσ,τ are “compatible”. That is, instead of writing Apσ,(τ→σ)(kσ,τ , x ) we write kx.

The axioms of HAω are the universal closures of the following:

x = x x = y → y = x x = y ∧ y = z → x = z y = z → xy = xz x = y → xz = yz

p0 (p x y) = x p1 (p x y) = y p (p0 z)(p1 z) = z k x y = x s x y z = x z (y z) r x y 0 = x r x y (S z) = y (r x y z)z S x = S y → x = y ¬S x = 0 and ρ(0) → ∀x(ρ(x) → ρ(S x)) → ∀yρ(y)

The theory HAω uses the combinators s and k rather than the more familiar notion of λ-abstraction. Nevertheless, as the following theorem shows, they are sufficient for defining λ-abstraction explicitly: Lemma 6.1. Given a variable xσ and a term t(x)τ , there is a term λxσ.t(x) of sort σ → τ such that σ (i)( λx . t1(x)) t2 = t1[t2/x] (ii)( λxσ. t x) = t for x 6∈ FV(t) σ σ (iii) If x 6∈ FV(t1) ∪ FV(t2) then t1 = t2 → λx . t[t1/y] = λx . t[t2/y].

Proof. We define λxσ.t by induction on the structure of t. (a) If t is a variable or constant, let λxσ. x = s k k and λxσ. t = k t if t 6≡ x.

(b) Suppose t = Ap(t1, t2) for some terms t1 : σ1 → σ2 and t2 : σ1.

If t2 ≡ x, let ( σ σ σ s (λx . t1)(λx . t2) if x ∈ FV(t1) λx .t1 t2 = t1 if x 6∈ FV(t1)

If t2 6≡ x, let ( σ σ σ s (λx . t1)(λx . t2) if x ∈ FV(t1) ∪ FV(t2) λx .t1 t2 = k (t1 t2) if x 6∈ FV(t1) ∪ FV(t2) That this satisfies (ii) is clear from construction. The other two properties follow from a simple induction on the structure of t.

38 This allows us to translate closed terms from T into terms of HAω, and vice versa (since the combinators can easily be defined via λ-abstraction). From here on, we will do this implicitly, allowing us to treat the terms of the two languages as identical. In particular, the program term M extracted from a proof M without open assumptions is closed, and hence has a corresponding termJ in KHAω. In a similar vein, we define T (σ) = σ for every sort σ. The following definition introduces the tool that will bridge the gap between logical formulas and the programs we extract from their proofs. Informally, we can think of the formula t mr ϕ as saying that the program term t meets the specification ϕ. Definition (Modified realizability). For terms t in T ∗ and formulas ϕ in the language of HAω we define a formula t mr ϕ in L(HAω) as follows:

u mr ⊥ = ⊥ u mr P = P  u mr ϕ ∧ t mr ψ if T (ϕ) = ∗  t mr(ϕ ∧ ψ) = t mr ϕ ∧ u mr ψ if T (ψ) = ∗  p0 t mr ϕ ∧ p1 t mr ψ otherwise  (t = 0 → u mr ϕ) ∧ (t 6= 0 → u mr ψ) if T (ϕ) = ∗ = T (ψ)  (p t = 0 → p t mr ϕ) ∧ (p t 6= 0 → u mr ψ) if T (ϕ) 6= ∗ = T (ψ) t mr(ϕ ∨ ψ) = 0 1 0 (p0 t = 0 → u mr ϕ) ∧ (p0 t 6= 0 → p1 t mr ψ) if T (ϕ) = ∗= 6 T (ψ)  (p0 t = 0 → p0 p1 t mr ϕ) ∧ (p0 t 6= 0 → p1 p1 t mr ψ) if T (ϕ) 6= ∗= 6 T (ψ)  u mr ϕ → t mr ψ if T (ϕ) = ∗  t mr(ϕ → ψ) = ∀xT (ϕ)(x mr ϕ → u mr ψ) if T (ϕ) 6= ∗ and T (ψ) = ∗  ∀xT (ϕ)(x mr ϕ → t x mr ψ) otherwise ( ∀xσ(u mr ϕ) if T (ϕ) = ∗ t mr(∀xσ.ϕ) = ∀xσ(t x mr ϕ) otherwise  u mr ϕ[t/x] if T (ϕ) = ∗ t mr(∃xσ.ϕ) = h i (p1 t) mr ϕ p0 t/x otherwise

Lemma 6.2. If ϕ is ∃–∨-free, then u mr ϕ ≡ ϕ.

Proof. We have T (ϕ) = ∗. The proposition follows directly by induction on the structure of ϕ.

Theorem 6.3 (Soundness). If ϕ Γ `i M : ϕ, then there is a proof term µ(M) such that

0 Γ `i µ(M): M mr ϕ, J K 0 χ χ χ where Γ = {t(u) mr χ | (u : χ) ∈ Γ} and t(u ) = xu if T (χ) 6= ∗ and t(u ) = u otherwise.

Proof. See Appendix B.

39 It now follows that a proof of a formula ∀xσ.∃yτ .ϕ(x, y) will give a function f satisfying the specification ϕ(x, f(x)):

Corollary 6.4. If Γ and ϕ are ∃–∨-free and

σ τ Γ `i M : ∀x ∃y ϕ, then σ h i Γ `i ∀x ϕ M x/y . J K Proof. We have

M mr ∀xσ∃yτ ϕ ≡ ∀xσ( M x mr ϕ) J K J K h i ≡ ∀xσ(u mr ϕ M x/y ) J K h i ≡ ∀xσϕ M x/y J K since T (ϕ) = ∗ and thus u mr ϕ ≡ ϕ. By Theorem 6.3, we have

σ h i Γ `i µ(M): ∀x ϕ M x/y . J K

6.5. Applications to classical proofs

The previous steps have outlined an algorithm for extracting computational content from certain classically provable sentences: (i) Given a classical proof of ϕ formalized in HA, (ii) apply a negative translation (·)N (such as (·)g or ((·)g)A for a suitable A) so that

N `i ϕ

(iii) if ϕ belongs to a class of formulas such that

N `i ϕ ↔ ϕ

then we have found an intuitionistic proof of ϕ, from which we extract a program in T using · . J K Actually formalizing proofs of interesting number theoretic theorems sometimes turns out to be a tedious task, and well beyond the scope of this thesis. We will however mention an example given in [18]: The article formalizes the following proof of the fact that the greatest common divider of any two numbers can be written as a linear combination of the numbers. The above extraction method is then applied to yield a program for computing the coefficients3. We refer to the article for the full details.

Theorem 6.5. For all a1, a2 ∈ N such that 0 < a2, there are k1 and k2 such that |k1a1 − k2a2| divides a1 and a2 and 0 < |k1a1 − k2a2|.

3The authors also make a small modification to the A-translation in order to weed out some ineffective steps in the extracted program, which we will not mention further here.

40 Proof. Consider the ideal (a1, a2) generated by a1 and a2. Since a2 > 0, the ideal has a least positive element c = |k1a1 − k2a2|, where k1, k2 ∈ N. Suppose c is not a common divisor of a1 and a2. Then

ai = cq + r where q, r ∈ N, 0 < r < c, for some i ∈ {1, 2}. But then r ∈ (a1, a2) and r < c, which contradicts the assumption that c was the smallest positive element.

Suppose d is a common divisor of a1 and a2. Then d divides |k1a1 − k2a2| = c, so in particular d ≤ c.

Note that the theorem is a Π2-formula and that the proof uses classical reasoning when it assumes that c is not a common divisor of a1 and a2 and derives a contradiction. Furthermore, it invokes the least number principle to find c, which, as we noted in Section 1, is not in general intuitionistically valid. Using the above extraction method, the authors of the article extracted the following algorithm, which computes the witnesses k1 and k2:

gcdaux(a1, a2) = r0×0→0×0 (λk.p 0 0) (λn0. λh0×0→0×0. 0×0 λk . if 0 < r2(k) then h(p k1 · q2(k)

f(a2, a1, k2, k1, q2(k)))

else if 0 < r1(k) then h(p f(a1, a2, k1, k2, q1(k))

k2 · q1(k)) else k)

(a2 + 1) (p 0 1) where k1 = p1 k, k2 = p2 k, ( q · l1 − 1 if l2 · b2 < l1 · b1 and 0 < q f(b1, b2, l1, l2, q) = q · l1 + 1 otherwise and qi(k) and ri(k) is the quotient respectively remainder of ai after division with µ(k) = |k1a1 −k2a2|.

More informally this program can be written as gcdaux(a1, a2) = h(a2 + 1, h0, 1i) where

h(0, k) = undefined

h(n + 1, hk1, k2i) = if 0 < r2(hk1, k2i) then h(n, hk1q2(k1, k2),

f(a2, a1, k2, k1, q2(k1, k2))i)

else if 0 < r1(hk1, k2i) then h(n, hf(a1, a2, k1, k2, q1(k1, k2)),

k2q1(k1, k2)i)

else hk1, k2i

We can then compute the greatest common divisor as

gcd(a1, a2) = µ(gcdaux(a1, a2)).

41 7. Finite type arithmetic and the axiom of choice

In the previous section we introduced the theory HAω as a means for formulating modified realizability and proving the extraction theorem, but it turns out to be suitable for formalizing not only analysis, but many other branches of modern mathematics. The idea is as follows: Since it contains functionals of sort 0 → 0, we ought to be able to quantify over (characteristic functions of) subsets of natural numbers. Using higher sorts, this process could then be iterated as needed to define iterated “power sets” of 0. In order to define the characteristic function of any property in higher-order arithmetic, it is not sufficient to merely be able to quantify over objects: In order to prove the existence of an actual functional corresponding to the characteristic function, we also need a variant of the axiom of choice: σ τ σ→τ σ ∀x .∃y .ρ(x, y) → ∃f .∀x ρ(x, f x)(ACσ,τ )

HAω makes no claim about the nature of functionals. There are models where they behave more like computer programs, that is, where every functional is a code for some program. In these models we consider programs to be equal only if the code is identical. Other models treat functionals more like the mathematical functions we are used to: Functions are equal if they take the same values with the same arguments. We add this as an axiom, for each σ, τ: σ→τ σ→τ σ ∀f ∀g (∀x (f x =τ g x) → f =σ→τ g)(Extσ,τ )

Notation. Let Ext refer to the collection Extσ,τ for all sorts σ, τ and similarly for AC. We write E-HAω for HAω + Ext and E-PAω for PAω + Ext.

The above two axioms actually let us prove a comprehension principle for objects of higher types (similar to the comprehension axiom from set theory). As a result, E-HAω + AC is sufficient for expressing much the same things as set theory. Goodman and Myhill have conjectured (see [9]) that all of Bishop’s constructive analysis (see [4]) can be formalized within the system.

7.1. A model of E-HAω

We can define a model for finite type arithmetic within recursion theory, where every functional is interpreted as the index of a recursive function, as follows:

Definition. The model of hereditarily effective operations, HEO is given by a set HEOσ for every sort σ, along with interpretations of the relation and function symbols as follows:

HEO0 = N x =0 y ≡ x = y HEOσ→τ = {x ∈ N | ∀y, z.y =σ z → {x}(y) =τ {x}(z)} x =σ→τ y ≡ x ∈ HEOσ→τ ∧ y ∈ HEOσ→τ ∧ ∀z ∈ HEOσ.({x}(z) =τ {y}(z)) HEOσ×τ = {x ∈ N | j1(x) ∈ HEOσ ∧ j2(x) ∈ HEOτ } x =σ×τ y ≡ j1(x) =σ j1(y) ∧ j2(x) =τ j2(y)

(where j1 and j2 are the projection functions N → N induced by the standard Cantor pairing function 2 HEO HEO j : N → N). The function symbol Apσ,τ (s, t) is interpreted as {s }(t ), and the constant symbols are interpreted as follows: 0HEO = 0 SHEO = Λx.x + 1 kHEO = Λx.Λy.x sHEO = Λx.Λy.Λz. {x}(z) {y}(z) pHEO = Λx.Λy.j(x, y) HEO HEO p1 = Λx.j1(x) p2 = Λx.j2(x)

42 and rHEO = r, where r is a number such that

{r}(x, y, 0) = x {r}(x, y, S z) = {y}({r}(x, y, z), z).

(Such an r can be found using the well-known Recursion Theorem from recursion theory.) Remark. The restriction of the universes of functional types to functions treating extensionally equal arguments as identical is necessary for the axiom

y = z → xy = xz.

The reader can verify that this model satisfies the axioms of E-HAω. Remark. The universes and equality predicates of HEO are HA-definable, so every HAω-formula ϕ can in fact be interpreted as a HA-formula ϕHEO by the above construction.

7.2. The axiom of choice and constructive mathematics

Theorem 7.1.E -HAω is conservative over HA.

Proof. We can show by induction on proofs in E-HAω that

ω HEO E-HA `i ϕ ⇒ HA `i ϕ .

Now we show that HEO HA `i ϕ ↔ ϕ for all ϕ ∈ L(HA). First, for any primitive recursive function symbol f in HA, we have

HEO HA `i f(¯x) = tf (¯x) where tf is the HA-term defining f, given by Theorem 4.2. From this follows by induction over terms that HEO HA `i t(¯x) = t(¯x) for all HA-terms t. Now, by induction over formulas ϕ in L(HA), we find that

HEO HA `i ϕ ↔ ϕ , which concludes the proof.

(For details on this and further results about HEO, see [24].) There is a theorem due to Goodman that states the following: Theorem.E -HAω + AC is conservative over HA.

However, the proof requires some machinery beyond the scope of the current text. For our intents and purposes, it is sufficient to prove a special case:

ω Theorem.E -HA + AC is conservative over HA with respect to Π1-formulas.

43 Unfortunately, a proof of this would also require some extra tools. We instead prove conservativity for HAω + AC and show that the same approach is not possible when adding Ext. For a proof of the full theorem, the reader is referred to [2]. ω Theorem 7.2.HA + AC is conservative over HA with respect to Π1-formulas.

ω Proof. Suppose HA + AC `i ϕ for a HA-formula ϕ on the form ∀x0.ψ(x)

ω 0 where ψ is ∆0. Then HA + AC `i ∀x .tψ(x) = 0, where tψ(x) is the HA-term corresponding to ψ given by 4.2. By Theorem 6.3, we have ω 0 HA + Γ `i f mr ∀x .tψ(x) = 0 for some term f, where

Γ = {xuσ,τ mr ACσ,τ | uσ,τ is the proof term for ACσ,τ and σ, τ arbitrary}. But 0 0 f mr ∀x .tψ(x) = 0 ≡ ∀x .u mr tψ(x) = 0 0 ≡ ∀x .tψ(x) = 0, since T (tψ(x) = 0) = ∗, hence ω HA + Γ `i ϕ.

Now we note that ACσ,τ is realizable by the term

σ→τ×T (ϕ) σ σ tσ,τ ≡ λz .hλx . p0 (z x), λx . p1 (z x)i if T (ϕ) 6= ∗, and τ σ tσ,τ ≡ λz .λx . z if T (ϕ) = ∗, that is ω HA `i tσ,τ mr ACσ,τ for all σ, τ.

Then, since every xuσ,τ was an arbitrary realizer for ACσ,τ , we have ω HA `i ϕ.

Finally, since HAω is conservative over HA,

HA `i ϕ.

The problem with mirroring this proof for E-HAω + AC is that we would also need to prove that ω HA `i xuExt mr Ext, that is, that Ext is realizable. However σ→τ σ→τ σ Ext ≡ ∀f .∀g .(∀x .f x =τ g x) → f =σ→τ g is ∃-∨-free, so T (Ext) = ∗ and xuExt mr Ext ≡ Ext, which is problematic, since Ext is independent of HAω: Instead of the extensional model mentioned in Section 7.1, we can construct an intensional model where every term is interpreted as an algorithm. Since different algorithms can compute the same function, this model does not satisfy Ext. As a result, deriving ω HA ` xuExt mr Ext is not possible. More on intensional models of HAω is found in [24].

44 7.3. Translations of HAω

We now return to the subject of negative translations. As we saw in previous sections, much of the conservativity results for arithmetic HA rely on the fact that HA is M-closed under g. This turns out to be the case for HAω and E-HAω as well. Theorem 7.3.HA ω is M-closed under g.

Proof. From Theorem 3.3 since all axioms of HAω are spreading.

ω 0 0 Theorem 7.4.HA `i ∀x , y .¬¬(x =0 y) → x =0 y.

Proof. By induction in HAω. e Definition. For any sort σ, we define a binary predicate =σ as follows:

e x =0 y ≡ x =0 y e e e x =σ×τ y ≡ p0 x =σ p0 y ∧ p1 x =τ p1 y e σ e x =σ→τ y ≡ (∀z .x z =τ y z)

ω σ σ e Lemma 7.5.E -HA `m ∀x , y .x =σ y ↔ x =σ y.

e Proof. By induction on σ: If σ = 0, then x =0 y ≡ x =0 y, so we are done. Suppose ω τ τ e E-HA `m ∀x , y .x =τ y ↔ x =τ y and ω ρ ρ e E-HA `m ∀x , y .x =ρ y ↔ x =ρ y. If σ = τ × ρ, then e e e x =τ×ρ y ≡ p0 x =τ p0 y ∧ p1 x =ρ p1 y, which is equivalent to p0 x =τ p0 y ∧ p1 x =ρ p1 y by the induction hypothesis. We see immediately from the axioms of HAω that

ω HA `m x =τ×ρ y ↔ p0 x =τ p0 y ∧ p1 x =ρ p1 y.

If σ = τ → ρ, then e τ e x =τ→ρ y ≡ (∀z .x z =ρ y z), which is equivalent to τ ∀z .x z =ρ y z by the induction hypothesis. The implication

τ (∀z .x z =ρ y z) → x =τ→ρ y is given by Ext, while the converse follows from the axioms of HAω.

e This equivalence between =σ and the normal equality predicate can be found in [24], where it is discussed in further detail. Lemma 7.6. For any sort σ, ω E-HA `i ¬¬x =σ y → x =σ y.

45 e e Proof. By Lemma 7.5, we can instead consider the formula ¬¬x =σ y → x =σ y. We proceed by induction on σ: The base case σ = 0 is given by Theorem 7.4. If σ = τ × ρ, we have

e e e ¬¬x =σ y ≡ ¬¬(p0 x =τ p0 y ∧ p1 x =ρ p1 y), but e e e e ¬¬(p0 x =τ p0 y ∧ p1 x =ρ p1 y) `m ¬¬p0 x =τ p0 y ∧ ¬¬p1 x =ρ p1 y by Lemma 1.1. This is equivalent to

e e p0 x =τ p0 y ∧ p1 x =ρ p1 y

e by the induction hypothesis, that is x =τ×ρ y. If σ = τ → ρ, then e τ e ¬¬(x =τ→ρ y) ≡ ¬¬(∀z .x z =ρ y z). By Lemma 1.1, this implies τ e ∀z .¬¬x z =ρ y z, and by the induction hypothesis, τ e ∀z .x z =ρ y z, that is, e x =τ→ρ y.

ω g Theorem 7.7.E -HA `i Ext .

g σ→τ σ→τ σ Proof. We have Ext ≡ ∀f , g .(∀x .¬¬f x =τ g x) → ¬¬f =σ→τ g, which is equivalent to Ext, by Lemma 7.6 and Lemma 1.1.

This gives us the following generalization of Theorem 7.3: Theorem 7.8.E -HAω is M-closed under g. Corollary 7.9. If ϕ is (essentially) isolating and all occurrences of ⊥ in ϕ are negative, then

ω ω E-PA `c ϕ =⇒ E-HA `i ϕ.

Proof. By Theorems 3.8 and 3.9, since the only occurrence of ⊥ in E-HAω is in the axiom S x = 0 → ⊥, where it is positive.

7.4. Translations of choice

Unlike the case in set theory, the type-theoretical axiom of choice is in fact an intuitionistically rather weak axiom, not at all at odds with constructivity: It is justified directly by the BHK interpretation of ∀∃-formulas! A proof of ∀x.∃y.ϕ(x, y) is a method f such that f(x) is a proof of ∃y.ϕ(x, y), but such a proof consists of a pair (y0, p) such that p is a proof of ϕ(x, y0). By applying f to an element x, we get a witness y0 to ϕ. This is a choice function! So the axiom of choice seems to be rather unproblematic from a constructive point of view. What about its negative translation? We have

g σ τ g σ→τ σ g ACσ,τ ≡ ∀x .¬∀y .¬ϕ(x, y) → ¬∀f .¬∀x .ϕ(x, f(x)) .

46 By Lemma 1.1(viii) and 1.1(xvi), this is equivalent to

∀xσ.¬¬∃yτ .ϕ(x, y)g → ¬¬∃f σ→τ .∀xσ.ϕ(x, f(x))g.

We can embed this theory in minimal logic as usual:

ω ω g g g E-PA + AC `c ϕ if and only if (E-HA ) + AC `m ϕ .

As we observed, E-HAω is M-closed under g, so this can be refined to

ω ω g g E-PA + AC `c ϕ if and only if E-HA + AC `i ϕ .

However, it turns out that ACg is intuitionistically much stronger than AC. To see this, we show that we can interpret an arithmetical theory known to be stronger than PA in E-PAω + AC. The theory of second order arithmetic PA2 is a two-sorted, classical theory where the first sort, 0, is meant to range over numbers, and the second one, 1, ranges over sets of elements of type 0. It includes a relation symbol ∈0×1. The axioms are those of Peano arithmetic, induction over elements of sort 0, and the following two axioms: The arithmetical comprehension axiom

∃y1.∀x0.x ∈ y ↔ ϕ(x)

for any formula ϕ. Extensionality ∀x1.∀y1.(∀z0.z ∈ x ↔ z ∈ y) ↔ x = y. Theorem 7.10. Second order arithmetic PA2 can be interpreted in E-PAω + AC.

Proof (Sketch). Among the terms of sort 0 → 0 we can find the characteristic function of any predicate on natural numbers. We use this to interpret and prove the second order comprehension axiom from PA2 in E-PAω + AC. Consider a predicate ϕ(x0), and define the graph of it’s characteristic function as

ψ(x, y) ≡ (ϕ(x) ∧ y = 1) ∨ (¬ϕ(x) ∧ y = 0)

By LEM, we have ω E-PA + AC `c ϕ(x) ∨ ¬ϕ(x). In the former case, we have ψ(x, 1) and in the latter ψ(x, 0), so that in both cases

ω E-PA + AC `c ∃y.ψ(x, y).

0→0 By AC0,0, there is a choice function f such that

ω E-PA + AC `c ψ(x, f x).

Thus, ω 0→0 E-PA + AC `c ∃f ∀x.ψ(x, f x), that is, ω 0→0 E-PA + AC `c ∃f ∀x.(ϕ(x) ∧ f x = 1) ∨ (¬ϕ(x) ∧ f x = 0). Compare the last formula to the second order comprehension axiom from PA2. The axiom of extensionality follows immediately from Ext.

47 (See [20] for a thorough account on PA2 and the above theorem.) We may use this to show that E-PAω +AC is stronger than E-HAω +AC. G¨odel’ssecond incompleteness theorem famously states4 that the formula

ConPA ≡ ¬∃p.PrfPA(p, p0 6= 0q) in L(PA), expressing the consistency of PA, is derivable in PA if and only if PA is inconsistent. (Here PrfPA(x, y) is a ∆0-formula expressing that x is a proof in PA of the formula coded by y, and p0 6= 0q is a code for the formula 0 6= 0). However, this is is not the case for second order arithmetic:

2 Theorem 7.11. PA `c ConPA.

2 Proof (Sketch). PA lets us inductively construct the set TN containing (the G¨odelcodes of) all closed formulas that are true in the standard model N of PA. We can then show that p0 6= 0q 6∈ TN and that ∃p.PrfPA(p, x) → x ∈ TN and conclude that ¬∃p.¬PrfPA(p, p0 6= 0q) ≡ ConPA is provable within PA2.

(A full proof can be found in [20].) Now, finally, we can combine the above theorems to show the following:

ω g Corollary 7.12. If HA is consistent, then E-HA + AC 6`i AC .

Proof. By Theorems 7.10 and 7.11 we have that

ω E-PA + AC `c ConPA, so ω g g g (E-HA ) + AC `i (ConPA) .

Note that ConPA is Π1 (by Lemma 1.1), so by Corollary 4.6,

g HA `i (ConPA) → ConPA.

The same proof can of course be carried out in HAω. Now, E-HAω is M-closed under g, so

ω g E-HA + AC `i ConPA.

g ω ω ω Suppose AC is derivable from E-HA + AC. Then E-HA + AC `i ConPA, but since E-HA + AC is conservative over HA, then HA `i ConPA. This would mean that PA (and hence HA) is inconsistent, by G¨odel’ssecond incompleteness theorem.

The theory E-HAω + AC, which turns out to be a classically rather strong theory that allows us to formalize much of mathematics, is thus not M-closed under g. This means that, although they can be used to carry over many classical number theoretic results into intuitionistic logic, the negative translations we’ve presented are in no way a silver bullet. Perhaps not completely surprising, but disappointing nonetheless.

4See, for example [19].

48 A. Proof of Lemma 1.1

Lemma 1.1. For all formulas ϕ, ψ

(i) ϕ `m ¬¬ϕ

(ii) ¬¬¬ϕ `m ¬ϕ

(iii) ¬¬⊥ `m ⊥

(iv) ϕ ∨ ψ `m ¬(¬ϕ ∧ ¬ψ)

(v) ϕ → ψ `m ¬ψ → ¬ϕ

(vi) ϕ → ψ `m ¬¬ϕ → ¬¬ψ

(vii) ¬(ϕ → ψ) `i ¬(¬ϕ ∨ ψ) σ σ (viii) ∃x .ϕ `m ¬∀x .¬ϕ σ σ (ix) ¬∃x .ϕ `m ∀x .¬ϕ σ σ (x) ∀x .¬ϕ `m ¬∃x .ϕ

(xi) ¬¬ϕ ∧ ¬¬ψ `m ¬¬(ϕ ∧ ψ)

(xii) ¬¬(ϕ ∧ ψ) `m ¬¬ϕ ∧ ¬¬ψ

(xiii) ¬¬ϕ → ¬¬ψ `i ¬¬(ϕ → ψ) σ σ (xiv) ∃x .¬¬ϕ `m ¬¬∃x .ϕ σ σ (xv) ¬¬∀x .ϕ `m ∀x .¬¬ϕ σ σ (xvi) ¬∀x .¬ϕ `m ¬¬∃x .ϕ

Proof. We have the following derivations: (i) [¬ϕ]1 [ϕ]

⊥ 1 ¬¬ϕ (ii) [¬ϕ]1 [ϕ]2 ⊥ [¬¬¬ϕ] ¬¬ϕ 1 ⊥ ¬ϕ 2 (iii) [⊥]1 1 [(⊥ → ⊥) → ⊥] ⊥ → ⊥ ⊥ 2 (iv) [¬ϕ ∧ ¬ψ]1 [¬ϕ ∧ ¬ψ] ¬ϕ [ϕ]3 ¬ψ [ψ]3

⊥ 1 ⊥ 2 [ϕ ∨ ψ] ¬(¬ϕ ∧ ¬ψ) ¬(¬ϕ ∧ ¬ψ) 3 ¬(¬ϕ ∧ ¬ψ) (v) [ϕ → ψ][ϕ]1 [¬ψ]2 ψ

⊥ 1 ¬ϕ 2 ¬ψ → ¬ϕ

49 (vi) [ϕ → ψ] (Theorem 1.1(v)) ¬ψ → ¬ϕ (Theorem 1.1(v)) ¬¬ϕ → ¬¬ψ (vii) [¬ϕ]1 [ϕ]2 ⊥ ⊥ ψ i [ψ]1 2 3 [¬ϕ ∨ ψ]4 ϕ → ψ ϕ → ψ 1 [¬(ϕ → ψ)] ϕ → ψ

⊥ 4 ¬(¬ϕ ∨ ψ) (viii) [∀xσ.¬ϕ]1 ¬ϕ(xσ)[ϕ(xσ)]2

⊥ 1 [∃xσ.ϕ] ¬∀xσ.¬ϕ 2 ¬∀xσ.¬ϕ (ix) [ϕ]1 [¬∃xσ.ϕ] ∃xσ.ϕ

⊥ 1 ¬ϕ ∀xσ.¬ϕ (x) [∀xσ.¬ϕ] ¬ϕ [ϕ]1 [∃xσ.ϕ]2 ⊥ 1 ⊥ 2 ¬∃xσ.ϕ (xi) [ϕ]1 [ψ]2 [¬(ϕ ∧ ψ)]3 ϕ ∧ ψ [¬¬ϕ ∧ ¬¬ψ]4 ⊥ 1 ¬¬ϕ ¬ϕ [¬¬ϕ ∧ ¬¬ψ] ⊥ 2 ¬¬ψ ¬ψ

⊥ 3 ¬¬(ϕ ∧ ψ) (xii) [ϕ ∧ ψ]3 [ϕ ∧ ψ]1 4 [¬ϕ]2 ϕ [¬ψ] ψ ⊥ ⊥ 1 3 [¬¬(ϕ ∧ ψ)] ¬(ϕ ∧ ψ) [¬¬(ϕ ∧ ψ)] ¬(ϕ ∧ ψ) ⊥ ⊥ 2 4 ¬¬ϕ ¬¬ψ ¬¬ϕ ∧ ¬¬ψ

50 (xiii) [¬(ϕ → ψ)]3 3 (Theorem 1.1(vii)) [¬ϕ]1 [¬(ϕ → ψ)] 2 ¬(¬ϕ ∨ ψ) ¬ϕ ∨ ψ (Theorem 1.1(vii)) [ψ] ¬(¬ϕ ∨ ψ) ¬ϕ ∨ ψ ⊥ 1 [¬¬ϕ → ¬¬ψ] ¬¬ϕ ⊥ 2 ¬¬ψ ¬ψ

⊥ 3 ¬¬(ϕ → ψ) (xiv) [ϕ(xσ)]1 [¬∃xσ.ϕ]2 ∃xσ.ϕ

⊥ 1 [¬¬ϕ(xσ)]3 ¬ϕ(xσ)

⊥ 2 [∃xσ.¬¬ϕ] ¬¬∃xσ.ϕ 3 ¬¬∃xσ.ϕ (xv) [∀xσ.ϕ]1 [¬ϕ]2 ϕ

⊥ 1 [¬¬∀xσ.ϕ] ¬∀xσ.ϕ

⊥ 2 ¬¬ϕ ∀xσ.¬¬ϕ (xvi) [ϕ[t/xσ]]1 [¬∃xσ.ϕ]2 ∃xσ.ϕ

⊥ 1 ¬ϕ[t/xσ] [¬∀xσ.¬ϕ] ∀xσ.¬ϕ

⊥ 2 ¬¬∃xσ.ϕ

51 B. Proof of the soundness theorem for program extraction

Theorem 6.3 (Soundness). If ϕ Γ `i M : ϕ, then there is a proof term µ(M) such that

0 Γ `i µ(M): M mr ϕ, J K 0 χ χ χ where Γ = {t(u) mr χ | (u : χ) ∈ Γ} and t(u ) = xu if T (χ) 6= ∗ and t(u ) = u otherwise.

Proof. By induction on the structure of M: • If uϕ : ϕ, then ϕ is an open assumption and ϕ ∈ Γ. Hence, when T (ϕ) 6= ∗,

ϕ qu y mr ϕ ≡ xu mr ϕ

and we have 0 0 u : u mr ϕ `i u : u mr ϕ, J K J K so we set µ(uϕ) = u0. (The case T (ϕ) = ∗ is similar.) • Suppose !(M ⊥)ϕ : ϕ. By induction hypothesis, there is a proof

µ(M): M mr ⊥. J K We have M = u since T (⊥) = ∗, so J K M mr ⊥ ≡ ⊥, J K and hence we can construct a proof

cT (ϕ) mr ϕ !(µ(M)) : cT (ϕ) mr ϕ ≡ q!(M)y mr ϕ.

• Suppose hM ϕ,N ψi : ϕ ∧ ψ. By the induction hypothesis there are proofs

µ(M): M mr ϕ J K µ(N): N mr ψ. J K – If T (ϕ) = ∗, then M = u J K so

qhM,Niy mr(ϕ ∧ ψ) ≡ N mr(ϕ ∧ ψ) J K ≡ u mr ϕ ∧ N mr ψ J K ≡ M mr ϕ ∧ N mr ψ. J K J K Then we have a proof hµ(M), µ(N)i : qhM,Niy mr(ϕ ∧ ψ). (The case T (ψ) = ∗ is similar.)

52 – If T (ϕ) 6= ∗= 6 T (ψ), then qhM,Niy = p M N J KJ K so

qhM,Niy mr(ϕ ∧ ψ) ≡ p0 qhM,Niy mr ϕ ∧ p1 qhM,Niy mr ψ

≡ p0 (p M N ) mr ϕ ∧ p1 (p M N ) mr ψ J KJ K J KJ K ≡ M mr ϕ ∧ N mr ψ J K J K Then we have a proof hµ(M), µ(N)i : qhM,Niy mr(ϕ ∧ ψ).

ϕ∧ψ • Suppose π0(M ): ϕ. By the induction hypothesis there is a proof

µ(M): M mr(ϕ ∧ ψ). J K – If T (ψ) = ∗, then qπ0(M)y = M J K so

qπ0(M)y mr ϕ ≡ M mr ϕ. J K Then we have a proof π0(µ(M)) : qπ0(M)y mr ϕ since M mr(ϕ ∧ ψ) ≡ M mr ϕ ∧ u mr ψ. J K J K – If T (ϕ) = ∗ and T (ψ) 6= ∗, then qπ0(M)y = u so

qπ0(M)y mr ϕ ≡ u mr ϕ.

Then we have a proof π0(µ(M)) : qπ0(M)y mr ϕ since M mr(ϕ ∧ ψ) ≡ u mr ϕ ∧ M mr ψ. J K J K – If T (ψ) 6= ∗= 6 T (ϕ), then qπ0(M)y = p0 M J K so

qπ0(M)y mr ϕ ≡ p0 M mr ϕ. J K Then we have a proof π0(µ(M)) : qπ0(M)y mr ϕ since M mr(ϕ ∧ ψ) = p0 M mr ϕ ∧ p1 M mr ψ. J K J K J K ϕ∧ψ (The case π1(M ): ψ is similar.)

53 • Suppose inl(M ϕ)ϕ∨ψ : ϕ ∨ ψ. By the induction hypothesis, we have a proof µ(M): M mr ϕ. J K If T (ϕ) = ∗, we have M = u. J K Then, either T (ψ) = ∗ and qinl(M)y = 0, or T (ψ) 6= ∗ and qinl(M)y = h0, cT (ψ)i. In the former case

qinl(M)y mr ϕ ∨ ψ ≡ 0 mr ϕ ∨ ψ ≡ (0 = 0 → u mr ϕ) ∧ (0 6= 0 → u mr ψ) ≡ (0 = 0 → M mr ϕ) ∧ (0 6= 0 → u mr ψ). J K In the latter case, we have

qinl(M)y mr ϕ ∨ ψ ≡ h0, cT (ψ)i mr ϕ ∨ ψ

≡ (0 = 0 → u mr ϕ) ∧ (0 6= 0 → cT (ψ) mr ψ)

≡ (0 = 0 → M mr ϕ) ∧ (0 6= 0 → cT (ψ) mr ψ). J K Now if T (ϕ) 6= ∗, either T (ψ) = ∗ and

qinl(M)y = h0, M i, J K or T (ψ) 6= ∗ and qinl(M)y = h0, h M , cT (ψ)ii. J K In the former case

qinl(M)y mr ϕ ∨ ψ ≡ h0, M i mr ϕ ∨ ψ J K ≡ (0 = 0 → M mr ϕ) ∧ (0 6= 0 → u mr ψ). J K In the latter case, we have

qinl(M)y mr ϕ ∨ ψ ≡ h0, h M , cT (ψ)ii mr ϕ ∨ ψ J K ≡ (0 = 0 → M mr ϕ) ∧ (0 6= 0 → cT (ψ) mr ψ). J K In all of the above cases we can construct a proof

h(λu0=0.µ(M)), λv06=0.!(v06=0 u0=0)χi : qinl(M)y mr ϕ ∨ ψ. with a suitably chosen χ. (The case inr(M ψ)ϕ∨ψ : ϕ ∨ ψ is similar.) ϕ∨ψ χ χ • Suppose Du,v(M , s1 , s2 ): χ. By the induction hypothesis, there are proofs

Γ `i µ(M): M mr ϕ ∨ ψ J K and

0 Γ ∪ {u : t(u) mr ϕ} `i µ(s1(u)) : s1 mr χ 0 J K Γ ∪ {v : t(v) mr ψ} `i µ(s2(v)) : s2 mr χ J K for all u, v.

54 – If T (ϕ) = ∗ = T (ψ), then

qDu,v(M, s1, s2)y = if M then s2 else s1 J K J K J K and t mr ϕ ∨ ψ ≡ (t = 0 → u mr ϕ) ∧ (t 6= 0 → u mr ψ) for all t. Furthermore, t(u) = t(v) = u.

Now, T (ϕ ∨ ψ) = 0, so either M 7→β 0 or M 7→β S n for some n. J K J K In the former case, we have

0=0 Γ `i π0(µ(M)) t1 : u mr ϕ

0=0 for some proof t1 and if M then s2 else s1 = s1 . J K J K J K J K Hence, since from the induction hypothesis we have a proof

0 Γ `i λu .µ(s1): u mr ϕ → s1 mr χ, J K that is, 0 Γ `i λu .µ(s1): u mr ϕ → qDu,v(M, s1, s2)y mr χ, we have a proof

 0     Γ `i λu .µ(s1) π0 µ(M) t1 : qDu,v(M, s1, s2)y mr χ.

In the latter case, we have S n6=0 Γ `i π1(µ(M)) t2 : u mr ψ S n6=0 for some proof t2 and

if M then s2 else s1 = s2 , J K J K J K J K whence we get a proof

 0     Γ `i λv .µ(s2) π1 µ(M) t2 : qDu,v(M, s1, s2)y mr χ.

– If T (ϕ) 6= ∗ = T (ψ), then h i qDu,v(M, s1, s2)y = if p0 M then s2 else s1 p1 M /xu J K J K J K J K and t mr ϕ ∨ ψ ≡ (p0 t = 0 → p1 t mr ϕ) ∧ (p0 t 6= 0 → u mr ψ)

for all t. Furthermore, t(u) = xu and t(v) = u.

Now, T (ϕ ∨ ψ) = 0 × T (ϕ), so either p0 M 7→β 0 or p0 M 7→β S n for some n. J K J K In the former case, we have

0=0 Γ `i π0(µ(M)) t1 : u mr ϕ

0=0 for some proof t1 and h i if M then s2 else s1 = s1 p1 M /xu . J K J K J K J K J K

55 Hence, since from the induction hypothesis we have a proof

0 Γ `i λxu.λu .µ(s1): xu mr ϕ → s1 (xu) mr χ, J K we have    0      h i Γ `i λxu.λu .µ(s1) p1 M π0 µ(M) t1 : s1 p1 M /xu mr χ, J K J K J K that is    0      Γ `i λxu.λu .µ(s1) p1 M π0 µ(M) t1 : qDu,v(M, s1, s2)y mr χ, J K

If p0 M 7→β S n, we proceed similarly to the previous case. J K (The remaining cases are similar.) • Suppose λuϕ.M ψ : ϕ → ψ. By the induction hypothesis, there is a proof

0 Γ ∪ {u : t(u) mr ϕ} `i µ(M): M mr ψ. J K – If T (ϕ) = ∗, then λu.M = M , J K J K so λu.M mr(ϕ → ψ) ≡ u mr ϕ → λu.M mr ψ J K J K ≡ u mr ϕ → M mr ψ. J K Furthermore, t(u) = u, so the induction hypothesis gives

0 Γ ∪ {u : u mr ϕ} `i µ(M): M mr ψ. J K This gives a proof λu0.µ(M): λu.M mr(ϕ → ψ). J K – If T (ϕ) 6= ∗ and T (ψ) = ∗, then rM ψz = u so λu.M mr(ϕ → ψ) ≡ ∀xσ.(x mr ϕ → u mr ψ) J K ≡ ∀xσ.(x mr ϕ → M mr ψ). J K Hence, λxσ.λu0.µ(M): λu.M mr(ϕ → ψ). J K – If T (ϕ) 6= ∗ and T (ψ) 6= ∗, then

λu.M = λxu. M J K J K and λu.M mr(ϕ → ψ) ≡ ∀xσ.(x mr ϕ → λu.M x mr ψ) J K σ J K  ≡ ∀x .(x mr ϕ → λxu. M x mr ψ) J K σ h i ≡ ∀x .(x mr ϕ → M x/xu mr ψ), J K so λxσ.λu0.µ(M): λu.M mr(ϕ → ψ). J K

56 • Suppose M ϕ→ψ N ϕ : ψ. By the induction hypothesis there are proofs

µ(M): M mr(ϕ → ψ) J K µ(N): N mr ϕ. J K – If T (ϕ) = ∗, then MN = M J K J K and N = u, J K so

M mr(ϕ → ψ) ≡ u mr ϕ → M mr ψ J K J K ≡ N mr ϕ → MN mr ψ. J K J K Then we have a proof µ(M) µ(N) : MN mr ψ. J K – If T (ϕ) 6= ∗ and T (ψ) = ∗, then T (ϕ → ψ) = ∗ so MN = u. J K We have

M mr(ϕ → ψ) ≡ ∀xσ(x mr ϕ → u mr ψ) J K ≡ ∀xσ(x mr ϕ → MN mr ψ) J K and hence we have a proof   µ(M) N  µ(N) : MN mr ψ. J K J K

– If T (ϕ) 6= ∗ and T (ψ) 6= ∗, then MN = M N , J K J KJ K so

M mr(ϕ → ψ) ≡ ∀xσ(x mr ϕ → M x mr ψ). J K J K Then we have a proof

µ(M) N : N mr ϕ → M N mr ψ ≡ N mr ϕ → MN mr ψ J K J K J KJ K J K J K and   µ(M) N  µ(N) : MN mr ψ. J K J K • Suppose λxσ.M ϕ : ∀xσ.ϕ. By the induction hypothesis, there is a proof  µ M[t/x] : qM[t/x]y mr ϕ[t/x].

for every tσ.

57 – If T (ϕ) = ∗, then M = u, J K so

qλxσ.My mr(∀xσ.ϕ) ≡ ∀xσ.u mr ϕ ≡ ∀xσ. M mr ϕ. J K Then we have a proof λxσ.µ(M): qλxσ.My mr(∀xσ.ϕ).

– If T (ϕ) 6= ∗, then, since qλxσ.My = λxσ. M , J K we have

qλxσ.My mr(∀xσ.ϕ) ≡ ∀xσ.qλxσ.My x mr ϕ ≡ ∀xσ.λxσ. M  x mr ϕ J K ≡ ∀xσ. M mr ϕ. J K Then we have a proof λxσ.µ(M): qλxσ.My mr(∀xσ.ϕ).

σ • Suppose rM ∀x .ϕ tz : ϕ[t/x]. Then M t = M t. J K J K By the induction hypothesis, there is a proof

µ(M): M mr(∀xσ.ϕ). J K – If T (ϕ) = ∗, then q(M x)ϕy = u for every xσ so

M mr(∀xσ.ϕ) ≡ ∀xσ. (u mr ϕ) J K ≡ ∀xσ. M x mr ϕ . J K Then we have a proof µ(M) t : M t mr ϕ[t/x]. J K – If T (ϕ) 6= ∗, then

M mr(∀xσ.ϕ) ≡ ∀xσ. M x mr ϕ J K J K ≡ ∀xσ. M x mr ϕ . J K Then we have a proof µ(M) t : M t mr ϕ[t/x]. J K • Suppose htσ,M ϕ[t/x]i : ∃xσ.ϕ. By the induction hypothesis, there is a proof

µ(M): M mr ϕ[t/x]. J K

58 – If T (ϕ) = ∗, then qht, Miy = t, and M = u J K so h i qht, Miy mr(∃xσ.ϕ) ≡ u mr ϕ qht, Miy/x ≡ u mr ϕ[t/x] ≡ M mr ϕ[t/x], J K but we already have a proof µ(M): M mr ϕ[t/x]. J K – If T (ϕ) 6= ∗, then qht, Miy = p t M , J K so

σ h i qht, Miy mr(∃x .ϕ) ≡ p1qht, Miy mr ϕ p0qht, Miy/x  h  i ≡ p1 p t M mr ϕ p0 p t M /x J K J K  h  i ≡ p1 p t M mr ϕ p0 p t M /x J K J K ≡ M mr ϕ[t/x], J K but we already have a proof µ(M): M mr ϕ[t/x]. J K ∃xσ .ϕ(x) ϕ(y) ψ • Suppose Eu,y(M ,N(u , y) ): ψ. By the induction hypothesis, there is a proof

σ Γ `i µ(M): M mr ∃x .ϕ(x), J K and a proof n 0 o  Γ ∪ u : t(u) mr ϕ[y/x] `i µ N(u, y) : qN(u, y)y mr ψ for all u, y. – If T (ϕ) = ∗, then h i qEu,y(M,N)y = N M , u/y, xu , J K J K and h i M mr ∃xσ.ϕ ≡ u mr ϕ M /x . J K J K With y = M above, since t(u) = u if T (ϕ) = ∗, we have a proof J K   0 h i  h i Γ ∪ u : u mr ϕ M /x `i µ N(u, M ) : N M , u/y, xu mr ψ, J K J K J K J K whence

0  h i h i Γ `i λu .µ N(u, M ) : u mr ϕ M /x → N M , u/y, xu mr ψ. J K J K J K J K

59 This gives a proof    0  h i Γ `i λu .µ N(u, M ) µ(M) : N M , u/y, xu mr ψ. J K J K J K That is,    0   Γ `i λu .µ N(u, M ) µ(M) : rEu,y M,N(u, M ) z mr ψ. J K J K

– If T (ϕ) 6= ∗, then h i qEu,y(M,N)y = N p0 M , p1 M /y, xu , J K J K J K and

σ  h i M mr ∃x .ϕ ≡ p1 M mr ϕ p0 M /x . J K J K J K

With y = p0 M above, since t(u) = xu if T (ϕ) 6= ∗, we have a proof J K   0 h i  h i Γ ∪ u : xu mr ϕ p0 M /x `i µ N(u, p0 M ) : N p0 M /y mr ψ, J K J K J K J K whence

0  h i h i Γ `i λu .µ N(u, p0 M ) : xu mr ϕ p0 M /x → N p0 M /y mr ψ J K J K J K J K and

0  T (ϕ) h i h i Γ `i λxu.λu .µ N(u, p0 M ) : ∀xu .xu mr ϕ p0 M /x → N p0 M /y mr ψ. J K J K J K J K This gives a proof

  !  0   h i Γ `i λxu.λu .µ N(u, p0 M ) p1 M µ(M) : N p0 M , p1 M /y, xu mr ψ. J K J K J K J K J K

That is,

  !  0   s  { Γ `i λxu.λu .µ N(u, p0 M ) p1 M µ(M) : Eu,y M,N u, M mr ψ. J K J K J K

60 References

[1] H. P. Barendregt. Lambda calculus with types. In S. Abramsky, D. Gabbay, and T. Maibaum, editors, Handbook of Logic in Computer Science, volume 2. Oxford University Press, 1992. [2] M. Beeson. Goodman’s theorem and beyond. Pacific Journal of Mathematics, 84(1):1–16, 1979. [3] M. Beeson. Foundations of constructive mathematics. Springer-Verlag, 1985. [4] E. Bishop. Foundations of constructive analysis. McGraw-Hill series in higher mathematics. McGraw-Hill, 1967. [5] T. Coquand. Computational content of classical logic. Semantics and logics of computation, 14:33, 1997. [6] N. Cutland. Computability: An introduction to recursive function theory. Cambridge university press, 1980. [7] H.-D. Ebbinghaus. Mathematical logic. Springer, 1994. [8] H. Friedman. Classically and intuitionistically provably recursive functions. In Higher set theory, pages 21–27. Springer, 1978. [9] N. D. Goodman and J. Myhill. The formalization of Bishop’s constructive mathematics. In Toposes, Algebraic Geometry and Logic, pages 83–96. Springer, 1972. [10] P. H´ajek and P. Pudl´ak.Metamathematics of first-order arithmetic. Perspectives in Mathematical Logic, Springer-Verlag, 1993. [11] B. Jacobs. Categorical logic and type theory, volume 141. Elsevier, 1999. [12] S. C. Kleene. On the interpretation of intuitionistic number theory. The Journal of Symbolic Logic, 10(4):pp. 109–124, 1945. [13] U. Kohlenbach. Applied proof theory: proof interpretations and their use in mathematics. Springer, 2008. [14] D. Leivant. Intuitionistic formal systems. In L. A. Harrington, M. D. Morley, A. Scedrov,ˇ and S. G. Simpson, editors, Harvey Friedman’s research on the foundations of mathematics. Elsevier, 1985. [15] D. Leivant. Syntactic translations and provably recursive functions. The Journal of Symbolic Logic, 50(3):682–688, 1985. [16] M. Manzano. Introduction to many-sorted logic. In J. T. K. Meinke, editor, Many-Sorted Logic and its Applications. Wiley Professional Computing, 1993. [17] D. Prawitz and P.-E. Malmn¨as. A survey of some connections between classical, intuitionistic and minimal logic. In Contributions to Mathematical Logic, Proceedings of the Logic Colloquium, Hannover, 1966. North-Holland. [18] H. Schwichtenberg. Refined program extraction from classical proofs: Some case studies, 1999. [19] H. Schwichtenberg and S. S. Wainer. Proofs and computations. Cambridge University Press, 2011. [20] J. R. Shoenfield. Mathematical logic, volume 21. Addison-Wesley Reading, 1967. [21] M. H. Sørensen and P. Urzyczyn. Lectures on the Curry-Howard isomorphism, volume 149. Elsevier, 2006. [22] A. S. Troelstra. Metamathematical Investigation of Intuitionistic Arithmetic and Analysis. Springer, 1973. [23] A. S. Troelstra and D. van Dalen. Constructivism in Mathematics, volume 1. North-Holland, 1988.

61 [24] A. S. Troelstra and D. van Dalen. Constructivism in Mathematics, volume 2. North-Holland, 1988.

62