<<

Proof and Mining II: Formal of Analysis

Jeremy Avigad

Department of Philosophy and Department of Mathematical Sciences Carnegie Mellon University

July 2015 Sequence of Topics

1. Computable Analysis 2. Formal Theories of Analysis 3. The Dialectica and Applications 4. Ultraproducts and Nonstandard Analysis Formal theories of analysis

Analysis can be formalized in theory.

But Part I showed that we don’t always need big sets: many objects of analysis can be represented by sets of natural numbers.

Hilbert and Bernays’ Grundlagen der Mathematik, volume II, 1936, showed that portions of analysis can be formalized in second-order arithmetic (with third-order parameters).

SOA is often called “analysis” for that reason.

Kreisel: provability in restricted theories provides additional information. Axiomatic theories

We will consider: • Primitive recursive arithmetic (PRA) • First-order arithmetic (PA, HA) (and subsystems) • Second-order arithmetic (PA2 , HA2 ) (and subsystems)

These come in classical and intuitionistic versions.

We will also consider theories of finite type. Primitive recursive arithmetic

Recall that the primitive recursive functions include • zero, 0 • successor, S n • projections, pi (x1,..., xn) = xi and are closed under

• composition: f (~x) = h(g1(~x),..., gn(~x)) • primitive :

f (0, ~z) = g(~z), f (x + 1, ~z) = h(f (x, ~z), x, ~z) Primitive recursive arithmetic

Primitive recursive arithmetic is an axiomatic theory, with • 0 6= S(x), S(x) = S(y) → x = y • defining equations for the primitive recursive functions • quantifier-free induction: ϕ(0) ϕ(x) → ϕ(x + 1) ϕ(t)

In PRA, can handle “finitary” proofs. Primitive recursive arithmetic

Just as all “reasonable” computable functions are primitive recursive, all “reasonable” facts about them can be proved in PRA.

In fact, it is surprisingly hard to find ordinary mathematical that can be stated in the language of PRA but not proved there. Primitive recursive arithmetic

PRA can be presented either as a first-order theory (classical or intuitionistic) or as a quantifier-free calculus.

Theorem. Suppose first-order classical PRA proves ∀x ∃y ϕ(x, y), with ϕ quantifier-free. Then for some f , quantifier-free PRA proves ϕ(x, f (x)).

Proof. First, show PRA has a set of universal . Define bounded quantification, and replace induction by

∀y (ϕ(0) ∧ ∀x < y (ϕ(x) → ϕ(S(x))) → ϕ(y)).

Then apply Herbrand’s . Herbrand’s theorem

Theorem. Suppose T is a universal axiomatized first-order theory, and T ` ∃~x ϕ(~x), ϕ quantifier free. Then there are sequences of terms ~t1,...~tn and a quantifier-free proof of

ϕ(~t1) ∨ ... ∨ ϕ(~tn) from instances of axioms of T .

Proof. WLOG, assume T has a constant symbol.

If the conclusion fails, T ∪ {¬ϕ(~t)} is consistent, where ~t ranges over all tuples of terms.

Let M be a model. Let M0 be the submodel with universe {tM}. Then M0 |= T ∪ {∀~x ¬ϕ(~x)}. First-order arithmetic

First-order arithmetic is essentially PRA plus induction. Peano arithmetic (PA) is classical, Heyting arithmetic (HA) is intuitionistic.

Language: 0, S, +, ×, <.

Axioms: quantifier-free defining axioms, induction.

A formula is

• ∆0 if every quantifier is bounded

• Σ1 if of the form ∃~x ϕ, ϕ ∈ ∆0

• Π1 if of the form ∀~x ϕ, ϕ ∈ ∆0

• ∆1 if equivalent to Σ1 and Π1

Primitive recursive functions / relations have Σ1/∆1 definitions. First-order arithmetic

IΣ1 is the restriction of PA with induction for only Σ1 formulas.

This theory suffices to define the primitive recursive functions, and hence interpret PRA. Conversely:

Theorem (Parsons, Mints, Takeuti). IΣ1 is conservative over PRA for Π2 sentences: if

IΣ1 ` ∀x ∃y ϕ(x, y), with ϕ quantifier-free, then

PRA ` ϕ(x, f (x)) for some function symbol f . Conservativity of IΣ1 over PRA

There are various ways to prove this theorem.

Syntactic proofs: • Using cut elimination or normalization. • Using the Dialectica interpretation (plus normalization).

Model-theoretic proofs: • A model-theoretic due to Friedman. • A “saturation” construction. Double-negation translations

The G¨odel-Gentzendouble-negation translation interprets classical in minimal logic: • AN ≡ ¬¬A for atomic A • (ϕ ∨ ψ)N ≡ ¬(¬ϕN ∧ ¬ψN ) • (∃x ϕ)N ≡ ¬∀x ¬ϕN . The translation commutes with ∀, ∧, →.

Theorem. If Γ ` ϕ classically, ΓN ` ϕN in minimal logic.

The proof uses induction on derivations. For example, (ϕ ∨ ¬ϕ)N is easily proved. Reducing PA to HA

Theorem. If PA proves a formula, ϕ, then HA proves ϕN .

Proof. For each of ϕ of PA, HA proves ϕN .

Corollary. If PA proves ∀x ∃y R(x, y), where R is primitive recursive, then HA proves ∀x ¬∀y ¬R(x, y).

Better theorem. If PA proves ∀x ∃y R(x, y), where R is primitive recursive, then HA proves ∀x ∃y R(x, y).

There are various ways to prove this; later, we will use the Dialectica interpretation. Second order arithmetic

The language is two-sorted: • variables x, y, z,... and functions 0, S, +, × on one sort • variables X , Y , Z,... on the other sort • a relation t ∈ X between the two sorts

Axioms: • axioms of PA, with induction extended to the bigger language • comprehension: ∃X ∀y (y ∈ X ↔ ϕ(y, ~z))

The “standard model” is (N, P(N),...), but there are smaller ones.

An ω-model is a model where the first-order part is standard, i.e. N. Second order arithmetic

We can define equality on the second sort by

X = Y := ∀z (z ∈ X ↔ z ∈ Y ).

The usual axioms of equality (including substitution) follow.

This is known as “extensional equality.”

Alternatively, we can take equality as a basic symbol, and add the axiom above.

This second system is interpreted in the first. Second order arithmetic

Functions vs. sets: • With functions basic, interpret sets as characteristic functions:

x ∈ S ≡ χS (x) = 1.

• With relations basic, interpret functions as functional relations, ∀x ∃!y R(x, y)

Can add choice axioms:

∀x ∃y ϕ(x, y) → ∃f ∀y ϕ(x, f (x)). Second order arithmetic

From a proof-theoretic perspective, second-order arithmetic is very strong.

We obtain weaker systems by: • restricting comprehension • restricting induction Subsystems of second-order arithmetic

The big five: 0 • RCA0 : recursive (∆1) comprehension (formalized computable analysis)

• WKL0 : weak K¨onig’slemma (a form of compactness)

• ACA0 : arithmetic comprehension (analytic principles like the least-upper bound principle.)

• ATR0 : transfinitely iterated arithmetic comprehension (transfinite constructions) 1 1 • Π1 −CA0 :Π1 comprehension (strong analytic principles)

We will focus on the first three. RCA0

The axioms of RCA0 are as follows: • quantifier-free axioms for 0, S, +, ×, <

• induction, restricted to Σ1 formulas (with both number and set parameters):

ϕ(0) ∧ ∀x (ϕ(x) → ϕ(x + 1)) → ∀x ϕ(x)

• the recursive comprehension axiom, (RCA):

∀x (ϕ(x) ↔ ψ(x)) → ∃Y ∀x (x ∈ Y ↔ ϕ(x))

where ϕ is Σ1 and ψ is Π1. RCA0

Notice that the induction schema includes set induction:

0 ∈ Y ∧ ∀x (x ∈ Y → x + 1 ∈ Y ) → ∀x (x ∈ Y ).

It is slightly stronger.

Since RCA0 includes IΣ1 , we can act as though primitive recursive arithmetic is “built-in.” RCA0

(RCA) says that if a set exists if it has a definition as well as a co-computably enumerable definition (relative to others sets in the universe).

Roughly, it allows you to define computable sets and relations.

Let REC denote the set of recursive sets. Then (N, REC,...) is the minimal ω-model.

Analysis in RCA0 is roughly “formalized computable analysis.” WKL0

Remember that a on {0, 1} is a set T of finite binary sequences closed under initial segments:

Tree(T ) ≡ ∀σ, τ (σ ∈ T ∧ τ ⊆ σ → τ ∈ T ).

We can say T is infinite as follows:

Infinite(T ) ≡ ∀n ∃σ (σ ∈ T ∧ length(σ) = n).

A set P is a path through T if, when viewed as an infinite binary sequence, every initial segments is in T :

Path(P, T ) ≡ ∀σ ((∀i < length(σ) , i ∈ P ↔ (σ)i = 1) → σ ∈ T ). WKL0

Weak K¨onig’slemma (WKL) says that every infinite binary tree has a path:

∀T (Tree(T ) ∧ Infinite(T ) → ∃P Path(P, T )).

The theory WKL0 is RCA0 + (WKL). WKL0

Since there are computable infinite binary trees with no computable path, REC is not an ω-model.

It has a model where every set is computable from 00.

Remarkably, (WKL) has no minimal ω-model, but the intersection of all ω-models is REC. WKL0

Over RCA0 ,(WKL) is equivalent to each of these: • the Heine-Borel theorem (for [0, 1]) • Every open cover of {0, 1}ω has a finite subcover. • Every continuous function on [0, 1] is uniformly continuous • Every continuous function on [0, 1] is bounded

Here, [0, 1] can be replaced by any compact space.

(More on this below.) ACA0

A formula is arithmetic if it has only first-order quantifiers.

ACA0 adds to RCA0 the arithmetic comprehsion axiom,(ACA):

∃Y ∀x (x ∈ Y ↔ ϕ(x)) where ϕ is arithmetic (possibly with number and set parameters).

The smallest ω-model is the collection of arithmetically definable sets.

Notice that set-induction now gives us induction for every arithmetic formula. ACA0

Theorem. Over RCA0 ,(ACA) is equivalent to each of these: • Every bounded increasing sequence of real numbers has a least upper bound. • Every bounded sequence of real nubmers has a least upper bound. • Every bounded sequence of real numbers has a convergence subsequence. • Every sequence of points in a compact metric space has a convergent subsequence. of SOSOA

We have considered subsystems of second-order arithmetic from the point of view of: • their minimal models • what they can be prove

Knowing that a mathematical theorem is provable in a restricted theory, T , provides additional information. Conservation results

Many central proof theoretic results have the following form: 0 For any ϕ ∈ Γ, if T1 ` ϕ, then T2 ` ϕ . These can be used to reduce • infinitary theories to finitary ones • classical theories to constructive ones • impredicative theories to predicative ones • nonstandard theories to standard ones • higher-order theories to first-order ones • first-order theories to quantifier-free ones Conservation results

From a foundational point of view, conservations results explain or interpret theories that are more • abstract, • mysterious, or • dubious, in terms of ones that are more • concrete, • familiar, or • trustworthy.

From a practical point of view, they provide additional information. Conservation results

We have already seen a few examples:

• PRA is conservative over quantifier-free PRA for Π2 formulas (in an appropriate sense)

• IΣ1 is conservative over PRA for Π2 formulas. • If PA proves ϕ, HA proves ϕN . Conservation results

There are at least two approaches to proving “if T1 proves ϕ, then 0 T2 proves ϕ ”:

• Syntactic (proof theoretic): translate a proof of ϕ in T1 to a proof of ϕ in T2.

• Semantic (model theoretic): transform a model of T2 ∪ {¬ϕ} into a model of T1 ∪ {¬ϕ}.

The second relies on soundness and completeness, and may a priori provide no information about how to translate proofs.

A slight variant works for intuitionistic theories. Conservation results

Syntactic approaches: • “Global” transformations • “Local” interpretations

Global transformations like cut-elimination and normalization allow iterated exponential increase in proof length.

Local interpretations proceed line by line. Conservation results

Interpretations: • Double-negation translations • The Friedman-Dragalin interpretation • Semantic interpretations • Forcing translations • Realizability interpretations • Functional interpretations

The last two work best on intuitionistic theories. Metamathematics of RCA0

RCA0 is interpretable in IΣ1 .

Simply interpret the set variables as ranging over codes for computable sets.

For example, we can take the codes to be indices of Turing machines that halt on every input and return 0 or 1.

So RCA0 is conservative over IΣ1 for all first-order sentences.

Hence it is conservative over PRA for Π2 sentences.

So whenever RCA0 proves ∀x ∃y R(x, y), R primitive recursive, there is a primitive recursive function witnessing this. Weak K¨onig’slemma

Theorem (Friedman). WKL0 is conservative over PRA for Π2 sentences.

First syntactic proof, using cut-elimination, by Sieg.

We’ll later see a proof by Kohlenbach using the Dialectica interpretation.

1 Theorem (Harrington). WKL0 is conservative over RCA0 for Π1 sentences.

Harrington’s model-theoretic proof is inspired by the low-basis theorem.

Syntactic proofs by H´ajek,Avigad, Ferreira and Ferreira. Metamathematics of ACA0

Theorem. ACA0 is conservative over PA.

Proof. Suppose PA doesn’t prove ϕ. Let M be a model of PA ∪ {¬ϕ}.

Turn this into a model ACA0 ∪ {¬ϕ}, taking the second-order part to be the collection of sets that are arithmetically definable from parameters.

It is not hard to see that (ACA) and the broader schema of induction hold in this model, while the value of ϕ does not change.

1 (A slight variation shows conservativity for Π1 sentences, in an appropriate sense.) Metamathematics of ACA0

We will see later that whenever PA proves ∀x ∃y R(x, y) for a primitive recursive R, then there is a primitive recursive functional f of type N → N such that ∀x R(x, f (x)) holds.

As a corollary, the same holds for ACA0 .

In other words, the probably total computable functions of

• RCA0 and WKL0 are primitive recursive, and those of

• ACA0 can be defined by primitive recursion in the higher types. Formalizing analysis

In the language of PRA, one can define integers, rational numbers, and other finitary objects in natural ways.

Define the real numbers to be Cauchy sequences of rationals with a fixed rate of convergence:

−n ∀n ∀m ≥ n (|an − am| < 2 ).

Equality is a Π1 notion:

−n+1 a = b ≡ ∀n (|an − bn| ≤ 2 ).

Less-than is a Σ1 notion:

−n+1 a < b ≡ ∃n (an + 2 < bn). Complete separable metric spaces

Definition. A complete separable metric space X = Ab consists of a set A together with a function d : A × A → R satisfying: • d(x, x) = 0 • d(x, y) = d(y, x) • d(x, z) ≤ d(x, y) + d(y, z).

A point of Ab is a sequence (an) of elements of A such that for −n every n and m ≥ n we have d(an, am) < 2 .

Outside the theory, we can think of A as “coding” or “representing” the metric space.

Within the theory, we say that A “is” the space. Compactness

Three notions of compactness for a CSM: • Totally bounded: for every rational ε > 0, there is a finite ε-net. • Heine-Borel compact: every covering by open sets has a finite subcover. • Sequentially compact: every sequence has a convergent subsequence.

In weak theories:

• RCA0 proves e.g. [0, 1] is totally bounded. • Totally bounded ⇒ Heine-Borel requires weak K¨onig’slemma. • Totally bounded ⇒ sequentially compact requires arithmetic comprehension.

In constructive mathematics, one usually uses “totally bounded.” Continuity

A function f between CSM’s is uniformly continuous if

∀ε > 0 ∃δ > 0 ∀x, y (d(x, y) < δ → d(f (x), f (y)) < ε).

A modulus of uniform continuity for f is a function g(ε) returning such a δ for each ε:

∀x, y, ε > 0 (d(x, y) < g(ε) → d(f (x), f (y)) < ε).

Theorem. In RCA0 , the statement that every continuous function from a compact space to R has modulus of uniform continuity is equivalent to (WKL).

In Bishop’s constructive mathematics, functions are assumed to come with such moduli (“avoidance of pseudo-generality”). Closed sets

Two notions of a closed set: • closed = complement of a sequence of basic open balls • separably closed = the closure of a sequence of points

Theorem (Brown). Over RCA0 : 1. For compact spaces, “closed ⇒ separably closed” is equivalent to (ACA). 2. In general, “closed ⇒ separably closed” is equivalent to 1 (Π1 −CA). 3. “separably closed ⇒ closed” is equivalent to (ACA) Distance

Theorem (Avigad and Simic): Over RCA0 , the following are equivalent to (ACA): 1. In a compact space, if C is any closed set and x is any point, then d(x, C) exists. 2. If C is any closed of [0, 1], then d(0, C) exists. 1 The following are equivalent to (Π1 −CA): 1. In an arbitrary space, if C is any closed set and x is any point, then d(x, C) exists.

2. In a compact space, if S is any Gδ set and x is any point, then d(x, S) exists.

3. If S is a Gδ subset of [0, 1], then d(0, S) exists.

In constructive mathematics, sets are often assumed to be located. Hilbert space and Banach spaces

As in computable analysis, we can define a Hilbert space H = Ab to be a countable vector space A over Q together with a function h·, ·i : A × A → R satisfying 1. hx, xi ≥ 0 2. hx, yi = hy, xi 3. hax + by, zi = ahx, zi + bhy, zi

Define ||x|| = hx, xi1/2 and d(x, y) = ||x − y||, and think of H as the completion of A. of ergodic theory

Recall the mean ergodic theorem: let T : H → H be a nonexpansive on a Hilbert space, and for each n, let 1 A f = (f + Tf + ... + T n−1f ). n n The mean ergodic theorem asserts that this sequence of averages converges in the Hilbert space norm.

Over RCA0 , this is equivalent to (ACA). Reverse mathematics of ergodic theory

More precisely: • Let M = {f ∈ H | Tf = f } • Let N be the closure of {Tg − g | g ∈ H}

The mean ergodic theorem says: • M is the orthogonal complement of N

• Anf converges in norm to PM f Reverse mathematics of ergodic theory

Consider the three statements:

1. Anf converges

2. PN f exists

3. PM f exists

Theorem (Avigad and Simic) In RCA0 , 1 and 2 are equivalent, but showing that 3 implies either 1 or 2 requires (ACA).

In fact, even the statement “if PM f = 0, then Anf converges” requires (ACA). Higher types

Recall that the finite types are defined as follows: • N is a finite type • If σ and τ are finite types, so are σ × τ and σ → τ

The primitive recursive functionals of finite type allow: • λ abstraction, application, pairing, projection • Higher-type primitive recursion:

F (0) = G, F (n + 1) = H(F (n), n) Higher-type arithmetic

The theory PRAω (a.k.a. G¨odel’stheory T ) axiomatizes these, just as PRA axiomatizes the primitive recursive functions.

• There is a sort for each type. • Basic constants and combinators allow us to define functions using λ abstraction, application, pairing, projection. • “Recursors” allow definition by primitive recursion. Higher-type arithmetic

Higher-type arithmetic: • PAω = PRAω + induction ω ω • HA = PRAi + induction These are conservative extensions of PA and HA respectively.

In fact, one can add quantifier-free choice axioms (QF −AC) to PAω, and full choice (AC) to HAω.

Restricted versions are conservative over PRA: ω • PRAd + (QF −AC) ω • PRAd i + (AC)

We will obtain even stronger results with the Dialectica interpretation.

One can take type N equality to be basic, then define higher-type equality extensionally:

f = g ≡ ∀x (fx = gx)

The extensionality axiom says that functions respect this equality:

f = g → Ff = Fg.

Alternatively, one can take = to be basic at each type, and have axioms asserting that this corresponds to the extensional notion.

Theorem (Luckhardt). Extensionality can be interpreted, preserving the second-order (type 0/1) fragment.

More generally, though, the issues are subtle. Second-order vs. finite types

Notice that adding higher types alone doesn’t make a theory strong.

It is the presence of set comprehension that makes second-order arithmetic so much stronger than arithmetic.

Indeed, PRAω can be interpreted in HA, by internalizing either HEO or HRO. Conservation results summarized

The following theories are “finitary”: • PRA

• IΣ1

• RCA0 , WKL0 ω • PRAd + (QF −AC) + (WKL) The following theories are “arithmetic”: • PRAω • PA

• ACA0 • HAω + (AC) + (MP) + (WKL). • PAω + (QF −AC) + (ACA) A finite-type variant of ACA0

ω ω Let ACA0 denote PRA + (QF −AC) + (ACA).

This is a finite-type variant of ACA0 , and a .

It is natural to consider an “arithmetic comprehension functional,” µ: ∃x (f (x) = 0) → f (µ(f )) = 0 for f : N → N.

ω Theorem (Hunter). ACA0 + (µ) is a conservative extension of ACA0 . A conservation theorem for measure theory

Code sets as characteristic functions, i.e. interpret x ∈ Y as χY (x) = 1.

We can define unions and intersections. Using µ, we can also define countable unions and intersections.

Add a symbol λ for Lebesgue measure, and axioms: • ∀X ∈ P(2N), λ(X ) ≥ 0 • λ(∅) = 0 S P • ∀(Xn) , (∀i, j , j 6= i → Xi ∩ Xj = ∅) → λ( Xn) = λ(Xn) • ∀σ ∈ 2

ω 1 Theorem (Kreuzer). ACA0 + (µ) + (λ) is a Π2-conservative extension of ACA0 .

The proof uses the Dialectica interpretation, together with delicate normalization .