U.U.D.M. Project Report 2016:32
The Unprovability of the Continuum Hypothesis Using the Method of Forcing
Alvar Bjerkeng van Keppel
Examensarbete i matematik, 15 hp Handledare: Vera Koponen Examinator: Jörgen Östensson Juni 2016
Department of Mathematics Uppsala University
Abstract The continuum hypothesis is the statement that no set has cardinality greater than N and smaller than R. We show that if ZFC is consistent, the continuum hypothesis is not provable from ZFC. This is done using the method of forcing, pioneered by Paul Cohen in 1963. Proofs and concepts are given with lots of detail to make reading as simple as possible.
1 1 Introduction
This thesis is meant to be a self contained, gentle and to the point introduction to the method of forcing. The reader is expected to have come in contact with mathematical logic and ZFC set theory. For ease of reading a lot of basic results in set theory are stated (but not proven) to refresh the readers memory. Forcing is a way of expanding a certain kind of model of set theory to a larger model. In this thesis forcing is used to prove that if ZFC is consistent, so is ZFC together with the negation of the continuum hypothesis(CH). We denote this by Cons(ZFC) ⇒ Cons(ZFC + ¬CH). This is one of the two parts in proving that the continuum hypothesis is independent of ZFC. The result was first published by Paul Cohen in his article “The Independence of The Continuum Hypothesis” published 1963[1]. In this article Cohen refers to earlier work by Kurt G¨odelin 1940 proving (among other things) the other direction, Cons(ZFC) ⇒ Cons(ZFC+CH). We will only look at the Cons(ZFC) ⇒ Cons(ZFC+¬CH) direction. This thesis mainly follows the approach of Kunens excellent book [3].
2 The Logic of ZFC
The underlying logic of Zermelo-Fraenkel with the axiom of choice (ZFC) is classical first order logic with equality and a binary relation symbol ∈. Thus the formulas will be of the form
(x = y), (x ∈ y), (φ ∧ ψ), (¬φ), (∃x φ) where x and y are variables and φ, ψ are formulas. In practice we drop parenthesis if no clarity is lost. We let φ∨ψ abbreviate the logically equivalent ¬(¬φ∧¬ψ), similarly for φ → ψ and φ ↔ ψ. In the same vein, ∀x φ abbreviates ¬∃x ¬φ. This is done to make proofs by induction on the structure of formulas have fewer cases. As a final convenience, the symbol ⊥ is used in a few places to denote falsehood in a formula. A possible definition of this symbol is thus ⊥ def⇔ ∃x(x ∈ x ∧ x∈ / x). At a glance, the language of first order logic does not seem expressive enough, or at least not terse enough, to formulate interesting notions in. We deal with this by defining new notation as shorthand for long formulas. Two examples of this are subset x ⊆ y def⇔ ∀z(z ∈ x → z ∈ y) and set comprehension x = {y ∈ Y | φ(y)} def⇔ ∀y(y ∈ x ↔ y ∈ Y ∧ φ(y)).
If the reader wants to refresh his or her memory about the axioms of ZFC they are included in an appendix at the back.
3 Metatheory and Formalism
As in the vast majority of mathematical works, we will be using mathematical prose as opposed to some formal (and therefore tedious) proof system. We will however implicitly assume that any of our proofs can be translated into a formal proof in some proof system for classical first order logic, except in a few special cases. These cases touch on the two separate types of logic used. On the one hand, we have the logic of ZFC and on the other hand, the logic of the metatheory. One problematic proof is of the form “for all formulas φ1, . . . , φn, X is true” where X depends on φ1, . . . , φn. Since we cannot quantify over formulas within first order logic, such statements and proofs cannot be translated as is. However, in the few cases where this pops up, we will see that for each concrete choice of formulas φ1, . . . , φn a formal proof in first order logic can be extracted for X in a straightforward way. A special case of this is proofs by induction on the structure of formulas. We can from such a proof extract a first order proof by manually unwinding the inductive steps until no reference to the induction hypothesis is needed. This is akin to proving P (83) without induction given a proof by induction for ∀n P (n). We then have a proof for P (0) and a proof for ∀n(P (n) → P (n + 1)) where we for simplicity assume no induction is used. By specializing ∀n(P (n) → P (n + 1)) for each n = 0,..., 82 and applying modus ponens 83 times, we get a proof for P (83) without using induction.
2 The underlying structure of the proof is as follows. To prove Cons(ZFC) ⇒ Cons(ZFC + ¬CH), we show the equivalent (ZFC + ¬CH ` ⊥) ⇒ (ZFC ` ⊥)., (?) where T ` φ stands for “there is a formal proof of φ using assumptions from T ”. To do this we assume the finite assumption property of the formal proof system, that if T ` ψ for some possibly infinite list of formulas T then there is a finite sublist φ1, φ2, . . . , φn of T such that φ1, φ2, . . . , φn ` ψ. We use this to reformulate (?) into cases, one for each finite sublist φ1, φ2, . . . , φn of ZFC where we let Φ ≡ φ1 ∧ φ2 ∧ ... ∧ φn: (Φ ∧ ¬CH ` ⊥) ⇒ (ZFC ` ⊥). (†) Let us consider a fixed Φ. We prove (†) by constructing a model1 of Φ ∧ ¬CH in ZFC using the method of forcing pioneered by Paul Cohen [1]. This is the meat of the thesis. This model M[G] is the expansion of a base model M which we augment with an extra set G, similar to how the field extension F (g) is the smallest subfield containing F ∪ {g} where F ⊂ G and g ∈ G. To make “a model of” more precise, we define in our metalogic a transformation (·)N on formulas, where φN is read as “φ relativized to N”. It simply bounds all quantifications to a set N and leaves everything else intact. As an example,
(∃x (x ∈ y → x = z))N becomes ∃x ∈ N (x ∈ y → x = z).
The construction of the model can then be stated as
ZFC ` ∃M[G] (Φ ∧ ¬CH)M[G].
We will gloss over something that is intuitively clear but requires a rigorous metalogical framework to define, namely that we can use deduction inside models N:
(ψ ` χ) ⇒ (ZFC ` ψN → χN ).
This is the last piece needed to prove (†).
ZFC ` ∃M[G] (Φ ∧ ¬CH)M[G] the constructed model Φ ∧ ¬CH ` ⊥ the supposition in (†) ZFC ` (Φ ∧ ¬CH)M[G] → ⊥M[G] deduction inside M[G] ZFC ` ∃M[G] ⊥M[G] by line one and three ZFC ` ⊥ simplification
This concludes the rough sketch of the overarching structure of the proof. Our metatheory thus needs to support manipulation of the formal proof system, be able to prove the finite assumption property and support induction and recursion over formulas.
4 Well-founded Relations
Two concepts that are not really interesting to formalize outside of set theory are that of classes and of class functions. The intuition is simply that a class is a collection of sets that may be “bigger” than a set and a class function is a function whose domain and range are classes. Examples of classes are the class of all singleton sets and the class of all groups. An example of a class function is the Kuratowski operator that sends arbitrary sets x and y to {{x}, {x, y}}.
Definition 1. Let s1, . . . , sn be sets.
• A class A is defined by a formula φA(x, z1, . . . , zn) and the sets si. We write a ∈ A if φA(a, s1, . . . , sn) holds.
• A class function F is defined by a formula ψF(x1, . . . , xm, y, z1, . . . , zn) and the sets si where for sets r1, . . . , rm there is a unique set y such that ψF(r1, . . . , rm, y, s1, . . . , sn) holds. If F is a class function and ψF(a1, . . . , am, b, s1, . . . , sn) is true we use F(a1, . . . , am) as a shorthand for b. 1The exact meaning of “model” will be explained later.
3 The class of all sets, called the universe of sets is denoted by V. The choice of formula for a class or class function is usually implicit. For φV any tautology will do, x = x being the canonical choice. Every set is a class (fix z as the desired set in φ(x, z) def⇔ x ∈ z) but there are classes which are not sets. It should also be noted that classes and class functions are not objects in the underlying first-order logic, they are just an abstraction used to hide the formula they represent. Keep this in mind as certain operations on classes cannot be justified by ZFC (such as the collection of all subclasses) while others yield new classes without causing any trouble. If A is a class it is for example sane to speak of the class {a ∈ A : φ(a)} for some φ since this is represented by the formula φA(x) ∧ φ(x). Definition 2. A well-founded class W is a class W with a binary relation R (a class whose members are pairs of elements from W) where every nonempty subset of W has a R-minimal element; or expressed by a formula, ∀S ⊆ W(S 6= ∅ → ∃m ∈ S ∀s ∈ S ¬sRm). Note that for any well-founded class, ¬xRx (otherwise the set {x} has no minimal element). By using the axiom of choice there is an alternative characterization of well-founded relations that hopefully sheds more light on what structure well-foundedness imposes. Lemma 3. The following two statements are equivalent. • (W, R) is well-founded.
• Every function f : N → W, such that for all n ∈ N either f(n + 1)Rf(n) or f(n + 1) = f(n), satisfies ∃N ∈ N ∀n ≥ N f(N) = f(n). The second statement can be phrased as every R-descending N-sequence converges. Proof. Assume that (W, R) is well-founded and take a function f as above. Take a N ∈ N such that f(N) = min range(f). Then f(N) = f(N + 1) = ··· = f(N + n) = ... since by induction f(N + i + 1)Rf(N + i) is impossible by minimality of f(N + i) = f(N) for all i ∈ N. In the other direction, assume that every descending function f converges and towards a contradiction, assume that there is a nonempty set S ⊂ W with no R-minimal element. Let cS be a choice function of S and s ∈ S. Then
f(0) = s, f(n + 1) = cS({x ∈ S | xRf(n)}) defines a function since the set {x ∈ S | xRf(n)} being empty is equivalent to f(n) being a minimal element in S. This f contradicts our assumption since f(n) 6= f(n + 1) for all n ∈ N. This tells us that however we drop a bouncy ball down a well-founded staircase it will only bounce on a finite number of stairs. The canonical example of a wellfounded class is of course (V, ∈). That this is a wellfounded class follows directly from the axiom of foundation: ∀x [∃y(y ∈ x) → ∃y(y ∈ x ∧ ¬∃z(z ∈ x ∧ z ∈ y))] . Definition 4. The powerset of a set A denoted by PA. S S Definition 5. The union {x | ∃a ∈ A(x ∈ a)} of a set A is denoted by A or a∈A a. Definition 6. A relation R on a class A is set-like if for every a ∈ A the class {x ∈ A : xRa} is a set. Definition 7. Let R be a set-like relation on a class A and let x ∈ A. We define pred (as in predecessor) and closure by recursion as follows: pred(A, x, R) = {a ∈ A : aRx}
pred0(A, x, R) = pred(A, x, R) [ predn+1(A, x, R) = {pred(A, a, R): a ∈ predn(A, x, R)}
The set {predn(A, x, R): n ∈ N} exists by the replacement axiom so [ closure(A, x, R) = {predn(A, x, R): n ∈ N} is well defined.
4 Proposition 8. If a relation R is set-like and well-founded on a class W then any nonempty subclass W0 of W has a R-minimal element. Proof. This is seen by fixing a v ∈ W0 and taking a R-minimal element m of the set
S = W0 ∩ ({v} ∪ closure(W0, v, R)).
A m0 ∈ W0 such that m0Rm implies m0 ∈ S by construction of S. This contradicts the minimality of m in S so m must be a R-minimal element of W0. We now have all the machinery to prove that induction over well-founded sets is valid. To get a familiar special case of this general induction principle, take (W, R) = (N, <). This becomes