The Lebesgue and Large Cardinals

Jacinto Carlos Ferreira Monteiro de Freitas França

Thesis to obtain the Master of Science Degree in and Applications

Supervisor: Prof. António Marques Fernandes

Examination Committee Chairperson: Prof. Maria Cristina Sales Viana Serôdio Sernadas Supervisor: Prof. António Marques Fernandes Members of the Committee: Prof. Luís Pereira Prof. Fernando Ferreira

July 2017 ii Dedicated to my Father and to the Three Gems

iii iv Acknowledgments

This work would not have been possible without the of generous people during the research and writing process and throughout my years at Instituto Superior T´ecnico. First I would like to express my gratitude to my supervisor, Prof. Ant´onioMarques Fernandes, for his guidance and expertise during the research and writing of this thesis. In addition to the countless technical corrections, and to the generosity with his time, his advice and thoughtful discussions revealed numerous times the layers of subtlety and complexity in the field of theory that I had not noticed. I would like to thank Prof. Ana Leonor Silvestre for her support during the MSc. program. I would also like to thank Prof. Jo˜aoRasga for the high quality of his supervision of my BSc. final project. I am grateful to Prof. Paulo Pinto and to Prof. Cristina Cˆamarafor their encouragement and support of my academic aspirations. I am also grateful to Dr. Carlos Filipe for the decisive impact of his help in my life. And I am indebted for the second opportunity that Instituto Superior T´ecnicogave me, and for the high quality of the education provided by this institution and its faculty members. I am thankful to my family for their encouragement, support, and advice during the difficult periods of my life. My deepest gratitude is to my father, Jo˜aoFran¸ca,for the immeasurable help he gave me throughout my life and for the strength of character he so often revealed. Finally, I would like to thank the Triple Gem for providing the invaluable instructions that allowed me to surpass the enormous difficulties that characterized my earlier life.

v vi Resumo

Um dos factores que motivaram grandes avan¸cosna teoria dos conjuntos foi a medida de Lebesgue. A medida de Lebesgue ´ea componente fundamental da teoria da integra¸c˜aode Lebesgue, que ´euma gener- aliza¸c˜aoda teoria de integra¸c˜aode Riemann. Apesar do de Lebesgue ter muitas vantagens, este n˜aoconsegue medir todos os conjuntos de reais. No entanto, existe uma alternativa quando adoptamos uma vers˜aomais fraca do axioma da escolha, nomeadamente o princ´ıpiodas escolhas dependentes (DC). O Teorema de Solovay afirma que se existe um modelo de ZFC com um cardinal inacess´ıvel, ent˜ao existe um modelo de ZF+DC onde todos os conjuntos de reais s˜aomensur´aveis `aLebesgue. Por outro lado, o Teorema de Shelah esclarece que quando existe um modelo de ZF+DC onde todos os conjuntos de reais s˜aomensur´aveis `aLebesgue, tamb´emexiste um modelo de ZFC com um cardinal inacess´ıvel. Esta tese ´euma exposi¸c˜aodetalhada dos teoremas de Solovay e de Shelah, e das respectivas demons- tra¸c˜oes.Os pr´e-requisitos para as demonstra¸c˜oess˜aoprincipalmente a teoria dos modelos para conjuntos; o m´etodo de , em particular o colapso de L´evye a ´algebraaleat´oria;e a teoria descritiva, com foco na mensurabilidade `aLebesgue. Estes pr´e-requisitoss˜aoapresentados ap´osuma revis˜aobreve da medida de Lebesgue. Depois os dois teoremas principais s˜aodemonstrados com o devido detalhe.

Palavras-chave: Medida de Lebesgue, Cardinal Inacess´ıvel, Princ´ıpiodas Escolhas Depen- dentes, Colapso de L´evy, Algebra´ Aleat´oria,Filtro R´apido

vii viii Abstract

One of the significant driving factors behind major developments in has been the Lebesgue measure. The Lebesgue measure is the cornerstone of Lebesgue’s theory of integration, which is a gen- eralization of Riemann’s theory of integration. Despite the many advantages of the Lebesgue integral, it cannot measure every set of reals. However, an alternative can be found if we adopt a weaker version of the , namely the principle of dependent choices (DC). Solovay’s Theorem establishes that if there is a model of ZFC with an inaccessible cardinal, then there is a model of ZF+DC in which every set of reals is Lebesgue measurable. On the other hand, Shelah’s Theorem clarifies that when there is a model of ZF+DC in which every set of reals is Lebesgue measurable, there is also a model of ZFC with an inaccessible cardinal. This thesis is a detailed exposition of the Solovay and Shelah theorems, and their respective proofs. The prerequisites for the proofs are mainly the model theory of sets; the method of forcing, particularly the L´evycollapse and the random algebra; and , focused on the aspect of Lebesgue measurability. These prerequisites are presented after a brief overview of the Lebesgue measure. Then the two main theorems are shown with due detail.

Keywords: Lebesgue Measure, Inaccessible Cardinal, Principle of Dependent Choices, L´evy Collapse, Random Algebra, Rapid Filter

ix x Contents

Acknowledgments...... v Resumo...... vii Abstract...... ix List of Figures...... xiii Nomenclature...... 1 Glossary...... 1

1 Introduction 1

2 The Lebesgue Measure5 2.1 The Lebesgue Measure...... 6

3 Preliminaries of Set Theory 11 3.1 Models of Set Theory...... 11 3.2 Forcing and the L´evyCollapse...... 18 3.2.1 The Theory of Forcing...... 19 3.2.2 Examples...... 30 3.2.3 The L´evyCollapse...... 32 3.3 The Lebesgue Measure and Descriptive Set Theory...... 35

4 The Lebesgue Measure and Large Cardinals 51 4.1 Solovay’s Theorem...... 51 4.2 Shelah’s Theorem...... 55 4.3 Further Results...... 63

5 Conclusions 67

Bibliography 69

A Proof of Solovay’s Technical Lemma 71

B Proof of Shelah’s Auxiliary Lemma 75

xi xii List of Figures

3.1 Geometric representation of the forcing method...... 23

xiii xiv Chapter 1

Introduction

Set theory is a branch of Mathematics pioneered by Georg Cantor in the late XIX century. He established for the first time that there are different sizes of infinities and that there is no upper bound for them. In the subsequent period, set theory was developed and expanded. However, it was still in a nonaxiomatic stage in the early XX century. There was a naive assumption that the set of all sets existed, but Bertrand Russell found a logical paradox following from this assumption. To avoid this kind of paradoxes, mathematicians had to find an appropriate axiomatization for set theory. The set of axioms that eventually became the standard axiomatization of set theory were the Zermelo-Fraenkel axioms with choice, abbreviated to ZFC. From ZFC all of Mathematics can be derived. Because of this, set theory is regarded as the foundation of Mathematics.

Set theory became a branch of mathematical research in its own right in great part due to Kurt G¨odel.He proved that the axiom of choice (AC) and the continuum hypothesis (CH) are consistent with ZF, provided that ZF is consistent. G¨odelaccomplished this by defining the class L of constructible sets and proving that L is a model of ZFC+CH. This put to rest early concerns about the axiom of choice. G¨odelalso established his famous incompleteness theorems in 1931. These theorems entail that any consistent system of axioms strong enough to interpret first-order arithmetic is necessarily incomplete. These groundbreaking results eliminated the possibility of deciding all statements from a single set of strong enough axioms, such as the axioms of set theory. Among the undecidable statements in ZFC are the large cardinal hypotheses. G¨odelconjectured that if we assumed the existence of a large cardinal, in addition to the ZFC axioms, then we could prove new results that were previously undecidable in ZFC. G¨odelalso conjectured that this process could be repeated by adding even larger cardinals to improve the strength of the previously strengthened axiom system. Since there is no maximum cardinal, the process of adding larger cardinal hypotheses would eventually decide all propositions by resorting to strengthened enough systems of axioms. This was G¨odel’sprogram for set theory.

The next great innovation in set theory came from Paul Cohen, in 1963. He proved the independence of AC and CH from ZF using the method of forcing, a mathematical technique that he created. Forcing is a method of adding a convenient object to a proposed model of set theory to force the resulting extended model to satisfy desired properties. This method, in its multiple forms, proved to be incredibly powerful.

1 It has been used to establish that there are propositions that are consistent relative to ZFC, or even independent of ZFC. It has also been used to measure the strength of certain propositions by comparing them with large cardinal hypotheses. Among the most common of these propositions are the axiom of choice, the continuum hypothesis, and the axiom of constructibility (V = L). A (partial) hierarchy of the size of large cardinals was established. As G¨odelconjectured, large cardinals hypotheses allowed to prove new results and to shed light on the other of set theory. The study of large cardinals is now inseparably interwoven with other areas of set theory. Despite its many successes, the known large cardinal hypotheses do not decide CH. Set theory is a branch of mathematics that is more isolated from the other branches than the norm. Its direct consequences to mathematics as a whole have been comparatively limited — for now. But the study of large cardinals, for example, may have an enormous impact through the whole of mathematics. We only have to look at Euclid’s axioms, the new geometries that were created by changing these axioms, and modern differential geometry, to realize the potentially dramatic consequences of the study of large cardinals for mathematics. Additionally, numerous models of ZFC can be found with the method of forcing. This abundance of models, in a way, makes set theory resemble group theory more than the theory of a single desired model of the theory of sets. Thus, set theory and the study of large cardinals may have profound implications to mathematics as did noneuclidean geometry and abstract algebra. The other main theme of this thesis is the Lebesgue measure. The Lebesgue measure was created to solve important limitations in the theory of integration, thus providing a better way to measure the length, , or of a given set. But, despite hopes to the contrary, the Lebesgue measure is not able to measure all sets of reals. A counterexample was devised by Giuseppe Vitali in 1905 which required the AC. If the AC is substituted by a weaker axiom, namely the principle of dependent choices (DC), then it is possible to circumvent Vitali’s counterexample and find a model of ZF+DC in which all sets of reals are Lebesgue measurable. To find this model, we assume that there is a model of ZFC with an inaccessible cardinal. This result was established in 1964 by Robert Solovay ([15]). In fact, Solovay proved that if there is a model of ZFC with an inaccessible cardinal, then there is a model of ZF+DC in which every set of reals is Lebesgue measurable, has the Baire property, and has the perfect set property. Even though inaccessible cardinals are among the smallest of the large cardinals, their existence is sufficient to find a model with these three regularity properties. It is natural to ask if the existence of an inaccessible cardinal is not just sufficient, but also necessary in Solovay’s Theorem. It was known that an inaccessible cardinal was necessary for the perfect set property since 1957 due to Ernst Specker. In 1984, Saharon Shelah published a paper showing that the inaccessible cardinal hypothesis is not necessary for the Baire property, but is necessary for the Lebesgue measurability of all sets of reals ([14]). The main goal of this thesis is to present the two theorems established by Solovay and Shelah, with exclusive focus on Lebesgue measurability, and to explain the strategies involved in each of the proofs.

In Chapter 2, we review the Lebesgue measure in the traditional context of R, so that we can see the problems that gave rise to it, and so that we can use the Lebesgue measure’s properties in later chapters

2 in the context of set theory. Then we present the preliminary set-theoretic material that is required for the main theorems of this thesis. We start in Chapter 3.1 with the basic model theory of sets, by defining some of the most common models of set theory, and reviewing their properties. We proceed in Chapter 3.2 to introduce the theory of forcing, explore some notable examples of this mathematical technique, and to analyze in detail a particular notion of forcing that is used in the proof of Solovay’s Theorem, called the L´evycollapse. Then we introduce the basic concepts of descriptive set theory in Chapter 3.3, with a predominant emphasis on the Lebesgue measure. In Chapters 4.1 and 4.2 we prove Solovay’s Theorem and Shelah’s Theorem, respectively. We end in Chapter 4.3 with a brief overview of notable results in set theory related to the Lebesgue measure. In this thesis, we try to maintain a balance in the exposition between presenting material that is too basic for the experts or too advanced for mathematics students unfamiliar with set theory. So we assume that the reader is familiar with the most basic notions of set theory, such as well-orderings, ordinals, cardinals, or the standard axioms of set theory. Building on this assumption, we present the preliminary set-theoretic material without proofs, except when it is directly relevant to one of the main theorems.

Bibliographical Sources

The primary bibliographical source for the Lebesgue measure in Chapter 2 was Ricou [11], and the secondary source was Bronshtein, et al. [2]. For the basic model theory of sets in Chapter 3.1, the main source was Jech [4]. The primary source for the forcing method in Chapter 3.2 was Schindler [12], and the secondary source was Kunen [6]. The main sources for descriptive set theory in Chapter 3.3 were Schindler [12] and Kanamori [5], and the auxiliary source, regarding topology, was Munkres [8]. For Solovay’s Theorem in Chapter 4.1, the primary source was Schindler [12]. The main sources for Shelah’s Theorem in Chapter 4.2 were Bekkali [1] and Semmes [13]. For the concise overview of results in set theory related to the Lebesgue measure in Chapter 4.3, the primary source was Foreman and Kanamori [3].

3 4 Chapter 2

The Lebesgue Measure

Introduction

The area between the graph of a nonnegative f :[a, b] → R and the x axis is measured by the integral of f. The common notion of integral, taught in introductory calculus courses, is that of the . However, this form of integration has significant shortcomings:

R b R b (1) It is not always true that a limn→∞ fn(t)dt = limn→∞ a fn(t)dt.

R b P∞ P∞ R b (2) It is not always true that a n=1 fn(t)dt = n=1 a fn(t)dt.

(3) It is not always true that if f, g :[a, b] → R satisfy f(t) = g(t), except on a null subset1 of [a, b], R b R b then a f(t)dt = a g(t)dt.

(4) The Riemann integral does not provide a satisfactory relation between the two fundamental theo- rems of calculus.

(i) (First Fundamental Theorem of Calculus). Let f be a Riemann-integrable function on the R x I = [a, b], and F be given by F (x) = a f(t)dt + F (a) on I. Then F is continuous on I and F 0(x) = f(x), except on a null of I.

(ii) (The Second Fundamental Theorem of Calculus). Let F be continuous on I, f be Riemann- 0 R x integrable on I, and F (t) = f(t), except on a finite set D ⊂ I. Then F (x) − F (a) = a f(t)dt.

The conclusion of the Second Fundamental Theorem of Calculus does not hold if we assume that D is null. Therefore, the sense in which these two theorems are the conceptual inverses of each other has a significant limitation.

This list, although not exhaustive, encapsulates the main problems with the Riemann integral. The Lebesgue integral is a generalization of the Riemann integral that does not have these shortcomings. This makes the Lebesgue integral a better theoretical framework for integration. The core notion of this

1A set A ⊂ is null if, for every ε > 0, there is a countable of intervals I = ]a , b [ , such that P (b −a ) < ε R k k k k k k and A ⊂ S I . See Definition 2.1.13 below. k k

5 theory of integration is the Lebesgue measure. Once the Lebesgue measure is defined, Lebesgue’s theory of integration arises naturally and elegantly from it.

Lebesgue presented this theory of integration in his doctoral thesis in 1902. However, despite hopes to the contrary, it was soon discovered that not all sets of reals are Lebesgue measurable. A counterexample was devised by Giuseppe Vitali in 1905 using the Axiom of Choice (AC). This contributed to the controversy surrounding AC during the crisis in the foundations of mathematics. The controversy was eventually put to rest after G¨odel’s proof of the relative consistency of AC with ZF in 1935.

However, it is possible to use weaker versions of AC, such as the Principle of Dependent Choices2 (DC), and still obtain most of the results of . Then a question arises naturally: is it possible to find a model of ZF + DC, in which all sets of reals are Lebesgue measurable? The answer is yes, and this result was established by Solovay in 1964 (published in 1970). But he used one extra assumption, the existence of an inaccessible cardinal. This assumption is not redundant because the existence of an inaccessible cardinal cannot be derived from ZFC. And, if this assumption is dropped, Solovay’s Theorem does not hold. That is, if every set of reals in a given model of ZF + DC is Lebesgue measurable, then there exists a model of ZFC with an inaccessible cardinal. This reciprocal result was established by Shelah in 1979 (published in 1984). Henceforth we will refer to this result as Shelah’s Theorem.

The study of the Lebesgue measure inspired other important developments in set theory. Notably among these are the measurable cardinals, which arise from a generalized version of the measure problem originally proposed by Lebesgue. In this chapter, we review the basic definitions and results about the Lebesgue measure. This foundational material will be used later in the set theoretical context.

2.1 The Lebesgue Measure

The Lebesgue measure of a subset of R, or R2, or R3, is a generalization of the traditional concepts of length, area, and volume, respectively. The goal of this generalization is to be able to measure as many sets as possible — and thus integrating as many functions as possible —, while obtaining a good theory of integration. Let us start with the easiest sets to measure.

Definition 2.1.1. Let I1,...,In ⊂ R be intervals. n (1) A set I ⊂ R is called an n-rectangle if I = I1 × ... × In.

(2) If I = [a1, b1]×...×[an, bn], then we define the n-volume of I as vn(I) = (b1 −a1) ... (bn −an). The

n-volume of I is computed in the same way in case any of the Ik is not closed, but still bounded. (3) If I is an unbounded n-rectangle, then the convention used for computing its n-volume is:

(a) If some Ik satisfies v1(Ik) = 0, then vn(I) = 0;

(b) If no Ik satisfies v1(Ik) = 0, and there is at least one Ik such that v1(Ik) = +∞, then

vn(I) = +∞.

2See Definition 3.1.1 (2).

6 Measures on σ-Algebras

The main distinguishing feature of the Lebesgue measure is σ-additivity, which allows us to measure a countable of disjoint sets by adding the measures of the individual sets. But before we can define the Lebesgue measure, we need the appropriate framework on which to define it. That framework is a σ-algebra.

Definition 2.1.2. Let X be an arbitrary set. A nonempty collection A ⊂ P(X) is called a σ-algebra on X when the following two conditions hold: (a) If A ∈ A, then X − A ∈ A; S (b) If A1,...,An,... ∈ A, then An ∈ A. n∈N Proposition 2.1.3. Let A be a σ-algebra on X. Then (1) ∅,X ∈ A. (2) If A, B ∈ A, then B − A ∈ A. T (3) If A1,...,An,... ∈ A, then An ∈ A. n∈N T (4) If {Ai : i ∈ I} is a collection of σ-algebras, then i∈I Ai is a σ-algebra. We can now define a measure on a σ-algebra.

Definition 2.1.4. Let A be a σ-algebra on X. (1)A measure on A is a function µ : A → [0, +∞] such that (a) µ(A) ≥ 0; (b) µ(∅) = 0; S P∞ (c) If A1,...,An,... ∈ A are pairwise disjoint sets, then µ( Ai) = µ(An)(µ is σ- n∈N n=1 additive). (2) If µ is a measure, then the triplet (X, A, µ) is called a measure . The elements of A are called A-measurable, or just measurable.

Example 2.1.5. A is a measure space (X, A,P ), where P satisfies P (X) = 1. The measure is the probability function P . In this case no set has infinite measure because all sets A satisfy P (A) ≤ 1.

Proposition 2.1.6. Let µ be a measure on a σ-algebra A, and A, B, A1,...,An,... ∈ A. (1) If A ⊂ B, then µ(A) ≤ µ(B) (µ is monotone). S P∞ (2) µ( An) ≤ µ(An) (µ is σ-subadditive). n∈N n=1 S (3) If A1 ⊂ A2 ⊂ ... ⊂ An ⊂ ... , then µ( An) = limn→∞µ(An) (µ is continuous from bellow). n∈N T (4) If µ(A1) < +∞, and A1 ⊃ A2 ⊃ ... ⊃ An ⊃ ... , then µ( An) = limn→∞µ(An) (µ is n∈N continuous from above).

The most important σ-algebras in measure theory are the Borel algebra and the Lebesgue algebra.

Definition 2.1.7. The Borel algebra on Rn, denoted by B(Rn), is the smallest σ-algebra containing all the open of Rn. That is, B(Rn) is the intersection of every σ-algebra containing all the open subsets of Rn. The elements of B(Rn) are called Borel sets. The definition of the Lebesgue algebra is less simple. We need to introduce outer measures.

7 Outer Measures

Definition 2.1.8. A function µ∗ : P(Rn) → [0, +∞] is called an if the following conditions are satisfied: (a) µ∗(∅) = 0; (b) If A ⊂ B, then µ∗(A) ≤ µ∗(B); ∗ S P∞ ∗ (c) If A1,...,An,... ∈ A, then µ ( An) ≤ µ (An). n∈N n=1

∗ n Example 2.1.9. Let mL : P(R ) → [0, +∞] be a function defined by

∗ P∞ S m (A) = inf { vn(Ik): A ⊂ Ik}, L k=1 k∈N

n where the infimum is taken over all the coverings of A ⊂ R by countable collections of n-rectangles Ik. ∗ It is easy to see that mL is an outer measure.

Definition 2.1.10. Let µ∗ be an outer measure on Rn. A set A ⊂ Rn is called µ∗-measurable (or just measurable) if, for all Y ⊂ Rn,

µ∗(Y ) = µ∗(Y ∩ A) + µ∗(Y − A).

Theorem 2.1.11 (Carath´eodory). Let µ∗ be an outer measure on Rn, and M(Rn) be the collection of ∗ n n ∗ n n µ -measurable subsets of R . Then M(R ) is a σ-algebra, and µ  M(R ) is a measure on M(R ).

The Lebesgue Measure

Definition 2.1.12. The Lebesgue algebra, denoted by L(Rn), is the σ-algebra obtained by the Carath´eo- ∗ n dory theorem for the outer measure mL. If A ∈ L(R ), then we say that A is Lebesgue measurable, or ∗ just L-measurable. We write mL(A) = mL(A) for the L-measure of A.

It follows that all n-rectangles I are L-measurable, and mL(I) = vn(I). Before stating the basic properties of the L-measure, we need the following definitions.

Definition 2.1.13. An Fσ set is a set expressible as a countable union of closed sets. A Gδ set is a set ∗ expressible as a countable intersection of open sets. A A is a set that satisfies mL(A) = 0.

n Theorem 2.1.14. Let mL : L(R ) → [0, +∞] be the Lebesgue measure. (1) Every null set is L-measurable. (2) Every subset of a null set is null. S (3) If A1,...,An,... are null sets, then An is a null set. n∈N n S P∞ (4) If A1,...,An,... ∈ L( ) are pairwise disjoint sets, then mL( Ai) = mL(An) (mL is R n∈N n=1 σ-additive). n (5) If A is L-measurable, then mL(A) = inf {mL(U): U ⊂ R is open and A ⊂ U}. n (6) If A is L-measurable, then mL(A) = sup {mL(K): K ⊂ R is compact and K ⊂ A}.

(7) If A is L-measurable, then there is an Fσ set F , and a Gδ set G, such that F ⊂ A ⊂ G and

mL(G − F ) = 0. (8) All Borel sets are L-measurable, i.e., B(Rn) ⊂ L(Rn).

8 We can obtain the L-measurable sets using only the Borel sets and the null sets. In fact, this will be the definition of Lebesgue measurability in the set theoretical context.

Theorem 2.1.15. A set A ⊂ Rn is L-measurable iff there is a B such that the symmetric difference A∆B = (A − B) ∪ (B − A) is a null set.

A property is said to hold on A, or a.e. on A, if it holds on A except on a null subset of A.

Theorem 2.1.16 (Fubini). Let A ⊂ R×R be L-measurable. Then the sections Ax = {y ∈ R :(x, y) ∈ A} are null a.e. on R iff A is null, iff the sections Ay = {x ∈ R :(x, y) ∈ A} are null a.e. on R.

The Measure Problem

n n The measure space (R , L(R ), mL) can be characterized by four properties.

n n n Theorem 2.1.17. (R , L(R ), mL) is the unique measure space of R with these four properties: n (1) If I is an n-rectangle, then I ∈ L(R ), and mL(I) = vn(I). n n n (2) For every x ∈ R , and A ∈ L(R ), the set x+A = {x+a : a ∈ A} ∈ L(R ), and mL(A) = mL(x+A)

(mL is translation ). (3) If A ∈ L(Rn) is null, and B ⊂ A, then B ∈ L(Rn), and it is null. (4) B(Rn) ⊂ L(Rn).

We have been studying the L-measure on a σ-algebra. However, when Lebesgue formulated the

Measure Problem in his doctoral thesis he was trying to find a measure for all subsets of R. The original formulation is the following.

Definition 2.1.18 (The Measure Problem). Is there a function µ : P(R) → [0, +∞] such that: (1) µ([a, b]) = b − a;

(2) If x ∈ R, then µ(A) = µ(x + A); and (3) µ is σ-additive?

The Lebesgue measure mL on L(R) satisfies the three conditions, so long as A ∈ L(R) in condition (2). But not all subsets of R are L-measurable. There is a counter example, devised by Vitali, showing that L(R) 6= P(R).

Vitali’s Example

Let us define x ∼ y iff x − y ∈ Q. It is easy to see that ∼ is an equivalence relation. The equivalence class of x is [x] = {x + q : q ∈ Q}. Fix x ∈ R. Then every open interval contains representatives of [x]. In particular, there is a q ∈ Q, such that 0 < x + q < 1. The real number v = x + q is a representative of [x] in ]0, 1[, but it is not unique. We can pick a unique representative of each [x] in ]0, 1[ to form a set V ⊂ ]0, 1[ . Note that the choice of the representatives involves the AC.

Theorem 2.1.19 (Vitali). The set V is not L-measurable.

9 Proof. Let R = ] − 1, 1[ ∩ Q = {r1, r2, . . . , rn,...}. For each rn we define Vn = V + rn. If n 6= m, then

Vn ∩ Vm = (V + rn) ∩ (V + rm) = ∅, because rn 6= rm and the representatives in V are unique. So we S can form the union of pairwise disjoint sets G = Vn. n∈N Suppose V is L-measurable. Then mL(V ) = mL(Vn) for every n ∈ N, because mL is translation P∞ invariant. Since mL is also σ-additive, we get mL(G) = n=1 mL(Vn). There are two cases: P∞ (1) If mL(V ) > 0, then mL(G) = n=1 mL(V ) = +∞. However G ⊂ ] − 1, 2[ , which implies mL(G) ≤ 3. This is a contradiction. P∞ (2) If mL(V ) = 0, then mL(G) = n=1 mL(V ) = 0. But if we prove that ]0, 1[ ⊂ G, then we can get the contradiction mL(G) ≥ 1. So, let x ∈ ]0, 1[ . Then we can find a v ∈ V , such that x − v = r ∈ Q. Since x, v ∈ ]0, 1[ , we get r ∈ ] − 1, 1[ . But this means that r = rn, for some n ∈ N. Therefore x = v + rn ∈ Vn ⊂ G, which means that ]0, 1[ ⊂ G.

Vitali’s example not only involves the AC but, in fact, requires its use. Otherwise, it would not be possible to find a model of ZF+DC+LM, where LM stands for “all sets of reals are Lebesgue measurable”. It is Solovay’s Theorem that establishes the necessity of the AC in Vitali’s example.

10 Chapter 3

Preliminaries of Set Theory

3.1 Models of Set Theory

The Language and Axioms of Set Theory

The language of set theory, L∈, is the set of well-formed formulas with the usual logical connectives ¬, ∧, ∨, ⇒, and ⇔, the quantifiers ∃ and ∀, and the binary predicate symbols = and ∈.

The interpretation structures for L∈ are of the form M = (M,E), where the universe M can be a set or a class, and the binary predicate symbol ∈ is interpreted by a relation E ⊂ M × M. By a class we always mean a collection of sets defined by a formula, C = {x : ϕ(x)}. Some collections of sets are not sets themselves because they entail a paradox. Russell’s Paradox is a classic example: let C = {x : x∈ / x}. If C ∈ C, then C/∈ C. And if C/∈ C, then C ∈ C. So C cannot be a set. When a collection C = {x : ϕ(x)} is not a set we call it a proper class. Sometimes we consider the language of set theory enriched with additional predicate symbols or function symbols. For example, if A˙ is a unary predicate symbol, then the corresponding structure for this language is of the form (M,E,A), where A ⊂ M. Here, the intended meaning of A is that A(x) holds iff x ∈ A. The standard axioms of set theory are the ZFC axioms, i.e., the axioms of extensionality, pairing, union, infinity, power set, regularity, choice, and the axiom schemes of separation and replacement. We often work with ZF, i.e., the ZFC axioms except for the Axiom of Choice (AC).

Definition 3.1.1. (1) (Axiom of Choice). Let A be a nonempty collection of nonempty sets. Then there is a function f : A → S A, such that f(x) ∈ x. This f is called a choice function. (2) (Principle of Dependent Choices). Let A be a nonempty set, and R be a binary relation on A, such

that, for every x ∈ A, there is a y ∈ A, satisfying x R y. Then there is a hxn : n ∈ ωi of

elements of A, such that xn R xn+1.

It is sometimes convenient to substitute AC by the Principle of Dependent Choices (DC), as we shall see in the cases of Solovay’s Theorem and Shelah’s Theorem. The purpose of DC is to be able to make

11 a countable number of consecutive choices. This method of creating a sequence is often employed in mathematical analysis with the help of AC. So, even though AC is a stronger axiom than DC, most applications of AC in mathematical analysis require only DC.

The universe of all sets, V, can be defined by transfinite recursion: (1) V0 = ∅; (2) Vα+1 = P(Vα); (3) S S if β is a limit ordinal, then Vβ = α<β Vα; (4) if Ord is the class of all ordinals, then V = α∈Ord Vα. V contains all sets because it is possible to prove, with the help of the regularity axiom, that for every set x, there is an ordinal α, such that x ∈ Vα. The least ordinal α such that x ∈ Vα+1 is called the rank of x, and is denoted by rank(x). The universe V is a proper class. There are collections of sets C ⊂ V which are not of the form C = {x : ϕ(x)}, often called meta classes. Another important class is the constructible universe L, which is obtained like V, but through the iteration of a much more restricted version of the power set operation.

The Constructible Universe L

Definition 3.1.2. (1) If X ∈ V , or X ⊂ V , and there is a formula ϕ(v) such that ∀v (v ∈ X ⇔ ϕ(v)), then we say that X is definable.

(2) We say that a set X ⊂ M is definable over M = (M,E) if there is a formula ϕ(x0, x1, . . . , xn), and

parameters a1, . . . , an ∈ M, such that x ∈ X iff M  ϕ(x, a1, . . . , an).

The collection of sets definable over M is denoted by def(M) = {X ⊂ M : X is definable over M}.

Note that, after an appropriate encoding of the syntax and semantics, the relation “M  ϕ(a1, . . . , an)”, between M, ϕ, and a1, . . . , an, is definable. If M is a set, then the relation is definable for all formulas simultaneously. However, if M is a proper class, then the relation is only definable for each fixed formula. The constructible universe L is obtained by transfinite recursion, through the iteration of the notion S of definability: (1) L0 = ∅; (2) Lα+1 = def(Lα); (3) if β is a limit ordinal, then Lβ = α<β Lα. (4) S L = α∈Ord Lα. The elements of L are called the constructible sets. It is often important to determine whether all sets in a structure (M,E) are constructible. There is a formula ψ(v), such that ψ(X) holds iff the set X is constructible. The formula ∀v ψ(v), which states that every set is constructible, is written as V = L. It is also called the Axiom of Constructibility. L admits a well-ordering. As such, all the elements in L are well-orderable, i.e., L satisfies the principle of well-ordering. Since this principle is equivalent to AC, L satisfies AC. L is also a model of

ℵ0 the Continuum Hypothesis (CH), which states that 2 = ℵ1. In fact, L is a model of the Generalized

ℵα Continuum Hypothesis (GCH), which states that 2 = ℵα+1. Naturally, GCH implies CH. G¨odel proved that GCH is entailed by the Axiom of Constructibility. Thus L is a remarkably well-behaved model.

Theorem 3.1.3 (G¨odel). L is a model of ZFC + V = L + GCH.

It is important to clarify that the proof that L satisfies V = L is not obvious. L was constructed inside V. But, in order to prove that every x ∈ L is constructible from the point of view of L, we have to construct L inside L itself.

12 An inner model of ZFC is a model of ZFC of the form (M, ∈), where M is a transitive class that contains all the ordinals. A prominent example of an inner model is L. In fact, it is provable that if (M, ∈) is an inner model, then LM = L, where LM is the construction of L inside M. Thus L is the smallest inner model of ZFC.

The Mostowski Collapse

We want to define other classes which are models of ZFC. But, before doing so, we introduce some useful definitions and results.

Definition 3.1.4. Let M be a class, and E be a binary relation on M. We say that (1) E is well-founded if every subset of M has an E-minimal element. (2) E is extensional if {z ∈ M : zEx} = {z ∈ M : zEy} implies x = y, for each x, y ∈ M. (3) If M is a proper class, then E is set-like when {y ∈ M : yEx} is a set, for every x ∈ M.

Lemma 3.1.5. Let E be a binary relation on M. Then E is well-founded iff there is a unique function ρ : M → Ord, such that ρ(x) = sup({ρ(y) + 1 : yEx}), for all x ∈ M. The ρ(x) is called the E-rank of x. If M is a set, then ran(ρ) ∈ Ord.

Definition 3.1.6. Let X be a class. (1) We say that X is transitive if for every x ∈ X we have x ⊂ X. (2) The transitive closure of a set X is defined as TC(X) = T{T : X ⊂ T ∧ T is transitive }.

Theorem 3.1.7 (Mostowski Collapse). Let M be a class, and E be a binary relation on M. (1) If M is a set, and E is extensional and well-founded, then there is a unique isomorphism π between (M,E) and (N, ∈), for some transitive N. (2) If M is a proper class, and E is extensional, well-founded, and set-like, then there is a unique isomorphism π between (M,E) and (N, ∈), for some transitive N.

Note that π is a morphism pertaining to relations, not a morphism between interpretation structures. The isomorphism π can be defined, through well-founded E-recursion, by π(x) = {π(z): zEx}. In other words, π collapses M by collapsing its elements. If T ⊂ M is transitive, then π(x) = x, for all x ∈ T .

The Models L(A) and L[A]

We generalize the construction of L in two ways. The first is by defining L(A), which the smallest inner model that contains the set A as an element. Instead of starting the hierarchy of L(A) with L0(A) = ∅, we start it with L0(A) = TC({A}), so that L(A) is transitive, in addition to A ∈ L(A). The rest of the definition is the standard definition by recursion: Lα+1(A) = def(Lα(A)); and if β is a limit ordinal, then S S Lβ(A) = α<β Lα(A). The class of constructible sets over A is L(A) = α∈Ord Lα(A).

Theorem 3.1.8. Let A be a set. (1)L( A) is an inner model of ZF.

(2) If L(A)  “TC({A}) is well-orderable”, then L(A) is a model of ZFC.

13 The other way we generalize the construction of L is by enriching the language of set theory and including a unary predicate symbol A˙. In this case we have a different notion of definability. X is definable from A in M when there is a formula ϕ(x0, x1, . . . , xn) in the enriched language, and parameters a1, . . . , an ∈ M, such that

x ∈ X iff (M, ∈,A ∩ M)  ϕ(x, a1, . . . , an).

Let defA(M) = {X ⊂ M : X is definable over (M, ∈,A∩M)}. Then we recursively define L[A] as follows: S (1) L0[A] = ∅; (2) Lα+1[A] = defA(Lα[A]); (3) if β is a limit ordinal, then Lβ[A] = α<β Lα[A]. The S class of constructible sets from A is L[A] = α∈Ord Lα[A]. The class L[A] is the smallest inner model M, such that if x ∈ M, then x ∩ A ∈ M.

Theorem 3.1.9. L[A] is a model of ZFC.

L[A] satisfies AC because it admits a well-ordering relation

However, it is possible to prove in L[A] that if A is a set, then there is an ordinal α0 such that, for all

ℵα α ≥ α0, we have 2 = ℵα+1.

The Models HOD, HOD[A] and HOD(A)

The construction of L, L(A) and L[A] is based on the notion(s) of constructibility. There is another notion that generates useful models, that of hereditarily ordinal-definable sets.

Definition 3.1.10. Let X be a set.

(1) X is ordinal-definable if there is a formula ϕ(x0, x1, . . . , xn), and ordinal α1, . . . , αn, such

that X = {y : ϕ(y, α1, . . . , αn)}. The class of ordinal-definable sets is denoted by OD. (2) X is hereditarily ordinal-definable if TC({X}) ⊂ OD. The class of the hereditarily ordinal-definable sets is denoted by HOD.

Note that we use Ord to construct HOD from the “top down”, but we use Ord to construct L from the “bottom up”. Note also that, by the Reflection Theorem1, OD and HOD are definable. For each formula ϕ(x0, x1, . . . , xn), each ordinal numbers α1, . . . , αn, and each β such that y ∈ Vβ, there is an ordinal γ such that ϕ(y, α1, . . . , αn) holds iff ϕ(y, α1, . . . , αn) holds in Vγ . So, if X ∈ OD, and α1, . . . , αn are fixed, then there are sufficiently large ordinals β and γ, such that the set X is given by X = {y ∈ Vβ :Vγ  ϕ(y, α1, . . . , αn)}.

Theorem 3.1.11. HOD is an inner model of ZFC.

We can, again, generalize the HOD construction in two directions similar to the ones before.

Definition 3.1.12. Let A and X be sets. (1) We say that X is ordinal-definable from A, or that X ∈ OD[A], if there is a formula ϕ, and ordinal

numbers α1, . . . , αn, such that X = {y : ϕ(y, α1, . . . , αn,A)}.

1See Theorem 3.1.23 below.

14 (2) We say that X is hereditarily ordinal-definable from A if TC({X}) ⊂ OD[A]. The class of the hereditarily ordinal-definable sets sets from A is denoted by HOD[A].

In this definition A fulfills the role of a unary predicate.

Theorem 3.1.13. HOD[A] is an inner model of ZFC.

Definition 3.1.14. Let A and X be sets. (1) We say that X is ordinal-definable over A, or that X ∈ OD(A), if there is a formula ϕ, ordinal

numbers α1, . . . , αn, and a (a1, . . . , ak) of elements of A, such that the set X is given by

X = {y : ϕ(y, α1, . . . , αn, A, (a1, . . . , ak))}. (2) We say that X is hereditarily ordinal-definable over A if TC({X}) ⊂ OD(A). The class of the hereditarily ordinal-definable sets over A is denoted by HOD(A).

Theorem 3.1.15. HOD(A) is an inner model of ZF, but not necessarily of ZFC.

There is an additional generalization of HOD that corresponds more directly to L(A) than HOD(A).

Definition 3.1.16. Let A and X be sets.

(1) We say that X is ordinal-definable from elements of A, or that X ∈ ODA, if there is a formula

ϕ, ordinal numbers α1, . . . , αn, and elements a1, . . . , ak of A, such that the set X is given by

X = {y : ϕ(y, α1, . . . , αn, a1, . . . , ak)}.

(2) We say that X is hereditarily ordinal-definable from elements of A if TC({X}) ⊂ ODA. The class

of the hereditarily ordinal-definable sets from elements of A is denoted by HODA.

ω A case of particular importance for this thesis is when A = ω. In this case, we have X ∈ ODω ω ω iff X = {y : ϕ(y, α1, . . . , αn, y1, . . . , yk)}, where y1, . . . , yk ∈ ω. We can codify the yi into a single sequence a, such that a(0) = y1(0), a(1) = y2(0), . . . , a(k − 1) = yk(0), a(k) = y1(1), and so on. With this codification, it is possible to prove that X ∈ ODω ω iff there is an ordinal α, and a formula ψ, such that Vα  ψ(X, a).

Theorem 3.1.17. Let A be a set.

(1) HODA is a transitive model of ZF.

(2) If there is a well-ordering of A in ODA, then HODA is a transitive model of ZFC.

The L´evy Hierarchy and Elementary Embeddings

The formulas in L∈ can be organized in a hierarchy according to their unbounded quantifiers. A quantifier is said to be bounded if it is of the form ∀x ∈ y, or ∃x ∈ y. The expression ∀x ∈ y is an abbreviation of ∀x (x ∈ y ⇒ ...), and the expression ∃x ∈ y is an abbreviation of ∃x (x ∈ y ∧ ...).

Definition 3.1.18. The L´evyhierarchy of formulas is recursively defined as follows:

(1) A formula is Σ0 (or Π0, or ∆0) if all its quantifiers are bounded.

(2) A formula is Σn+1 if it is of the form ∃x1, . . . , xk ϕ(x1, . . . , xl), where ϕ(x1, . . . , xl) is Πn, and k ≤ l.

(3) A formula is Πn+1 if it is of the form ∀x1, . . . , xk ϕ(x1, . . . , xl) where ϕ(x1, . . . , xl) is Σn, and k ≤ l.

15 A property given by a Σn formula is called a Σn property. And a Πn property is one given by a Πn formula. A Σn property which is also Πn, is called a ∆n property.

∆0 properties are absolute for transitive sets or classes. That is, if M ⊂ N are transitive, if

ϕ(x1, . . . , xn) is a ∆0 formula, and if a1, . . . , an ∈ M, then M  ϕ(a1, . . . , an) iff N  ϕ(a1, . . . , an).

Examples of absolute properties, given by ∆0 formulas, include “x is empty”, “x is transitive”, “x is an ordinal”, “x is a limit ordinal”, “x = ω”, “R is a relation”, and “f is a function”. ∆1 properties are also absolute. Examples of ∆1 properties include “R is a well-founded relation” and “R is a well-ordering”. Absoluteness is important for the theory of forcing because it allows us to work with functions, or ordi- nals, or any absolute concept, without having to specify the model in which these properties are being considered.

Σ1 properties are upward absolute for transitive sets or classes. That is, if M ⊂ N are transitive, if

ϕ(x1, . . . , xn) is a Σ1 formula, if a1, . . . , an ∈ M, and if M  ϕ(a1, . . . , an), then N  ϕ(a1, . . . , an). This is because the elements of M which witness the unbounded existential quantifiers are also elements of N.

Therefore these elements can also witness ϕ(x1, . . . , xn) in N. Examples of upward absolute properties include “|X| ≤ |Y |”, and “|X| = |Y |”.

Finally, Π1 properties are downward absolute for transitive sets or classes. That is, if M ⊂ N are transitive, if ϕ(x1, . . . , xn) is a Π1 formula, and if a1, . . . , an ∈ M are such that N  ϕ(a1, . . . , an), then

M  ϕ(a1, . . . , an). Examples of downward absolute properties include “α is a cardinal”, “α is a regular cardinal”, and “α is a limit cardinal”.

Definition 3.1.19. Let M = (M,E) and N = (N,E) be two structures, where M and N are sets.

(1) If there is an injective map j : M → N, such that for every formula ϕ(x1, . . . , xn), and every

a1, . . . , an ∈ M, we have

M  ϕ(a1, . . . , an) iff N  ϕ(j(a1), . . . , j(an)), then we call j an elementary embedding. (2) If M ⊂ N, and j is the inclusion map, then we say that M is an elementary substructure of N , and write M ≺ N .

If j is an elementary embedding, then M and N satisfy the same sentences. When M,N are proper classes, it is not possible to formalize elementary embeddings in ZFC. But it is possible to formulate a reasonable alternative for inner models.

Definition 3.1.20. Let M = (M, ∈) and N = (N, ∈) be two inner models.

(1) If there is an injective map j : M → N, such that for every Σn formula ϕ(x1, . . . , xn), and every

a1, . . . , an ∈ M, we have

M  ϕ(a1, . . . , an) iff N  ϕ(j(a1), . . . , j(an)),

then we say that j is a Σn-elementary embedding.

(2) If M ⊂ N, and j is the inclusion map, then we say that M is a Σn-elementary substructure of N ,

and write M ≺n N .

Note that if j is a Σn-elementary embedding, and ϕ(x1, . . . , xn) is a Πn formula, then M  ϕ(a1, . . . , an) iff N  ϕ(j(a1), . . . , j(an)).

16 Proposition 3.1.21. Let M = (M, ∈) and N = (N, ∈) be two inner models, and j : M → N be a

Σ1-elementary embedding. Then, for each particular natural n, the map j is a Σn-elementary embedding.

This proposition shows that Σ1-elementary embeddings provide an adequate formalization in ZFC of the informal concept of elementary embedding for inner models. This is because in any particular proof we only need to resort to finitely many instances of the elementarity schema, M  ϕ(a1, . . . , an) iff

N  ϕ(j(a1), . . . , j(an)).

Theorem 3.1.22 (L¨owenheim-Skolem Theorem). Let M = (M,E) be a structure, let X ⊂ M, and let |X| ≤ κ ≤ |M|, where κ is an infinite cardinal. Then there is an elementary substructure N = (N,E) of M with cardinality κ, such that X ⊂ N. In particular, every infinite structure M has a countable elementary substructure N .

M The relativization of a formula ϕ(x1, . . . , xn) to M is the formula ϕ (x1, . . . , xn) resulting from the M restriction of the quantifier variables of ϕ(x1, . . . , xn) to M. Therefore, ϕ (x1, . . . , xn) holds (in V) iff M M  ϕ(x1, . . . , xn). It is implicitly assumed that the variables xi of ϕ (x1, . . . , xn) range over M.

Theorem 3.1.23 (Reflection Principle). Let ϕ(x1, . . . , xn) be a formula. If M is a set, then there is a

Vα limit ordinal α, such that M ⊂ Vα, and ϕ(a1, . . . , an) ⇔ ϕ (a1, . . . , an), for all a1, . . . , an ∈ M. In this case we say that Vα reflects ϕ.

Using the Reflection Principle, we can derive the next proposition.

Proposition 3.1.24. Let T be a set of axioms extending ZFC, and φ1, . . . , φn be any axioms of T . Then M M T ` ∃M,F (M is transitive ∧ F : ω → M is a bijection ∧ (φ1 ∧ ... ∧ φn )).

In particular, ZFC proves the existence of a countable transitive model M for any finite fragment of ZFC. Recall that the Compactness Theorem states that if every finite subset of formulas in a theory T has a model, then T has a model. This suggests that an application of the Compactness Theorem would ensure the existence of a model of ZFC, and thus entail the consistency of ZFC. However, G¨odel’sSecond Incompleteness Theorem implies that ZFC cannot prove its own consistency. Therefore it is important to clarify the apparent contradiction.

If we start with a finite number of previously fixed axioms of ZFC, φ1, . . . , φn, then ZFC proves that there is a (countable transitive) model of φ1, . . . , φn. However, if we do not fix the φ1, . . . , φn beforehand, and try to quantify over every finite set of axioms of ZFC, then ZFC cannot prove that every finite set of axioms of ZFC has a model. This is because, when we try to formalize this universal quantification over finite sets of axioms of ZFC, we accidentally capture infinite sets of axioms which are regarded as finite by some models. As such, ZFC does not prove of itself that every finite set of axioms of ZFC has a model. So, when we say that M is a model of ZFC, we are saying that, for each axiom φ of ZFC that we choose, we have M  φ.

Proposition 3.1.25. Let M be a set.

(1) If M is a transitive model of ZFC + V = L, then there is a limit ordinal α, such that M = Lα.

(2) If M is a transitive model of ZFC+V = L[A], then there is a limit ordinal α, such that M = Lα[A].

17 Inaccessible Cardinals

We finish this chapter by introducing inaccessible cardinals and exploring some of their properties.

Definition 3.1.26. Let κ be a cardinal. (1) the cofinality of κ is the smallest ordinal α ≤ κ such that there is a function f : α → κ whose range ran(f) is not bounded in κ. We write this as cf(κ) = α. (2) κ is called a singular cardinal if cf(κ) < κ. If cf(κ) = κ we say that κ is a regular cardinal. (3) κ is called a successor cardinal if there is a cardinal λ < κ such that κ is the least cardinal greater than λ. We write this as λ+ = κ. If κ is not a successor cardinal it is called a limit cardinal. (4) κ is called a strong limit cardinal if for every cardinal λ < κ we have 2λ < κ (5) κ is called a weakly inaccessible cardinal if it is a regular, uncountable, limit cardinal. (6) κ is called a (strongly) inaccessible cardinal if it is a regular, uncountable, strong limit cardinal.

It follows from the definitions that cf(κ) is a regular cardinal. If κ is a regular cardinal, and X is a S set with cardinality |X| = κ, then it is possible to show that X cannot be expressed as X = ξ<γ Xξ, where γ < κ and |Xξ| < κ. Therefore, if λ, κ are cardinals, such that λ is regular and λ > κ, then for every function f : λ → κ, there is an α ∈ κ, such that |f −1({α})| = λ. An inaccessible cardinal κ is called inaccessible because it cannot be “reached” through repeated ex- ponentiation of a smaller cardinal. Alternatively, it cannot be “reached” through the repeated application of the power set operation to Vα, where α < κ. Note that, if we assume GCH, then a weakly inaccessible cardinal is inaccessible.

Definition 3.1.27. Let φ(x) be a formula, and κ be a cardinal such that φ(κ) holds. We say that κ is a large cardinal if the formula ∃x φ(x) is independent of ZFC. In this case, the property φ(x) is called a large cardinal property. When we add to ZFC the formula ∃x φ(x), or the existence of a specific large cardinal κ, we call this new hypothesis a large cardinal hypothesis.

Theorem 3.1.28. Let κ be the least inaccessible cardinal. Then (ZFC + “κ exists”) ` (Vκ  ZFC).

Since ZFC cannot prove its own consistency, this theorem shows that the existence of inaccessible cardinals is independent from ZFC. Thus inaccessible cardinals are large cardinals. Weakly inaccessible cardinals are also large cardinals. When we add to the ZFC axioms the statement “There exists an inaccessible cardinal”, we write this as ZFC + I.

3.2 Forcing and the L´evyCollapse

Introduction

Forcing is a method used to extend a model of set theory so that the extended model has some desired properties. This method was first used by Paul Cohen in 1963 to establish that CH is independent from

ZFC, and that AC is independent from ZF. Kurt G¨odelhad already established in 1935 that L  AC, and in 1937 that L  GCH. This means that AC and GCH are consistent with ZF. Paul Cohen used his forcing method to extend a given model of ZFC so that the new extended model satisfied ¬CH. His proof

18 that there is a model for ZF + ¬AC also used a forcing method. However, the proof has an additional step to extract from the extended model a substructure which is a model of ZF + ¬AC. As such, AC is independent of ZF, and both CH and GCH are independent from ZFC. Paul Cohen earned the Fields Medal in 1966 for these results, and the innovative techniques used to prove them. Robert Solovay proved in 1964 that if ZFC+I is consistent, then there is a model of ZF+DC in which every set of reals is Lebesgue measurable (LM). There are some sets of reals which are not Lebesgue measurable, as Vitali had already established. But Vitali’s example resorts to AC. Therefore, in order to prove that there is a model of set theory in which every set of reals is Lebesgue measurable, Solovay devised a model of ZF in which AC fails. For this he used a forcing method to extend a model of ZFC+I. Then he extracted from the extended model a substructure which is a model of ZF + DC + LM. Each of the above proofs involves different versions of the forcing method. Solovay used a type of forcing called the L´evycollapse. Informally, the L´evycollapse, Col(µ, < κ), is a type of forcing that makes the cardinals in the original model which are between µ and κ to be ordinals of cardinality µ in the extended model. In this chapter we will first review the basic definitions and results of the theory of forcing. Then we will study some notable examples of forcing. Finally we proceed to explore the properties of the L´evy collapse in greater detail. We will present the proofs of the results which are directly relevant to the Solovay and Shelah theorems.

3.2.1 The Theory of Forcing

In this section we introduce the central concepts and results of the theory of forcing. We adopt the convention of saying that M is a transitive model, to mean that the structure M = (M, ∈) is a transitive model of ZFC. This convention is adopted in the rest of this thesis.

Prerequisites

The idea behind the method of forcing is to start with a transitive model M and to extend it to a new transitive model M[G] by adding a carefully crafted object G/∈ M. The definition of G, and of the extended model M[G], is done in such a way that M[G] is forced to have some desired properties. Let us start by defining the concepts that allow us to use the method of forcing to extend M to M[G].

Definition 3.2.1. A partial order is a pair (P, ≤), where ≤ is a binary relation in the set P, such that (a) p ≤ p; (b) p ≤ q and q ≤ p implies p = q; and (c) p ≤ q and q ≤ r implies p ≤ r, where p, q, r ∈ P. We use the same symbol P for the partially ordered set P and the partial order (P, ≤).

In the context of forcing, a partial order P is called a notion of forcing. If p ∈ P we call p a forcing condition. If p, q ∈ P and q ≤ p we say that q is stronger than p. We will always assume that P has a maximum element, 1P. This element is 1P = ∅ in all the concrete examples of notions of forcing defined in the next sections.

19 Suppose p, q ∈ P. If there is an r ∈ P such that r ≤ p and r ≤ q, we write it as p k q and say that p and q are compatible. If there is no such r, we say that p and q are incompatible, and write it as p⊥q.

Definition 3.2.2. Let P be a partial order. (1) P is called atomless if ∀p ∈ P ∃q, r ∈ P (q ≤ p ∧ r ≤ p ∧ q⊥r). (2) P is called separative if, whenever p  q, there is some r ≤ p such that r⊥q.

As we shall see, when P is atomless we can guarantee that the new object G is not in M. However, the usefulness of P being separative can only be understood later.

Definition 3.2.3. Let P be a partial order. A filter on P is a nonempty set F ⊂ P, such that (a) If p ∈ F and p ≤ q, then q ∈ F ; (b) If p, q ∈ F , then ∃r ∈ F such that r ≤ p and r ≤ q.

When a set F ⊂ P satisfies the condition (a), we say that F is upwardly closed.

Definition 3.2.4. Let P be a partial order. (1) D ⊂ P is called open if ∀p ∈ D if q ≤ p, then q ∈ D. (2) D ⊂ P is called dense if ∀p ∈ P ∃q ∈ D q ≤ p. (3) D ⊂ P is called predense if ∀p ∈ P ∃q ∈ D q k p. (4) A ⊂ P is called an antichain if ∀p, q ∈ A p 6= q ⇒ p⊥q. (5) A ⊂ P is called a maximal antichain if A is an antichain and there is no antichain A0 ) A.

Note that if A is a maximal antichain, and p ∈ P, then there is a q ∈ A, such that q k p.

Definition 3.2.5. Let P be a partial order. (1) D ⊂ P is called dense below p if ∀q ≤ p ∃r ∈ D r ≤ q. (2) D ⊂ P is called predense below p if ∀q ≤ p ∃r ∈ D r k q. (3) A ⊂ P is called an antichain below p if ∀q, r ∈ A (q, r ≤ p ∧ q 6= r ⇒ q⊥r).

The new object G that we want to add to M is a special kind of filter.

Definition 3.2.6. Let M be a transitive model, P ∈ M be a partial order, and G ⊂ P be a filter. We say that G is P-generic over M if, for every D ∈ M which is dense in P, we have G ∩ D 6= ∅.

When the context is clear we often omit the model M over which G is P-generic, or omit the fact that G is a filter. Let us see an example that clarifies the general purpose of this type of filters.

Example 3.2.7. Let P = <ω2. We define p ≤ q if p ⊃ q. Let G be a P-generic filter. The elements of G are finite binary . If a sequence is in G, then all of its subsequences are in G because G is upwardly closed. And, since all p, q ∈ G are compatible, we have p ≤ q or q ≤ p. In other words, G is a linear order of finite binary sequences. So S G is a binary sequence.

Now, for each n ∈ ω, let Dn = {p ∈ P : n ∈ dom(p)}. We claim that every Dn is dense in P. Let q ∈ P and n ∈ ω. Then either n ∈ dom(q), or there is an extension p of the sequence q such that n ∈ dom(p).

In other words, either q ∈ Dn or there is a p ∈ Dn such that p ≤ q.

20 We assumed G to be P-generic. Therefore G ∩ Dn 6= ∅, for each n ∈ ω. This means that there are sequences p ∈ G of any length n ∈ ω. As such S G is a function in ω. It is because G is a filter that the elements of G gradually approximate a function f = S G. And it is because G is P-generic that the elements of G approximate f with arbitrary precision, and force f to be a function with dom(f) = ω.

Theorem 3.2.8. Let M be a transitive model, P ∈ M be a partial order, and DM be the family of sets

D ∈ M which are dense in P. If DM is countable, then, for every p ∈ P, there is a P-generic filter G with p ∈ G. In particular, if M is countable, then, for every p ∈ P, there is a P-generic filter G with p ∈ G.

Theorem 3.2.9. Let M be a transitive model, P ∈ M be a partial order, and G ⊂ P be a filter. Then the following are equivalent:

(1) G is P-generic over M. (2) G ∩ D 6= ∅ for every open D ⊂ P in M. (3) G ∩ D 6= ∅ for every predense set D ⊂ P in M. (4) G ∩ A 6= ∅ for every maximal antichain A ⊂ P in M.

Theorem 3.2.10. Let M be a transitive model, P ∈ M be a partial order, and G ⊂ P be a filter. Then, if p ∈ G, the following are equivalent:

(1) G is P-generic over M. (2) G ∩ D 6= ∅, for every D ⊂ P in M that is dense below p. (3) G ∩ D 6= ∅, for every D ⊂ P in M that is predense below p. (4) G ∩ A 6= ∅ for every A ⊂ P in M that is a maximal antichain below p.

Theorem 3.2.11. Let M be a transitive model, P ∈ M be a partial order, and G ⊂ P be a P-generic filter. If P is atomless, then G/∈ M.

The Construction of M[G]

Now that we have the necessary tools, let us start the construction of M[G]. First we need to define a prototype of M[G] inside M.

Definition 3.2.12. Let M be a transitive model, and P ∈ M be a partial order. For α ∈ M ∩ Ord, we P define the sets Mα, by recursion on α, as follows

P P Mα = {τ ∈ M : τ is a binary relation and ∀(σ, p) ∈ τ (p ∈ P ∧ ∃β < α σ ∈ Mβ )}.

P S P We call M = α∈M∩Ord Mα the class of P-names in M. If τ is a P-name, then rankP(τ) is the least α, P such that τ ∈ Mα+1.

There is an alternative way of defining M P that produces the same result. We start with the universe V and proceed to the construction of VP in a similar manner. In this case, we have M P = (VP)M = VP ∩M. The definition of VP has the advantage of providing M P for any transitive model M. M P is the prototype of M[G] inside M. By definition we have M P ⊂ M. The elements of M P are not elements of M × P, but this is useful as an intuition about M P. Since M P ⊂ M we need an object which is not in M that can generate M[G] from M P in such a way that M ( M[G].

21 Definition 3.2.13. Let M be a transitive model, P ∈ M be a partial order, and G ⊂ P be a P-generic P filter. We define the G-interpretation of τ ∈ M , by recursion on rankP(τ), as

τ G = {σG : ∃p ∈ G (σ, p) ∈ τ}.

Note that τ G is computed in V, not in M. In fact, we can compute τ G inside any transitive model N of ZF, such that M ⊂ N and G ∈ N, and obtain the same object. Let us compute the G-interpretation for two special cases.

Example 3.2.14. The canonical name of x ∈ M is defined, by recursion on rank(x), as

xˇ = {(ˇy, 1) : y ∈ x}.

P To see thatx ˇ ∈ M note that, for every (ˇy, 1) ∈ xˇ, we have 1 ∈ P, and β = rankP(ˇy) < rankP(ˇx) = α. The reason we callx ˇ the canonical name of x is becausex ˇG = {σG : ∃p ∈ G (σ, p) ∈ xˇ} = {yˇG : y ∈ x} = x.

We also define the P-name G˙ = {(ˇp, p): p ∈ P} so that its G-interpretation is G. In fact, we have G˙ G = {σG : ∃p ∈ G (σ, p) ∈ G˙ } = {pˇG : p ∈ G} = G.

Now we can define M[G] using G-interpretation to generate M[G] from M P.

Definition 3.2.15. Let M be a transitive model, P ∈ M be a partial order, and G ⊂ P be a P-generic filter. The generic extension of M is defined by

M[G] = {τ G : τ ∈ M P}.

So M[G] is constructed by taking all the P-names τ ∈ M P and using G-interpretation to “project” to M[G] those τ whose second component is a p ∈ G. Note that M[G] is computed in V.

Theorem 3.2.16. Let M be a transitive model, P ∈ M be a partial order, and G ⊂ P be a P-generic filter. Then (1) M[G] is transitive and M ∪ {G} ⊂ M[G]. (2) M and M[G] have the same ordinals, i.e., M ∩ Ord = M[G] ∩ Ord.

One of the key features of the forcing method is that M[G] has the same ordinals as M, but more objects than M. The forcing method may introduce in M[G] new bijections between ordinals. This can force the same ordinal α in M and M[G] to have smaller cardinality |α| in M[G] than in M. This phenomenon is called cardinal collapse. Cardinal collapse is possible because being a cardinal is not an upward absolute property. Figure 3.1 below is a geometric representation of what happens when we apply the forcing method. The central line represents the class of all ordinals. Since the ordinals in M and M[G] are the same, M has the same height as M[G]. However, the new object G is inside M[G], but outside M. So the extended model M[G] is wider than the base model M, but both models have the same height. Note that all the elements in Vω, i.e., all finite sets, are in M and M[G].

22 Figure 3.1: Geometric representation of the forcing method.

The Forcing Relation

The next step is to establish that M[G] is also a transitive model of ZFC. This is accomplished using the forcing relation and the forcing language. In the forcing language, a formula ϕ uses the names in M P as constant symbols. A detailed definition of the forcing language will not be studied in this thesis, since we can use the language of set theory to intuitively represent the forcing language without hindering the exposition. The central matter is the forcing relation.

Definition 3.2.17 (Forcing Relation). Let M be a transitive model, P ∈ M be a partial order,

ϕ(x1, . . . , xn) be a formula, τ1, . . . , τn ∈ M P, and p ∈ P. We say that p forces ϕ(τ1, . . . , τn), written P G G as p M ϕ(τ1, . . . , τn), if we have M[G]  ϕ(τ1 , . . . , τn ), for all P-generic filters G with p ∈ G.

The forcing relation mimics the logical consequence relation of first-order logic. It allows us to

G G P determine if M[G]  ϕ(τ1 , . . . , τn ) using only M , G and the forcing relation, and without having to G G know the elements τ1 , . . . , τn of M[G]. We will substantiate this claim below with the Forcing Theorem.

Theorem 3.2.18. Let M be a transitive model, and P ∈ M be a partial order. Suppose ϕ(x1, . . . , xn) is P a fixed formula. Then the forcing relation, Fϕ,P = {(p, τ1, . . . , τn): p M ϕ(τ1, . . . , τn)}, is definable in M from the parameter P.

In fact Fϕ,P is uniformly definable. That is, for each fixed formula ϕ(x1, . . . , xn), the forcing relation F is definable in V and the relativization F M to any transitive model M gives the corresponding ϕ,P ϕ,P forcing relation in M. The next theorem lists the main properties of the forcing relation.

23 Theorem 3.2.19. Let M be a transitive model, P ∈ M be a partial order, and ϕ, ψ be formulas. P P (1) If p M ϕ, and q ≤ p, then q M ϕ. P P (2) It is not possible to have both p M ϕ and p M ¬ϕ. P P (3) For every p, there is a q ≤ p, such that either q M ϕ or q M ¬ϕ.(q decides ϕ). P P (4) p M ¬ϕ iff there is no q ≤ p such that q M ϕ. P P P (5) p M (ϕ ∧ ψ) iff p M ϕ and p M ψ. P P P (6) p M (ϕ ∨ ψ) iff, for all q ≤ p, there is an r ≤ q, such that r M ϕ or r M ψ. P P P (7) p M ∀x ϕ(x) iff p M ϕ(τ), for every τ ∈ M . P P P (8) p M ∃x ϕ(x) iff, for all q ≤ p, there exists an r ≤ q, and a τ ∈ M , such that r M ϕ(τ). P P P (9) If p M ∃x ϕ(x), then p M ϕ(τ), for some τ ∈ M .

Recall that, when q ≤ p, we say that q is stronger than p. The word stronger is used because q contains all the information that p does. This fact also clarifies property (1). Property (9), which is called the maximality principle, is the only property in this list that requires AC, in addition to ZF.

Theorem 3.2.20 (Forcing Theorem). Let M be a transitive model, P ∈ M be a partial order, ϕ(x1, . . . , xn) be a formula, and τ1, . . . , τn ∈ M P. Suppose that, for every p ∈ P, there is a P-generic filter G with p ∈ G. Then

G G P M[G]  ϕ(τ1 , . . . , τn ) ⇐⇒ ∃p ∈ G p M ϕ(τ1, . . . , τn).

By combining the Forcing Theorem, and Theorem 3.2.8, we can derive an important corollary.

Corollary 3.2.21. Let M be a countable transitive model, P ∈ M be a partial order, ϕ(x1, . . . , xn) be a formula, and τ1, . . . , τn ∈ M P. Then

G G P M[G]  ϕ(τ1 , . . . , τn ) ⇐⇒ ∃p ∈ G p M ϕ(τ1, . . . , τn).

This equivalence is crucial to many proofs related to forcing. Therefore, we will assume from now on that M is a countable transitive model of ZFC, or a CTM. In the next subsection we will see why this assumption is always possible.

Theorem 3.2.22. Let M be a CTM, P ∈ M be a partial order, and G ⊂ P be a P-generic filter. Then (1) M[G] is a CTM. (2) If N is a CTM, and M ∪ {G} ⊂ N, then M[G] ⊂ N. In other words, M[G] is the smallest CTM that contains M ∪ {G}.

The Formal Underpinning of Forcing

When we use the method of forcing we assume the consistency of ZFC and then generate a transitive model M[G] with some desired properties. Suppose these properties are given by a formula ϕ. In this case we say that ϕ is consistent with ZFC, or that ZFC + ϕ is consistent relative to ZFC. We also write this as Con(ZFC) ⇒ Con(ZFC + ϕ). In relative consistency proofs, we focus on the mathematical aspects of the construction of M[G] from M. Usually, we assume there is a transitive model of ZFC and use a convenient notion of forcing to

24 prove that M[G] satisfies the properties we desire. However, when we assume the consistency of ZFC this does not mean that there is a transitive model of ZFC, since the latter statement is stronger than the former. But there are formal aspects of forcing that provide a finitistic proof of relative consistency and that allow us to proceed as if the ground model M is a transitive model of ZFC. We can enrich the language of set theory and add further axioms to ZFC so that the resulting theory provides a CTM (when we assume its consistency). Let L∗ be a language that extends set theory, formed with the nonlogical symbols ∈,C, and F , where C and F are constant symbols. Let ZFC∗ be the theory in L∗ consisting of the normal axioms of ZFC, plus the relativized sentences φC for every axiom φ of ZFC, plus the sentences “C is transitive” and “F : ω → C is a bijection”.

∗ ∗ Lemma 3.2.23. Let ϕ ∈ L∈. If ZFC ` ϕ, then ZFC ` ϕ. In particular, Con(ZFC) ⇒ Con(ZFC ).

Proof. Suppose that ZFC∗ ` ϕ. This proof can only involve a finite number of axioms of ZFC relativized to C. Let φ1, . . . , φn be these axioms, and let

x x ψ(x, y) ≡ “x is transitive ∧ y : ω → x is a bijection ∧ (φ1 ∧ ... ∧ φn)”.

Then ZFC + ψ(C,F ) ` ϕ. As such, ZFC + ∃x, y ψ(x, y) ` ϕ. But ZFC ` ∃x, y ψ(x, y), by Proposition 3.1.24. Therefore ZFC ` ϕ. In particular, if ZFC∗ derives a contradiction ϕ, then so does ZFC.

Now we can assume Con(ZFC∗) and carry the forcing arguments in ZFC∗. We consider C as the ground model and produce an extension C[G]. This extension employs a convenient P-generic filter G so that we can prove φC[G], for every sentence φ in ZFC + ϕ. This means that Con(ZFC∗) ⇒ Con(ZFC + ϕ). Thus, if we assume Con(ZFC), we get Con(ZFC + ϕ), due to the previous Lemma. It is important to mention that there are several ways to formalize relative consistency proofs, with varying degrees of detail. In practice, however, we will focus on the mathematical construction of M[G] from M, while resting assured that this naive approach has a solid formal underpinning.

Forcing and Cardinals

Recall that “κ is a cardinal” is a downward absolute property for transitive models.

Proposition 3.2.24. Let M be a CTM, P ∈ M be a partial order, and G ⊂ P be a P-generic filter. If κ is a cardinal in M[G], then it is also a cardinal in M.

Definition 3.2.25. Let P be a partial order. (1) We say that P satisfies the κ-chain condition, or the κ-c.c., if every antichain A ⊂ P has size |A| < κ.

If P satisfies the ℵ1-c.c., we say that P satisfies the countable chain condition, or the c.c.c.

(2) Let κ be an infinite regular cardinal. P is called κ-complete if every decreasing sequence hpη : η < γi

of conditions in P, with γ < κ, has a lower bound q ∈ P, i.e., q ≤ pη for all η < γ.

From definition (1) it readily follows that, if |P| = κ, then P satisfies the κ+-c.c. The role of these properties of P is to indicate that certain cardinals are preserved when we apply forcing with P.

Theorem 3.2.26. Let M be a CTM, P ∈ M be a partial order, and G ⊂ P be a P-generic filter. Suppose <κ <κ κ is a regular cardinal in M such that M  “P is κ-complete”. Then M[G] ∩ ( M) = M ∩ ( M).

25 In particular, there are no bijective functions in M[G], with domain µ < κ, that were not in M. Therefore the cardinals µ < κ are preserved.

Theorem 3.2.27. Let M be a CTM, P ∈ M be a partial order, and G ⊂ P be a P-generic filter. Let X ∈ M[G], where X ⊂ M, and write µ = |X|M[G]. Suppose κ is a cardinal in M such that P satisfies the κ-c.c. in M. Then (1) There is some Y ∈ M such that X ⊂ Y and + (i) If κ ≤ µ , then M  |Y | ≤ µ; + (ii) If κ ≥ µ and µ < cf(κ), then M  |Y | < κ; (iii) If κ ≥ µ and µ ≥ cf(κ), then M  |Y | ≤ κ. (2) In particular, if λ is a cardinal in M such that λ > κ, or such that λ = κ and λ is regular in M, then λ remains a cardinal in M[G].

Point (2) clarifies the role of the κ-c.c. in the preservation of cardinals. Note that, in this case, we M M[G] can have (λ = ℵα) , and (λ = ℵβ) , where β < α.

Corollary 3.2.28. If P satisfies the c.c.c. in M, then every cardinal in M remains a cardinal in M[G].

Dense Homomorphisms and the Product Lemma

We are now interested in sequentially extending M to M[G], and then to M[G][H], using two notions of forcing P, Q ∈ M.

Definition 3.2.29. Let P = (P ≤P) and Q = (Q, ≤Q) be partial orders. The map π : P → Q is called a homomorphism if, for all p, q ∈ P,

(a) p ≤P q implies π(p) ≤Q π(q), and

(b) p⊥Pq implies π(p)⊥Qπ(q).

The terms endomorphism and automorphism have the analogous meaning to their counterparts in abstract algebra. Note that if p kP q, then π(p) kQ π(q). Therefore, p kP q iff π(p) kQ π(q). Equivalently, p⊥Pq iff π(p)⊥Qπ(q).

Definition 3.2.30. A homomorphism π : P → Q is called dense if, for every q ∈ Q, there is some p ∈ P such that π(p) ≤Q q. In other words, π is a dense homomorphism when the range of π is dense in Q.

Proposition 3.2.31. Let M be a CTM, P, Q ∈ M be partial orders, and π : P → Q be a dense homomorphism, where π ∈ M.

(1) If G ⊂ P is a P-generic filter, and H = {q ∈ Q : ∃p ∈ G π(p) ≤Q q}, then (i) G = {p ∈ P : π(p) ∈ H}; (ii) H is a filter which is Q-generic over M; and (iii) M[G] = M[H].

(2) If H ⊂ Q is a Q-generic filter, then (i) G = {p ∈ P : π(p) ∈ H} is a filter which is P-generic over M; and (ii) M[H] = M[G].

26 If M is a CTM, P ∈ M is a partial order, and π ∈ M is a dense endomorphism of P, then π induces a mapπ ˜ : M P → M P defined through recursion byπ ˜(τ) = {(˜π(σ), π(p)) : (σ, p) ∈ τ}. This allows us to establish that dense endomorphisms “preserve” the forcing relation in the following sense.

Proposition 3.2.32. Let M be a CTM, P ∈ M be a partial order, π : P → P be a dense endomorphism in M, ϕ(x1, . . . , xn) be a formula, and τ1, . . . , τn ∈ M P. If p ∈ P, then

P P p M ϕ(τ1, . . . , τn) iff π(p) M ϕ(˜π(τ1),..., π˜(τn)).

Proof. Let G be P-generic over M, let H = {r ∈ P : ∃p ∈ G π(p) ≤ r}, and let K = {q ∈ P : π(q) ∈ G}. Then, by the previous proposition, we have that G = {p ∈ P : π(p) ∈ H}, H is P-generic over M, and K is P-generic over M. Claim 1 : For all τ ∈ M P, we have τ G =π ˜(τ)H and τ K =π ˜(τ)G.

σG ∈ τ G iff (σ, p) ∈ τ for some p ∈ G iff (˜π(σ), π(p)) ∈ π˜(τ) for some p ∈ G iff (˜π(σ), π(p)) ∈ π˜(τ) for some π(p) ∈ H iffπ ˜(σ)H ∈ π˜(τ)H

Thus τ G =π ˜(τ)H since the induction hypothesis is σG =π ˜(σ)H . On the other hand, τ K =π ˜(τ)G because of the following equivalences, and of the induction hypothesis σK =π ˜(σ)G:

σK ∈ τ K iff (σ, q) ∈ τ for some q ∈ K iff (˜π(σ), π(q)) ∈ π˜(τ) for some q ∈ K iff (˜π(σ), π(q)) ∈ π˜(τ) for some π(q) ∈ G iffπ ˜(σ)G ∈ π˜(τ)G

P P Claim 2 : If p M ϕ(τ1, . . . , τn), then π(p) M ϕ(˜π(τ1),..., π˜(τn)). Since we are working with a CTM M, we can always find a P-generic filter G with π(p) ∈ G, P by Theorem 3.2.8. Suppose that π(p) ∈ G. Then p ∈ K. Since p M ϕ(τ1, . . . , τn), we have K K K G K G M[K]  ϕ(τ1 , . . . , τn ) by the Forcing Theorem. But τ1 =π ˜(τ1) , . . . , τn =π ˜(τn) by Claim 1, G G and M[K] = M[G] by Proposition 3.2.31. Therefore M[G]  ϕ(˜π(τ1) ,..., π˜(τn) ). Hence, we get P π(p) M ϕ(˜π(τ1),..., π˜(τn)). P P Claim 3 : If π(p) M ϕ(˜π(τ1),..., π˜(τn)), then p M ϕ(τ1, . . . , τn). P Suppose that p ∈ G. Then π(p) ∈ H. Since π(p) M ϕ(˜π(τ1),..., π˜(τn)), the Forcing Theorem implies H H G H G H M[H]  (˜π(τ1) ,..., π˜(τn) ). But τ1 =π ˜(τ1) , . . . , τn =π ˜(τn) by Claim 1, and M[G] = M[H] by G G P Proposition 3.2.31. Therefore M[G]  ϕ(τ1 , . . . , τn ). Hence p M ϕ(τ1, . . . , τn).

Definition 3.2.33. Let P and Q be partial orders. The product of P and Q is the partial order defined 0 0 0 0 as P × Q = (P × Q, ≤P×Q), where (p, q) ≤P×Q (p , q ) iff p ≤P p and q ≤Q q .

The Product Lemma expresses the relation between the consecutive use of generic filters G ⊂ P and H ⊂ Q, and the use of a single generic filter K ⊂ P × Q.

27 Lemma 3.2.34 (Product Lemma). Let M be a CTM, and let P, Q ∈ M be partial orders. (1) If G is P-generic over M, and H is Q-generic over M[G], then G × H is (P × Q)-generic over M. (2) Let K be (P × Q)-generic over M, and set G = {p ∈ P : ∃q ∈ Q, (p, q) ∈ K} and H = {q ∈ Q : ∃p ∈ P, (p, q) ∈ K}. Then G is P-generic over M and H is Q-generic over M[G]. (3) Let K = G × H be (P × Q)-generic over M. Then M[K] = M[G][H].

Homogeneity and Forcing

The homogeneity properties allow us to establish that certain formulas are decidable in the forcing M[G] language. They can also allow us to extract from M[G] a substructure N = HODA which is contained in M, for every A ∈ M. The purpose of this subsection is to prove these two facts.

Definition 3.2.35. Let M be a transitive model.

(1) A partial order P is called homogenous if, for all p, q ∈ P, there is a dense endomorphism π : P → P such that π(p) k q.

(2)A P-name τ ∈ M P is called homogenous if every dense endomorphism π : P → P in M satisfies π˜(τ) = τ.

(3) If τ1, . . . , τn ∈ M P, then a partial order P is called homogenous with respect to τ1, . . . , τn if, for all

p, q ∈ P, there is a dense endomorphism π : P → P such that π(p) k q andπ ˜(τ1) = τ1,..., π˜(τn) = τn.

It follows that if τ1, . . . , τn ∈ M P are homogenous, and P is homogenous with respect to σ1, . . . , σk, then P is homogenous with respect to σ1, . . . , σk, τ1, . . . , τn. Note that if P is homogenous with respect to τ1, . . . , τn, then P is homogenous.

Proposition 3.2.36. Let M be a CTM, and P ∈ M be a separative partial order. Then for every map π˜ : M P → M P induced by a dense homomorphism, and every x ∈ M, we have π˜(ˇx) =x ˇ.

Proof. Let π : P → P be any dense homomorphism andπ ˜ : M P → M P the induced map. We must have π(1) = 1. Suppose instead that π(1) < 1. Then 1  π(1). Since P is separative, there is some r ≤ 1 such that r⊥π(1). By the density of ran(π), there is some s such that π(s) ≤ r. Therefore

π(s)⊥π(1), which is equivalent to s⊥1. This contradicts our convention that 1 is the maximum of P. We can now prove thatπ ˜(ˇx) =x ˇ by induction on the ∈-rank of x. First note that

π˜(ˇx) = {(˜π(ˇy), π(1)) : (ˇy, 1) ∈ xˇ} = {(˜π(ˇy), 1) : (ˇy, 1) ∈ xˇ}.

Base: Note that ∅ˇ = {(ˇy, 1) : y ∈ ∅} = ∅. Thereforeπ ˜(∅ˇ) = ∅. Step: We have seen thatπ ˜(ˇx) = {(˜π(ˇy), 1) : (ˇy, 1) ∈ xˇ}. By the induction hypothesis,π ˜(ˇy) =y ˇ. Therefore π˜(ˇx) = {(ˇy, 1) : (ˇy, 1) ∈ xˇ} =x ˇ.

So all the canonical names are homogenous when P is separative. When P is not separative, we can replace it by a separative partial order, called the the separative quotient, that will produce the same generic extension2.

2See Lemmas 14.11, 14.12, and 14.13 in Jech [4].

28 Theorem 3.2.37. Let M be a CTM, P ∈ M be a partial order, and ϕ(x1, . . . , xn) be a formula. Suppose P P P P is homogenous with respect to τ1, . . . , τn ∈ M . Then either 1 M ϕ(τ1, . . . , τn), or 1 M ¬ϕ(τ1, . . . , τn).

P P Proof. Suppose, instead, that there are p, q ∈ P such that p M ϕ(τ1, . . . , τn) and q M ¬ϕ(τ1, . . . , τn).

Pick a dense endomorphism π : P → P such that π(p) k q andπ ˜(τ1) = τ1,..., π˜(τn) = τn. Then we have P P π(p) M ϕ(˜π(τ1),..., π˜(τn)) because π is a dense homomorphism. And we have π(p) M ϕ(τ1, . . . , τn) because P is homogenous with respect to τ1, . . . , τn. On the other hand, since π(p) k q, there exists r ∈ P such that r ≤ π(p) and r ≤ q. Therefore we P P have r M ϕ(τ1, . . . , τn) and r M ¬ϕ(τ1, . . . , τn). This is not possible by Theorem 3.2.19 (2).

The next corollary is a crucial element in the proof of Solovay’s Theorem.

Corollary 3.2.38. Let M be a CTM, P ∈ M be a separative homogenous partial order, let ϕ(x1, . . . , xn) P P be a formula, and let τˇ1,..., τˇn ∈ M be canonical names. Then we either have 1 M ϕ(τ ˇ1,..., τˇn), or P 1 M ¬ϕ(τ ˇ1,..., τˇn).

Proof. Since P is a separative partial order, all the canonical names are homogenous. And since P is homogenous, it is homogenous with respect to the canonical names. Thus we can apply the previous theorem to ϕ(τ ˇ1,..., τˇn).

Let us now turn to the second goal of this subsection.

Lemma 3.2.39. Let M be a CTM, P be a separative partial order, G ⊂ P be P-generic over M, and x ∈ M[G] such that x ⊂ M. Suppose that P is also homogenous with respect to τ1, . . . , τn ∈ M P and that G G M[G]  ∀y (y ∈ x ⇔ ϕ(y, τ1 , . . . , τn )). Then x ∈ M.

G G Proof. Let us assume that M[G]  ∀y (y ∈ x ⇔ ϕ(y, τ1 , . . . , τn )). In this case, we have y ∈ x iff G G P M[G]  ϕ(y, τ1 , . . . , τn ) iff ∃p ∈ G p M ϕ(ˇy, τ1, . . . , τn).

Since P is separative, it is homogenous with respect toy ˇ and to τ1, . . . , τn. So, by the previous theorem, P P P we either have 1 M ϕ(ˇy, τ1, . . . , τn), or 1 M ¬ϕ(ˇy, τ1, . . . , τn). And, since 1 M ϕ(ˇy, τ1, . . . , τn) iff P P ∃p ∈ G p M ϕ(ˇy, τ1, . . . , τn), we have y ∈ x iff 1 M ϕ(ˇy, τ1, . . . , τn). P M As such, we may compute x inside M as {y : 1 M ϕ(ˇy, τ1, . . . , τn)}. Therefore x ⊂ Vα is a set.

M[G] As a corollary, we can extract from M[G] a proper substructure HODA which is contained in M, for every A ∈ M, and models ZF.

Corollary 3.2.40. Let M be a CTM, P be a separative partial order, G ⊂ P be P-generic over M, and M[G] M[G] x ∈ M[G] such that x ⊂ M. If x ∈ ODA , where A ∈ M , then x ∈ M. In particular, HODA ⊂ M, for every A ∈ M.

Proof. Recall that x ∈ ODA iff x = {y : ϕ(y, α1, . . . , αn, a1, . . . , ak)}, where α1, . . . , αn are ordinals, and G G a1, . . . , ak are parameters in A. Recall also thaty ˇ satisfiesy ˇ = y. Therefore

G G G G G M[G]  ∀y (y ∈ x ⇔ ϕ(ˇy , αˇ1 ,..., αˇn , aˇ1 ,..., aˇk )).

29 Since P is separative, it is homogenous with respect to canonical names. Therefore we can apply the previous lemma to obtain x ∈ M. M[G] In particular, HODA ⊂ M for every A ∈ M, since HODA ⊂ ODA for each A ∈ M.

Having reviewed the central notions of the theory of forcing, let us explore some notable examples.

3.2.2 Examples

Cohen Forcing

Definition 3.2.41. Let C = <ωω be the set of finite sequences of natural numbers. For p, q ∈ C, let p ≤ q iff p ⊃ q. Then C = (C, ≤) is called the Cohen forcing.

The Cohen forcing takes a CTM and adds a new to it, called a Cohen real. A Cohen real is a particular example of a generic real.

Definition 3.2.42. Let M be a CTM. We say that x ∈ ωω is generic over M if there is a partial order

P ∈ M, and a P-generic filter G, such that M[G] is the smallest CTM that satisfies M ∪ {x} ⊂ M[G]. In this case, we also write M[x] instead of M[G].

One of the fortunate advantages of the forcing method is that it adds a real number x to M, so that M[x] is a CTM, without generating contradictions. This is not as straightforward as one might think because a real number can codify a large amount of information. For example, x can codify the order- type3 of the ordinals in M. In this case, x encodes a surjective function from ω ∈ M to (Ord ∩ M) ⊂ M. However, the range of this function is not a set in M. Therefore, if we add x to M, then we get a contradiction with the replacement schema because Ord ∩ M = Ord ∩ M[x] is not a set in M[x].

<κ Definition 3.2.43. Let κ ≥ ω be a cardinal. Let Cκ = κ be the set of all functions f : γ → κ, where

γ < κ. For p, q ∈ Cκ, let p ≤ q iff p ⊃ q. Then (Cκ, ≤) is called the Cohen forcing at κ.

This definition is a generalization of the Cohen forcing, since Cω = C. The forcing notion Cκ takes a CTM and adds a new subset of κ to it.

Product Forcing

ℵ0 If we want to find a model for 2 > ℵ1, we need to add many Cohen reals at once. For this purpose we use the following notion of forcing.

Definition 3.2.44. Let α be an ordinal and C be the Cohen forcing. For p ∈ αC let the support of p, supp(p), be the set of all η < α with p(η) 6= ∅. Let

α C(α) = {p ∈ C : |supp(p)| < ℵ0}.

For p, q ∈ C(α), define p ≤ q iff p(η) ⊃ q(η), for all η < α. Then (C(α), ≤) is called the product of α Cohen forcings with finite support.

3If S is a well-ordered set, then it is order isomorphic to an unique ordinal. This ordinal is the order-type of S.

30 Lemma 3.2.45. C(α) satisfies the c.c.c.

Now we can prove that the Continuum Hypothesis is independent from ZFC. We have already mentioned that L satisfies CH. Therefore Con(ZFC) ⇒ Con(ZFC + CH). Cohen used forcing for the first time to prove the following theorem.

Theorem 3.2.46 (Cohen). Con(ZFC) ⇒ Con(ZFC + ¬CH).

M M Proof. Let M be a CTM, and α ∈ M be an ordinal such that |α| ≥ ℵ2 . Since α ∈ M, we have C(α) ∈ M. Let G be C(α)-generic over M. Inside M[G] we may define F : α → ωω by setting F (η) = S{p(η): p ∈ G}, for η < α. Let

Dk,η = {p ∈ C(α): k ∈ dom(p(η))} ∈ M

Fix k < ω and η < α. Then, for all p : α → C with finite support, there is some q ∈ Dk,η such that q ⊃ p.

Therefore Dk,η is dense in C(α). Hence F is a well-defined function. For η, η0 < α, with η 6= η0, let

0 Dη,η = {p ∈ C(α): ∃k ∈ (dom(p(η)) ∩ dom(p(η0))), such that p(η)(k) 6= p(η0)(k)} ∈ M

Fix any η and η0 in the above conditions. Then, for all p : α → C with finite support, there is some 0 0 q ∈ Dη,η such that q ⊃ p. Therefore Dη,η is dense in C(α). As such, F (η) 6= F (η0), for η 6= η0. Hence F ∈ M[G] is an injection from α into ωω. In particular, |α|M[G] ≤ |ωω|M[G] = (2ℵ0 )M[G].

Finally, since C(α) satisfies the c.c.c., M and M[G] have the same cardinals. Therefore we get

M[G] M[G] M[G] ℵ0 M[G] |α| ≥ ℵ2 . As such, ℵ2 ≤ (2 ) . This shows that M[G]  ¬CH.

We can generalize the definition of C(α) to Cκ(α) as follows.

Definition 3.2.47. Let α be an ordinal and Cκ be the Cohen forcing at κ, where κ is an infinite regular α cardinal. For p ∈ (Cκ) let the support of p, supp(p), be the set of all η < α with p(η) 6= ∅. Let

α Cκ(α) = {p ∈ (Cκ): |supp(p)| < κ}.

For p, q ∈ Cκ(α), define p ≤ q iff p(η) ⊃ q(η), for all η < α. Then (Cκ(α), ≤) is called the product of α Cohen forcings at κ, with support < κ.

In general, if we have a collection {Pη : η < α} of α notions of forcing, we can define the product forcing.

Definition 3.2.48. Let {(Pη, ≤Pη ): η < α} be a collection of α notions of forcing. Q (1) The product forcing of Pη is the pair (( η<α Pη), ≤Π), where Q α S (a) η<α Pη = {p ∈ ( η<α Pη): p(η) ∈ Pη}; and

(b) p ≤Π q iff p(η) ≤Pη q(η), for all η < α. α S (2) If κ < α, the κ-product forcing of Pη is the set {p ∈ ( η<α Pη): p(η) ∈ Pη ∧ |supp(p)| < κ}, with

the order ≤Π.

From this definition we can easily see that C(α) is the ℵ0-product forcing of α Cohen forcings C. It is also easy to see that Cκ(α) is the κ-product forcing of α Cohen forcings at κ.

31 3.2.3 The L´evyCollapse

This section is dedicated to the L´evy collapse, which is the notion of forcing used in Solovay’s Theorem. Before defining the L´evycollapse let us define a simpler notion of forcing that can collapse cardinals.

Definition 3.2.49. Let µ be a regular cardinal, and let κ ≥ µ. Let Col(µ, κ) = <µκ be the set of all functions f : γ → κ, where γ < µ. For p, q ∈ Col(µ, κ), let p ≤ q iff p ⊃ q. Then (Col(µ, κ), ≤) is called the collapse of κ to µ.

Note that Col(µ, µ) = Cµ. In particular, this notion of forcing does not always collapse cardinals.

Definition 3.2.50 (L´evyCollapse). Let µ be an infinite regular cardinal, and X be a set of ordinals which are all of size ≥ µ. Let

Col∗(µ, X) = {p : p is a function with domain D ⊂ X and ∀η ∈ D, p(η) ∈ Col(µ, η)}.

For p ∈ Col∗(µ, X), we define the support of p by supp(p) = dom(p). Now let

Col(µ, X) = {p ∈ Col∗(η, X): |supp(p)| < µ}.

For p, q ∈ Col(µ, X), define p ≤ q iff supp(p) ⊃ supp(q) and p(η) ⊃ q(η), for all η ∈ supp(q). If κ > µ, then we write Col(µ, < κ) for Col(µ, [µ, κ[). Then (Col(µ, < κ), ≤) is called the L´evycollapse of κ to µ.

Given the definition of the L´evycollapse we can anticipate that, in the Col(µ, < κ)-generic extension M[G] of a CTM, every ordinal in [µ, κ[ will have the same cardinality as µ. That is, if (|µ| = ℵα) , and M[G] δ ∈ [µ, κ[ is an ordinal, then (|δ| = ℵα) .

Proposition 3.2.51. Col(µ, < κ) is an atomless partial order.

Proof. Let p ∈ Col(µ, < κ) and η∈ / supp(p). It is possible to extend p to q and q0 in Col(µ, < κ), in such a way that q⊥q0, as follows: let q(η): {∅} → η and q0(η): {∅} → η be functions that satisfy q(η)(∅) 6= q0(η)(∅). Then q⊥q0.

This proposition ensures that a Col(µ, < κ)-generic extension M[G] of a CTM M is a proper extension because G/∈ M.

Proposition 3.2.52. Col(µ, < κ) is a separative partial order.

Proof. Let us consider p, q ∈ Col(µ, < κ) such that p  q. In particular, p 6= q. If p⊥q then we the result holds trivially. So, let p k q. Then p and q have a common extension. Therefore, either p(η) ⊂ q(η) or p(η) ⊃ q(η), for each η ∈ [µ, κ[ and p(η), q(η) ∈ Col(µ, η). Since p  q and p 6= q, there is an η ∈ [µ, κ[ such that p(η) ( q(η). If dom(p(η)) = α, and dom(q(η)) = β, then α < β < µ. Thus we can extend p(η) to a function fη :(α + 1) → η, with fη ∈ Col(µ, η), that satisfies fη(α) 6= q(η)(α). 0 0 0 Finally, define r ∈ Col(µ, < κ) as r(η) = fη; and r(η ) = p(η ) for η 6= η. Then r ≤ p and r⊥q.

Corollary 3.2.53. Let M be a CTM, and Col(µ, < κ) be the L´evycollapse in M.

32 (1) Col(µ, < κ) is homogenous. (2) If π : Col(µ, < κ) → Col(µ, < κ) is a dense endomorphism, then π˜(ˇx) =x ˇ and π˜(1) = 1. Thus Col(µ, < κ) is homogenous with respect to canonical names. Col(µ,<κ) (3) Let ϕ(v1, . . . , vn) be a formula, and xˇ1,..., xˇn ∈ M be canonical names. Then either Col(µ,<κ) Col(µ,<κ) 1 M ϕ(x ˇ1,..., xˇn), or 1 M ¬ϕ(x ˇ1,..., xˇn).

Proof. (1) Let p, q ∈ Col(µ, < κ). We will prove that there is an automorphism π of Col(µ, < κ), such that π(p) k q. Note that this automorphism is necessarily a dense endomorphism. There is a bijective function f : κ → κ, such that f(supp(p)) ∩ supp(q) = ∅. If r ∈ Col(µ, < κ), and β < κ, then we define π(r)(f(β))(ξ) = r(β)(ξ), where f(β) ∈ supp(π(r)) iff β ∈ supp(r). It follows from the definition that π is bijective, and that it preserves the order and the incompatibility between elements of Col(µ, < κ). Thus π is an automorphism. Finally, if f(β) ∈ supp(π(p)), then β ∈ supp(p), which means that f(β) ∈/ supp(q). Therefore supp(π(p)) ∩ supp(q) = ∅. As such, π(p) k q.

(2) Since Col(µ, < κ) is separative, it is homogenous with respect tox ˇ1,..., xˇn. (3) It follows from Corollary 3.2.38.

This is an important property of the L´evycollapse. It tells us that all generic extensions of M that use the L´evycollapse evaluate the properties of the elements of M in the same way. This will be a crucial step in the proof of Solovay’s Theorem.

Theorem 3.2.54. Col(µ, < κ) is µ-complete.

Proof. Let γ < µ, and hpη : η < γi be a decreasing sequence of conditions in Col(µ, < κ). Note that S |supp(pη)| < µ, for all η < γ. Since µ is regular, the set D = η<γ supp(pη) has cardinality |D| < µ. Let S q be a forcing condition such that supp(q) = D and q(ξ) = {pη(ξ): η < γ ∧ supp(pη) 3 ξ}. Each q(ξ) is a well-defined member of Col(µ, ξ), because pη1 (ξ) ⊂ pη2 (ξ), for η1 < η2. As such, q ∈ Col(µ, < κ) is well-defined. Since, by construction, q ≤ pη for all η < γ, we conclude that Col(µ, < κ) is µ-complete.

This theorem implies that all cardinals below µ remain the same when we force with the L´evycollapse. To prove that the cardinals above κ are preserved we need the ∆-Lemma.

Definition 3.2.55. Let B be a collection of sets. B is called a ∆-system if there is a set r such that every x, y ∈ B, with x 6= y, satisfy x ∩ y = r. The set r is called the root of B.

Lemma 3.2.56 (∆-Lemma). 4 Let κ be an uncountable regular cardinal, and µ < κ be an infinite regular cardinal, such that λγ < κ, for all λ < κ and γ < µ. Let A ⊂ {x ⊂ κ : |x| < µ} be a collection of sets with size |A| = κ. Then there is a collection B ⊂ A, with size |B| = κ, which forms a ∆-system.

Theorem 3.2.57. Let κ be an uncountable regular cardinal, and µ < κ be an infinite regular cardinal, such that λγ < κ, for all λ < κ and γ < µ. Then Col(µ, < κ) satisfies the κ-c.c.

Proof. Suppose there is an antichain C ⊂ Col(µ, < κ), such that |C| ≥ κ. Let A = {supp(p): p ∈ C}. We have supp(p) ⊂ [µ, κ[, and |supp(p)| < µ. Hence A ⊂ {x ⊂ κ : |x| < µ}. There are two possibilities:

4The proof can be found, in simplified form, in pages 105-106 of Schindler [12].

33 (1) Let |A| < κ. Since κ is regular, there is some supp(q) ∈ A that corresponds to κ conditions p ∈ C with support supp(p) = supp(q). If we can prove that there are less than κ conditions p ∈ C with supp(p) = supp(q), then we get a contradiction. <µ <µ S γ For each p ∈ C and λ ∈ supp(p), we have p(λ) ∈ Col(µ, λ) = λ. We have |λ | = | γ<µ(λ )| < κ, for each λ ∈ supp(p). Since supp(p) is not cofinal in κ, there is a cardinal ρ such that, for all λ ∈ supp(p), we have |λ<µ| ≤ ρ < κ. Therefore, the number of conditions p with supp(p) = supp(q) is less or equal to ρ|supp(p)| < κ. This is the contradiction. (2) Let |A| ≥ κ. Then, by the ∆-Lemma, there is a B ⊂ A such that |B| = κ and B has a root r. Every distinct supp(p), supp(p0) ∈ B satisfy supp(p) ∩ supp(p0) = r. On the other hand, we have p⊥p0 because p, p0 ∈ C. The incompatibility between p and p0 must arise in r. But, similarly to case (1), it can |r| be shown that there are at most ρ < κ conditions p ∈ C, such that supp(p  r) = r. This contradicts |C| = κ.

As a consequence of this proposition we get the main theorem about the effects of the L´evycollapse on cardinals.

Corollary 3.2.58. Let M be a CTM, κ be an uncountable regular cardinal, and µ < κ be an infinite regular cardinal, such that λγ < κ, for all λ < κ and γ < µ. If G is Col(µ, < κ)-generic over M, then (1) All M-cardinals outside of [µ, κ[ will remain cardinals in M[G]; (2) All M-cardinals in [µ, κ[ will be ordinals of size µ in M[G]; (3) κ = µ+ in M[G].

Proof. (1) Since Col(µ, < κ) is µ-complete, the M-cardinals < µ remain the same in M[G]. On the other hand, Col(µ, < κ) satisfies the κ-c.c., and κ is regular. So all the M-cardinals ≥ κ remain cardinals in M[G]. S (2) Let G be a Col(µ, < κ)-generic filter over M, and let D = p∈G supp(p). Then, using the density of the filter G, we can prove that D = [µ, κ[, and that for each δ ∈ D, there is a surjective function f(δ): µ → δ in M[G], defined by f(δ) = S{p(δ): p ∈ G ∧ supp(p) 3 δ}. Therefore the cardinality of every ordinal in [µ, κ[ is µ in M[G]. (3) As a consequence of (1) and (2), we have κ = µ+ in M[G].

The specific form of the L´evycollapse that will be used below is Col(ω, < κ), where κ is an inaccessible cardinal. As such, we summarize the properties of Col(ω, < κ) in the next corollary.

Corollary 3.2.59. Let M be a CTM, κ be an inaccessible cardinal, ϕ(v1, . . . , vn) be a formula, and Col(ω,<κ) xˇ1,..., xˇn ∈ M be canonical names. If G is Col(ω, < κ)-generic over M, then (1) Col(ω, < κ) is an atomless and separative partial order. (2) Col(ω, < κ) is homogenous with respect to canonical names. Col(ω,<κ) Col(ω,<κ) (3) Either 1 M ϕ(x ˇ1,..., xˇn), or 1 M ¬ϕ(x ˇ1,..., xˇn). (4) Col(ω, < κ) satisfies the κ-c.c. (5) All M-cardinals outside of [ω, κ[ will remain cardinals in M[G]. All M-cardinals in [ω, κ[ will be

ordinals of size ℵ0 in M[G]. And κ = ℵ1 in M[G].

34 3.3 The Lebesgue Measure and Descriptive Set Theory

Introduction

In this chapter we present the Lebesgue measure in the context of set theory, particularly of descriptive set theory. Descriptive set theory consists in the study of the structure of definable sets of reals. Examples of these sets include the Borel sets and the projective sets. The projective sets can be obtained from Borel sets by continuous images and complements. An equivalent definition of the projective sets is that they are subsets of R that can be obtained from closed subsets of Rn by a combination of projecting to a lower dimension and taking complements.

To see how this relates to definability, consider the projection of a subset A ⊂ R2 to the x-axis. The result will be the set of all x such that there exists y with (x, y) ∈ A. Thus, projection corresponds to existential quantification. On the other hand, taking complements corresponds to negation. So we can combine the two and obtain universal quantification as well. We can therefore think of a projective set as a set that is definable from a in a finite number of steps.

We want to define the Lebesgue measure in a framework better suited for set theory. We also want to see how to translate the previous results about the Lebesgue measure in R to this set theoretical framework. In set theory R can be constructed through Dedekind cuts. Alternatively, we can start with ω, then consecutively build equivalence classes, upon equivalence classes, and define R as the set of equivalence classes of Cauchy sequences of rational numbers. However, these constructions are too cumbersome to be used regularly in the context of set theory.

Instead of working with R, it is convenient to work with the set of sequences of natural numbers, ωω, or with the set of binary sequences, ω2. This is because ω2 and ωω have the same cardinality as R, yet each of these representations is suitable for different situations. The representation by ωω is particularly convenient, since there are homeomorphisms that give us ωω ' (ωω)k ' (ω)j × (ωω)k ' (ωω)ω, where j, k ∈ ω. This is not true for R. The dimension of Rk is invariant for homeomorphisms — which is an essential geometric property of Rk. But the products of ωω are not dimensionally invariant. This makes the representation of the reals in set theory by ωω more malleable. And it allows us to compress in ωω the information contained in the sets (ωω)k, or (ω)j × (ωω)k, or even in (ωω)ω.

We use ωω more often than ω2 in this chapter. But in order to include both ωω and ω2 in the definitions and results, we adopt the unified notation ωK, whenever it is possible and convenient.

We start this chapter by introducing topological notions for ωK. We proceed to study the Borel and the projective sets, as well as the hierarchies of sets associated to them. Then we introduce the Lebesgue measure on ωK and, with the help of Borel codes, we prove that the Lebesgue measure, and other important properties, are absolute between transitive models of ZF + DC. After this, we introduce an important forcing notion related to Borel sets, the random algebra. Then we describe the Baire property and the perfect sets. We end the chapter by studying filters, ideals, ultrafilters and trees.

35 The Topology on ωK

Let us start by defining a topology for ωK, where K is either ω, or 2 = {0, 1}. This is accomplished with the help of finite sequences in K. If s ∈ <ωK is a finite sequence, then its length is denoted by |s|. Note that the |s| is the ordinal |s| = {0, 1,..., |s| − 1}. In particular, |s| = dom(s). If |s| = n, and k ∈ K, then we define s ∗ k as the extension of s, with length n + 1, that satisfies (s ∗ k)(n) = k.

Definition 3.3.1. Let s ∈ <ωK. (1) A set of the form O(s) = {x ∈ ωK : s ⊂ x} is called a basic . (2) The set A ⊂ ωK is called open if it is the union of basic open sets. (3) The set A ⊂ ωK is called closed if ωK − A is an open set.

Proposition 3.3.2. The collection of all open sets is a topology for ωK.

Proof. Any basic open set is open. In particular, O(∅) = ωK is open. Any union of open sets is again an open set, because it is expressible as a union of basic open sets. In particular, the empty union of open sets is open, i.e., the set ∅ is open. Finally, a finite intersection of open sets Ui is an open set, since the Tn Tn S i i intersection i=0 Ui = i=0( k O(sk)) can be determined by the intersection of the finite sequences sk, and the basic open sets associated to them.

Note that the basic open sets O(s) are also closed sets, because ωK −O(s) = S{O(t): |t| = |s|∧t 6= s}. The topology on ωK coincides with the product topology on Kω, where each factor K is equipped with the discrete topology. We can also form the (ωK, d ). The distance between x, y ∈ ωK 1 is defined by d(x, y) = 2n , where n is the least natural number for which x(n) 6= y(n). The topology of (ωK, d ) also coincides with the topology on ωK generated by the basic open sets.

Definition 3.3.3. Let ωK be equipped with the topology just defined. If K = ω, then ωω is called the . If K = 2, then ω2 is called the Cantor space.

Lemma 3.3.4. If A ⊂ ωK is open, then it can be expressed as a union of pairwise disjoint basic open sets.

Proof. Let A = S O(s ) be an open set. In order to eliminate the unnecessary overlap between sn∈S n basic open sets O(sn) we eliminate the sequences which are not ⊂-minimal. Specifically, let us define

R = {sn ∈ S : sn has no proper restriction in S}. Then the basic open sets O(sn) with sn ∈ R are pairwise disjoint. But their union remains the same, A = S O(s ) = S O(s ). sn∈S n sn∈R n

Note that the basic open sets form a basis for the topology. Since the number of basic open sets is countable, the space ωK is second-countable, and thus first-countable. We finish this subsection with a characterization of the compact subsets of the Cantor space, which will be useful in the proof of Shelah’s Theorem.

Lemma 3.3.5. The Cantor space ω2 is a compact Hausdorff space. In particular, the compact subsets of ω2 are precisely its closed subsets.

36 Proof. The topology on ω2 coincides with the product topology on 2ω. The topological space 2 = {0, 1} is compact for the discrete topology, since there are only a finite number of open sets to cover {0, 1}. As such, the Tychonoff Theorem implies that ω2 is compact. Therefore the closed subsets of ω2 are compact. ω Let x, y be distinct elements of 2. Then there is a least n ∈ ω, such that x  n 6= y  n. In this case, ω O(x  n) ∩ O(y  n) = ∅, but x ∈ O(x  n), and y ∈ O(y  n). Therefore 2 is a Hausdorff space. As such, the compact subsets of ω2 are closed.

Borel and Projective Sets

In this subsection, we assume that X is a Polish space, which is a complete separable metric space.

Examples of Polish spaces include the R, the Baire space ωω, and the Cantor space ω2.

Definition 3.3.6. The Borel σ-algebra on X is the smallest σ-algebra on X that contains all of its open sets. It is denoted by B(X). We call B ⊂ X a Borel set when B ∈ B(X).

It follows from the definition that the open sets, and the closed sets, are Borel sets. It also follows that the Fσ sets, and the Gδ sets, are Borel sets. That is, the countable unions of closed sets, and the countable intersections of open sets, are Borel sets. The Borel sets can be arranged in a hierarchy that describes them in greater detail.

Definition 3.3.7 (The ). Let α < ω1. We define, by transfinite recursion on α, the 0 0 collections Σα and Πα of subsets of X: 0 (1) Σ1 is the collection of all open sets. 0 (2) Π1 is the collection of all closed sets. 0 S 0 (3) Σα is the collection of all sets of the form A = n∈ω An, where each An is in Πβ for some β < α. 0 0 (4) Πα is the collection of the complements of all the sets in Σα.

0 0 The elements of Σα and Πα are all Borel sets. This can be easily established by transfinite induction 0 0 0 0 on α. It is also possible to use transfinite induction to prove that Σα ⊂ Σβ; that Σα ⊂ Πβ; that Π0 ⊂ Π0 ; and that Π0 ⊂ Σ0 , where α < β. This entails S Σ0 = S Π0 . α β α β α<ω1 α α<ω1 α The construction of the Borel hierarchy ensures that S Σ0 is closed for complements and count- α<ω1 α able unions, i.e., that it is a σ-algebra. Hence

B(X) = S Σ0 = S Π0 . α<ω1 α α<ω1 α

By definition, the of a Borel set, as well as the countable union of Borel sets, is a Borel set. However, the continuous image of a Borel set need not be a Borel set. The study of the continuous images of Borel sets, and their projections, leads us to the projective hierarchy. The projection of A ⊂ X × Y onto X, is the set P = {x ∈ X : ∃y (x, y) ∈ A}. Let us start by defining the analytic and coanalytic sets.

Definition 3.3.8. Let A be a subset of a Polish space X. We say that A is analytic if there is a continuous function f : ωω → X, such that A = f(ωω). If X − A is analytic, we say that A is coanalytic.

Proposition 3.3.9. Let A be a subset of a Polish space X. Then the following are equivalent: (1) A is the continuous image of ωω.

37 (2) A is the continuous image of a Borel set B, where B is contained in some Polish space Y . (3) A is the projection of a Borel set in X × Y , for some Polish space Y . (4) A is the projection of a closed set in X× ωω.

We can now define the projective sets.

1 1 Definition 3.3.10 (The Projective Hierarchy). We define the collections Σn and Πn of subsets of X, by recursion on the n ≥ 1: 1 (1) Σ1 is the collection of all analytic sets in X. 1 (2) Π1 is the collection of all coanalytic sets in X. 1 1 ω (3) Σn+1 is the collection of the projections of all Πn sets in X× ω. 1 1 (4) Πn is the collection of the complements of all Σn sets in X. 1 1 If a set A belongs to any of the collections Σn or Πn, then we say that A is a projective set.

1 1 1 1 1 1 Let ∆n = Σn ∩ Πn. It is possible to use induction on n ≥ 1 to prove that ∆n ⊂ Σn ⊂ ∆n+1, and 1 1 1 S 1 S 1 S 1 that ∆n ⊂ Πn ⊂ ∆n+1. This entails n≥1 ∆n = n≥1 Σn = n≥1 Πn. 1 1 The Σ1, Π1, and the Borel sets are directly connected.

Theorem 3.3.11 (Suslin). If A ⊂ X is both analytic and coanalytic, then A is a Borel set. Equivalently, 1 ∆1 = B(X).

The Borel and projective hierarchies are also known as the boldface hierarchies, to distinguish them from the lightface hierarchies — also known as the effective hierarchies.

The Effective Hierarchies

In this subsection we define the effective hierarchies, their relativization to a real number, and then prove how the relativized hierarchies are linked to the Borel and projective hierarchies. The effective hierarchies are hierarchies of sets, defined in second-order arithmetic, and stratified by the complexity of the formulas which define their sets. Second-order arithmetic is an interpretation structure in second-order logic. Informally, second-order logic allows us to quantify over all the subsets of the domain, in addition to quantifying over the domain of a structure. Since |P(ω)| = |ωω|, we use ωω to define the second-order arithmetic.

Definition 3.3.12. The second-order arithmetic is the structure A2 = (ω, ωω, ap, +, ×, <, 0, 1). The connection between ω and ωω is made by the application ap : ωω × ω → ω, defined by ap(x, m) = x(m).

We implicitly use the variables m1, m2,... to range over ω, and the variables x1, x2,... to range ω i ω j over ω. If (m1, . . . , mi, x1, . . . , xj) ∈ ω × ( ω) , then we write A(m1, . . . , mi, x1, . . . , xj) to mean

(m1, . . . , mi, x1, . . . , xj) ∈ A.

Definition 3.3.13. Let A ⊂ ωi × (ωω)j. We say that A is arithmetical if A is definable by a formula in A2 that does not quantify over ωω. The on ωi × (ωω)j is defined by 0 (1) A ∈ Σn iff ∀w (A(w) ⇔ ∃m1 ∀m2 . . . Qmn ϕ(m1, . . . , mn, w)); 0 (2) A ∈ Πn iff ∀w (A(w) ⇔ ∀m1 ∃m2 . . . Qmn ϕ(m1, . . . , mn, w)),

38 where the formula ϕ has only bounded quantifiers ranging over ω, and Q is such that we have alternating quantifiers.

A formula that defines an arithmetical set is also called arithmetical. Note that it is implicit in this definition that it is possible to contract consecutive instances of the same quantifier into a single instance, 0 0 and shift the bounded quantifiers to the right. If an arithmetical set A is in Σ1 ∩ Π1, then we call it a recursive set.

Definition 3.3.14. Let A ⊂ ωi × (ωω)j. We say that A is analytical if A is definable by a formula in A2. The on ωi × (ωω)j is recursively defined by 1 0 (1)Σ 0 = Σ1; 1 0 (2)Π 0 = Π1; 1 (3) A ∈ Σn iff ∀w (A(w) ⇔ ∃x1 ∀x2 . . . Qxn ϕ(x1, . . . , xn, w)); 1 (4) A ∈ Πn iff ∀w (A(w) ⇔ ∀x1 ∃x2 . . . Qxn ϕ(x1, . . . , xn, w)), where ϕ is arithmetical, and Q is such that we have alternating quantifiers.

Note that it is implicit in this definition that we can shift the number quantifiers to the right, and the function quantifiers to the left; and that we can contract consecutive instances of the same quantifier into a single instance. These implicit facts, and the similar ones above, are proved in Kanamori, §12. The arithmetical and analytical hierarchies are linked to the Borel and projective hierarchies, respec- tively. The connection is made using relativization to real parameters.

Definition 3.3.15. Let a ∈ ωω, and leta ˙ be a binary predicate symbol, such thata ˙(m, n) iff a(m) = n. The second-order arithmetic relative to a is the structure A2(a) = (ω, ωω, ap, +, ×, <, 0, 1, a˙), where ap(x, m) = x(m).

Now that we have extended the language to include a, we can relativize the effective hierarchies to a.

Definition 3.3.16. Let A ⊂ ωi ×(ωω)j. We say that A is arithmetical in a, if A is definable by a formula in A2(a) that does not quantify over ωω. The arithmetical hierarchy relativized to a is defined in the extended language by 0 (1) A ∈ Σn(a) iff ∀w (A(w) ⇔ ∃m1 ∀m2 . . . Qmn ϕ(m1, . . . , mn, w)); 0 (2) A ∈ Πn(a) iff ∀w (A(w) ⇔ ∀m1 ∃m2 . . . Qmn ϕ(m1, . . . , mn, w)), where the formula ϕ has only bounded quantifiers ranging over ω, and Q represents alternating quantifiers.

Definition 3.3.17. Let A ⊂ ωi × (ωω)j. We say that A is analytical in a if A is definable by a formula in A2(a). The analytical hierarchy relativized to a is recursively defined by 1 0 (1)Σ 0(a) = Σ1(a); 1 0 (2)Π 0(a) = Π1(a); 1 (3) A ∈ Σn(a) iff ∀w (A(w) ⇔ ∃x1 ∀x2 . . . Qxn ϕ(x1, . . . , xn, w)); 1 (4) A ∈ Πn(a) iff ∀w (A(w) ⇔ ∀x1 ∃x2 . . . Qxn ϕ(x1, . . . , xn, w)), where the formula ϕ is arithmetical in a, and Q represents alternating quantifiers.

Given this definition, we can now see how the relativized effective hierarchies are refinements of the Borel and projective hierarchies.

39 Theorem 3.3.18. Let A ⊂ (ωω)k, and n ≥ 1. Then 0 0 ω (1) A ∈ Σn iff A ∈ Σn(a), for some a ∈ ω. 0 0 ω A ∈ Πn iff A ∈ Πn(a), for some a ∈ ω. 1 1 ω (2) A ∈ Σn iff A ∈ Σn(a), for some a ∈ ω. 1 1 ω A ∈ Πn iff A ∈ Πn(a), for some a ∈ ω.

Remark 3.3.19. We provide a sketch of the proof. But before doing so, it is useful to mention that <ω there is an enumeration of the finite sequences, {si ∈ ω : 1 ≤ i ∈ ω}, such that if si ⊂ sj, then i ≤ j. The enumeration can be constructed in a similar way to the standard zigzag bijection between N and N2. <ω Using this enumeration we can also enumerate the basic open sets {O(si): si ∈ ω ∧ 1 ≤ i ∈ ω}.

However, these enumerations are only used when necessary. Thus the notation O(sn) does not neces- th th sarily refer to the n basic open set, corresponding to the n sequence sn. When we need to distinguish th n the n sequence sn, from an arbitrary sn, we write the latter as s .

0 ω k Proof. (1) Let A ∈ Σ1. Then A is open in ( ω) , and is expressible as the union of basic open sets in (ωω)k. The basic open sets of (ωω)k are of the form O(s1) × ... × O(sk). Since the finite sequences can k <ω k be enumerated, each of the s ∈ ω corresponds to an 1 ≤ i ∈ ω, such that s = si. We can use this enumeration to determine all the basic open sets O(s1) × ... × O(sk), whose union is A. Note that w ∈ A iff w ∈ O(s1) × ... × O(sk), for some s1, . . . , sk. The trick is to use a real number ω a ∈ ω that codifies which of the sequences si are involved in the union that makes up A. If the union of open sets that make up A is indexed by m, we have

A(w) ⇔ ∃m (w ∈ O(sa(mk)) × O(sa(mk+1)) × ... × O(sa(mk+k−1))).

0 Hence, A ∈ Σ1(a). 0 ω ω k Conversely, let A ∈ Σ1(a), for some a ∈ ω. Then A(w) ⇔ ∃m ϕ(m, w), for each w ∈ ( ω) . The formula ϕ(m, w) does not have unbounded quantifiers ranging over ω. And the bounded quantifiers are of the form ∃n ∈ S, or ∀n ∈ S, where the set S is an element of ω. That is, S is a natural number q = {0, 1, . . . , q − 1}.

Let us fix m ∈ ω. Then the set Rm = {w : ϕ(m, w)} is determined by a formula with bounded quantifiers. Thus, by the previous paragraph, the elements w of Rm are determined by a finite amount of natural numbers. Therefore there are only a finite number of restrictions on the w ∈ Rm. As such, S each Rm is the union of basic open sets, which means each Rm is open. Therefore A = m Rm is open, 0 i.e., A ∈ Σ1. 0 For A ∈ Πn, note that

0 ω 0 A ∈ Πn iff ( ω − A) ∈ Σn ω 0 ω iff ( ω − A) ∈ Σn(a), for some a ∈ ω 0 ω iff A ∈ Πn(a), for some a ∈ ω.

0 0 0 The Σn+1 case follows from the Πn case by an argument similar to the Σ1 case. (2) The base case is the same, by the definition of the relativized analytical hierarchy. The induction step is proved in Kanamori, §12.

40 The Lebesgue Measure

In this subsection we define the Lebesgue measure on ωK, and we see how it is possible to translate results about the Lebesgue measure on R to the Lebesgue measure on the Baire space ωω. Let us start by defining the Lebesgue measure for the Borel subsets of ωω.

ω ω Definition 3.3.20. The Lebesgue measure on B( ω) is a function mL : B( ω) → [0, 1], definable by recursion as follows: ω (1) If s = ∅, then mL(O(∅)) = mL( ω) = 1. 1 (2) If mL(O(s)) is defined, then mL(O(s ∗ k)) = 2k+1 · mL(O(s)).

(3) Let {Ai : i ∈ ω} be a collection of pairwise disjoint sets, such that mL(Ai) is defined for every i. S P Then mL( i∈ω Ai) = i∈ω mL(Ai). ω (4) If mL(A) is defined, then mL( ω − A) = 1 − mL(A).

This function mL exists. In the Cantor space, the Lebesgue measure is defined similarly, with the ω 1 exception of point (2). If O(s) ⊂ 2, and mL(O(s)) is defined, then mL(O(s ∗ k)) = 2 · mL(O(s)), where k ∈ {0, 1}. Note that these definitions implicitly use the fact that an open set A can always be expressed as S A = n∈ω O(sn), where the O(sn) are pairwise disjoint. In order to extend the definition of mL beyond B(ωK) we resort to null sets.

ω Definition 3.3.21. We say that A ⊂ K is a null set when inf{mL(U): A ⊂ U ∧ U is open } = 0.

Now we can use null sets and Borel sets to define all the Lebesgue measurable sets.

Definition 3.3.22. A set A ⊂ ωK is Lebesgue measurable, or L-measurable, if there is a Borel set B, such that the symmetric difference A∆B is a null set. In this case, the L-measure of A is mL(A) = mL(B).

It follows that every null set, and every Borel set, is L-measurable. But there are L-measurable sets which are not Borel sets, such as the analytic sets which are not coanalytic.

Many results about the L-measure on R apply to the L-measure on the Baire Space ωω. First we can reduce the study of the L-measure on R to its study on [0, 1], since mL is σ-additive and translation invariant. Now, let I = [0, 1] − Q be the set of irrational numbers in [0, 1], equipped with the subspace topology. Let A, A0 ⊂ [0, 1], and A0 = I∩A. Then A0 is L-measurable iff A is L-measurable because A−A0 is at most countable. Finally, there is a homeomorphism f : I ' ωω which puts the Borel sets of each space in bijective correspondence, and preserves the null sets5. Therefore f preserves L-measurability. Based on the properties of f, and on the preceding discussion, we will often use results about the

L-measure on R that are applicable to ωω, without further explanation. We will do the same for the Cantor space ω2. We end this subsection with the Lebesgue Density Theorem, which is involved in the proof of Shelah’s Theorem. First, let us define density.

Definition 3.3.23. Let A ⊂ ω2 be L-measurable, let x ∈ A, and let

mL(A∩O(xn)) d(x) = limn→∞ . mL(O(xn)) 5 ω The proof that I is homeomorphic to ω can be found in pg. 5 of Miller [7].

41 If the limit exists we call d(x) the density of A at x.

Theorem 3.3.24 (Lebesgue Density Theorem). 6 Let A ⊂ ω2 be L-measurable, and x ∈ A. Then d(x) = 1, almost everywhere in A.

Borel Codes

Every Borel set is in the Borel hierarchy. Since this hierarchy is indexed by α < ω1, we can obtain any

Borel set in less than ω1 steps. The steps to obtain a Borel set is the information contained in a Borel code. Before defining the Borel codes we need the following auxiliary definitions. Let Γ : ω × ω → ω be the 1 ω canonical bijection, defined by Γ(i, n) = i + 2 · (i + n)(i + n + 1). If c ∈ ω, then we define the sequence u(c) by u(c)(n) = c(n + 1). We also define the sequences vi(c) by vi(c)(n) = c(Γ(i, n) + 1), for all i ∈ ω.

Definition 3.3.25. Let 0 < α < ω1. We recursively define the sets of reals Σα and Πα, as follows:

(1) c ∈ Σ1 if c(0) > 1;

(2) c ∈ Πα if either c ∈ Σβ ∪ Πβ, for some β < α, or c(0) = 0 and u(c) ∈ Σα; S (3) c ∈ Σα, where α > 1, if either c ∈ Σβ ∪ Πβ for some β < α, or c(0) = 1 and vi(c) ∈ β<α(Σβ ∪ Πβ) for all i ∈ ω.

0 0 If c ∈ Σα, we call c a Σα-code. Similarly, we call c a Πα-code if c ∈ Πα. The set of all Borel codes is BC = S Σ = S Π . α<ω1 α α<ω1 α

We can associate to each Borel code a corresponding Borel set.

Definition 3.3.26. Let c ∈ BC. We define the Borel set Ac as follows: S (1) If c ∈ Σ1 then Ac = {O(si): c(i) = 1}. ω (2) If c ∈ Πα, and c(0) = 0, then Ac = ω − Au(c). S (3) If c ∈ Σα, and c(0) = 1, then Ac = i∈ω Avi(c).

0 0 It follows that if c ∈ Σα, then Ac ∈ Σα. Similarly, if c ∈ Πα, then Ac ∈ Πα. In particular, if c ∈ Σ1, then Ac is open; if c ∈ Π1, then Ac is closed; if c ∈ Σ2, then Ac is a Gδ set; and if c ∈ Π2, then Ac is an

Fσ set. 0 It is possible to show, by induction on α, that for each Borel set B ∈ Σα, there is a code c ∈ Σα, such 0 that B = Ac. And for each B ∈ Πα there is a code c ∈ Πα, such that B = Ac. Thus {Ac : c ∈ BC} is the collection of all Borel sets.

M If Ac is a Borel set coded by c, then the relativization of Ac to a model M is denoted by Ac . Let M M,N be transitive models of ZF + DC, such that M ⊂ N. Some properties of Ac hold in M iff they N hold for Ac in N. That is, the properties are absolute between M and N.

Theorem 3.3.27. Let M,N be transitive models of ZF + DC such that M ⊂ N, let c, d, e, c1, c2,... ∈ M be Borel codes in M, and let x ∈ M. (1) Being a Borel code is an absolute property between M and N.

6The proof can be found in pg. 17 of Oxtoby [9].

42 (2) The following properties are absolute between M and N:

Ac = ∅ x ∈ Ac Ac = Ad

Ac ⊂ Ad Ae = Ac ∩ Ad Ae = Ac ∪ Ad ω S Ad = ω − Ac Ae = Ac∆Ad Ad = n∈ω Acn

M M N N (3) mL (Ac ) = mL (Ac ). In particular, being null is an absolute property between M and N.

Proof. We provide a sketch of the proof of (2) and (3).

<ω (2) Since ω is absolute, so are all the si ∈ ω. Thus, the basic open sets determined by every si are absolute. Since the Borel codes in M are also absolute, and they determine how the basic open sets interact to form Borel sets in N, we conclude that all the equalities in point (2) are absolute between M and N. (3) We prove by induction on the complexity of the Borel sets. Let O(s) be a basic open set. Then it is absolute, since it is determined by s ∈ <ωω. On the other hand, the L-measure of O(s) is always determined by the sequence s. Therefore the L-measure of the basic open sets is absolute. This proves the base step. S If c ∈ Σ1 then Ac = {O(si): c(i) = 1}. Recall that we can express Ac as a pairwise of basic open sets, by Lemma 3.3.4. In the proof of this fact we extracted a subsequence of the si, whose definition is absolute, that gives us the pairwise disjoint union. So we can define a subsequence sik , such that A = S O(s ) is a pairwise disjoint union. Hence, c ik ik

mM (AM ) = P mM (O(s )M ) = P mN (O(s )N ) = mN (AN ). L c ik L ik ik L ik L c

ω If c ∈ Π1, and c(0) = 0, then Ac = ω − Au(c), where u(c) is in M. Then

M M M M N N N N mL (Ac ) = 1 − mL (Au(c)) = 1 − mL (Au(c)) = mL (Ac ).

It follows from this theorem that we may unambiguously refer to the unique Borel set BN ∈ N that corresponds to the Borel set BM ∈ M. In particular, if BM ∈ M, then there is a unique corresponding set BM[G] ∈ M[G]. When the context is clear we write B∗, instead of BN , or BM[G]. Using this notation, note that it also follows from this theorem that (1)( ωω ∩ M − A)∗ = (ωω ∩ N) − A∗; and T ∗ T ∗ (2)( n∈ω An) = n∈ω An. It is important to observe that the absolute properties in this theorem hold despite the fact that the extension of each Borel set may change between transitive models M ⊂ N of ZF + DC. We end this subsection by determining the size of the Borel σ-algebra B(ωω) with the help of Borel codes. The result also holds for B(ω2).

Lemma 3.3.28. |B(ωω)| = 2ℵ0 .

Proof. Sets with only one real number are closed. Thus we have |B(ωω)| ≥ 2ℵ0 .

43 On the other hand, every Borel set has at least one corresponding Borel code in ωω. Therefore

|B(ωω)| ≤ |ωω| = 2ℵ0 .

The Random Algebra

A set A ⊂ ωK is L-measurable iff A∆B is null, for some Borel set B. As such, there is a close relation between Borel sets and L-measurable sets. In order to study L-measurability in the context of set theory it is convenient to have a notion of forcing that involves the Borel sets. That notion of forcing is the random algebra. Let A, B ∈ B(ωK). We say that A  B iff A − B is a null set. In other words, A  B iff A ⊂ B, apart from a null set. It is easy to see that this is a partial order. We write A ≈ B when A  B and B  A, i.e., when A∆B is a null set. Note that the null sets in B(ωK) are -minimal elements. It is not desirable to have minimal elements in a notion of forcing. This is because a generic filter G is a gradual approximation of a minimal object f that is not in the ground model.

∗ ω ω Let B ( K) = {B ∈ B( K): mL(B) > 0} be equipped with the order . We take the quotient of B∗(ωK) by the relation ≈, so that the resulting partial order is separative. The equivalence class of A ∈ B∗(ωK) is [A] = {B ∈ B∗(ωK): B ≈ A}. The elements of [A] differ only by a null set. We say that [A] ≤ [B] iff A  B. The order ≤ is well defined because if A0 ∈ [A], and B0 ∈ [B], then A0 ≈ A  B ≈ B0. So A0  B0 iff A  B.

ω ω ∗ ω Definition 3.3.29. The random algebra on K is the set B0( K) = {[A]: A ∈ B ( K)}, equipped with the partial order ≤ induced by .

Let us now establish the central properties of the random algebra, and prove the results related to this notion of forcing that are central to the proof of the Solovay Theorem.

ω Proposition 3.3.30. (B0( K), ≤) is an atomless and separative partial order that satisfies the c.c.c.

ω Proof. (1) Atomless: Let [A] ∈ B0( K). Then A is a Borel set with positive L-measure. As such, we can

find two pairwise disjoint sets B1,B2 ⊂ A with positive L-measure. Since B1 ∩ B2 = ∅, the only class of ω Borel sets which is ≤ than [B1] and [B2], is the class [∅]. Since ∅ is a null set, [∅] ∈/ B0( K). Thus we get

[B1]⊥[B2]. And, since [B1] ≤ [A] and [B2] ≤ [A], we conclude that the order is atomless. ω (2) Separative: Let [A], [B] ∈ B0( K) and [A]  [B]. Then A  B. That is, A − B has positive L-measure. Let C = A − B. Then C − A = ∅ is null, and thus C  A. Additionally, C ∩ B = ∅ is null. Therefore [C]⊥[B].

ω (3) C.c.c.: Suppose that {[Bα]: α < ω1} is an antichain in B0( K) with size ℵ1. Then the intersections S of the corresponding Borel sets, Bα ∩ Bβ, are null sets (where α 6= β). Therefore β<α(Bβ ∩ Bα) is a null set, for each countable α < ω1. 0 S 0 0 Let Bα = Bα − ( β<α Bβ). Then Bα ∩ Bβ = ∅, whenever α 6= β. Additionally,

0 S Bα = (Bα − ( β<α(Bβ ∩ Bα))) ≈ Bα.

0 0 That is, [Bα] = [Bα], for all α < ω1. Therefore, the Bα have positive L-measure for all α < ω1.

44 Since ω1 = ℵ1 is a regular cardinal, and ω1 > ω, there are ℵ1 many ordinals α < ω1, such that 0 1 ω mL(Bα) > n , for a fixed n < ω. This contradicts mL( K) ≤ 1.

ω Proposition 3.3.31. Let M be a CTM, and let G be a B0( K)-generic filter. Then there is a unique ω xG ∈ K ∩ M[G], such that, for all the Borel sets B that are encoded by sequences in M, we have

∗ xG ∈ B ⇔ [B] ∈ G,

∗ where B is computed in V. In particular, M[xG] = M[G].

Proof. First we show uniqueness. Let x, y ∈ ωK ∩ M[G] be such that x ∈ B∗ ⇔ [B] ∈ G ⇔ y ∈ B∗ for all B ∈ B(ωK). In particular, x ∈ O(s)∗ ⇔ [O(s)] ∈ G ⇔ y ∈ O(s)∗, for all s ∈ <ωK. Suppose that x 6= y. Then there is some s ∈ <ωK with |s| > 0, such that x ∈ O(s)∗ and y∈ / O(s)∗. But this contradicts x ∈ O(s)∗ ⇔ y ∈ O(s)∗.

<ω Let us now show existence. Working in M[G], we recursively define a set {sn : n < ω} ⊂ K, such S that |sn| = n, and [O(sn)] ∈ G for each n ∈ ω. The union n∈ω sn will be the xG. ω Set s0 = ∅. Then [O(∅)] = [ ω] ∈ G, since G is upwardly closed.

Given sn with length |sn| = n, and [O(sn)] ∈ G, we will pick sn+1 using the density of G below S P [O(sn)]. First, note that O(sn) = k∈K O(sn ∗ k), and that mL(O(sn)) = k∈K mL(O(sn ∗ k)). ω Let [B] ≤ [O(sn)], where [B] ∈ B0( K). So B has positive L-measure. Therefore, there is at least one k ∈ K, such that B ∩ O(sn ∗ k) has positive L-measure. So [B ∩ O(sn ∗ k)] ≤ [O(sn ∗ k)]. This argument ω shows that the set Dn = {[B] ∈ B0( K): ∃k [B] ≤ [O(sn ∗ k)]} is dense below [O(sn)]. Hence there is a

[B] ∈ Dn ∩ G. And since G is upwardly closed, there is a k ∈ K, such that [O(sn ∗ k)] ∈ G. We define sn+1 = sn ∗ k for this k. S Now let x = n<ω sn be the prospective xG. We claim that

x ∈ B∗ ⇔ [B] ∈ G, for all B ∈ B(ωK) (encoded by sequences in M). We prove this claim for all Borel sets by induction. The base step can be reduced to proving the claim for basic open sets: since the filter G cannot have incompatible elements, the claim is true for each basic open set, by the construction of x. The induction steps pertain to complements of Borel sets and to countable unions of Borel sets.

ω ω ω Note that {[B], [ K − B]} is a maximal antichain in B0( K). Therefore, G ∩ {[B], [ K − B]}= 6 ∅, by Theorem 3.2.9. Since G does not contain incompatible elements, either [B] ∈ G, or [ωK − B] ∈ G. Therefore, if the claim is true for B ∈ B(ωK), then it is also true for ωK − B.

ω ∗ Let {Bn : n ∈ ω} ⊂ B( K), where x ∈ Bn ⇔ [Bn] ∈ G, for every n ∈ ω. First we want to prove that ω S ω D = {[Bn]: n ∈ ω} ∪ {[ K − ( n<ω Bn)]} is predense in B0( K). Note that

ω S ω B0 ∪ B1 ∪ ... ∪ Bm ∪ ... ∪ ( K − ( n<ω Bn)) = K.

ω Let [C] ∈ B0( K) and suppose that there is no element in D that is compatible with [C]. This means ω S that C ∩ Bn is null for each n ∈ ω, and that C ∩ ( K − ( n<ω Bn)) is null. However,

ω ω S C = C ∩ K = (C ∩ B0) ∪ (C ∩ B1) ∪ ... ∪ (C ∩ Bm) ∪ ... ∪ (C ∩ ( K − ( n<ω Bn))).

45 This means that C is a countable union of null sets and is therefore a null set. This is impossible, because ω [C] ∈ B0( K). Therefore there must be an element of D that is compatible with [C], which means that ω D is predense in B0( K). Thus D ∩ G 6= ∅. Now we can prove the induction step concerning countable unions of Borel sets:

S ∗ S ∗ ∗ x ∈ ( n<ω Bn) = n<ω Bn iff x ∈ Bn, for some n ∈ ω

iff [Bn] ∈ G, for some n ∈ ω (by the induction hypothesis) S iff [ n<ω Bn] ∈ G,

S ω S because [ n<ω Bn] is incompatible with [ K − ( n<ω Bn)] and G ∩ D 6= ∅.

The xG characterizes G, since G can be recovered from xG. Therefore, M[xG] = M[G] and xG is a generic real.

ω M Definition 3.3.32. Let M be a CTM, G be a (B0( K)) -generic filter, and xG be the unique real ∗ ∗ number such that xG ∈ B ⇔ [B] ∈ G, for all Borel sets of M (where B ∈ V). Then xG is called a random real over M.

The set of random reals can be characterized in terms of the null Borel sets in M.

Proposition 3.3.33. Let M be a CTM. Then x ∈ ωK is a random real over M iff x∈ / B∗, for all the Borel sets B ∈ (B(ωK))M which are null in M.

ω ω M Proof. (⇒) Let x = xG ∈ K be random over M, and B ∈ (B( K)) be null in M. Then we have [(ωK ∩ M) − B] = [ωK ∩ M] ∈ G, and x ∈ [ωK ∩ M]. Since x is a random real over M, we have x ∈ ((ωK ∩ M) − B)∗ = (ωK ∩ M[G]) − B∗. As such, x∈ / B∗. (⇐) Suppose that x ∈ ωK satisfies x∈ / B∗, for all the B ∈ (B(ωK))M that are null in M. Let G = {[B]: B ∈ (B(ωK))M ∧ x ∈ B∗}. Given the definition of G, it suffices to verify that G is a ω M (B0( K)) -generic filter over M. ∗ ∗ First we prove that G is a filter. Let [B1], [B2] ∈ G. Then x ∈ B1 and x ∈ B2 . That is, x is an ∗ ∗ ∗ element of B1 ∩ B2 = (B1 ∩ B2) . As such, [B1 ∩ B2] ∈ G. Now let [B] ∈ G and [B] ≤ [C]. Then B  C. That is, B − C is a null set, and so is (B − C)∗. But this means that (B − C) ∪ C ∈ [C]. Therefore, if x ∈ B∗, then x ∈ ((B − C) ∪ C)∗. So [C] ∈ G.

ω Now let A ∈ M be a maximal antichain. The goal is to find an element in A ∩ G. Since M  “B0( K) satisfies the c.c.c.”, the antichain A is countable in M. Let A = {[Bn]: n ∈ ω}, where Bn ∈ M for all ω S n ∈ ω. Since A is a maximal antichain, ( K ∩ M) − ( n∈ω Bn) must be a null set in M. But then ω S ∗ S ∗ S ∗ ∗ x∈ / (( K ∩ M) − ( n∈ω Bn)) , and hence x ∈ ( n∈ω Bn) = n∈ω Bn. Therefore x ∈ Bn for some n ∈ ω, so that [Bn] ∈ G. Thus A ∩ G 6= ∅.

The converse of this proposition is that each real that is not random over M must be in a Borel set B∗ ∈ V, such that BM is null in M. We can ensure that the set of nonrandom reals in a generic extension M[G] is null under the right circumstances. These circumstances will arise naturally in the proof of Solovay’s Theorem.

46 Proposition 3.3.34. Let M be a CTM. Then A = {x ∈ ωK ∩ M[G]: x is not random over M} is a null set when (2ℵ0 )M is countable in M[G].

Proof. From the previous proposition, x ∈ A iff x ∈ S{B∗ : B ∈ (B(ωK))M ∧ B is null in M}. As such,

A is a countable union because |(B(ωK))M | = (2ℵ0 )M , and (2ℵ0 )M is countable in V. Since being null is absolute between transitive models, A is null.

Regularity Properties

The Lebesgue measure is one of the main regularity properties of sets of reals that are studied in descriptive set theory. Two other main properties are the Baire property and the perfect set property.

Definition 3.3.35. Let A ⊂ ωK. (1) If D ⊂ A, we say that D is dense in A, if D = A, where D is the closure of D in A. (2) We say that A is nowhere dense if the complement of A contains an open dense set. (3) We say that A is a meager set if it is a countable union of nowhere dense sets. (4) We say that A has the Baire property if there is an open set U, such that A∆U is a meager set.

The definition of the Baire property resembles the definition of L-measurability. While in the case of L-measurability, the idea of a small set is captured by a null set, in the case of the Baire property, the idea of a small set is captured by a meager set — a purely topological concept.

Definition 3.3.36. Let A, P ⊂ ωK. (1) Let P be nonempty and closed. If P has no isolated points, then we call it a perfect set. (2) We say that A has the perfect set property if A is either countable or has a perfect subset .

We finish this subsection with two central properties of the perfect sets.

Proposition 3.3.37. Every perfect set has cardinality 2ℵ0 .

Theorem 3.3.38 (Cantor-Bendixson). Let F ⊂ ωK be an uncountable closed set. Then F = P ∪ C, where P is a perfect set and C is at most countable.

Filters, Ideals, Ultrafilters, and Trees

We end this chapter by defining supplementary notions that are required in later chapters, and by establishing their essential properties.

Definition 3.3.39. Let S be a set, and F ⊂ P(S) be nonempty. We say that F is a filter on S when (1) ∅ ∈/ F ; (2) A, B ∈ F implies A ∩ B ∈ F ; and (3) A ∈ F and A ⊂ B implies B ∈ F .

This notion of filter is not the same as the one we have defined previously for partial orders. However, since (P(S), ⊂) is a partial order, this notion of filter is a particular case of a filter on a partial order. When the context is clear we use the word filter without further clarification.

47 We can interpret a filter F as a collection of large subsets of S, since the intersection of two elements in F is still in F . In this sense, there are also collections of small subsets of S, which are called ideals.

Definition 3.3.40. Let S be a set, and I ⊂ P(S) be nonempty. We say that I is an on S when (1) S/∈ I; (2) A, B ∈ I implies A ∪ B ∈ I; and (3) A ∈ I and B ⊂ A implies B ∈ I.

Example 3.3.41. (1) Let S and P be nonempty sets such that P ⊂ S. The collection F = {A ⊂ S : P ⊂ A} is called a principal filter. It is easy to verify that it is a filter: ∅ ∈/ F by definition. If A, B ∈ F , then P ⊂ A ∩ B. Finally, if A ∈ F and A ⊂ B, then P ⊂ B, so that B ∈ F . (2) Let S be an infinite set. The collection of cofinite subsets of S, i.e., F = {A ⊂ S : S − A is finite } is called the Fr´echetfilter. To verify that F is a filter we note that S − ∅ cannot be finite, so that ∅ ∈/ F . If A, B ∈ F , then S − (A ∩ B) = (S − A) ∪ (S − B) is a finite set. Therefore A ∩ B ∈ S. And if A ∈ F and A ⊂ B, then S − B is finite. Thus B ∈ F .

Definition 3.3.42. Let S be a set and U, I ⊂ P(S). (1) U is called an ultrafilter on S when, for every A ⊂ S, either A ∈ U or S − A ∈ U. (2) I is called a prime ideal on S when P(S) − I is an ultrafilter.

It is easy to see that ultrafilters and prime ideals are maximal filters and ideals, respectively.

Definition 3.3.43. Let F be a filter on S, I be an ideal on S, and κ be a regular uncountable cardinal. We say that

(1) F is κ-complete when the intersection of any collection {Xξ : ξ < γ} of sets in F satisfies T ( ξ<γ Xξ) ∈ F , for all γ < κ; S (2) I is κ-complete when the union of any collection {Yξ : ξ < γ} of sets in I satisfies ( ξ<γ Yξ) ∈ I, for all γ < κ.

When F is ℵ1-complete, we say that F is σ-complete. Similarly, when I is ℵ1-complete, we say that I is σ-complete. The completeness of a filter F on S is the least cardinal κ for which there is a collection T {Xξ : ξ < κ} ⊂ F of κ sets, such that ( ξ<κ Xξ) ∈/ F . In this case we write Comp(F ) = κ. The completeness of F is at most |S| because we can always define a collection of at most |S| sets in F whose intersection is ∅. Namely, {S − {s} ∈ F : s ∈ S}.

Lemma 3.3.44. Let U be an ultrafilter on S, and κ be a regular uncountable cardinal. Then U is κ-complete iff the dual prime ideal, I = P(S) − U, is κ-complete.

Proof. (⇒) Let U be κ-complete, and let {Yξ : ξ < γ} ⊂ I, where γ < κ. To get to a contradiction, S S S T suppose that ( ξ<γ Yξ) ∈/ I. Then (S − ( ξ<γ Yξ)) ∈/ U. However, S − ( ξ<γ Yξ) = ξ<γ (S − Yξ) ∈ U, which is a contradiction. (⇐) The proof is the same, mutatis mutandis.

48 Lemma 3.3.45. Let U be an ultrafilter on S, and κ be a regular uncountable cardinal. Then U is S κ-complete iff there is no partition S = ξ<γ Xξ, where γ < κ, such that Xξ ∈/ U for all ξ < γ.

Proof. (⇒) Let U be κ-complete, and I = P(S) − U. Suppose there is said partition {Xξ : ξ < γ}. Since S Xξ ∈/ U, we have Xξ ∈ I, for every ξ < γ. Therefore ξ<γ Xξ ∈ I because I is κ-complete. However, S since I is an ideal, we have ξ<γ Xξ = S/∈ I. This is a contradiction. T (⇐) If U is not κ-complete, then there is a family of sets {Xξ : ξ < γ} ⊂ U, such that ξ<γ Xξ = ∅. S S Let Yξ = S − Xξ. Then S = ξ<γ Yξ. If we set Z0 = Y0, and Zξ = Yξ − ( ζ<ξ Yζ ), then the Zξ are S disjoint. Furthermore, Zξ ⊂ Yξ, so that Zξ ∈/ U. Finally, S = ξ<γ Zξ.

We now turn our attention to trees and the basic associated definitions.

Definition 3.3.46. Let (T, ≤T ) be a partial order. If for all s ∈ T , the set {t ∈ T : t ≤T s} is well-ordered, then T is called a tree .

Definition 3.3.47. Let T be a tree, and s ∈ T .

(1) The level of s in T is the order-type of {t ∈ T : t

(2) The height of T is the ordinal given by sup({lvT (s) + 1 : s ∈ T }), and is denoted by ht(T ).

(3) A set b ⊂ T is called a branch through T if (b, ≤T ) is a linear order and, for all s ∈ b, if t

Definition 3.3.48. Let X be a set. <ω (1)A (sequential) tree on X is a subset T ⊂ X, such that if s ∈ T , then s  n ∈ T , for all n ∈ ω.A

tree on X is ordered by reverse inclusion, i.e., s1 ≤T s2 iff s1 ⊃ s2. ω (2) If T is a tree on X, then [T ] = {x ∈ X : ∀n ∈ ω x  n ∈ T } is called the set of the infinite paths through T . ω (3) If Y ⊂ X, then we define TY = {f  n : f ∈ Y ∧ n ∈ ω}, which is a tree on Y . th (4) The restriction of TY to the n level is TY (n) = {f  k : f ∈ Y ∧ k ≤ n}.

A tree on X allows us to analyze a set of infinite sequences using an adequately defined set of finite sequences. Note that a tree on X is technically not a particular case of a tree, due to the difference between the definitions of the respective order relations. However, a tree on X ordered by inclusion is a tree. We now have all the required mathematical tools to prove the main theorems of this thesis.

49 50 Chapter 4

The Lebesgue Measure and Large Cardinals

4.1 Solovay’s Theorem

Introduction

Theorem 4.1.1 (Solovay). If ZFC + “κ is an inaccessible cardinal” is consistent, then there is a model of ZF + DC in which all sets of reals are Lebesgue measurable.

The key ingredients in the proof of Solovay’s Theorem are the L´evycollapse Col(ω, < κ), with κ inaccessible, the homogeneity of Col(ω, < κ), the random algebra and the Solovay sets.

Solovay Sets and Random Reals

Definition 4.1.2. Let M be a CTM, and A ⊂ ωω. We say that A is Solovay over M if there is a formula ω ϕ, and parameters a1, . . . , ak ∈ M, such that if x ∈ ω is generic over M, then

x ∈ A iff M[x]  ϕ(x, a1, . . . , ak).

The first goal is to find a sufficient condition for all Solovay sets A over M to be L-measurable. In order to do that we start by partitioning A into random reals and nonrandom reals.

Lemma 4.1.3. Let M be a CTM, and A ⊂ ωω be Solovay over M. Then there is a Borel set B ⊂ ωω, such that x ∈ A iff x ∈ B, for every x ∈ ωω that is random over M.

ω M Proof. Let G ∈ M be a (B0( ω)) -generic filter, let xG be the random real corresponding to G, and let ω M (B0( ω)) G τ ∈ M be a name such that τ = xG. Recall that xG is generic over M and that M[xG] = M[G].

Since A is Solovay over M there is a formula ϕ, and parameters a1, . . . , ak ∈ M, such that, for all the reals x that are generic over M, we have x ∈ A iff M[x]  ϕ(x, a1, . . . , ak). ω M ω M (B0( ω)) Let D = {[D] ∈ (B0( ω)) :[D] M ϕ(τ, aˇ1,..., aˇk)}, and let E ⊂ D be an antichain that is ω M maximal in D. Then E is at most countable because (B0( ω)) satisfies the c.c.c. So E = {[Dn]: n ∈ ω}.

51 S ∗ S ∗ ∗ Let B = ( n∈ω Dn) = n∈ω Dn. Since the Dn are Borel sets, B is a Borel set. We have

xG ∈ A iff M[xG]  ϕ(xG, a1, . . . , ak)

iff M[G]  ϕ(xG, a1, . . . , ak) ω M (B0( ω)) iff (∃ [Dj] ∈ G)[Dj] M ϕ(τ, aˇ1,..., aˇk)

iff ∃ j ∈ ω [Dj] ∈ E ∩ G S iff [ n∈ω Dn] ∈ G S ∗ iff xG ∈ ( n<ω Dn) .

Hence, for any random real xG, we have xG ∈ A iff xG ∈ B.

This lemma implies that the elements of A∆B are all nonrandom reals. Recall that Proposition 3.3.34 says that if (2ℵ0 )M is countable in M[G], then the set of nonrandom reals in M[G] is null.

Corollary 4.1.4. Let M be a CTM, P ∈ M be a partial order, G ⊂ P be a P-generic filter, and (2ℵ0 )M be countable in M[G]. If A ∈ M[G] is a set of reals that is Solovay over M, then A is L-measurable.

The Strategy For the Proof of Solovay’s Theorem

M[G] ℵ0 M It is possible to prove that all the sets of reals A ∈ ODω ω are Solovay sets. If we forced (2 ) to be countable in M[G], then the set of nonrandom reals in A would be null, making A an L-measurable M[G] set. Then we would extract the substructure HODω ω from M[G] and be done with the proof. However, it is not so simple because there are technical difficulties that arise in relation to the Solovay M[G] sets. It is not possible to prove that each set of reals A ∈ ODω ω is Solovay over the same submodel of M[G]. Each A is Solovay over a submodel MA of M[G] that depends on A. Additionally, in each of

ℵ0 MA M[G] the MA we need (|(2 ) | = ℵ0) to prove that A is L-measurable. This is where the inaccessible cardinal has a crucial role.

ℵα ℵ0 MA M[G] If κ is an inaccessible cardinal, and ℵα < κ, then 2 < κ. In order to obtain (|(2 ) | = ℵ0)

ℵ0 MA for every MA, we should collapse all the cardinalities (2 ) . This can be accomplished with the L´evy collapse Col(ω, < κ), where κ is an inaccessible cardinal. The L´evycollapse will force the nonrandom reals of a Solovay set A over MA to be a null set, making A an L-measurable set. This is how the forcing ω notions Col(ω, < κ) and B0( ω) “interact” in the proof of Solovay’s Theorem.

We need a careful application of the L´evycollapse. For each A, the submodel MA of M[G] is a generic extension of M, such that M ( MA ( M[G]. We will find a general method to factorize M[G] so that, in each particular factorization related to A, it includes MA as a factor. This factorization method allows

M[G] ℵ0 MA M[G] us to prove that each set of reals A ∈ ODω ω is Solovay over MA, where (|(2 ) | = ℵ0) , and thus prove the L-measurability of A. Once the technical difficulties are removed, the proof is straightforward. After the L´evycollapse we M[G] extract the substructure N = HODω ω from M[G], and prove it is a model of ZF+DC in which all sets of reals are L-measurable.

52 Solovay’s Technical Lemma

The next lemma is the first step to find the adequate factorization method of M[G]. Let us start with two preliminary definitions: (1) the collection of sets hereditarily of cardinality less than λ is defined as Hλ = {x : TC(x) < λ}, where λ is a cardinal; (2) if G is a Col(ω, < κ)-generic filter, we define G  λ = {p  (supp(p) ∩ λ): p ∈ G}, and G  [λ, k[ = {p  (supp(p) ∩ [λ, k[): p ∈ G}. So (G  λ) × (G  [λ, k[) is isomorphic to G.

Lemma 4.1.5. Let M be a CTM, G be a Col(ω, < κ)-generic filter, and N be a transitive set with ω ∈ N.

ω (1) For each f ∈ N ∩ M[G], there is a cardinal λ < κ, such that f ∈ M[G  λ]. ω (2) In particular, if x ∈ ω ∩ M[G], then there is a cardinal λ < κ, such that x ∈ M[G  λ].

Proof. Suppose f ∈ ωN ∩ M[G]. Let τ ∈ M Col(ω,<κ) be a name such that τ G = f, and

ˇ τ = {((n, ν), p): n ∈ ω ∧ ν ∈ N ∧ p ∈ An ∧ p τ(ˇn) =ν ˇ}, where each An ∈ M is an antichain maximal in D = {p ∈ Col(ω, < κ): ∃ν ∈ N p τ(ˇn) =ν ˇ}. Every An is an element of Hκ because Col(ω, < κ) satisfies the κ-c.c. And since κ is inaccessible, there is a cardinal

λ < κ, such that every An is in Hλ. Therefore

f(n) = ν iff ∃p ∈ (An ∩ G) p τ(ˇn) =ν ˇ

iff ∃p ∈ (An ∩ G  λ) p τ(ˇn) =ν. ˇ

G  λ is Col(ω, < λ)-generic over M, by the Product Lemma. Thus f ∈ M[G  λ].

ω This proof can be adapted to a finite number of reals. That is, if x1, . . . , xn ∈ ω ∩ M[G], then there is some λ < κ, such that x1, . . . , xn ∈ M[G  λ]. The next lemma is the specific factorization lemma that we need to surpass the technical difficulties mentioned above. Its proof is long, and technical in nature. We present it in the Appendix A so that the technical details do not obscure the main ideas of the proof of Solovay’s Theorem.

M[Gλ] ω Lemma 4.1.6. Let M be a CTM, P ∈ Hκ be a partial order where λ < κ, and x ∈ ω ∩ M[G] be P-generic over M[G  λ]. Then there is a filter H ∈ M[G] which is Col(ω, < κ)-generic over M[G  λ][x], such that M[G] = M[G  λ][x][H].

Now we have the proper tools to complete the proof.

Solovay’s Theorem

Theorem 4.1.7. Let M be a CTM, κ be an inaccessible cardinal, and G be a Col(ω, < κ)-generic filter. M[G] Then every set of reals A ∈ ODω ω is L-measurable.

ω M[G] Proof. Let us fix an A ⊂ ω, such that A ∈ ODω ω . Then there is a corresponding formula ϕA, ordinals ω α1, . . . , αk, and reals x1, . . . , xl ∈ ω ∩ M[G], such that

x ∈ A iff M[G]  ϕA(x, α1, . . . , αk, x1, . . . , xl),

53 ω ω for all x ∈ ω ∩ M[G]. Since x1, . . . , xl ∈ ω ∩ M[G], there is a cardinal λ < κ, such that G  λ is

Col(ω, < λ)-generic over M, and x1, . . . , xk ∈ M[G  λ]. We have the factorization M[G] = M[ G  λ ][ G  [λ, k[ ], by the Product Lemma. Thus, the cardinal (2ℵ0 )M[Gλ] is countable from the point of view of M[G]. Therefore, in order to prove that A is

L-measurable in M[G], it is enough to prove that A is Solovay over M[G  λ]. ω 1 M[Gλ] So, let x ∈ ω ∩ M[G] be generic over M[G  λ]. Then , the respective P is in Hκ . Thus, by the technical lemma, there is filter H ∈ M[G] which is Col(ω, < κ)-generic over M[G  λ][x], such that M[G] = M[G  λ][x][H]. Therefore

x ∈ A iff M[G]  ϕA(x, α1, . . . , αk, x1, . . . , xl)

iff M[G  λ][x][H]  ϕA(x, α1, . . . , αk, x1, . . . , xl) Col(ω,<κ) iff ∃p ∈ H p ϕA(ˇx, αˇ1,..., αˇk, xˇ1,..., xˇl). M[Gλ][x]

Since Col(ω, < κ) is homogeneous, 1Col(ω,<κ) decides ϕA(ˇx, αˇ1,..., αˇk, xˇ1,..., xˇl), by Corollary 3.2.59. As such,

Col(ω,<κ) x ∈ A iff 1Col(ω,<κ) ϕA(ˇx, αˇ1,..., αˇk, xˇ1,..., xˇl). M[Gλ][x]

Since the forcing relation FϕA,Col(ω,<κ) is uniformly definable, we can encapsulate

Col(ω,<κ) 1Col(ω,<κ) ϕA(ˇx, αˇ1,..., αˇk, xˇ1,..., xˇl) M[Gλ][x] in M[G  λ][x] by a formula ψA(x, α1, . . . , αk, x1, . . . , xl). As such

x ∈ A iff M[G  λ][x]  ψA(x, α1, . . . , αk, x1, . . . , xl), which means that A is Solovay over M[G  λ]. Thus A is L-measurable.

Note that M[G  λ] corresponds to the above mentioned MA. Note also that the homogeneity of M[G] Col(ω, < κ) allows us to make a decisive step to show that a set of reals A ∈ ODω ω is Solovay over

M[G  λ], by using the fact that 1Col(ω,<κ) decides ϕA(ˇx, αˇ1,..., αˇk, xˇ1,..., xˇl) in M[G  λ][x].

M[G] Theorem 4.1.8 (Solovay). Let N = HODω ω . If M is a model of ZFC + “κ is an inaccessible cardinal”, then N  ZF + DC + LM.

Proof. We know that N  ZF, by Theorem 3.1.17. Thus we only need to prove that every set of reals in N is L-measurable, and that N  DC. Since N and M[G] have the same reals, they have the same Borel codes. If c is a Borel code, then x ∈ Ac is absolute between N and M[G], by Lemma 3.3.27 (2). Therefore, N and M[G] have the same M[G] Borel sets. Let A ∈ N be a set of reals. Then A ∈ ODω ω and, by the previous theorem, A is an L-measurable set in M[G]. As such, there is a Borel set B such that A∆B is null in M[G]. Since being null is absolute between transitive models of ZF, the set A∆B is null in N. Therefore every set of reals in N is L-measurable.

1See Lemma 15.43 of Jech [4].

54 Let A ∈ N, and let R be a binary relation on A, such that, for each x ∈ A, there is a y ∈ A, satisfying x R y. By hypothesis, R ∈ N. We want to prove that there is a sequence hxn : n ∈ ωi of elements of A that belongs to N. M[G] Let us pick an x0 ∈ A. Then x0 ∈ N and thus x0 ∈ ODω ω . As such, there are ordinals α0, γ0 in ω M[G] M[G], a formula ψ0, and a real a0 ∈ ω ∩ M[G], such that Vγ0  ψ0(x0, α0, a0). This is true of every M[G] x ∈ ODω ω .

Let us define x1 ∈ A, resorting to x0, such that it satisfies x0 R x1. We take the minimum of the M[G] triples (ψ1, α1, γ1) (ordered by the lexicographical order), such that Vγ1  ψ1(x1, α1, a1), for some real ω a1 ∈ ω ∩ M[G], and x0 R x1. Thus, x1 is defined using only (ψ0, α0, γ0), a0, and a1, since ψ1, α1, and γ1 were determined by minimization. We can proceed to define x2 using the minimum of the triples M[G] ω (ψ2, α2, γ2), such that Vγ2  ψ2(x2, α2, a2), for some a2 ∈ ω ∩M[G], and x1 R x2. Thus, x2 is defined using only (ψ0, α0, γ0), a0, a1, and a2.

This recursive procedure provides a definition of hxn : n ∈ ωi, using only (ψ0, α0, γ0), and the real ω numbers a0, a1, . . . , an,... ∈ ω ∩ M[G]. The sequence han : n ∈ ωi is in M[G] because M[G] satisfies

AC. We can codify the real numbers a0, a1, . . . , an,... with a single real number a: if Γ : ω × ω → ω is 2 the canonical bijection , then we define a(Γ(i, j)) = ai(j). Therefore, we only need the real number a M[G] and the triple (ψ0, α0, γ0) to define the sequence hxn : n ∈ ωi. This means that hxn : n ∈ ωi ∈ ODω ω . M[G] M[G] Since we have hxn : n ∈ ωi ⊂ HODω ω , we conclude that TC({hxn : n ∈ ωi}) ⊂ ODω ω . Therefore M[G] hxn : n ∈ ωi ∈ HODω ω = N.

A question arises naturally from Solovay’s Theorem: is it necessary to suppose the existence of an inaccessible cardinal? The answer to this question was given by Shelah, and is the subject of the next chapter.

4.2 Shelah’s Theorem

Introduction

Shelah’s Theorem is the converse of Solovay’s Theorem. In its original form, it states that:

1 V If ZF+DC+“All Σ3 sets are L-measurable” is consistent, then L  “ω1 is an inaccessible cardinal”.

1 The proof involves the construction of a nonmeasurable filter F (x) that is a Σ3(x) set, and requires a detailed analysis of the place of F (x) in the analytical hierarchy. Jean Raisonnier published a more straightforward proof of this result in 1984 [10]. In this thesis we prove a modified version of the original theorem, which we also call the Shelah Theorem.

Theorem 4.2.1 (Shelah). If there is a model of ZF+DC in which every set of reals is L-measurable, then there is a model of ZFC with an inaccessible cardinal.

This theorem requires the L-measurability of all sets of reals, instead of the L-measurability of just the 1 Σ3 sets of reals. However, this weakening of the hypothesis is rewarded with the simplicity of its proof, 2The definition of Γ is given right before Definition 3.3.25.

55 since it does not involve the analysis of the place of F (x) in the analytical hierarchy. The exposition of this proof is based on Bekkali [1], and on the simplified adaptation of Jean Raisonnier’s work presented by Brian Semmes [13].

In this chapter, we will work with filters on ω which can be thought of as subsets of ω2. As such, a real number a ∈ ω2 is considered, when appropriate, as the characteristic function of the corresponding subset a−1({1}) ⊂ ω; or as that very subset a−1({1}) ⊂ ω. We adopt the abusive notation a for both cases to avoid cumbersome proofs with heavier notation.

Nonprincipal Ultrafilters on ω Are Not Lebesgue Measurable

The proof of Shelah’s Theorem is by contradiction: we suppose that ZF+DC+“Every set of reals V is L-measurable” is consistent and that ω1 is not inaccessible in L, and then show that there is a set of reals that is not L-measurable. Since the goal is to find a nonmeasurable set of reals, we will prove in this subsection that a nonprincipal ultrafilter on ω is not L-measurable, and discuss the implications afterward. A nonprincipal ultrafilter on ω is a particular kind of set.

ω Definition 4.2.2. Let A ⊂ 2. Suppose that, for every a ∈ A, if a  (ω − n) = b  (ω − n) for some n, then b ∈ A. Then A is called a tail set.

Note that n is being regarded in ω − n as the set of the predecessors of the natural number n, i.e., as the ordinal n. In a tail set, the membership is not affected when two elements have different finite “tails”. We can generalize the definition of tail sets to sets A ⊂ ω2 × ω2. In this case, A is called a tail set when, 0 0 0 0 for every (a, a ) ∈ A, if (a  (ω − n), a  (ω − n)) = (b  (ω − n), b  (ω − n)) for some n, then (b, b ) ∈ A. An example of a tail set is the Fr´echet filter on ω because any two cofinite subsets of ω with different finite tails intersect to form a cofinite set. The L-measurable tail sets satisfy an important property.

Proposition 4.2.3. If A is an L-measurable tail set, and B is any L-measurable set, then A and B are independent, i.e., mL(A ∩ B) = mL(A). mL(B).

Proof. We can represent ω2 as the cartesian product n2 × (ω−n)2 , for each positive n ∈ ω. Using this n framework, a tail set A can be represented as A = 2 × Tn, for any positive n ∈ ω, since the membership to A is not affected by the finite “tails”.

Let s ∈ n2, and let O(s) = {s} × (ω−n)2 be a basic open set. By the definition of L-measure on ω2, 1 n n we have mL({s} × Tn) = 2|s| · mL( 2 × Tn) = mL(O(s)). mL( 2 × Tn). Since O(s) ∩ A = {s} × Tn, we get mL(A ∩ O(s)) = mL(A). mL(O(s)). Note that this equality holds for any finite binary sequence s, of |s| any length |s| ∈ ω because the tail set A can always be represented as A = 2 × T|s|.

Let B ⊂ ω2 be L-measurable. Then, similarly to when B ⊂ R, there is a decreasing sequence of open ω ω sets Ui ⊂ 2, such that mL(B) = limi→∞mL(Ui). Each open set Ui ⊂ 2 can be expressed as the union S i P i of pairwise disjoint basic open sets, Ui = k O(sk). Therefore mL(Ui) = k mL(O(sk)). Let B be

L-measurable, and let mL(B) = limi→∞mL(Ui). Then A ∩ B is L-measurable, and we have

56 mL(A ∩ B) = limi→∞mL(A ∩ Ui) S i = limi→∞mL(A ∩ ( k O(sk))) P i = limi→∞ k mL(A ∩ O(sk)) P i = limi→∞ k mL(A). mL(O(sk)) P i = limi→∞mL(A). ( k mL(O(sk)))

= limi→∞mL(A). mL(Ui)

= mL(A). mL(B).

Corollary 4.2.4 (Kolmogorov’s 0 − 1 Law). If A is an L-measurable tail set, then either mL(A) = 0 or mL(A) = 1.

Proof. We have mL(A) = mL(A ∩ A) = mL(A). mL(A). Therefore mL(A) is either 0 or 1.

Kolmogorov’s 0 − 1 Law also holds when A is an L-measurable tail subset of ω2 × ω2.

Proposition 4.2.5. Let U be a nonprincipal ultrafilter on ω. Then U, regarded as a subset of ω2, is not L-measurable.

Proof. U is a tail set in ω2, since U is closed for finite intersections, and is not principal. So, if U is

L-measurable, then mL(U) = 0 or mL(U) = 1. We assume U is L-measurable to obtain a contradiction. Let T : ω2 → ω2 be the bijection defined by T (a)(n) = 1 − a(n). Let us prove that T preserves ω L-measurability, and the values of mL themselves. If A ⊂ 2 is L-measurable, then there is a decreasing S i i sequence of open sets Vi such that mL(Vi − A) → 0. Let Vi = k O(sk), where the O(sk) are pairwise S i S i i disjoint basic open sets. Then T (Vi) = T ( k O(sk)) = k O(T (sk)), where T (sk) is the finite sequence defined by T (s)(n) = 1 − s(n). Therefore each T (Vi) is open and, as such, L-measurable. Furthermore, i i the L-measure of each O(sk) is the same as the L-measure of each O(T (sk)), since it only depends on i i the length of sk. It follows that T preserves the L-measure of basic open sets. Since the O(sk) are pairwise disjoint basic open sets, T preserves the L-measure of open sets. Thus T preserves the L- measure of the complements of the open sets, i.e., of the closed sets. Recall that the closed subsets of ω2 are its compact subsets. Thus mL(T (Vi) − T (A)) → 0 because otherwise there would be a compact set T T W ⊂ ( i T (Vi)) − T (A) of positive L-measure, contradicting mL(Vi − A) → 0. Given that i T (Vi) is a T Borel set such that T (A)∆( i T (Vi)) is a null set, T (A) is L-measurable and mL(T (A)) = mL(A).

Since U is an ultrafilter, a ∈ U ⇔ T (a) ∈/ U ⇔ T (a) ∈ T (U). Therefore mL(U) = mL(T (U)). On the ω other hand, U ∩ T (U) = ∅, and U ∪ T (U) = 2. So mL(U) = mL(T (U)) cannot be 0, nor 1. But this contradicts U being a tail set.

Rapid Filters Are Not Lebesgue Measurable

The definitions, results, and proofs from the previous subsection will be useful below. But we have to replace the notion of an ultrafilter with a different one in order to prove Shelah’s Theorem.

57 ω Definition 4.2.6. A filter F ⊂ 2 is called rapid if for every increasing sequence hni : i ∈ ωi of natural numbers, there is an a ∈ F such that |a ∩ ni| ≤ i, for all i ∈ ω.

Note that a is being regarded in the intersection a ∩ ni as a subset of ω. The goal of this subsection is to show that rapid filters are not L-measurable.

Proposition 4.2.7. Let F be a filter on ω such that ω − n ∈ F , for every n. If F , regarded as a subset ω of 2, is L-measurable, then mL(F ) = 0.

Proof. Note that F is a tail set because it is closed for finite intersections, and it is not principal. Thus, if F is L-measurable, then its L-measure is either 0 or 1. It cannot be 1 because F ∩ T (F ) = ∅ entails mL(F ) + mL(T (F )) = 2 > 1. Therefore mL(F ) = 0.

Proposition 4.2.8. Suppose that A ⊂ ω2 intersects every compact subset of ω2 of positive L-measure. ∗ Then the outer measure of A is positive, mL(A) > 0.

∗ Proof. The outer measure of A is given by mL(A) = inf{mL(U): A ⊂ U ∧ U is open}. We suppose ∗ mL(A) = 0 in order to find a compact K such that mL(K) > 0, and K ∩ A = ∅. Recall that, by Lemma 3.3.5, the compact subsets of ω2 are precisely the closed subsets. Since the basic open sets are also closed, <ω it is enough to find some t ∈ 2 such that mL(O(t)) > 0, and O(t) ∩ A = ∅. ∗ Let mL(A) = 0. Then, for 0 < ε < 1, there is an open set Uε, such that A ⊂ Uε and mL(Uε) < ε. S We can express Uε as a pairwise disjoint union Uε = k∈K O(sk). Since it is a pairwise disjoint union, <ω the sequences sk are not compatible. Hence they form an antichain {sk : k ∈ K} in 2. However, S ω {sk : k ∈ K} is not a maximal antichain because k∈K O(sk) 6= 2. Therefore there is a sequence t which is incompatible with all the sk. S For this t we have O(t) ∩ ( k∈K O(sk)) = ∅ = O(t) ∩ A, and mL(O(t)) > 0.

Proposition 4.2.9. Let F be a rapid filter on ω. Then F is not L-measurable.

Proof. Let us first prove that F cannot be principal. If F is principal there is some nonempty X ⊂ ω such T that F = X. Suppose X is finite, and let hni : i ∈ ωi be a sequence such that n0 > max(X). Then we have |a∩n0| > 0, for any a ∈ F , which is a contradiction. Suppose X is infinite, and let X = {x1, x2,...}, where the xi’s are in increasing order. For each i, let ni > xi+1. Then |a ∩ ni| > i, for any a ∈ F . Thus T F = ∅.

For each n ∈ ω, there are distinct a1, . . . , an ∈ F such that n ∩ (a1 ∩ ... ∩ an) = ∅. In particular

(a1 ∩ ... ∩ an) ⊂ ω − n. As such, ω − n ∈ F , for each n ∈ ω because F is upwardly closed. Therefore, if ∗ F is L-measurable, then mL(F ) = 0 by Proposition 4.2.7. In particular, mL(F ) = 0. In order to complete this proof we will prove the next lemma, which states that F intersects every ω ∗ compact subset of 2 with positive L-measure. Then we obtain the contradiction mL(F ) > 0, which implies that F is not L-measurable.

Lemma 4.2.10. Let F be a rapid filter on ω. Then F intersects every compact subset A ⊂ ω2 of positive L-measure.

58 ω Proof. Before proceeding, recall that if A ⊂ 2, then TA = {f  n : f ∈ A ∧ n ∈ ω}. Let A be a compact ω subset of 2, such that mL(A) > 0. We want to prove F ∩ A 6= ∅. For this purpose, we can pick a compact subset B ⊂ A, still of positive L-measure, such that: i (1.1) There is a sequence hTB : i ∈ ωi of finite maximal antichains of TB such that, whenever i < j, we i j have max{|s| : s ∈ TB} < min{|t| : t ∈ TB}; i 1 (1.2) For all i ∈ ω, and all s ∈ TB, mL(O(s) ∩ B) > (1 − 2i+1 )· mL(O(s)). The justification for the possibility of choosing this compact B will not be given. We mention only that (1.1) is always possible, while (1.2) requires the Lebesgue Density Theorem. Given our choice of B, it is sufficient to prove B ∩ F 6= ∅. i Let mi be the least larger than max{|s| : s ∈ TB}. For the increasing sequence hmi : i ∈ ωi, there is an a ∈ F , such that |a ∩ mi| ≤ i, for all i ∈ ω. Let us define the increasing sequence hni : i ∈ ωi, where n0 = 0, and ni = mi. Then |a ∩ n0| = 0, and |a ∩ ni| ≤ i, for i ≥ 1. <ω We will recursively define a chain of sequences {si ∈ 2 : i ∈ ω}, ordered by ⊂, such that: i (2.1) si ∈ TB;

(2.2) O(si) ∩ B 6= ∅;

(2.3) If m ∈ a ∩ ni, then si(m) = 1. 0 Let s0 be any sequence in TB. Then s0 satisfies (2.1) and (2.2). And since |a ∩ n0| = 0, s0 satisfies (2.3).

So let us then define si+1, assuming that we have si. ω Let H = {x ∈ 2 : x(n) = 1, for all n ∈ a∩[|si|, ni+1[ }. Our choice of a implies |a∩[|si|, ni+1[ | ≤ i+1. 1 So each x ∈ H has at most i + 1 values of x(n) restricted to 1. Therefore mL(H) ≥ 2i+1 . Since the restrictions on the elements of H do not overlap with the restrictions on the elements of 1 O(si), we have mL(H ∩ O(si)) = mL(H). mL(O(si)) ≥ 2i+1 · mL(O(si)). So H ∩ O(si) 6= ∅. This implies

H ∩ O(si) ∩ B 6= ∅ because otherwise we would derive an absurdity:

mL(O(si) ∩ (B ∪ H)) = mL((O(si) ∩ B) ∪ (O(si) ∩ H))

= mL(O(si) ∩ B) + mL(O(si) ∩ H) 1 1 > (1 − 2i+1 )· mL(O(si)) + 2i+1 · mL(O(si))

= mL(O(si)).

i+1 Now fix some x ∈ H ∩ O(si) ∩ B, and let si+1 be the unique element of TB , such that si+1 ⊂ x. i+1 This si+1 exists because TB is a maximal antichain. Therefore si+1 satisfies (2.1). Since si+1 ⊂ x, we have x ∈ (O(si+1) ∩ B) 6= ∅. Thus si+1 satisfies (2.2). Furthermore, si+1 satisfies (2.3) because   si(m) = 1, m ∈ a ∩ ni si+1(m) =  x(m) = 1, m ∈ a ∩ [|si|, ni+1[ , m ∈ dom(si+1). S Let b = i si. Then b(m) = 1, for each m ∈ a. Therefore a ⊂ b, when considered as subsets of ω. Since F is upwardly closed, b ∈ F . Finally, to show b ∈ B, note first that ω2 − B is open. Suppose that <ω b∈ / B. Then ∃si ∈ 2, such that si ⊂ b and O(si) ∩ B = ∅, which is absurd. Thus B ∩ F 6= ∅.

Since we have shown that rapid filters are not L-measurable, we shall define a rapid filter to be used as the nonmeasurable set required in the proof of Shelah’s Theorem.

59 Defining the Rapid Filter

Throughout this subsection, we assume that there exists an uncountable well-orderable set A ⊂ ω2. Before defining the rapid filter, we need a preliminary definition.

Definition 4.2.11. Let W ⊂ <ω2, and let a ∈ ω2.

(1) We say that W captures a if ∃n∀m ≥ n (a  m ∈ W ). (2) We say that W captures A if W captures every a in A. (3) We say that W splits on n if there is an s ∈ W , with length |s| = n, such that the extensions s ∗ 0 and s ∗ 1 are both in W .

Definition 4.2.12. Let F ⊂ ω2 be the set defined as follows:

a ∈ F iff ∃W ⊂ <ω2 (W captures A and W splits on n iff a(n) = 1)

F is the prospective rapid filter.

Lemma 4.2.13. F is a filter on ω.

Proof. We first show that F is not empty. Let a = ω and W = <ω2. Then W captures A, it splits on every n, and a(n) is always 1. Therefore W witnesses that a ∈ F .

Let a ⊂ b, and let Wa witness that a ∈ F . Then Wa captures A. Consider

<ω <ω Wb = Wa ∪ {s ∗ 0 ∈ 2 : |s| = n and b(n) = 1} ∪ {s ∗ 1 ∈ 2 : |s| = n and b(n) = 1}.

Then Wb splits on n iff b(n) = 1. Since Wb also captures A, we get b ∈ F .

Let Wa and Wb witness that a, b ∈ F , respectively, and let Wc = Wa ∩ Wb. Since Wa and Wb both capture A, Wc captures A. Furthermore, Wc corresponds to a c ⊂ a ∩ b. Therefore a ∩ b ∈ F , by the previous paragraph. Finally we show that ∅ 6∈ F by proving that any W which never splits cannot capture A. Suppose W never splits. Then each s ∈ W can only help W capture at most one sequence a ∈ ω2. But, since W is countable, W can only capture countably many a ∈ ω2. This means that W cannot capture A because A is uncountable.

The next lemma is an auxiliary result with a technically intricate proof. Presenting it here would interrupt the flow of the main ideas in the proof of Shelah’s Theorem. As such, the proof is presented in Appendix B.

Lemma 4.2.14. Let A ⊂ ω2 be an uncountable well-orderable set. Suppose that the union of any sequence <ω of ω1 null sets is null. Then, for any increasing sequence hni : i ∈ ωi, there is a W ⊂ 2, such that W captures A and |{s ∈ W : |s| = ni}| ≤ i, for all i ∈ ω.

Under the right circumstances, which we cannot yet guarantee, the next result allows us to prove that F is a rapid filter.

Lemma 4.2.15. Let A ⊂ ω2 be an uncountable well-orderable set. Suppose that for any increasing <ω sequence hni : i ∈ ωi, there is a W ⊂ 2, such that W captures A and |{s ∈ W : |s| = ni}| ≤ i, for all i ∈ ω. Then F is rapid.

60 Proof. Let hni : i ∈ ωi be an increasing sequence, and let W be as in the hypothesis. Set a(i) = 1 iff W splits on i. Then W witnesses that a ∈ F . Since W captures A, we can assume that for any s ∈ W with |s| = i, there is a t ∈ W , such that s ⊂ t and |t| = i + 1. Indeed, we can remove those s ∈ W that do not conform to this assumption and it does not alter the fact that W captures A, nor does it alter the fact that |{s ∈ W : |s| = ni}| ≤ i, for all i ∈ ω. Then W can split on i iff |{s ∈ W : |s| = i}| < |{s ∈ W : |s| = i + 1}|. As such,

|a ∩ ni| = |{n < ni : W splits on n}|

= |{n < ni : ∃s ∈ W (|s| = n ∧ s ∗ 0, s ∗ 1 ∈ W )}|

≤ |{s ∈ W : |s| < ni ∧ s ∗ 0, s ∗ 1 ∈ W }|

≤ |{s ∈ W : |s| = ni}| ≤ i.

Therefore, the filter F is rapid.

We can summarize the results of this subsection as follows.

ω Proposition 4.2.16. If there exists an uncountable well-orderable set A ⊂ 2, and the union of ω1 null sets is null, then there exists a rapid filter F .

Shelah’s Theorem

In light of the previous proposition we need to make sure that two facts follow from the hypotheses of

Shelah’s Theorem: (1) The union of ω1 null sets is null; (2) There exists an uncountable well-orderable set A ⊂ ω2.

Proposition 4.2.17. Assume that every set of reals is L-measurable. Then the union of ω1 null sets is null.

Proof. Let B be null. Every null set can be enlarged to form a null tail set. Consider the left shift function u : ω2 → ω2, defined by u(a)(n) = a(n + 1). If a ∈ ω2, then we can concatenate a finite sequence s ∈ n2 with the sequence a shifted n times, un(a). Let s ∗ un(a) denote this concatenation, and let us S |s| form the set C = {( s∈<ω 2 s ∗ u (a)) : a ∈ B}. It follows that B ⊂ C. The set C is a tail set because each s∗un(a) consists in the substitution of the first n values of a by the values of s, and because we have made this substitution for all finite sequences. As such, membership to C is not affected by the finite tails. Furthermore, since there is only a countable number of finite sequences, C is a countable union of null sets, and thus is a null set. ω Let {Tα ⊂ 2 : α < ω1} be a collection of ω1 null sets. We can suppose Tα to be null tail sets. We S can also suppose that the Tα are pairwise disjoint, or else we could form the sets Sα = Tα − ( β<α Tβ). Since S S = S T , it is sufficient to prove that T = S T is null. α<ω1 α α<ω1 α α<ω1 α T is a tail set and, by hypothesis, T is L-measurable. So it suffices to show that T does not have L-measure 1 because of Kolmogorov’s 0 − 1 law. Let

U = {(a, b): ∃α, β ∈ ω1 (α < β ∧ a ∈ Tα ∧ b ∈ Tβ)}.

61 Then U is also a tail set, which means that its L-measure is either 0 or 1. We will derive the contradiction that U can neither have L-measure 0 nor L-measure 1, based on the supposition that mL(T ) = 1. S Let a ∈ T , say a ∈ Tα, where α is a countable ordinal. Since mL( ξ≤α Tξ) = 0, it must be that S mL( β>α Tβ) = mL({b :(a, b) ∈ U}) = 1. It must also be that mL({a :(a, b) ∈ U}) = 0.

So for all a ∈ T , mL({b :(a, b) ∈ U}) = 1; and for all b ∈ T , mL({a :(a, b) ∈ U}) = 0. By Fubini’s theorem it follows, respectively, that U cannot have L-measure 0; and that (ω2 × ω2) − U cannot have

L-measure 0. Therefore T cannot have L-measure 1, which means that mL(T ) = 0.

As a consequence of this proposition, we can obtain a nonmeasurable set.

Theorem 4.2.18. Let A ⊂ ω2 be an uncountable well-orderable set. Then there is a set that is not L-measurable.

Proof. Suppose that every set of reals is L-measurable. Then the union of ω1 null sets is null. Since the set A ⊂ ω2 is uncountable and well-orderable, there is a rapid filter on ω, by Proposition 4.2.16. But, since rapid filters are not L-measurable, there are no uncountable well-orderable sets of reals.

The only missing piece is the existence of an uncountable well-orderable set of reals. If we assume V that ω1 is not inaccessible in L, then we can find an uncountable well-ordered set of reals in L[x], for a specific x. Once this is established, Shelah’s Theorem follows easily.

V Theorem 4.2.19 (Shelah). If ZF+DC+“every set of reals is L-measurable”, then L  “ω1 is inacces- sible”.

V V Proof. Suppose L  “ω1 is not inaccessible”. Then either ω1 is not regular, or it is not uncountable, or it is not a strong limit cardinal. V V ω1 is regular in V and the property of being regular is downward absolute. Therefore ω1 is regular V in L. If we suppose that ω1 is countable in L, then it is countable in V, which is absurd. Suppose that V ω1 is not a strong limit cardinal. Recall that being a strong limit cardinal in L and being a limit cardinal V in L coincide because GCH holds in L. Therefore we are supposing that ω1 is not a limit cardinal in L, V i.e., that ω1 is a successor in L. V + V ω Let L  ω1 = α , where α < ω1 is countable in V. Let x ∈ 2 be a code for a well-ordering of ω of order-type α. Then L[x]  “α is countable”. Recall that the equality between the cardinality of V + V two sets, |X| = |Y |, is upward absolute. Therefore we get L[x]  ω1 = α . Hence ω1 is the first ω ω uncountable cardinal of L[x]. Since L[x]  ω1 ≤ 2, the set 2 ∩ L[x] is uncountable. And since L[x] is well-orderable, ω2 ∩ L[x] is a well-ordered of reals. Thus, by the previous theorem, there is a nonmeasurable set of reals. This contradicts the hypothesis that every set of reals is L-measurable. V Therefore we conclude that L  “ω1 is inaccessible”.

Shelah’s Theorem entails the equiconsistency of ZFC+I with ZF+DC+LM. Equivalently, there is a model of ZFC+I iff there is a model of ZF+DC+LM. In this sense, the inaccessible cardinal hypothesis in Solovay’s Theorem is necessary. It is important to mention that we have focused just on one of the regularity properties featured in Solovay’s original article. The regularity properties are indicative of well-behaved sets of reals, of which

62 prominent examples are Lebesgue measurability, having the Baire property, and having the perfect set property. Solovay proved that if there is a model of ZFC+I, then there is a model of ZF+DC in which all sets of reals have these three regularity properties. Before Solovay proved this theorem in 1963, Specker had established in 1957 that an inaccessible cardinal is necessary for the perfect set property. It was only in 1984 that Shelah published a proof that an inaccessible cardinal is necessary for Lebesgue measurability but, surprisingly, not necessary for the Baire property.

4.3 Further Results

There are many classical results in set theory related to the Lebesgue measure and large cardinals. We provide a brief overview of the main results. We have already mentioned that all analytic sets and all coanalytic sets are L-measurable. That is, all 1 1 1 Σ1 sets and all Π1 sets are L-measurable. Therefore, it is natural to ask whether Σ2 sets are L-measurable. 1 The answer to this question is undecidable in ZFC: with forcing one can construct models where all Σ2 1 sets are L-measurable but, on the other hand, there are Σ2 sets in L that are not L-measurable. If we focus on the analytical hierarchy, then we can find more results that clarify the limitations on ω 1 the L-measurability of projective sets: (1) If a ∈ ω, then the Σ2(a) sets of reals are L-measurable iff ω 1 almost all reals are random over L[a]. (2) If a ∈ ω, then the ∆2(a) sets of reals are L-measurable iff there is a real random over L[a]. (3) If ℵ1 many random reals are generically added to L[a], then the 1 ω Σ2(a) set ω ∩ L[a] is not L-measurable in the extension. 1 The consistency of ZFC plus the proposition that every Σ2 set is L-measurable does not require large cardinal hypotheses. But if we add large cardinal hypotheses we get more L-measurable sets of reals. In particular, Solovay’s Theorem proves that if ZFC + I is consistent, then there is a model of ZFC in which all projective sets of reals are L-measurable. However, this does not mean that the projective sets are L-measurable in all models of ZFC when we assume an inaccessible cardinal. Even cardinals as large as 1 the measurable cardinals do not entail the L-measurability of ∆3 sets in every model of ZFC. In order to further analyze the impact of large cardinals on the L-measurability of sets of reals, we introduce new large cardinals.

Definition 4.3.1. Let κ be a cardinal. (1) κ is a if it is uncountable and if there is a nonprincipal κ-complete ultrafilter U on κ.

(2) κ is a Woodin cardinal if for all A ⊂ Vκ there are arbitrarily large λ < κ, such that for all α < κ, there exists an elementary embedding j :V → M into an inner model M, with critical point λ,

such that j(λ) > α,Vα ⊂ M, and A ∩ Vα = j(A) ∩ Vα. (3) If λ ≥ κ, then κ is λ-supercompact if there is an elementary embedding j :V → M into an inner model M, with critical point κ, such that λ < j(κ), and λM ⊂ M. A cardinal κ is supercompact if κ is λ-supercompact for all regular cardinals λ ≥ κ.

If κ is supercompact, then it is Woodin. Woodin cardinals lie between measurable and supercompact cardinals in the hierarchy of large cardinals. Shelah and Woodin proved that, assuming the existence of

63 a supercompact cardinal, every set of reals in L(R) is L-measurable. However, the assumption that a supercompact cardinal exists is a very strong hypothesis, so it is natural to seek weaker large cardinal assumptions that suffice to prove the L-measurability of all sets of reals in L(R).

Theorem 4.3.2 (Shelah-Woodin). 1 (1) If n ∈ ω and there are n Woodin cardinals with a measurable cardinal above them, then every Σn+2 set of reals is L-measurable. (2) If there are infinitely many Woodin cardinals with a measurable cardinal above them, then every set

of reals in L(R) is L-measurable.

These are the optimal hypotheses for the L-measurability of the projective sets, since these sets are in L(R). The hypotheses of (1) sequentially lead to the L-measurability of every projective set. Yet the marginally stronger hypotheses of (2) entail the L-measurability of every set of reals in L(R). A different way to explore L-measurability is by determining winning strategies for a type of game. Given a set A ⊂ ωω, consider the following infinite game associated with A: there are two players, I and

II, who alternately choose a natural number ni. To begin with, player I plays n0, then player II plays n1, to which I answers by playing n2, and so on. At the end of the run, the players have produced an infinite ω sequence a = hni : i ∈ ωi ∈ ω. If a ∈ A, then player I wins the game and player II wins otherwise. The game is determined if one of the two players has a winning strategy. Formally, a strategy for player II is a function f that assigns a natural number to each finite sequence of odd length. It is a th winning strategy if player II always wins the game when he plays f(n0, n1, . . . , n2k) in his k turn, whatever moves are made by player I. Similarly, one can define a winning strategy for I. We say that the set A is determined if the game associated with A is determined. For example, if A = ωω, then player I has a winning strategy (however badly he chooses to play). Similarly, if A = ∅, then player II has a winning strategy. It is possible to prove that every finite set and every is determined. In fact, every Borel set is determined. One might guess that every game is determined but it is possible to use AC to prove the existence of a game that is not determined. The axiom of (AD) asserts that all sets of reals are determined. AD implies that every set of reals is L-measurable, has the Baire property and has the perfect set property. While AD rules out the existence of pathologically irregular sets of reals, it implies the negation of AC. But, despite AD being inconsistent with ZFC, it is consistent with ZF + DC. The AD axiom can be considered as a restriction of the classical notion of a set leading to a smaller universe, say of determined sets, which reflect some physical intuitions which are not fulfilled by the classical sets. For example, the Banach-Tarski Paradox defies our basic physical intuitions: given a solid ball in R3, it is possible to partition it into finitely many pieces and reassemble them through translations and rotations to form two solid balls, each identical in size to the first. Note that these pieces cannot be L-measurable because otherwise this would contradict the invariance of the Lebesgue measure by isometries in R3. The Axiom of Choice is indispensable to the proof of the Banach-Tarski paradox. However, the Banach-Tarski paradox is eliminated by AD. There are weaker versions of AD that are compatible with ZFC. If we assume that the projective sets are determined, then they share nearly all the classical properties of Borel and analytic sets. Since

64 the determinacy of all projective sets is independent of ZFC, and since it allows for the extension of the theory of Borel and analytic sets to all projective sets in a very elegant way, it constitutes an excellent candidate for a new set-theoretic axiom. This axiom is known as projective determinacy (PD). It implies that every projective set is L-measurable, has the Baire property and has the perfect set property. Hence there is no projective counterexample to CH. Since AD is independent of ZF, it was speculated where, or indeed whether, determinacy fitted in with the hierarchy of large cardinal hypotheses. Woodin settled this question definitively by proving that ZFC plus the existence of infinitely many Woodin cardinals is equiconsistent with ZF + AD. In fact, if we assume that there are infinitely many Woodin cardinals with a measurable cardinal above them all, then AD holds in L(R). If we can focus our attention on the projective sets we can see how large cardinals affect the determi- nacy of these sets. If there exists a measurable cardinal, then every analytic set, and therefore also every coanalytic set, is determined. Martin and Steel proved that PD is consistent with ZFC when we assume the following large cardinal hypotheses: if there exist n Woodin cardinals with a measurable cardinal 1 above them, then every Πn+1 game is determined. Subsequently, Woodin showed that the hypothesis that for each n it is consistent that there exist n Woodin cardinals is necessary in order to obtain the consistency of PD. Thus the existence of infinitely many Woodin cardinals is a sufficient, and essentially necessary, assumption for extending the classical theory of Borel and analytic sets to all projective sets of reals.

65 66 Chapter 5

Conclusions

The Lebesgue measure inspired many important advancements in set theory. The fact that there are nonmeasurable sets of reals encouraged mathematicians to use set theoretical tools to determine what kinds of sets of reals are Lebesgue measurable, or under which conditions these sets are Lebesgue measurable. In the last chapter we had a glimpse into the scope of the work developed to further analyze Lebesgue measurability. This work involved descriptive set theory, large cardinals, determinacy and model theoretical techniques, showing how these apparently independent areas are inextricably connected. In this thesis we focused on Solovay’s Theorem and Shelah’s Theorem. These are classic results in set theory and their proofs involve concepts and techniques drawn from different areas of set theory which were further developed by Solovay and Shelah. Together, these results establish that the axiom systems ZF+DC+LM and ZFC+I are equiconsistent. In particular, Shelah’s Theorem shows that the inaccessible cardinal hypothesis of Solovay’s Theorem is not superfluous. Large cardinal hypotheses had an enormous impact on all of set theory, and even beyond it. In spite of this success, their status as true axioms of set theory is still a matter of debate. Not least because the known large cardinal hypotheses do not decide one of the oldest problems in set theory, the size of the continuum.

67 68 Bibliography

[1] M. Bekkali. Topics in Set Theory, volume 1476 of Lecture Notes in Mathematics. Springer, Berlin, 1991.

[2] I. N. Bronshtein, H. Muehlig, G. Musiol, and K. A. Semendyayev, editors. Handbook of Mathematics. Springer, Berlin, 5th edition, 2007.

[3] M. Foreman and A. Kanamori, editors. Handbook of Set Theory, volume 1, 2, and 3. Springer, Berlin, 2010.

[4] T. Jech. Set Theory. Springer, Berlin, 3rd edition, 2002.

[5] A. Kanamori. The Higher Infinite. Springer, Berlin, 2nd edition, 2009.

[6] K. Kunen. Set Theory, An Introduction to Independence Proofs. Elsevier, Amsterdam, 1980.

[7] A. W. Miller. Descriptive Set Theory and Forcing, volume 4 of Lecture Notes in Logic. Springer, Berlin, 1995.

[8] J. R. Munkres. Topology. Prentice-Hall, 2nd edition, 2000.

[9] J. Oxtoby. Measure and Category. Springer, Berlin, 2nd edition, 1980.

[10] J. Raisonnier. A mathematical proof of S. Shelah’s theorem on the measure problem and related results. Israel Journal of Mathematics, 48:48–56, 1984.

[11] M. Ricou. Measure and Integration. Unpublished Lecture Notes in Portuguese, Instituto Superior T´ecnico,2013.

[12] R. Schindler. Set Theory, Exploring Independence and Truth. Springer, Berlin, 2014.

[13] B. Semmes. The Raisonnier-Shelah construction of a nonmeasurable set. Master’s thesis, University of Amsterdam, 1997.

[14] S. Shelah. Can you take Solovay’s inaccessible away? Israel Journal of Mathematics, 48:1–47, 1984.

[15] R. M. Solovay. A model of set theory in which every set of reals is Lebesgue measurable. Annals of Mathematics, 92:1–56, 1970.

69 70 Appendix A

Proof of Solovay’s Technical Lemma

In this appendix we present the proof of the specific version of the factorization lemma that is necessary for the proof of Solovay’s Theorem. We need two auxiliary lemmas which will be stated without proof1.

Lemma A.0.3. Let κ be an infinite cardinal, and P be an atomless partial order, such that 1P |κˇ| = ℵ0.

Then, for every p ∈ P, there is an antichain A ⊂ {q ∈ P : q ≤P p} of size κ.

Lemma A.0.4. Let κ be an infinite cardinal, and P be a separative partial order, such that |P| = κ, and

1P |κˇ| = ℵ0. Then there is a dense homomorphism π : Col(ω, κ) → P.

M[Gλ] Lemma A.0.5. Let M be a CTM, P ∈ Hκ be a partial order, where λ < κ, and s ∈ M[G] be P-generic over M[G  λ]. Then there is some H ∈ M[G] which is Col(ω, < κ)-generic over M[G  λ][s], such that M[G] = M[G  λ][s][H].

Proof. In order to simplify the notation in this long proof, let us assume that λ = 0, so that [G  λ] can be dropped from the factorization. The proof for λ > 0 can be adapted from this particular case without significant obstacles.

So let us fix a partial order P ∈ Hκ, and let s ∈ M[G] be P-generic over M. We aim to construct some H ∈ M[G] which is Col(ω, < κ)-generic over M[s] such that M[G] = M[s][H].

With the help of Lemma 4.1.5, we can assume that s ∈ M[G  (µ + 1)], where µ + 1 is the successor ordinal of a cardinal µ < κ. The forcing notion Col(ω, < (µ + 1)) has the same cardinality as µ because Col(ω,<(µ+1)) it is a set of finite sequences on µ > ω. Since 1 |µˇ| = ℵ0, Lemma A.0.4 implies that there is a dense homomorphism

i : Col(ω, µ) → Col(ω, < (µ + 1)).

By Proposition 3.2.31,

G0 = {p ∈ Col(ω, µ): i(p) ∈ G  (µ + 1)}

1The two lemmas correspond to Lemma 6.49 and Lemma 6.51 from Schindler[12], and the proofs can be found in pages 114 and 115.

71 is a Col(ω, µ)-generic filter over M, with M[G0] = M[G  (µ + 1)]. Additionally, Col(ω, µ) is isomorphic to Col(ω, {µ + 1}), so that if j : Col(ω, µ) → Col(ω, {µ + 1}) is an isomorphism, then Proposition 3.2.31 implies that

G1 = {p ∈ Col(ω, µ): ∃q ∈ G j(p) = q(µ + 1)} is a Col(ω, µ)-generic filter over M[G0]. Note that M[G0][G1] = M[G  (µ + 2)]. Recall that we have assumed s ∈ M[G  (µ + 1)] = M[G0].

∗ ∗ Claim: There is a Col(ω, µ)-generic filter H over M[s] with M[s][H ] = M[G0][G1].

Suppose this claim to be true. Then we have

∗ M[G] = M[G  (µ + 2)][G  [(µ + 2), κ[ ] = M[G0][G1][G  [(µ + 2), κ[ ] = M[s][H ][G  [(µ + 2), κ[ ].

Let i0 : Col(ω, µ) → Col(ω, < (µ + 2)) be a dense homomorphism, and let us set

∗ 0 H0 = {p ∈ Col(ω, < (µ + 2)) : ∃q ∈ H i (q) ≤ p}.

∗ Then H0 is Col(ω, < (µ + 2))-generic over M[s], by Proposition 3.2.31, and M[s][H ] = M[s][H0]. If we finally set

H = {p ∈ Col(ω, < κ): p  (µ + 2) ∈ H0 ∧ p  [(µ + 2), κ[ ∈ G  [(µ + 2), κ[ }, then H is Col(ω, < κ)-generic over M[s] and

∗ M[s][H] = M[s][H0][G  [(µ + 2), κ[ ] = M[s][H ][G  [(µ + 2), κ[ ] = M[G], as desired. Thus, all that remains to complete the proof of Lemma A.0.5 is to prove the claim. A.0.5

Proof of the Claim. We want to find a filter H∗ which is Col(ω, µ)-generic over M[s], such that

∗ Col(ω,µ) G0 M[s][H ] = M[G0][G1]. Since s ∈ M[G0], we may pick a τ ∈ M , such that s = τ .

Let us recursively define inside M[s] a sequence hQα : α ∈ Ordi of subsets of Col(ω, µ) as follows. Set p ∈ Q0 iff, for all r ∈ P,

(p rˇ ∈ τ) ⇒ r ∈ s ∧ (p rˇ ∈ / τ) ⇒ r∈ / s.(†)

Having defined Qα, set p ∈ Qα+1 iff, for all open dense sets D ⊂ Col(ω, µ), where D ∈ M, we have

0 0 ∃p ≤ p (p ∈ D ∩ Qα).

T If λ is a limit ordinal, and Qα is defined for every α < λ, then we set Qλ = α<λ Qα. 0 0 For each α, if p ∈ Qα and p ≤ p ∈ Col(ω, µ), then p ∈ Qα. Therefore, if α ≤ β, then Qβ ⊂ Qα. Let

δ be least such that Qδ+1 = Qδ. Set

¯ ¯ Q = Qδ and Q = Q × Col(ω, µ).

72 We construe Q¯ and Q as partial orders, with the order relation given by the restriction of the order relation of Col(ω, µ) and Col(ω, µ) × Col(ω, µ) to Q¯ and Q, respectively. If p ∈ Q¯ , p ≤ q, and q ∈ Col(ω, µ), then q ∈ Q¯ . Note that Q was defined inside M[s], and the parameters we need for this are µ, P, τ, and s. Let us write Ψ(v0, v1, v2, v3, v4) for the defining formula, i.e.,

0 0 0 M[s]  ∀Q (Q = Q ⇔ Ψ(Q , µ, P, τ, s)). ¯ Subclaim 1: G0 ⊂ Q.

Suppose that p ∈ G0 − Qβ, where β is minimal such that G0 − Qβ 6= ∅. β cannot be a limit ordinal.

G0 We also cannot have β = 0 because s = τ , and because of the definition of Q0. Therefore β = α + 1, for some α. We may pick some open dense set D ⊂ Col(ω, µ), where D ∈ M, such that

0 0 0 ∀p ≤ p (p ∈ D ⇒ p ∈/ Qα). (‡)

∗ 0 0 ∗ 0 0 Let p ∈ D ∩ G0, and p ∈ G0, such that p ≤ p , p. Then p ∈ D because D is open, and hence p ∈/ Qα 0 ¯ by (‡). But then p ∈ G0 − Qα, hence G0 − Qα 6= ∅, which contradicts the choice of β. Therefore G0 ⊂ Q. ¯ Subclaim 1 implies that Q 6= ∅. Let p1  p2, where p1, p2 ∈ Q, and let D = {r ∈ Col(ω, µ): r⊥p2}.

Then D is predense below p1. As such, D ∩ G0 6= ∅ and there is an r ≤ p1, such that r⊥p2. Therefore Q¯ is separative. Since Col(ω, µ) is also separative, Q = Q¯ × Col(ω, µ) is separative. Q has the same cardinality as µ inside M[s]. ¯ 0 Subclaim 2: Let p ∈ Q. Then there is a G0 ∈ M[G] which is Col(ω, µ)-generic over M, such that 0 G0 p ∈ G0 and s = τ 0 . Letp ¯ ∈ Q¯ , and let D ⊂ Col(ω, µ) be open dense in Col(ω, µ), where D ∈ M. By the definition of the ¯ 0 0 ¯ Qα, ifp ¯ ∈ Q = Qδ+1, then there is a p ≤ p¯, with p ∈ D ∩ Qδ = Q. In M[G] there are only countably many dense subsets of Col(ω, µ) which are in M. So, given p ∈ Q¯ , we may work in M[G] and produce 0 0 0 ¯ some G0 which is Col(ω, µ)-generic over M, such that p ∈ G0 and G0 ⊂ Q. 0 0 0 0 ¯ Let r ∈ P, and suppose that p ∈ G0 decidesr ˇ ∈ τ. As p ∈ G0 ⊂ Q ⊂ Q0, we must have (†). 0 Therefore, G0 is as desired.

With subclaim 2, G0 × G1 ⊂ Q is a filter.

Subclaim 3: G0 × G1 is Q-generic over M[s].

Let D ∈ M[s] be dense in Q. We need to see that D ∩ (G0 × G1) 6= ∅. Suppose, instead, that s D ∩ (G0 × G1) = ∅. Let ρ ∈ M P be such that ρ = D, and let DM = {E ∈ M : E is dense in P}. 00 00 Recall the formula Ψ(v0, v1, v2, v3, v4), and let Φ(v0, v1, v2, v3, v4, v5) be a formula such that, if G0 × G1 is Col(ω, µ) × Col(ω, µ)-generic over M, then

00 00 00 00 M[G0 × G1 ]  Φ(G0 ,G1 , P, τ, ρ, DM ) iff the following holds true: 0 G00 0 If s = τ 0 , then s is P-generic over M; and if

0 0 P ˇ D = {(p, q) ∈ Col(ω, µ) × Col(ω, µ): ∃r ∈ s , r M (p, q) ∈ ρ},

0 0 0 0 0 0 0 00 00 and Ψ(Q , µ, P, τ, s ) holds inside M[s ] for exactly one Q , then D is dense in Q and D ∩(G0 ×G1 ) = ∅.

By hypothesis, M[G0 × G1]  Φ(G0,G1, P, τ, ρ, DM ). Let (p, q) ∈ (G0 × G1) be such that

73 Col(ω,µ)×Col(ω,µ) ˙ ˙ ˇ ˇ (p, q) M Φ(G0, G1, P, τ,ˇ ρ,ˇ DM ), (††)

˙ Col(ω,µ)×Col(ω,µ) where Gn ∈ M is the canonical name for Gn, for n ∈ {0, 1}. By Subclaim 1, (p, q) ∈ Q. Since D is dense in Q, there is some (p0, q0) ≤ (p, q) such that (p0, q0) ∈ D. 0 0 By Subclaim 2, there is some G0 × G1 inside M[G] which is Col(ω, µ) × Col(ω, µ)-generic over M, 0 0 0 0 G0 such that (p , q ) ∈ G0 × G1, and s = τ 0 . 0 0 0 0 G0 0 0 0 By (††), M[G0 × G1]  Φ(G0,G1, P, τ, ρ, DM ). By s = τ 0 , the s which Φ(G0,G1, P, τ, ρ, DM ) 0 0 0 0 0 describes in M[G0 × G1] is equal to s. In turn, the D which Φ(G0,G1, P, τ, ρ, DM ) describes in 0 0 0 0 0 0 M[G0 × G1] must be equal to D, and the Q which Ψ(Q , µ, P, τ, s) describes in M[G0 × G1], as part of 0 0 0 0 0 0 Φ(G0,G1, P, τ, ρ, DM ), must be equal to Q. Therefore, M[G0 × G1]  Φ(G0,G1, P, τ, ρ, DM ) yields that 0 0 0 0 0 0 D ∩ (G0 × G1) = ∅. However, (p , q ) ∈ (G0 × G1) ∩ D, which is a contradiction. Now, since Q is separative and has the same cardinality as µ inside M[s], there is a dense homomor- phism k : Col(ω, µ) → Q. By Subclaim 3, if we set

∗ H = {p ∈ Col(ω, µ): k(p) ∈ G0 × G1},

∗ ∗ then H is Col(ω, µ)-generic over M[s], and M[s][H ] = M[s][G0 × G1] = M[G0][G1], since s ∈ M[G0] ∗ and M[G0] is the least model containing s. Therefore, H is as desired. Claim

74 Appendix B

Proof of Shelah’s Auxiliary Lemma

<ω mL(A∩O(v)) If v ∈ 2, then we adopt the simplified notation µv(A) for . mL(O(v))

Lemma B.0.6. Let A ⊂ ω2 be an uncountable well-orderable set. Suppose that the union of any sequence <ω of ω1 null sets is null. Then, for any increasing sequence hni : i ∈ ωi, there is a W ⊂ 2, such that W captures A and |{s ∈ W : |s| = ni}| ≤ i, for all i ∈ ω.

Proof. Note that, in the proof of Shelah’s Theorem, it is enough to suppose that the size of A is ω1. As such, we will suppose exactly that.

<ω <ω <ω Let z be a bijection between 2 and ω − {0}. For i ∈ ω − {0} and s ∈ 2 we define Bi,s ⊂ 2 by

i <ω Bi,s = {u ∗ 1 ∈ 2 : |u| = i × z(s)}, where u ∗ 1i is a sequence u concatenated with the sequence of 1’s with length i. From this definition it follows that if i1 6= i2 or s1 6= s2, then the length of the sequences in Bi1,s1 is different from the length of the sequences in Bi2,s2 . So, if O(Bi,s) is the set of all infinite binary sequences that contain some sequence in Bi,s, then the sets O(Bi1,s1 ) and O(Bi2,s2 ) are independent, i.e,

µv(O(Bi1,s1 ) ∩ O(Bi2,s2 )) = µv(O(Bi1,s1 )). µv(O(Bi2,s2 )).

Let v ∈ <ω2, and let i > |v|. Then

i i mL(O(Bi,s)∩O(v)) mL( {a∈O(Bi,s): v⊂u∗1 ⊂a ∧ u∗1 ∈Bi,s} ) 1 µv(O(Bi,s)) = = = i . mL(O(v)) mL(O(v)) 2

For a finite S ⊂ <ω2, and for i > |v|, we have

S ω S T ω 1 |S| µv( s∈S O(Bi,s)) = 1 − µv( 2 − ( s∈S O(Bi,s))) = 1 − µv( s∈S( 2 − O(Bi,s))) = 1 − (1 − 2i ) .

Recall that if x > 0, then 1 − x < e−x. Thus, if |S| ≥ 2i, it follows that

i − 1 i 1 |S| 1 2 2i 2 −1 1 (1 − 2i ) ≤ (1 − 2i ) < (e ) = e < 2 .

i S 1 So if i > |v|, and |S| ≥ 2 , then µv( s∈S O(Bi,s)) > 2 . Now for a ∈ A let

a S U = O(Bi,a n ). n i>n  22(i+1)

75 a 1 Then mL(U ) ≤ n because the length of the sequences in Bi,a n is never ≤ n. It follows that n 2  22(i+1) T a n∈ω Un is a null set. Since the union of any sequence of ω1 null sets is null by hypothesis, we have that S T a a∈A( n∈ω Un ) is null. S T a 1 <ω Let V be an open set containing a∈A( n∈ω Un ), such that mL(V ) < 2 . For v ∈ 2 and i ∈ ω 1 we define S(v, i) to be the set of those s such that µv(O(Bi,s) − V ) = 0 and µv(V ) < 2 . Note that if 1 µv(V ) ≥ 2 , then S(v, i) = ∅. We have that

P S S 1 0 = s∈S(v,i) µv(O(Bi,s) − V ) ≥ µv( s∈S(v,i)(O(Bi,s) − V )) > µv( s∈S(v,i) O(Bi,s)) − 2 .

i S 1 It follows that if i > |v|, then |S(v, i)| < 2 because otherwise we would get µv( s∈S(v,i) O(Bi,s)) > 2 , P and thus s∈S(v,i) µv(O(Bi,s) − V ) > 0, contradicting our choice of S(v, i). Define

0 <ω W = {s ∈ 2 : ∃i∃v (|v| < i, s ∈ S(v, i), and |s| = n22(i+1) )}.

0 0 By definition, the lengths of the sequences in W are of the form n22(i+1) . Since we wish for W to capture 0 0 A we must fill the gaps in length. For any i ∈ ω if there is an s ∈ W of length n22(i+1) and a t ∈ W 0 of length n22(i+2) that extends s, then we add to W all sequences v such that s ⊂ v ⊂ t. Let W be this new set obtained from W 0. i i×i 2(i+1) If |v| < i, then |S(v, i)| < 2 , as computed above. Therefore |{s ∈ W : |s| = n22(i+1) }| < 2 < 2 . i+1 2(i+2) 2(i+1) 2(i+2) Similarly, |{s ∈ W : |s| = n22(i+2) }| < 2 × (i + 1) < 2 . Now, let 2 < k < 2 , let

|s| = n22(i+1) , and let |t| = n22(i+2) . Then the number of sequences v in W , with length nk, such that i+1 2(i+1) s ⊂ v ⊂ t, is less than 2 × (i + 1) ≤ 2 < k. Therefore |{s ∈ W : |s| = nk}| < k, for all k ∈ ω. To show that W captures A we first prove the following claim. <ω 1 a Claim: For every a ∈ A there is an n ∈ ω and a v ∈ 2 such that µv(V ) < 2 and µv(Un − V ) = 0. 1 a Proof. Suppose not. Then there is an a such that µv(V ) ≥ 2 or µv(Un − V ) > 0 for every n and v. 1 We will recursively define a ⊂-increasing chain of finite sequences {sn : n ∈ ω} to satisfy µsn (V ) < 2 and a S O(sn+1) ⊂ Un for all n. The real number c = n∈ω sn will provide a contradiction. ω 1 a Let s0 = ∅. Then O(s0) = 2, and µs0 (V ) = mL(V ) < 2 . Therefore µs0 (U0 − V ) > 0. Assume 1 a a sn is given such that µsn (V ) < 2 . Then µsn (Un − V ) > 0. Since Un is open there is an u ⊃ sn such a that O(u) ⊂ Un and µsn (O(u) − V ) > 0. By the Lebesgue density theorem, there is a b ∈ (O(u) − V ), such that d(b) → 1. Since b is outside V , there is a long enough sequence t, such that u ⊂ t ⊂ b, and

mL(V ∩O(t)) 1 µt(V ) = < . Let sn+1 = t. This completes the construction of {sn : n ∈ ω}. mL(O(t)) 2 S T a a Let c = n∈ω sn. Then c ∈ ( n∈ω Un ) ⊂ V because O(sn+1) ⊂ Un for each n. But, for each n, we have O(sn) * V . That is, the arbitrarily small neighborhoods O(sn) of c are not contained in V . Therefore V is not open, which is a contradiction.  Now we are ready to complete the proof by showing that W captures A. Let a ∈ A, let n and v be as in the hypothesis of the claim, and let i > |v| and i > n. Since W was defined by filling the 0 gaps in length, it suffices to show that a  n22(i+1) ∈ W . In other words, it is sufficient to prove that a µv(O(Bi,a n ) − V ) = 0. But this follows immediately from the fact that O(Bi,a n ) ⊂ U , and  22(i+1)  22(i+1) n from the claim.

76