Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2009 A Defense of Platonic Realism in Mathematics: Problems About the Wataru Asanuma

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected] FLORIDA STATE UNIVERSITY

COLLEGE OF ARTS AND SCIENCES

A DEFENSE OF PLATONIC REALISM IN MATHEMATICS:

PROBLEMS ABOUT THE AXIOM OF CHOICE

By

WATARU ASANUMA

A Dissertation submitted to the Department of Philosophy in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Degree Awarded: Spring Semester, 2009

The members of the Committee approve the Dissertation of Wataru Asanuma defended on April 7, 2009.

______Russell M. Dancy Professor Directing Dissertation

______Philip L. Bowers Outside Committee Member

______J. Piers Rawling Committee Member

______Joshua Gert Committee Member

Approved:

______J. Piers Rawling, Chair, Philosophy

______Joseph Travis, Dean, College of Arts and Sciences

The Graduate School has verified and approved the above named committee members.

ii

ACKNOWLEDGMENTS

First of all, I would like to thank all my dissertation committee members, Dr. Russell Dancy, Dr. Piers Rawling, Dr. Joshua Gert and Dr. Philip Bowers. Special thanks go to my major professor, Dr. Dancy, who has guided my dissertation every step of the way. The courses they offered, especially Dr. Dancy‘s ―υlato‘s ‗Unwritten Doctrines‘,‖ Dr. Rawling‘s ―Modern Logic I & II‖ and Dr. Gert‘s ―υhilosophy of Mathematics,‖ formed the backbone of my dissertation. I would also like to thank Dr. Bowers (Department of Mathematics) for serving as an outside committee member. It was exceptionally fortunate that I had an opportunity to present the parts of my dissertation at the 2008 Logic, Mathematics and Physics Graduate Philosophy Conference at the University of Western Ontario. The conference has marked a major turning point for me, and continues to inspire and motivate further research. Especially I‘m very grateful to Dr. John Bell and Dr. William Harper for their kind and insightful words on my presentation. In addition, I had opportunities to present the parts of my dissertation at the 7th Hawaii International Conference on Arts and Humanities and at the 12th Northeast Florida Student Philosophy Conference. I would like to express my appreciation to the great audience for their attention.

iii

TABLE OF CONTENTS

List of Figures ...... vi Abstract ...... viii

INTRODUCTION ...... 1

1. WHAT IS THE AXIOM OF CHOICE? ...... 4

1.1 The Axioms of ZF Theory ...... 4 1.2 The Axiom of Choice and its Equivalents ...... 5 1.3 The Consequences of the Axiom of Choice ...... 8 1.4 A Weaker Form of the Axiom of Choice ...... 11 1.5 The Axiom of Choice and the ...... 13 1.6 Fictionalism or Instrumentalism ...... 15

2. LEBESGUE‘S THEORY OF MEASURE ...... 20

2.1 Lebesgue‘s Theory of Integration ...... 20 2.2 Measurable Cardinals ...... 31

3. THE BANACH-TARSKI PARADOX ...... 39

3.1 Preliminaries ...... 40 3.2 Non-Lebesgue Measurable Sets ...... 45 3.3 The Hausdorff Paradox ...... 47 3.4 The Banach-Tarski Paradox ...... 55 3.5 What is a Paradox? ...... 58 3.6 What cannot Happen? ...... 63 3.7 A Paradox without the Axiom of Choice? ...... 64 3.8 Some Philosophical Implications ...... 66

4. GÖDEL‘S INCOMPLETENESS THEOREMS ...... 69

4.1 Gödel‘s First Incompleteness Theorem ...... 69 4.2 Turing Machines ...... 73 4.3 Recursive Functions and Recursive Sets ...... 78 4.4 The Halting Problem ...... 82 4.5 The Undecidability of First-order Logic and Arithmetic ...... 85 4.6 Undecidability and Incompleteness ...... 87 4.7 Gödel‘s Second Incompleteness Theorem ...... 90 4.8 Relative Consistency Proofs ...... 91

iv

5. MODEL-THEORETIC ARGUMENTS ...... 95

5.1 What is a Model? ...... 96 5.2 The Löwenheim-Skolem Theorem and the Skolem Paradox ...... 97 5.3 Quine‘s Thesis of the Indeterminacy of Translation ...... 103 5.4 υutnam‘s Model-Theoretic Arguments against Metaphysical Realism ...... 107 5.5 The Lessons from the Model-Theoretic Arguments ...... 111

6. WHAT IS MATHEMATICAL EXISTENCE? ...... 116

6.1 ψenacerraf‘s ωhallenges to Platonism ...... 116 6.2 Some Applications of the Axiom of Choice ...... 119 6.3 The Axiom of ...... 120 6.4 The Axiom of Constructibility ...... 121 6.5 Anselm‘s Argument of the Existence of God ...... 124 6.6 Locke on Essences ...... 126 6.7 Essence and Existence in Mathematics ...... 130 6.8 What is a Maximally Consistent Theory? ...... 131

CONCLUSION ...... 136

APPENDIX ...... 138

BIBLIOGRAPHY ...... 140

BIOGRAPHICAL SKETCH ...... 148

v

LIST OF FIGURES

Figure 1: The Axiom of Choice ...... 6

Figure 2: Riemann Integral vs. Lebesgue Integral ...... 28

Figure 3: Example of a partially ordered set ...... 40

Figure 4: Partial order by inclusion of the of a set S={0, 1, 2} ...... 43

Figure 5: The use of the Axiom of Choice in the proof of Zorn‘s Lemma ...... 44

Figure 6: Example of a non-Lebesgue measurable set ...... 47

Figure 7: G-paradoxical ...... 48

Figure 8: G-equidecomposable ...... 48

Figure 9: The Hausdorff Paradox ...... 50

Figure 10: The existence of a set T containing exactly one from each G-orbit ... 52

Figure 11: Hausdorff‘s paradoxical decomposition...... 53

Figure 12: The Weak Form of the Banach-Tarski Paradox (Two Spheres from One Version) ...... 57

Figure 13: The Weak Form of the Banach-Tarski Paradox (The Pea and the Sun Version) ...... 57

Figure 14: Example of a computer program to decide whether x is an even ...... 76

Figure 15: Example of a computer program to decide whether x is an odd ...... 77

Figure 16: Example of a computer program to compute x+y ...... 78

Figure 17: Recursively enumerable (r.e.) ...... 80

Figure 18: Decidable (recursive) ...... 81

Figure 19: Undecidable ...... 81

Figure 20: Example of a recursive set ...... 82

vi

Figure 21: Example of a recursively enumerable but not recursive set...... 82

Figure 22: The analogy between the Halting Problem and Cantor‘s Theorem ...... 84

Figure 23: T(0)↓ is undecidable ...... 87

Figure 24: Approach to Gödel‘s First Incompleteness Theorem from the theory of computability ...... 89

Figure 25: The Skolem Paradox ...... 101

Figure 26: The comparison of the indeterminacy of translation and the underdetermination of scientific theory ...... 104

Figure 27: Example of actual practice of mathematics ...... 124

Figure 28: Example of the interrelationships among the models ...... 132

vii

ABSTRACT

The conflict between Platonic realism and Constructivism marks a watershed in philosophy of mathematics. Among other things, the controversy over the Axiom of Choice is typical of the conflict. Platonists accept the Axiom of Choice, which allows a set consisting of the members resulting from infinitely many arbitrary choices, while Constructivists reject the Axiom of Choice and confine themselves to sets consisting of effectively specifiable members. Indeed there are seemingly unpleasant consequences of the Axiom of Choice. The non-constructive nature of the Axiom of Choice leads to the existence of non-Lebesgue measurable sets, which in turn yields the Banach-Tarski Paradox. But the Banach-Tarski Paradox is so called in the sense that it is a counter-intuitive theorem. To corroborate my view that mathematical truths are of non-constructive nature, I shall draw upon Gödel‘s Incompleteness Theorems. This also shows the limitations inherent in formal methods. Indeed the Löwenheim-Skolem Theorem and the Skolem Paradox seem to pose a threat to υlatonists. In this light, Quine/υutnam‘s arguments come to take on a clear meaning. According to the model-theoretic arguments, the Axiom of Choice depends for its truth-value upon the model in which it is placed. In my view, however, this is another limitation inherent in formal methods, not a defect for Platonists. To see this, we shall examine how mathematical models have been developed in the actual practice of mathematics. I argue that most mathematicians accept the Axiom of Choice because the existence of non-Lebesgue measurable sets and the Well-Ordering of reals open the possibility of more fruitful mathematics. Finally, after responding to ψenacerraf‘s challenge to Platonism, I conclude that in mathematics, as distinct from natural sciences, there is a close connection between essence and existence. Actual mathematical theories are the parts of the maximally logically consistent theory that describes mathematical reality.

viii

INTRODUCTION

A fundamental problem of philosophy of mathematics boils down to the conflict between Platonic realism and Constructivism, and the conflict between them marks a watershed in philosophy of mathematics. ψy ―υlatonic realism‖ I mean the philosophical view that posits mathematical entities, such as numbers, sets, functions and so on, as super-spatio-temporal ones. Indeed, owing to this view, mathematical knowledge was extended further and further. At the turn of the last century, however, a variety of paradoxes, such as Russell‘s Paradox, were discovered by mathematicians and logicians in the wake of the attempts to base the whole of mathematics on , and gave rise to the so-called ―crisis in the foundations of mathematics.‖ Against Platonic realism, there arose an anti-realistic doctrine called ―Constructivism.‖ Constructivism avoids positing mathematical entities dogmatically and restricts them to those that are legitimately constructible in space and time.1 But this view conceals in itself the danger that we have to pay a high price: the sacrifice of many productive results of classical mathematics. This is the reason why philosophers of mathematics take pains to seek some middle ground between the two extreme camps. At this point the problem of how to deal with the Law of Excluded Middle, impredicative definition, the Axiom of Choice, actual or potential infinity and so on becomes a controversial issue. Among other things, the controversy over the Axiom of Choice is typical of the conflict between Platonic realism and Constructivism. Not only is the Axiom of Choice the most interesting axiom in axiomatic set theory, but it also plays an important role in many other areas of mathematics. So the problem of the Axiom of Choice is one of the significant topics in philosophy of mathematics. First of all, we shall see what the Axiom of Choice is and where the problem with the Axiom lies. Especially, we shall focus on what we can do in the presence of the Axiom of

1 I will use the word ―Constructivism‖ in a broader sense than ψrower‘s ωonstructivism. In ψrower‘s Constructivism mathematical entities are constructible in our mind. But I will use the word ―Constructivism‖ in a narrower sense than Gödel‘s axiom of constructibility. Gödel‘s Axiom of Constructibility is a much stronger assumption than Constructivism as I call it.

1

Choice that we couldn‘t otherwise. Platonists accept the Axiom of Choice, which allows a set consisting of the members resulting from infinitely many arbitrary choices, while Constructivists reject the Axiom of Choice and confine themselves to sets consisting of effectively specifiable members (Chapter 1). Lebesgue‘s theory of measure will set the stage for discussing the Banach-Tarski Paradox and the existence of measurable cardinals in later chapters. Also, since Lebesgue is one of the French Constructivists, it is interesting to see the non-constructive nature of Lebesgue measure creates an irreconcilable tension with Lebesgue‘s skeptical attitude toward the Axiom of Choice (Chapter 2). The Hausdorff Paradox is the prototype of the Banach-Tarski Paradox. Informally, the Hausdorff Paradox states that a sphere is decomposed into finite number of pieces and reassembled by rigid motions to form two copies of the same size as the original. Here ―almost‖ means ―except on a countable .‖ ψanach and Tarski made improvement on the Hausdorff Paradox by eliminating the need to exclude a countable subset from a sphere. Informally, the Banach-Tarski Paradox states that a sphere is decomposed into finite number of pieces and reassembled by rigid motions to form two copies of exactly the same size as the original. The Banach-Tarski Paradox deepened the skepticism about the Axiom of Choice. But the Banach-Tarski Paradox is so called in the sense that it is a counter-intuitive theorem, as distinct from a logical contradiction or a fallacious reasoning. I argue that we should accept the Banach-Tarski Paradox as a Platonic truth and rejects epistemology based on a mathematical intuition (Chapter 3). Next, from a slightly different perspective, I corroborate my view that mathematical truths are of non-constructive nature. Once we got the undecidability of Peano Arithmetic (PA), Gödel‘s First Incompleteness Theorem is immediate. The set of true sentences in PA is not recursively enumerable. But the set of theorems (provable sentences) in PA is recursively enumerable. So it is easy to see that there is a sentence that is true but unprovable. This implies that there are some arithmetical truths we cannot get access to in an effective way. We also have to note Gödel‘s Incompleteness Theorems show that there are limitations inherent in formal methods (Chapter 4). The Löwenheim-Skolem Theorem and the Skolem Paradox seem to pose a threat to

2

Platonists. In the light of the Löwenheim-Skolem results, both Quine‘s thesis of the indeterminacy of translation and Putnam‘s model-theoretic arguments against metaphysical realism come to take on a clear meaning. According to the model-theoretic arguments, the Axiom of Choice depends for its truth-value upon the model in which it is placed. In my view, however, this is another limitation inherent in formal methods, not a defect for Platonists (Chapter 5). Finally, I meet Benacerraf‘s epistemological and ontological challenges to Platonism by examining how mathematical models have been developed in the actual practice of mathematics. Most mathematicians prefer the Axiom of Choice to the in favor of the existence of non-Lebesgue measurable sets and the Well-Ordering of reals. Also, most mathematicians reject the Axiom of Constructibility in favor of the existence of a measurable cardinal. In both cases, working mathematicians are driven by Platonic realism rather than Constructivism. I conclude that in mathematics, as distinct from natural sciences, there is a close connection between essence and existence, actuality and possibility. The actual mathematical theories are the parts of the maximally logically consistent theory that describes mathematical reality (Chapter 6).2

2 I shall give some credit to the sources from which I got mathematical technicalities. Throughout the process of writing the dissertation, I referred to Cameron (1998), Hamilton (1988), Jech (1978), Kunen (1980), Levy (1979). They offer a panoramic view of set theory overall. The former two are concise but useful introductions, whereas the latter three provide detailed and exhaustive information. For Lebesgue‘s theory of measure, Hawkins (1975) is a good help to know the historical background. We have seen how the Lebesgue integral overcomes the difficulties of the Riemann integral. For this, see e.g. Weir (1973), Wilcox and Myers (1978). For the Banach-Tarski Paradox, one can find technical details in Wagon (1985). Wapner (2005) gives a more informal presentation of the Banach-Tarski υaradox. →hen discussing Gödel‘s First Incompleteness Theorem, I put focus on the approach from the theory of computability. For this approach, Boolos and Jeffery (1974) is a classic although a wholesale revision has been made in the 4th edition of the same title (2002). Also, I consulted Cohen (1987), Cutland (1980), Ebbinghaus, Flum and Thomas (1994). Franzén (2005) warns against a prevalent misconception of Gödel‘s First Incompleteness Theorem and a conflation of distinct senses of ―completeness‖ and ―undecidability.‖ Manin (1977), Mendelson (1997) are good guides for the Skolem Paradox.

3

CHAPTER 1

WHAT IS THE AXIOM OF CHOICE?

Introduction In this chapter, first of all, I recapitulate the Axioms of Zermelo-Fraenkel (ZF) set theory (Section 1). Then, I state the Axiom of Choice and give a couple of its equivalents: the Well-Ordering Theorem and the Multiplicative Axiom (Section 2). Next, I shall show that the Axiom of Choice has some useful consequences, e.g., the Aleph Theorem. At the same time, we shall see that there were many opponents of the Axiom of Choice, and that it has some unpleasant consequences as well (Section 3). Also, I shall discuss a weaker form of the Axiom of Choice: the Denumerable Axiom of Choice, and some of its consequences (Section 4). Moreover, I shall examine the relation of the Axiom of Choice and the Continuum Hypothesis (Section 5). Finally, I provisionally conclude that the debate over the Axiom of Choice favors Platonic realism. 1.1 The Axioms of ZF Set Theory Before I state the Axiom of Choice, I shall see what constitutes the Axioms of ZF set theory. In 1930 Zermelo proposed ZF set theory in a form closely related to that used today, which consisted of the following seven Axioms. (i) : If the two sets x, y have the exactly same members, then they are equal. (ii) Power Set Axiom: For any set x, the power set of x is a set. Here the power set is the set of all of x. (iii) Axiom of : For any set x, the union of x is a set. The union, denoted by , is the set of all members of the members of a set x. (iv) : For any sets x, y, {x, y} is a set. (v) Axiom of Separation: If a propositional function P(x) is definite for a set z, there is a set y containing exactly the members of z for which P(x) holds. The Axiom allows us to separate the members with some property from a set and form a set consisting of these members. (vi) Axiom of Replacement: If F is a function, then for every set x, F[x] is a set.

4

F[x] is called the image of x under the mapping F. (vii) Axiom of Foundation: If x0 then there exists yx such that yx0. This means that there is no infinite descending -sequence. Also, there were two Axioms that were not included in this system but had occurred in his system of 1908: the and the Axiom of Choice. Since I shall discuss the Axiom of Choice in detail below, I shall mention just the Axiom of Infinity here. Axiom of Infinity: There exists a set x such that 0x and whenever yx then y{y}x. This means that if we pick up any member y in a set x, then the immediate successor of y is also in x. Zermelo did not include the Axiom of Infinity in his system of 1930 because he believed that it did not belong to general theory of set theory. He did not include the Axiom of Choice on the ground that it differed in nature from the other Axioms. In contemporary ZFC set theory are included the seven Axioms as postulated above, the Axiom of Infinity, and the Axiom of Choice. 1.2 The Axiom of Choice and its Equivalents The Axiom of Choice First of all, we shall see what the Axiom of Choice says: For every family F of disjoint nonempty sets S, there exists a set C containing exactly one member from each member S of F (i.e., for each SF the set SC is a ). Using the notion of a function we can paraphrase this as follows: For every family F of disjoint nonempty sets S, there exists a choice function f on F such that f(S)S for each set S in the family F. For instance, we can classify all natural numbers by the residues that result when they are divided by 3 (i.e., the set T of the sets S of numbers congruent each other, modulo 3).

T{S1{0, 3, 6, …},

S2{1, 4, 7, …},

S3{2, 5, 8, …}} Then it is easy to see that there exists a set C containing exactly one member from each

5 member S1, S2, S3 of T (e.g., C{0, 4, 8}). In fact, the use of the Axiom of Choice is dispensable in the case of a family of finitely many disjoint non-empty sets, and even in the case of a family of infinitely many disjoint non-empty sets if we can specify the rule by which to perform the choices. In our case, we can make sure that there exists such a set without appealing to the Axiom of Choice, for instance, following the rule of choosing the least member from the members of Sn (i.e., C{0, 1, 2}). The problem of the Axiom of Choice is concerned only with infinitely many arbitrary choices.

Figure 1: The Axiom of Choice.

The Well-Ordering Theorem The most useful form of the Axiom of Choice is the Well-Ordering Theorem: Every set can be well-ordered. Actually, the Axiom of Choice is equivalent to the Well-Ordering Theorem. But since this requires proof, we cannot regard the Well-Ordering Theorem itself as an axiom despite its usefulness. So it is important to show the equivalence of the Axiom of Choice and the Well-Ordering Theorem. But first we have to define a well-ordering.

6

In order to define a well-ordering exactly, we need to define the notion of ―an R-minimal member‖: x is an R-minimal member of A if and only if xA&(∀y)(yA~(yRx)). Also, we need to define the notion of ―connected‖: R is connected in A if and only if (∀x)(∀y)(x, yA&xyxRyyRx). We shall next define a well-ordering: R well-orders A if and only if every nonempty subset of A has an R-minimal member & R is connected in A. Roughly speaking, the notion of ―an R-minimal member‖ guarantees us the existence of a least member of every subset of A under the relation R. The notion of ―connected‖ guarantees that there is a linear ordering on A excluding the possibility of circularity. In Appendix (I), I shall show that the Axiom of Choice is equivalent to the Well-Ordering Theorem. In 1904 Zermelo explicitly formulated the Axiom of Choice and proved the Axiom of Choice is equivalent to the Well-Ordering Theorem. As we shall see in Section 3, there arose much controversy over the non-constructive nature of the Axiom of Choice. In response to his critics, in 1908 Zermelo reformulated the Axiom of Choice and his proof. There Zermelo attempted to deprive the Axiom of Choice of all the constructivist appearances by replacing a system of successive choices by a system of simultaneous ones and put more emphasis on its super-temporality. We can clearly see the figure of Zermelo as a Platonic realist here. In the same year Zermelo launched the axiomatization of set theory. It is often said that the discovery of set-theoretic paradoxes motivated Zermelo to axiomatize set theory. Under these circumstances, however, we could safely conclude that Zermelo wanted to secure the status of the Axiom of Choice by creating a rigorous system of axioms for set theory and lay down firm foundations of set theory and mathematics in general. The Multiplicative Axiom We also have to notice that there are many other equivalents of the Axiom of Choice. For instance, in abstract algebra one of the equivalents of the Axiom of Choice, Zorn‘s Lemma, is applied earlier than the Well-Ordering Theorem. This means that the Axiom of

7

Choice is not an ad hoc principle formed in the development of mathematics, but a stable principle which is widely applicable in many branches of mathematics. But here in connection with axiomatic set theory I shall confine my attention to Russell‘s Multiplicative Axiom. In Russell introduces the Axiom of Choice in the following wayμ ―If is a of mutually exclusive classes, no one of which is null, there is at least one class which takes one and only member from each member of .‖3 Russell calls it the ―Multiplicative Axiom,‖ probably because of the Axiom‘s connection with cardinal multiplication, i.e., the construction of a set for the product of a denumerable infinity of cardinals.

pairs 0א pairs of boots and 0א Russell takes as an example the millionaire who bought of socks.4 The question is how many boots and how many socks the millionaire had in all.

0א socks, we know that 0א×boots and 2 0א×Although it is natural to suppose that he had 2

boots 0א So the answer is that he had .0א0א×is not increased by doubling it, that is, 2

members. But we have to 0א pairs must have 0א socks. In general, the sum of 0א and notice that this result presupposes the existence of a set that consists of either of each pair. In some cases we can have such a set without the Multiplicative Axiom, whereas in other cases we cannot unless we assume the Axiom. In our case, among a pair of boots we can distinguish left from right and thus choose all the right boots and then all the left boots. Since there are no such distinguishing features among a pair of socks, however, we have no specific rule by which to choose either of each pair of socks. Therefore, in the case of socks the use of the Multiplicative Axiom is essential to show that there exists a set consisting of either of each pair of socks. 1.3 The Consequences of the Axiom of Choice As we have seen above, if we assume the Axiom of Choice, then, by the Well-Ordering Theorem, every set can be well-ordered. So the set R of all real numbers can be well-ordered. This is one of the most significant consequences of the Axiom of Choice. This does not mean that in the absence of the Axiom of Choice we know little about the set R. Actually, we know that the of the set R is greater than that of the set N of all

3 Russell, B. and Whitehead, A. N. [1910], vol. I, p. 536.

8 natural numbers by the Cantorian diagonal argument, and that the cardinality of the set R or

Based on ZF set theory .0אof the continuum is that of the power set of the set N, i.e., 2 without the Axiom of Choice, however, we cannot prove whether or not the set R can be well-ordered, therefore we don‘t even know whether or not the cardinality of the set R is an aleph.5 Only in the presence of the Axiom of Choice we do know that the set R can be well-ordered, and that the set R is an aleph. And only then we can ask which aleph is its cardinal. The set N of all natural numbers can be well-ordered by the less-than relation. Using the terminology of ZF set theory, the set N can be well-ordered by the membership relation. One of the strengths of ZF set theory is that the less-than relation can be replaced by the membership relation. The set N can be well-ordered by the less-than relation because every nonempty subset of the set N has a least member. On the other hand, the set N cannot be well-ordered by the greater-than relation because there are a bunch of subsets that do not have a greatest member. The set Q of all rational numbers cannot be well-ordered by magnitude. But, it is easy to see how the set Q of all rational numbers can be well-ordered. Because, using the ordering that emerges from the proof that the cardinality of rational numbers is the same as that of natural numbers, it‘s trivial that there is some way in which the set Q is put into one-to-one correspondence to the set of all natural numbers. But the situation is quite different with the set R of all real numbers. Intuitively speaking, we don‘t know how the set R can be well-ordered. Even so by the Well-Ordering Theorem, which implies that every set can be well-ordered, the set R can be well-ordered. As with the set Q, it is obvious that the set R cannot be well-ordered by magnitude for the same reason as the set Q. But unlike the set Q, there is no obvious ordering to hand that does the trick. However, the Well-Ordering Theorem tells us that there is some relation by which the set R can be well-ordered, though we don‘t know what it is specifically. We can see even from this that the Well-Ordering Theorem indeed makes

4 Russell, B. [1919], p. 126. 5 Alephs are the infinite well-ordered cardinals.

9 a very strong and powerful claim. The Aleph Theorem, The Trichotomy of Cardinals Moreover, since the Well-Ordering Theorem claims that every set can be well-ordered, it is not just the set R that can be well-ordered. So it follows from the Well-Ordering Theorem that all the cardinals are ordinals, which leads us to the Aleph Theorem that every infinite cardinal is an aleph. Thus the Well-Ordering Theorem simplifies addition and multiplication of infinite cardinal numbers, which would be more complicated otherwise. Also, all cardinals are taken to be initial ordinals. In particular, any two sets are comparable in terms of cardinality. Therefore the Trichotomy of Cardinals is true: For every cardinal m and n, either mn, or mn, or mn. Furthermore, as a corollary of the Aleph Theorem, the following equalities hold: m2mm2 . In this fashion the fundamental propositions true for alephs are extended to all infinite cardinals. If we assume the Axiom of Choice, then by the Well-Ordering Theorem, we don‘t have to worry about the existence of sets that cannot be well-ordered. We know much more about the cardinals of well-orderable sets than about the cardinals of sets that cannot be well-ordered. As a consequence, once we assume the Axiom of Choice, which implies the Well-Ordering Theorem, the theory of cardinals is considerably simplified. But in fact there arose much controversy over the Axiom of Choice and Zermelo‘s proof of its equivalence to the Well-Ordering Theorem. Hadamard, Hausdorff, and Keyser defended the proof in full generality. Roughly speaking, however, German critics such as Bernstein and Schoenflies disputed the proof on the ground that the Burali-Forti paradox lies hidden in the proof, while French Constructivists such as Lebesgue, Borel, and Baire opposed the Axiom of Choice itself on the ground that it does not provide the specific rule by which to perform the choices.6 Though Zermelo met the first criticism by rejecting the assertion that the collection W of all ordinals is a set, the second one was more

6 Poincaré, who is often said to be a conventionalist, accepted the Axiom of Choice so he did not reject the Well-Ordering Theorem but Zermelo's proof of it because it makes use of impredicative definition.

10 serious because of the stark philosophical difference underlying that criticism.7 The fundamental opposition in philosophy of mathematics is that between Platonic realism, which posits mathematical entities outside of space and time, on the one hand, and anti-realism, which restricts them to those which are legitimately constructible in space and time, on the other. Under the philosophical background of this sort, the Platonic realists accept the Axiom of Choice, which allows a set consisting of the members resulting from infinitely many arbitrary choices, while the anti-realists reject the Axiom of Choice and confine themselves to sets consisting of the effectively specifiable members. Hence some mathematicians have claimed that we should avoid the Axiom of Choice wherever possible, treating it just as a heuristic device for finding a new theorem, which is then to be proved without appeal to the Axiom. Though, as we have seen above, the legitimacy of the Axiom of Choice was already controversial, skepticism about the Axiom of Choice was deepened when in 1914 Hausdorff discovered an unpleasant consequence of it, which is called Hausdorff‘s paradox: half of a sphere is congruent to a third of the same sphere. Later Banach and Tarski established this result as the Banach-Tarski paradox: any sphere S can be decomposed into a finite number of pieces and reassembled into two spheres with the same radius as S. In fact, Borel believed Hausdorff‘s paradox to show that contradictions follow from the Axiom of Choice and that as a result the Axiom of Choice should be rejected. 1.4 A Weaker Form of the Axiom of Choice Given the controversial character of the Axiom of Choice, it is natural to attempt to weaken it in some way acceptable to its opponents. We can then save some of its consequences, although we have to sacrifice others. Precisely speaking, I have thus far confined myself to the so-called full Axiom of Choice in distinction from its weaker form. Since the full Axiom of Choice is independent of ZF, the weaker form of the Axiom of Choice should be

7 The following two objections against the Axiom of Choice can be expected: (1) The Axiom of Choice should be constructibly justifiable. (2) Even if the Axiom of Choice cannot be justified constructibly, we should be able to justify constructibly the Well-Ordering of reals which is most wanted. I doubt that both are legitimate criticisms.

11 too strong for theorems of ZF, but too weak for the full Axiom of Choice. In other words, the weaker form of the Axiom of Choice should be a theorem T of ZFC. More specifically, when we ask firstly whether or not it‘s a theorem of ZF and then whether or not it‘s equivalent to the full Axiom of Choice, both of the questions should be answered in the negative. For it‘s supposed to have the intermediate power between the theorems of ZF and the full Axiom of Choice. The Denumerable Axiom of Choice, The Principle of Dependent Choices An example in point is the Denumerable Axiom of Choice, which restricts infinitely many arbitrary choices to the cases of denumerable many sets. To put it precisely, the Denumerable Axiom of Choice runs as follows: Every family of denumerably many nonempty sets has a choice function. The Denumerable Axiom of Choice is closely related to the Principle of Dependent Choices: If R is a relation on a set S such that for every xS there exists yS such that

xRy, then there is a sequence x0, x1, x2, … of members of S such that

x0Rx1, x1Rx2, …, xnRxn+1, … This principle enables us to make a countable number of consecutive choices. In Appendix (II), I shall show that the Principle of Dependent Choices implies the Denumerable Axiom of Choice. The Countable Union Axiom If we assume the Denumerable Axiom of Choice, then we can get the Countable Union Axiom:8 The union of countably many countable sets is countable. In Appendix (III), I shall show this. Every has a countable subset, Every Dedekind- is finite, The restricted form of Trichotomy of Cardinals

8 A set is called denumerable if it is equinumerous with . A set is called countable if it is either equinumerous with ω or finite.

12

Also, if we assume the Denumerable Axiom of Choice, then we can prove that every infinite set has a denumerable subset. In Appendix (IV), I shall show this. In sum, I have shown above that the Principle of Dependent Choices implies the Denumerable Axiom of Choice and this in turn implies that every infinite set has a countable subset. Incidentally, neither of these implications can be reversed. The last fact means that every Dedekind-finite set is finite. A set S is Dedekind-finite if and only if there is no proper subset of S equipollent to S. It is a matter of significance that if we don‘t assume the Denumerable Axiom of Choice we cannot prove the equivalence of the notions of Dedekind-finite set and finite set. For this means that in the absence of the Denumerable Axiom of Choice there might exist sets which were infinite in one sense but were finite in another. Russell and Whitehead were seriously concerned that there might exist mediate cardinals which were too large to be finite but too small to be Dedekind-infinite. At the same time, it is worth noting that the Denumerable Axiom of Choice, instead of the full Axiom of Choice, suffices to reject such a possibility. Thus,

and the restricted from of the Trichotomy of ,0א every is comparable with

for any x. But the Principle of 0א|or |x ,0א|or |x ,0א|Cardinals does hold, i.e., |x Dependent Choices has its limitations; it does not, for instance, imply the existence of a well-ordering of the set R of all real numbers. Historically speaking, Borel, who rejected the full Axiom of Choice, accepted only the Denumerable Axiom of Choice, while unlike Borel, Hobson rejected even denumerably many arbitrary choices, though he was sympathetic with Borel‘s critique. 1.5 The Axiom of Choice and the Continuum Hypothesis In Section 3, we have seen that the cardinality of the set R of all real numbers is greater than that of the set N of all natural numbers, and that it is that of the power set of the set N, But there we have also seen only in the presence of the Axiom of Choice, which .0אi.e., 2 implies the Well-Ordering Theorem, we know that the set R can be well-ordered and the cardinality of the set R is thus an aleph, and also we can ask which aleph is its cardinal.

of that of 1א That is, we can ask whether the cardinality of the set R is the successor cardinal

between the cardinality of the set N and that 1א the set N, or there is the successor cardinal of the set R. The Continuum Hypothesis claims that the cardinality of the set R is the

13

0א This means that the Continuum . 12א ,.of that of the set N, i.e 1א successor cardinal Hypothesis presupposes the Axiom of Choice. To generalize this, the Generalized Continuum Hypothesis claims that the cardinality of the a set S is the successor cardinal of n-1א . n2א ,.that of a set S, i.e In this connection, it is interesting to see that Brouwer claims that for the intuitionists 9 is the only infinite cardinality of 0א .the Continuum Hypothesis doesn‘t make sense which the intuitionists can accept the existence. For the intuitionists real numbers are the rule-governed sequences constructed by a finite number of steps. Therefore for the intuitionists the set of all real numbers which contains free choice sequences is meaningless. So Brouwer claims that for the intuitionists it has no meaning to ask whether or not the

and whether or not the ,1א cardinality of the set of all real numbers is greater than cardinality of the set of all real numbers is the second smallest infinite cardinality. Given that the Continuum Hypothesis presupposes the Axiom of Choice, it comes as no surprise that Brouwer believes that for the intuitionists the Continuum Hypothesis doesn‘t make sense. But it is interesting to see that Brouwer admits that a set S is infinite if and only if S is equipollent to one of its subsets. As we have seen in Section 4, this definition is exactly Dedekind-infinite. This means that even Brouwer uses the Denumerable Axiom of Choice implicitly. To see how the Generalized Continuum Hypothesis works, we shall introduce the function ℶ . The letter ℶ (beth) is the second letter of the Hebrew alphabet.

,0אℶ 0

ℶ α ℶ α+12 ,

ℶ λαλℶ α, where λ is a limit ordinal. This definition makes sense only if we assume the Axiom of Choice because only in the presence of the Axiom of Choice, which implies the Well-Ordering Theorem, every set can

α-1א α, since ℶ α2אbe well-ordered and all cardinals are ordinals. For all α, ℶ α

9 Brouwer [1999], in Jacquette (ed) (2002), p. 271-4.

14

.αאα. Especially, if the Generalized Continuum Hypothesis holds, then ℶ αא Also, we shall see that under the Generalized Continuum Hypothesis an inaccessable cardinal is the first weakly inaccessable ordinal. An ordinal α is called weakly

for a limit ordinal . →e can get the concept of an א inaccessable if α is a limit cardinal inaccessable cardinal stronger than that of an weakly inaccessable cardinal by replacing the

by the exponentially and thus more rapidly +1א to א moderately increasing sequence from

א .to 2 . Then we can ask how big an inaccessable cardinal is א increasing sequence from Since an inaccessable cardinal is stronger than a weakly inaccessable cardinal, it is at least as big as the first weakly inaccessable ordinal. If we assume the Generalized Continuum

א an inaccessable cardinal is the first weakly inaccessable , +12א Hypothesis, since ordinal. 1.6 Fictionalism or Instrumentalism I believe that a deep-rooted and far-reaching topic in philosophy of mathematics is the debate between those who claims mathematical objects to exist over and above space and time (Platonic realism) and those who take them to be constructed within space and time (Constructivism). Indeed there is a fictionalist or instrumentalist account of mathematical objects, but I don‘t believe that fictionalism or instrumentalism is a good account of mathematical objects. According to fictionalism, mathematical statements are simply false, whereas according to instrumentalism, mathematical statements are neither true nor false. The difference between fictionalism and instrumentalism is only in the letter but not in the spirit. Philosophers of this sort attempt to explain the usefulness of mathematics in natural science by means of the conservation theorem. This means that the mathematical theory preserves the truth of the scientific theory, but facilitates the deductions which could be made at greater length and with greater difficulty otherwise. A mathematical object plays a role like a catalyst in chemistry that is a substance facilitating a chemical reaction though itself remaining unchanged. Fictionalists make their claim by refuting the indispensability argument. Roughly speaking, indispensabilists accept the existence of mathematical objects, insofar as those mathematical objects are indispensable to explain natural sciences. So fictionlists attempt

15 to show that mathematical objects are dispensable to explain natural sciences. But we have to note that one can reach the fictionalist conclusion by refuting the indispensability argument only if the indispensability argument is the most promising argument for mathematical realism. If there is a better argument for mathematical realism than the indispensability argument, fictionalists will have much more work to do in order to deny the existence of mathematical objects. So we shall examine the Quine/Putnam indispensability argument in detail. The Quine/Putnam indispensability argument aims to establish a realm of mathematical objects by showing that if a scientific theory is accepted as true, then any mathematical theory which is indispensable to formulate that scientific theory must also be accepted as true. [Q]uantification over mathematical entities is indispensable for science, both formal and physical; therefore we should accept such quantification; but this commits us to accepting the existence of the mathematical entities in question. This type of argument stems, of course, from Quine, who has for years stressed both the indispensability of quantification over mathematical entities and the intellectual dishonesty of denying the existence of what one daily presupposes.10 We must notice that Quine and Putnam are not only claiming that the truth of mathematics is presupposed by its use in science, but that the mathematics employed in our best scientific theories enjoys empirical support. The upshot of the Quine/Putnam indispensability argument is that the mathematics employed in a scientific theory is confirmed indirectly from the confirming evidence for the scientific theory in which it is contained. According to Quine, just as we accept the existence of molecules, atoms, and quarks if by so doing we have the best scientific theory that organizes and explains our experience, so we accept the existence of mathematical objects. Putnam stresses that scientific theories cannot even be formulated without the use of mathematics. Physical laws, such as Newton‘s law of gravitation, are formulated using equations. Thus, Putnam claims that they cannot be stated in a nominalistic language, that is, one in which no reference is made to numbers, functions, sets, etc. If this is the case, scientific theories refer to mathematical

16 objects and so we cannot accept our best scientific theories without accepting the existence of mathematical objects. Putnam says, ―mathematics and physics are integrated in such a way that it is not possible to be a realist with respect to physical theory and a nominalist with respect to mathematical theory.‖11 The indispensability argument is supported by the idea that mathematical objects are on a par with physical objects. Quine in Two Dogmas of Empiricism maintains that our statements about the external world face the tribunal of sense experience not individually but only as a corporate body. This means that logical reflection and sense experience together shape the total theory. Quine introduces a policy of ―minimum mutilation‖ in which we revise less central beliefs rather than more central ones. The logical and mathematical beliefs are the most central beliefs. This is why we seldom are tempted to revise them in the light of experience. According to Quine, however, the logical and mathematical beliefs do not enjoy some special sort of nonempirical justification. The logical and mathematical beliefs differ just in degree from the empirical beliefs. Thus, our belief in the validity of modus tollens is just central than our vernacular beliefs. Quine claims that the logical and mathematical beliefs are central because they apply to a lot of situations and plays an important role in organizing how we think about these situations. Even if modus tollens is central, however, I don‘t think that we could say that every theorem in pure mathematics is like this. A result in some recondite area of algebraic topology, for instance, might play little or no general role in organizing how we think about the world. Likewise, the parts of mathematics, such as advanced set theory, that go beyond this role are not accepted as true. The drawback of the indispensability argument is that it conflicts with the actual practice of mathematics. The history of mathematics after the nineteenth-century shows how mathematics separated and developed itself independently from natural sciences and took its own course. I think that, when we say that mathematics is indispensable, it‘s very important that mathematics is indispensable to either natural sciences or mathematics itself, especially considering the autonomous developments of actual practice of mathematics after the

10 Putnam, Mathematics, Matter and Method, p. 347. 11 Putnam, Mathematics, Matter and Method, p. 74.

17 nineteenth century. If we interpret it as indispensable to mathematics itself, such as the self-organization of mathematics, it amounts to much the same thing as the Platonistic claim that a consistent mathematical theory describes at least a part of mathematical universe. But in this case it seems to me that indispensability is not the best way to represent the characteristic feature of mathematical objects. So if indispensabilists have something to say different from Platonists, we have to interpret it as indispensable to natural sciences. For instance, if indispensabilists would accept the existence of mathematical objects which are indispensable to mathematics itself, since the Axiom of Choice is indispensable to mathematics itself, especially axiomatization of Cantorian set theory, they would have to accept the Axiom of Choice. Against Platonists, however, indispensabilists reject the Axiom of Choice in its own right. If indispensabilists accept the existence of mathematical objects which are indispensable to natural sciences, since the Axiom of Choice is dispensable to natural sciences, they can reject the Axiom of Choice as required. So when we use the indispensability argument to justify the ontological status of mathematical objects, we have to make it clear that mathematics is indispensable to natural sciences. It might be objected that, even if indispensabilists would accept the existence of mathematical objects which are indispensable to mathematics itself, they could differentiate themselves from Platonists in the sense that, as we shall see later, the Axiom of Choice is dispensable to prove the Banach-Tarski Paradox. But I shall claim that, even if the Banach-Tarski Paradox can be reformulated without the Axiom of Choice, it does not necessarily deal a blow to the Platonists. I shall ask whether or not the proof without the Axiom of Choice depends on extremely complex or ad hoc principles, compared with the proof with the Axiom of Choice. If by invoking the Axiom of Choice the Banach-Tarski Paradox can be proved in a simpler, more systematic and more unified way, I believe the proof with the Axiom of Choice reflects the fact of matter rather than the proof without the Axiom of Choice. In any case, I don‘t believe that the indispensability argument is the most promising argument for mathematical realism. If we believe mathematical theories applicable to natural sciences, in the extension we should believe mathematical theories not applicable to natural sciences. For instance, if we believe a weaker form of the Axiom of

18

Choice, there is no good reason to disbelieve the full Axiom of Choice. In Chapter 6, I shall submit the argument for mathematical realism I believe is the best. So even if fictionalists succeed in the nominalization of mathematical objects in natural sciences, since there is a better argument for mathematical realism, I don‘t believe that factionalism or instrumentalism is a correct account of mathematical objects. Conclusion The main problem with the Axiom of Choice concerns the issue of whether or not infinitely many arbitrary choices should be accepted in mathematics. The Platonists admit the possibility of making a set consisting of indefinable members of a certain kind. On the other hand, the constructivists allow only the existence of sets consisting of members that are specifiable by a finite number of steps. But the Axiom of Choice largely contributed to the systematization of Cantorian set theory. It is interesting to note that even some of the opponents of the Axiom of Choice used it implicitly. For instance, though Russell was skeptical of the Well-Ordering Theorem and the Trichotomy of Cardinals, he used the proposition that every infinite set has a denumerable subset in order to prove that a set is Dedekind-finite if and only if it is finite. But the proof of this proposition makes essential use of the Denumerable Axiom of Choice. This is a good example of the deductive power of the Axiom of Choice. Mathematics, then, is severely curtailed if we reject the Axiom of Choice. In light of this, I provisionally conclude that the debate over the Axiom of Choice favors Platonic realism.

19

CHAPTER 2

LEBESGUE’S THEORY OF MEASURE

Introduction In this chapter, I shall trace back the theory of large cardinals to its origin: Lebesgue‘s theory of measure, and claim that the non-constructive nature of the Lebesgue measure lies in the notion of -additivity (or, more generally, -additivity). To that aim, in the first half of the chapter, we shall see that the Lebesgue integral based upon the Lebesgue measure was devised in attempts to solve the problems with the Riemann integral based upon Jordan‘s content. The Lebesgue integral enabled the integration of functions that are not Riemann integrable, and also made much improvement on Riemann‘s theory of convergence properties. In the second half of this chapter, we shall investigate how Lebesgue‘s theory of measure is applied to the theory of large cardinals. The cogent relationship between Lebesgue‘s theory of measure and the theory of large cardinals can be detected in the theorem to the effect that if there exist measurable cardinals, they are (strongly) inaccessible. Finally, I shall point out that the non-constructive nature of the Lebesgue measure as shown above creates an irreconcilable tension with Lebesgue‘s skeptical attitude toward the Axiom of Choice. 2.1 Lebesgue’s Theory of Integration We can see the nature of the Riemann integral in the method to find the area bounded by a continuous function f(x) and the x-axis. The Riemann integral involves partitioning the domain of f(x) and approximating f(x) by means of the upper and lower step functions bracketing f(x) from without and within respectively. A partition P of [a, b] is a set {a0, a1,

…, an} such that

aa0a1 … anb.

Let Si{x | ai-1xai}. (i1, 2, … n)

Si(x) is the characteristic function of the set Si, defined by

20

Si(x)1 if xSi

0 if x Si

Also, let MisupSi f(x) and miinfSi f(x). Then, the upper step function n Φ(x)M1S1+M2S2+ …+MnSn∑i1 MiSi. Similarly, the lower step function n (x)m1S1+m2S2+ …+mnSn∑i1 miSi. Now, the upper sum U(f, P) is the area bounded by the upper step function Φ(x) and the x-axis: n U(f, P)M1(a1-a0)+M2(a2-a1)+ …+Mn(an-a n-1)Sn∑i1 Mi(ai-a i-1). In the same way, the lower sum L(f, P) is the area bounded by the lower step function (x) and the x-axis: n L(f, P)m1(a1-a0)+m2(a2-a1)+ …+mn(an-a n-1)Sn∑i1 mi(ai-a i-1). As n∞, we have the upper integral b f  inf{U f ,( P)} a and the lower integral b f  inf{L f ,( P)} a b b Finally, f is Riemann integral iff f  f . a a Although the Riemann integral will do for most practical use, there are some problems with the Riemann integral when it comes to advanced fields of mathematics. First of all, there exist a lot of functions that are not Riemann integrable. Secondly, the Riemann integral contains too strict convergence properties. The Lebesgue integral extends the range of integrable functions, taking over the nice properties of the Riemann integral. The turn from the Riemann to Lebesgue integral could be characterized as the one from constructive to non-constructive mathematics, as it were. In order to overcome the difficulties with the Riemann integral, Lebesgue substituted the Lebesgue measure for Jordan‘s content that provided foundation for the Riemann integral. Earlier concepts of measure such as Jordan‘s content were only finitely additive in the sense

21 that m(AB)m(A)+m(B) for any two disjoint measurable sets A, B, and these led to more limited theories of integration. On the other hand, the Lebesgue measure is σ-additive (countably additive) in the sense that

m(ξXξ)∑ξm(Xξ). for any pairwise disjoint measurable sets Xξ. The measure defined in this way fits into our intuition that the measure should be a length in one dimension, an area in two dimensions, and a volume in three dimensions. The upshot of this definition is that though the countable union of sets with measure zero is again of measure zero, the uncountable union of sets with measure zero has positive measure. Due to the property of -additivity of the Lebesgue measure, the Lebesgue integral based on the Lebesgue measure is more powerful than the Riemann integral based on Jordan‘s content. In order to see how significant the notion of -additivity is, it is useful to reconsider Zeno‘s argument and fifth-century Atomists‘ reaction to that.12 According to Zeno‘s argument, if finite extension is infinitely divisible, either the resulting least parts have no size or they have some positive size. If they have no size, however, when put together they result in something with no size. If they have a positive size, no matter how small it may be, when an infinite number of them are put together, the result is something of infinite. Either way, we cannot form the original object by reassembling its parts. So, the Atomists avoid this argument by claiming that bodies are ultimately composed of indivisibles. To Zeno‘s argument, Lebesgue would reply that a of points with measure zero remains of measure zero, and only an of points with measure zero can have positive measure. We define the Lebesgue measure more precisely. Let E be the unit interval [0, 1]. The outer measure can be obtained by approximating the set from without by open sets. That is, the outer measure of A is the infimum of open sets containing A. In symbols,

12 For this, see McKirahan, Philosophy before Socrates: an Introduction with Texts and Commentary, p. 310.

22

13 me(A)inf{m(G)∣ AGopen set} On the other hand, the inner measure can be obtained by approximating the set from within by compact (i.e., closed and bounded) sets.14 That is, the inner measure of A is the supremum of compact sets contained in A. In symbols,

mi(A)sup{m(K)∣ AKcompact set}1-me(E╲ A) E╲ A is the difference of E and A. Most importantly, a set AE is Lebesgue measurable if me(A)mi(A). A set with Lebesgue measure zero is called a null set. Here we must be careful not to confuse ‗‘ with ‗null set‘ in this sense. Actually, since we define the empty set to be of measure zero, the empty set is a null set. But a set containing just a single point, that is, a singleton is a null set as well. Due to the property of -additivity of the Lebesgue measure, any countable set of points is also a null set. Only an uncountable set can have positive measure. But we have to note that some uncountable sets of points can be null sets. A case in point is the Cantor set. The Cantor set is constructed as follows: Take the unit interval [0, 1]. Divide it into three equal intervals and remove the middle open third, leaving the set C1.

C1[0, 1/3][2/3, 1] Then, divide each of the two intervals into three equal intervals and remove the middle open third of each interval, leaving the set C2.

C2[0, 1/9][2/9, 1/3][2/3, 7/9][8/9, 1] …..

At the nth step, we get the set Cn.

n n n-1 n-1 n n Cn[0, 1/3 ][2/3 , 1/3 ] … [1-1/3 , 1-2/3 ][1-1/3 , 1]

Repeat this process again and again. After steps, we get the Cantor set C.

13 A set S is open if, for any xS, S contains an open ball of center x. In symbols, ∀x∃((x-, x+) S). 14 A set S is closed if its SC is open. A Set S is bounded if it is contained in some ball. Note that a closed set is not necessarily bounded. According to this definition, there are closed and unbounded sets. For instance, an infinite half open interval (-∞, 1] is closed because its complement (1, ∞) is open, but unbounded because it is an infinite interval.

23

Now, we prove that the Cantor set is of measure zero. As we can clearly see from the process of constructing the Cantor set, at the nth step we get 2n many intervals of the n n n n n length 1/3 . Therefore, Cn has the total length 2 ×1/3 (2/3) . Since (2/3) 0 as n ∞, the Cantor set is of measure zero. To see the unique nature of the Cantor set, for instance, we shall divide the unit interval [0, 1] into two equal intervals and remove the one half, leaving the other half. Indeed, after steps, we get the set A of measure zero since n (1/2) 0 as n∞. Unlike the Cantor set C, however, A converges to a single point, so it is no wonder that A is of measure zero. It remains to show that the Cantor set is uncountable. The upshot of this proof is to see that the Cantor set is the set of reals in [0, 1] that can be expressed, in the ternary system, only by 0 and 2 (i.e., without 1). This is the reason why the Cantor set is often called the Cantor ternary set. In general, when an integer N in the decimal system has a ternary expansion: 2 1 0 ….. +a23 +a13 +a03 (an0, 1, 2), N is written, in the ternary system, as

….. a2a1a0. Applying this notation to a decimal n[0, 1], when n has a ternary expansion:

1 2 3 a1/3 +a2/3 +a3/3 + ….. (an0, 1, 2), n is written, in the ternary system, as

0. a0a1a2 ….. For instance, the number expressed by 34 in the decimal system is expressed by 1021 in the ternary system because 34 has a ternary expansion: 1.33+0.32+2.31+1.30. Likewise, the number expressed by 0.5 in the decimal system is expressed by 0.111 … in the ternary system because 0.5 has a ternary expansion: 1/31+1/32+1/33+ ….. A possible ambiguity is that the end points can be written in two ways. For instance, 1/3 can be written as both 0.1 and 0.022….. and 2/3 can be written as both 0.2 and 0.122 ….. But this does not cause much trouble, considering similar cases encountered in the decimal system, such as 10.999…… We shall adopt the rule according to which: If the last non-zero place is 1, we choose the non-terminating expression.

24

Otherwise (i.e., if the last non-zero place is 2), we choose the terminating expression. Then, 1/3 is written as 0.022….. and 2/3 is written as 0.2. Using the terminology of the ternary system, we can put the process of constructing the Cantor set in a more simple way. Every number in [0, 1], expressed in the ternary system, is of the form:

0. a0a1a2 ….. (an0, 1, 2) The construction of the Cantor set in the ternary system is as follows:

xC1 iff a10, 2

xC2 iff a10, 2 & a20, 2 …..

xCn iff a10, 2 & a20, 2 ….. an0, 2 Therefore, the Cantor set consists of reals in [0, 1] expressed, in the ternary system, by 0 and 2 (i.e., without 1) as stated above. Now, from the diagonal argument, we can show that the Cantor set is uncountable. Suppose that the Cantor set is put in one-to-one correspondence with natural numbers as follows:

1 0.a11a12a13 …..

2 0.a21a22a23 ….. …..

n 0.an1an2an3 …..

As we have seen, each aij is 0 or 2. So, let bn2 if ann0 and bn0 if ann2. Then, we get a number

0.b1b2 b3 ….. This is exactly the number different in the nth place from that corresponding to n, therefore cannot be found in the list above. This shows that even though it is uncountable, the Cantor set is so scattered that it is negligible from the Lebesgue measure-theoretical point of view. Now, we shall discuss what impact the Lebesgue measure has on the Lebesgue

25 integral. The upshot of the Lebesgue integral is that the values of a function f(x) don‘t affect the values of the integral f(x)dx at all points x that form a null set. The Lebesgue integral can give the explicit answer to the question of how many points can be removed without altering the value of the integral. The answer is as many points as form a null set. In other words, points that do not form a null set determine the value of the integral. We already know that due to the property of -additivity of the Lebesgue measure, a countable set of points forms a null set. Therefore, most importantly, even if the values of a function f(x) could be altered at a countable set of points, the value of the integral remains the same. This is another way to show that the set of reals is not countable. This also tells us that the boundaries make no difference to the area. If some property holds except on a null set, the property is said to be hold almost everywhere, (abbreviated a.e), or presque partout (abbreviated p.p.). If there are two different functions f(x)g(x) almost everywhere, we cannot distinguish f(x) from g(x). Then, the Lebegue theory tells us that we can regard these two functions as being virtually equal. The Lebesgue integral made possible the integration of functions that are not

Riemann integrable. The characteristic function Q(x) of the set Q of rationals is an example of functions that are not Riemann integrable but Lebesgue integrable. The reason why Q(x) is not Riemann integrable is as follows. No matter how small the partitions

(x0, x1), (x1, x2), … (xn-1, xn) of [0, 1] are, rationals and irrationals coexist in the same partition. Therefore, it‘s possible to choose a rational from any partition and then an irrational from any partition in such a way that the upper and lower step functions bracketing Q(x) cannot coincide each other. The stock-in-trade of the Lebesgue integral is to approximate f(x) not by means of ‗vertical strips,‘ i.e., the upper and lower step functions but by means of ‗horizontal strips.‘ The Lebesgue integral involves a partition of the range of f(x) rather than a partition of the domain as for the Riemann integral. Thus, in the Lebesgue integral we partition the range of f(x) and approximate f(x) by means of the upper and lower step functions bracketing f(x) from without and within respectively. A partition P of [a, b] is a set {a0, a1, …, an} such

26 that

aa0a1 … anb.

Let Si{x∣ ai-1f(x)ai}. (i1, 2, … n) Then, the upper function n Φ(x)a1S1+a2S2+ …+anSn∑i1 aiSi. This function is called a simple function (or generalized step function). Similarly, the lower function n (x)a0S1+a1S2+ …+an-1Sn∑i1 ai-1Si. Now, the upper sum n 15 U(f, P)a1m(S1)+a2m(S2)+ …+anm(Sn)∑i1 aim(Si). In the same way, the lower sum n L(f, P)a0m(S1)+a1m(S2)+ …+an-1m(Sn)∑i1 ai-1m(Si). As with Riemann integral, f is Lebesgue integrable iff the infimum of the upper sum is the supremum of the lower sum. Therefore, Q(x) is Lebesgue integrable on R, and

RQ(x)dx1×m(Q)+0×m(R╲ Q)0 because Q is a null set, so m(Q)0 .

15 m(S1) stands for the Lebesgue measure of S1.

27

Figure 2: Riemann Integral vs. Lebesgue Integral.

28

Also, there are two major convergence theorems involving the Lebesgue integral: the Monotone Convergence Theorem and the Lebesgue Dominated Convergence Theorem, neither of which is true with regard to Riemann integrable functions. Both are concerned with when the limit of the integrals is the integral of the limit, that is, when we can interchange the order of the limit and the integral.

(Monotone Convergence Theorem)

Let {fn(x)} be a monotonic increasing sequence of measurable functions

0f1(x)f2(x)… such that fn(x) converges (pointwise) to f(x).

Then, limn∞ fn(x)dx(limn∞fn(x)dx)f(x)dx. (Lebesgue Dominated Convergence Theorem)

Let {fn(x)} be a sequence of measurable functions such that limn∞ fn(x)f(x). If the sequence is ―dominated‖ by an integrable function g(x) in the sense that

∣ fn(x)∣ g(x).

Then, limn∞ fn(x)dx(limn∞fn(x)dx)f(x)dx.

Here I need to clarify the distinction between uniform convergence and pointwise convergence. The sequence of functions f1(x), f2(x), … is said to converge uniformly to f(x) if 16 ∀ε0∃N∀x (if nN, then ∣ f(x)-fn(x)∣ ).

On the other hand, the sequence of functions f1(x), f2(x), … is said to converge pointwise to f(x) if

∀ε0∀x∃N (if nN, then ∣ f(x)-fn(x) ∣ ). We note that the order of ∀x and ∃N is contrary each other. In uniform convergence for all x the sequence converges to f(x) simultaneously, whereas in pointwise convergence at

16 This would be easier to understand if interpreted in such a way that ∀ε0∃N∀x (nN is large enough so that ∣ f(x)-fn(x)∣ ε).

29 each x the sequence could converge to f(x) in different ways. This is the reason why ∃N precedes ∀x in uniform convergence while ∀x precedes ∃N in pointwise convergence. Obviously, uniform convergence implies pointwise convergence, but this implication cannot be reversed. Uniform convergence is nice in the sense that if each function of the sequence f1(x), f2(x), … has some property such as continuity or measurability, so does the limit function f(x). We shall take a couple of examples to see in concreto the difference between uniform convergence and pointwise convergence. The sequence of functions fn(x)(1-1/n)x (x [0, 1]) converges uniformly to the function f(x)x for each x[0, 1]. To see this, for a given ε choose N1/ε. Then, for all nN for all x[0, 1] ∣ f(x)-fn(x)∣ as n required. On the other hand, the sequence of functions fn(x)x (x[0, 1]) converges pointwise to the function f(x)0 if 0x1 and f(x)0 if x1. This sequence is not uniform convergence because when 1/2, say, then, no matter how large n is, ∣ f(x)-

n fn(x)∣ 1/2 for x[1/ √2, 1). Notice that in the former case each fn(x) and f(x) are continuous on [0, 1], while in the latter case each fn(x) is indeed continuous on [0, 1] but f(x) is discontinuous at x1. These examples precisely show that in uniform convergence f(x) inherits a nice property of each fn(x), i.e., continuity. The significance of the Lebesgue integral, not least, Lebesgue‘s ωonvergence Theorems, is that it provides foundation for functional analysis. Functional analysis is a branch of mathematics that discusses Banach space and Hilbert space in a rigorous manner, and is applied to the theory of integral equation or, beyond mathematics, to quantum physics. Since functional analysis has to deal with discontinuous functions, the sequence of functions does not necessarily converge uniformly. Therefore, Lebesgue Convergence Theorems, which make it possible to interchange the order of the limit and the integral not only in uniform convergence but also in pointwise convergence, plays a significant role in functional analysis.

30

2.2 Measurable Cardinals Now we consider how Lebesgue‘s theory of measure is applied to the theory of large cardinals. Large cardinals are uncountable cardinals that cannot be reached from ―below.‖ Actually, mathematicians assume various kinds of large cardinals, e.g., inaccessible, Mahlo, weakly compact, ineffable, measurable cardinals in the order of magnitude. That is, the least weakly compact cardinal, if any, is a lot bigger than the least Mahlo cardinal, which is, in turn, a lot bigger than the least inaccessible cardinal. Measurable cardinals are very large cardinals, and the least measurable cardinal, if any, is greater than many weakly

is not a measurable 0אcompact and even ineffable cardinals. We have the proof that 2 cardinal. The concept of a measurable cardinal plays a much more major role in the theory of large cardinals than the weakly compact and the ineffable cardinals. Now we shall see a measurable cardinal in connection with Lebesgue‘s theory of measure. We begin with the definition of measure on a set S. A measure on a set S is a map m from P(S) to [0, 1] such that (i) m(Ø)0 and m(S)1 (ii) Monotonicity: If A⊆B, then m(A)m(B) (iii) Non-triviality: m({a})0 for aS

(iv) -additivity: If the Xξ‘s are pairwise disjoint, then m(ξXξ)∑ξm(Xξ). As to the Lebesgue measure, it is natural to ask whether or not there is non-trivial translation-invariant countably additive measure on all subsets of reals. In 1905 Vitali showed the existence of non-Lebesgue measurable sets of reals by using the Axiom of Choice. The concept of a measurable cardinal arose in response to Vitali‘s construction of a non-Lebesgue measurable set of real numbers. (1) If the measure does not need to be translation-invariant, is there a non-trivial countably additive measure on all sets of real numbers? (2) Is there such a measure for all subsets of some set S? These questions led to the theory of numbers, which had a great impact on both in pure set theory and in descriptive set theory. We define the notion of κ-additivity by generalizing the notion of -additivity:

31

is regular, , and the Xξ‘s are pairwise disjoint for any ξ, then 0א If m(ξXξ)∑ξm(Xξ).

Though m(X)∑m(X) because measure zero is assigned to singletons, the upshot of this definition is that the union of fewer than sets with measure zero remains of measure zero. We are now ready to define measurable cardinals: 17 .has a two-valued, -additive measure 0א is measurable if and only if Intuitively, a cardinal is measurable iff is an uncountable cardinal and the union of

0א fewer than sets of measure zero is of measure zero. It is worth noting that we put

would be measurable. The cardinal 0א ,in the definition. If it were not for this condition 2 would be also measurable. In this case, according to the conditions that a measure must satisfy, m(0)0, m({0})0, m({1})0, and m({0, 1})1. Obviously, m is two-additive measure on 2. But the core of the definition of measurable cardinals is that the union of

0א countably many sets of measure 0 is again of measure zero. Therefore, cardinals

.in the definition 0א should not be considered as measurable. This is why we put →e have to note that there is a slippage between the naming of -additivity and the naming of -additivity. For -additivity is so called by paying attention to the point up to which the union of sets with measure zero remains of measure zero (therefore, the uncountable union of sets with measure zero can have positive measure), whereas -additivity is so called by paying attention to the point beyond which the union of sets with measure zero can have positive measure (therefore, the fewer than union of sets with 18 .1-additivityא measure zero remains of measure zero). Thus, -additivity is the same as

0א is a non-measurable cardinal. Then, a question arises: Is 2 also a 0א ,0א Since non-measurable cardinal? We can show that the answer is yes by reductio ad absurdum:

17 Since we put non-triviality into the definition of measure above, we dropped it from the definition of measurable cardinals here. But we could separate non-triviality from the definition of measure, and put it

,has a two-valued, -additive 0אinto the definition of measurable cardinals: is measurable if and only if κ non-trivial measure. 18 .is the first aleph 0א Fortunately, there arises no ambiguity here because

32

is a measurable cardinal, we shall reach a contradiction. The proof 0אIf we suppose that 2 runs as follows:

,is a measurable cardinal, that is, we can assign a two-valued 0אSuppose that 2

א2 א -additive measure to all the subsets of 2 0. Therefore, there are 2 0 many subsets. Now

ψut note that this function is still not the one that assigns .{1 ,0}0א :we take a function f

0א a measure to all the subsets of 2 . And we have the set of function f‘sμ 2{f1, f2, f3,…,

0א f2 }. Assigning a measure to all the subsets of 2 boils down to assigning a measure to all the subsets of 2. When we discuss the set of functions, we denote it by 2 in distinction from cardinal exponentiation 2ω in order to avoid confusion. According to the definition (i) of measure above, m(2)1. Here we can divide 2 into the two disjoint sets: the set of functions satisfying f(0)0 and the set of functions satisfying f(0)1. That is, 2{f2∣ f(0)0}{f2∣ f(0)1} Then, we have to assign measure 1 to either {f2∣ f(0)0} or {f2∣ f(0)1}, no matter which it may be, because if both of them were of measure zero, they would not add up to m(2)1, contrary to the definition (iv) of measure above. Just for the sake of argument, we shall assume that m({f2∣ f(0)1})1. Again, we can divide the set {f2∣ f(0)1} into the two disjoint sets: the set of functions satisfying f(0)1&f(1) 0 and the set of functions satisfying f(0)1&f(1)1. That is, {f2∣ f(0)1}{f2∣ f(0)1&f(1)0}{f2∣ f(0)1&f(1)1}. For the same reason as before, we have to assign measure 1 to either {f2∣ f(0) 1&f(1)0} or {f2∣ f(0)1&f(1)1}. Just for the sake of argument, we shall assume that m({f2∣ f(0)1&f(1)1})1. We let this process go on and on. Here is the upshot of this proof. The Axiom of Choice guarantees that after steps, we get the set which consists of the unique function: {1(0א)fn}{ f 2∣ f(0)1&f(1)1& ….. &f}

33

Then we ask: Is m({fn})0 or 1? According to -additivity, all the sets to which we have

assigned measure zero so far cannot add up to measure 1, but m( 2)1, so m({fn})m( 2) 1. According to the definition (iii) of measure above, however, the singleton is of measure zero and the set {fn} is a singleton, so m({fn})0. A contradiction. This completes the proof. This proof also tells us that if 2 has a two-valued, -additive measure, it should be trivial. It is interesting to capture measurable cardinals in connection with the notion of an ultrafilter. But first we have to define the notion of a filter: A filter on a non-empty set S is a collection F of subsets of S such that for any A, B⊆S, (i) SF and Ø F. (ii) If A, BF, then ABF. (iii) If AF and A⊆B, then BF (In words, any set B that contains a set A being a member of a filter F is also in that filter).

A trivial filter F{S}. We define a principal filter. Let X0 be a non-empty subset of S.

A principal filter F{X⊆S | X0⊆X}. This means that there is no infinite regress in that filter. Thus every filter on a finite set is a principal filter. Take as an example the filters on the set {0, 1, 2, 3}. A trivial filter F{{0, 1, 2, 3}}. A filter F{{0, 1, 2}, {0, 1, 2, 3}}. Another filter F{{0, 1}, {0, 1, 2}, {0, 1, 3}, {0, 1, 2, 3}}. I shall also define the dual notion of a filter, i.e., an ideal: An ideal on a non-empty set S is a collection F of subsets of S such that for any A, B⊆S, (i) ØI and S I. (ii) If A, BI, then ABI. (iii) If AI and B⊆A, then BI (In words, any set B that is contained in a set A being a member of an ideal I is also in that ideal). There is a remarkable relationship between a filter F and an ideal I on S: I{S-X∣ XF} or equivalently, F{S-X∣ XI}. So, for the set {0, 1, 2, 3} mentioned above, an ideal I{Ø, {3}}. Another ideal I{Ø, {2}, {3}, {2, 3}}.

34

We now turn to the notion of an ultrafilter that is closely related to measurable cardinals. An ultrafilter is a filter on a set S such that for any X⊆S either XF or XCF. A set XC is the complement of a set X. Again, for the set {0, 1, 2, 3} mentioned above, an ultrafilter U{{0}, {0, 1}, {0, 3}, {0, 4}, {0, 1, 2}, {0, 1, 3}, {0, 2, 3}, {0, 1, 2, 3}}. Tarski‘s Theorem tells us that every filter can be extended to an ultrafilter. It is known that the proof of Tarski‘s Theorem uses the Axiom of Choice. A prime ideal is the dual notion of an ultrafilter. A prime ideal on the set {0, 1, 2, 3} is {Ø, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}. The Prime Ideal Theorem, which is the counterpart of Tarski‘s Theorem for an ultrafilter, tells us that every ideal can be extended to a prime ideal. It is also known that we need the Axiom of Choice in order to prove the Prime Ideal Theorem. The subsets of a set are classified into an ultrafilter or a prime ideal. Measure-theoretically, if m is a two-valued measure, we have to note that an ultrafilter is a collection U of sets to which measure 1 is assigned. That is, U{X⊆S | m(X)1}. To put measurable cardinals using the notion of an ultrafilter, κ is measurable if and only if there exists a -complete non-principal ultrafilter on κ. In analogy with -additivity, we can give a definition of a -complete filter on S:

.is regular, , and XξF, then ξXξF 0א If Assuming that there exist measurable cardinals, the following theorem clearly shows the nature of measurable cardinals: If is measurable, then is (strongly) inaccessible. In what follows, omitting ―strongly‖ we shall call it just ―inaccessible.‖

.is regular and strong limit 0א is inaccessible if and only if

.is regular and limit 0א In contrast, is weakly inaccessible if and only if

is a non-measurable 0אWe can show this theorem by just modifying the proof that 2

35 cardinal.19 We need to define the notion of regular cardinals. But first we have to define the notion of the cofinality of a set. Let be a well-ordered set. A subset of S is called cofinal in S if it has the maximal element of S. Also, the cofinality of a set S, denoted by cf(S), is the least number of elements of cofinal subsets. For an ordinal α, if cf(α)α, α is called singular, whereas if cf(α)α, α is called regular. Therefore, (i) cf(0)0, so 0 is regular.

(ii) For any successor ordinal α+, cf(α+)1, so 1 is regular and every other successor ordinal is singular. (iii) cf(), so is regular. Take an ordinal 3 as an example. The cofinal subsets in ar ordinal 3 are {2}, {0, 2}, {1, 2}, {0, 1, 2} and cf(3)1 (the number of elements of a cofinal subset {2}). Thus 3 is singular, as desired. The upshot of this definition is that a set of which is the maximal element must have many elements, as we can see from the fact that any cofinal subset in , {0, 1, …., }, {1, β, …., }, {0, β, …., }, has many elements. We can get the concept of an inaccessable cardinal stronger than that of an weakly

n+1 byא n toא inaccessable cardinal by replacing the moderately increasing sequence from nא . n to 2א the exponentially and thus more rapidly increasing sequence from

{.… ,2א ,1א ,0א}A limit cardinal is sup A strong limit cardinal is sup{α, βα, β, ….} Interestingly enough, however, if the Generalized Continuum Hypothesis holds, the first

is an 0א .weakly inaccessible cardinal is identical with the first inaccessible cardinal

is of course not inaccessible because an 0א example of a strong limit cardinal, though

is to finite cardinals what 0א Also, we can say that .0א inaccessible cardinal has to be an inaccessible cardinal is to smaller cardinals. To put it more precisely, the theorem that every measurable cardinal is inaccessible means that measurable cardinals (if any) are not

19 →e prove that first of all is a regular cardinal and then is a strong limit. For the latter, we can only א replace by in the proof that 2 0 is a non-measurable cardinal.

36 constructible from below by the set-theoretic operations. Therefore, every cardinal that is constructible by the set-theoretical operations is non-measurable. Another way to see this is Dana Scott‘s simple result in 1960 to the effect that if the Axiom of Constructibility does hold, there are no measurable cardinals. Since the existence of measurable cardinals opens more fruitful mathematical universe, most mathematicians reject the Axiom of Constructibility. Conclusion Tait claims that despite the fact that both the Law of Excluded Middle and the Axiom of Choice are non-constructive principles, it seems strange that there is a remarkable difference in the attitudes of mathematicians toward them. More specifically, the so-called French Constructivists, i.e., Borel, Baire, and Lebesgue did not challenge the Law of the Excluded Middle, but only the Axiom of Choice. According to Tait, however, the non-constructive nature of the Axiom of Choice ought to be attributed to the Law of Excluded Middle, which should be therefore rejected. Tait formulates the Axiom of Choice by using the term ―type‖ instead of ―set‖ in order to express that mathematical objects are to be constructed as objects of some type A. In addition, if we understand the existential quantifier correctly, Tait claims, the Axiom of Choice is constructively justifiable. In my view, however, as DeVidi precisely points out, the Axiom of Choice contains more information than can be saved by the constructive understanding of the existential quantifier. In order to justify the Axiom of Choice in such a way as Tait suggests, we are required to change the standard notion of a set drastically. The restricted form of the Axiom of Choice, whatever form it may be, must be strictly distinguished from the Axiom of Choice in its original form. In light of this, I point out that the non-constructive nature of Lebesgue measure as shown above is incompatible with his skeptical attitude toward the Axiom of Choice. In 1905 Vitali showed the existence of non-Lebesgue measurable sets by using the Axiom of Choice. Nevertheless, Lebesgue was convinced that the Axiom of Choice was false. But we have seen that the theory of large cardinals is based on the notion of -additivity that is the generalization of the notion of -additivity, which in turn

37 requires the Axiom of Choice in its proof. 20 Hence, I provisionally conclude that Lebesgue should have accepted the Axiom of Choice with no restrictions. In the next chapter we shall go on to consider the existence of non-Lebesgue measurable sets in connection with the Banach-Tarski paradox in more detail.

20 Lebesgue implicitly used the Axiom of Choice in order to prove that the Lebesgue measure on the real line is -additive. See Tait (1994), p. 47 and p. 64, n.6.

38

CHAPTER 3

THE BANACH-TARSKI PARADOX

Introduction In this chapter, we discuss the Banach-Tarski Paradox in a mathematically rigorous way by tracing back to its original form, the Hausdorff Paradox. The basic ideas of the latter can be already detected in the former. So I believe that the Hausdorff Paradox holds the key to correctly understanding the Banach-Tarski Paradox. In Section 1, we set the stage. In Section 2, I shall give a concrete example of non-Lebesgue measurable sets derived using the Axiom of Choice. In Section 3, I shall introduce two interconnected notions, ―G-paradoxical‖ and ―G-equidecomposable.‖ Using these notions we can formulate the Hausdorff Paradox and the Banach-Tarski Paradox in a rigorous manner. Then, we shall discuss the Hausdorff Paradox. I put more emphasis on a concrete example of a paradoxical decomposition than on the formal proof of the Paradox. We find out that showing the existence of a non-Lebesgue measurable set by appeal to the Axiom of Choice plays a crucial role in a paradoxical decomposition. In Section 4, we shall discuss the Banach-Tarski Paradox. The Banach-Tarski Paradox made improvement on the Hausdorff Paradox. So we shall show how the Hausdorff Paradox was reformed by the Banach-Tarski Paradox. In Section 5, we shall distinguish three different senses of paradox. The Banach-Tarski Paradox is so called in the sense that it is a counterintuitive theorem, as distinct from a logical contradiction or fallacious reasoning. In Section 6, we shall discuss what cannot happen in the Banach-Tarski Paradox. That is, the Paradox doesn‘t hold in R1 and R2, and a paradoxical decomposition cannot be performed using fewer than five pieces. In Section 7, we shall see a paradox without invoking the Axiom of Choice. This result is in favor of the Platonists. For even if the use of the Axiom of Choice yields a paradox, it doesn‘t follow from this that we should reject the Axiom of Choice. In Section 8, however, I mention a version of the Banach-Tarski Paradox that does not require the Axiom of Choice. Although it seems that this result strikes a blow to the Platonists, I claim that the ontological status of the Paradox should be determined by

39 carefully examining which of the Platonists or the Constructivists can explain the nature of the Paradox more systematically. Finally, I point out that the Banach-Tarski Paradox represents a characteristic nature as distinct from natural science. 3.1 Preliminaries First of all, we shall define an upper bound and the least upper bound or supremum. An upper bound x of a set S is such that, for any element s in S, x is equal to or greater than s. In symbols, ∀sS (sx). Let U be the set of all upper bounds of the sets. U may be empty, but otherwise the least element of U is called the least upper bound or supremum of S. A lower bound and the greatest lower bound or infimum of S is defined analogously. The point is that S does not necessarily attain the least upper bound or the greatest lower bound in S. Take an open unit interval (0, 1) as an example. According to the definition, 0 and 1 are the greatest lower bound and the least upper bound of (0, 1) respectively, but are not in (0, 1). Note that we need to make a distinction between a maximal element and the greatest element. a is a maximal element in S if there exist no elements greater than a in S. On the other hand, b is the greatest element if b is greater than any other element in S. Let (S, ) be a partially ordered set as below.

Figure 3: Example of a partially ordered set.

Since x0 and y0 are not comparable, there is no greatest element in S. But both x0 and y0 are maximal elements in S because there is no element greater than x0 and y0. The

40 distinction between a minimal element and the least element is the analogue of this definition. We shall define relations R on a set S as follows:21 R is reflexive if xRx R is irreflexive if ~(xRx) R is symmetric if xRyyRx R is anti-symmetric if (xRy & yRx)xy R is transitive if (xRy & yRz)xRz R is connected if xy(xRy yRx) Also, R is an equivalence relation iff it is reflexive, symmetric and transitive. Let R be an equivalence relation on a set S. Then, we shall define the set of x such that x is in S and x is in the relation R to a. This set is called the equivalence class of a under the equivalence relation R, denoted by R[a]. In symbols, R[a]{x∣ xS & xRa}. It is an important fact that the equivalence relation R on a set S partitions S, that is, divides S into equivalence classes so that any two distinct equivalence classes are disjoint. This is known as the Equivalence Relation Theorem. For instance, x y (mod n) is an equivalence relation R on the set Z of all integers, and partitions Z, that is divides Z so that any two distinct residue classes are disjoint. This means nothing other than that we can classify Z by the remainders when divided by n (i.e., the remainders 0, 1, 2, …, n-1). Later we shall use the Equivalence Relation Theorem to prove the existence of non-Lebesgue measurable sets. In order to prove the Equivalence Relation Theorem, we first have to show that any aS belongs to some equivalence class. For any aS, since R is reflexive, aRa. Therefore, for any aS, aR[a]. We next show that any two distinct equivalence classes are disjoint. This means that for any a, bS, either R[a]R[b] or R[a]∩R[b]Ø. Now

21 In a logically correct order, the relation R is denoted by (x, y)R, R(x, y), xRy So precisely a reflexive relation xRx should be denoted by (x, x)R.

41 we suppose that R[a]∩R[b]Ø, and then show that R[a]R[b]. Let cR[a]∩R[b]. Then cR[a] & cR[b]. So cRa & cRb. But, since R is symmetric, aRc & cRb. Since R is transitive, aRb. Now we prove that if aRb, then R[a]R[b]. Let xR[a]. Then xRa. By assumption aRb. Since R is transitive, xRb. So xR[b]. Therefore xR[a] implies xR[b]. This means R[a]⊆R[b]. Let xR[b]. Then xRb. By assumption aRb. Since R is symmetric, bRa. Since R is transitive, xRa. So xR[a]. Therefore xR[b] implies xR[a]. This means R[b]⊆R[a]. From R[a]⊆R[b] & R[b]⊆R[a], we get R[a]R[b]. This completes the proof. We shall define some important ordering relations, using the relations mentioned above. A strict partial ordering of a set S is a relation on S which is irreflexive, anti-symmetric, transitive. A non-strict partial ordering of a set S is a relation on S which is reflexive, anti-symmetric, transitive. In extension of a relation of magnitude between numbers, a strict ordering is denoted by representing an irreflexive relation and a non-strict ordering is denoted by representing a reflexive relation. A total ordering of a set S is a connected partial ordering of S. A connected ordering is also expressed as trichotomy: for any x, yS, either xy, or xy, or xy. In a totally ordered set, a maximal element is identical with the greatest element. A well-ordering of a set S is a well-founded total ordering of S. A partially ordered set S is well-founded if every non-empty subset of S has a minimal element. Since, as we have seen above, in a total ordering a minimal element is identical with a least element, a well-ordering of a set S amounts to a total ordering of S with the property that every non-empty subset of S has a least element. Note that in a well-ordered set S every non-empty set of S, not just the set S itself, has a least element. Take the closed unit interval [0, 1]. We have seen before that the Axiom of Choice is equivalent to the Well-Ordering Theorem: Every set is well-orderable. So [0, 1] is well-orderable. But it

42 is not well-orderable in the order of magnitude because, for instance, an open set (1/3, 2/3) is the subset of [0, 1] but does not have a least element in the order of magnitude. Roughly speaking, a partially ordered set is organized in a network of a number of one-way tracks, while a totally ordered set is arranged along the single, one-way track. Also, a well-ordered set is aligned one by one, next to next, with a beginning. For instance, the power set of a set S{0, 1, 2} is partially ordered by inclusion . To see this, the power set of S is {Ø, {0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, {0, 1, 2}}. Therefore,

Figure 4: Partial order by inclusion of the power set of a set S{0, 1, 2}.

A totally ordered subset of S is called a chain. In this example, since Ø {0}{0, 2}{0, 1, 2}, a subset of S, {Ø , {0}, {0, 2}, {0, 1, 2}}, constitutes a chain. Now we are ready to state what Zorn‘s Lemma is. Zorn‘s Lemma: Let (S, ) be a partially ordered set. If every chain C in S has an upper bound in S, then S has a maximal element.22 It is known that Zorn‘s Lemma is equivalent to the Axiom of Choice. This means that even if every chain C in S has an upper bound in S, without the Axiom of Choice, we

22 In regard to a minimal element, the analogue of Zorn‘s Lemma does hold: Let (S, ) be a partially ordered set. If every chain C in S has a lower bound in S, then S has a minimal element.

43 cannot say that S has a maximal element. Also, note that Zorn‘s Lemma does not concern how many maximal elements there exist in S. We shall show how to use the Axiom of Choice in order to prove Zorn‘s Lemma. The Axiom of Choice guarantees us that there is a choice function f on the power set of S. By assumption, for any element cC, we can define a set Uc{x∣∀xC (cx)}. Consider, for some element c1C, Uc1. We can find Uc1 in the power set of S. Note that Uc1Ø because every chain C in S has an upper bound in S. So f(Uc1)c1'C. Let c1'c2. Consider, for the element c2C, Uc2.

We can find Uc2 in the power set of S. Uc2Ø, so f(Uc2)c2'C. Let c2'c3. We continue this process on and on until Uci is a singleton. Then f(Uci)ci' is a maximal element.

Figure 5: The use of the Axiom of Choice in the proof of Zorn‘s Lemma.

44

It is obviously false to claim that there is a maximal element in the set Z of all integers. But does Zorn‘s Lemma prove this? It is incorrect to apply Zorn‘s Lemma to this claim. For there is a chain that does not have an upper bound in Z. That is, Z itself, {0, 1, 2, …..} Z. So this claim does not satisfy the assumption of Zorn‘s Lemma. 3.2 Non-Lebesgue Measurable Sets In 1905 Vitali showed that all sets of real numbers are not Lebesgue measurable. But he used the Axiom of Choice in order to derive a non-Lebesgue measurable set. Therefore, Lebesgue was convinced that the Axiom of Choice was false. But Lebesgue himself implicitly used the Axiom of Choice to derive the -additive nature of Lebesgue measure. Indeed, Lebesgue used a weaker form of the Axiom of Choice that restricts its application to countable sets. As we shall see, in order to derive a non-Lebesgue measurable set, we need a stronger form of the Axiom of Choice, which extends its application to uncountable sets. But then naturally there arises a question of why we should restrict the use of the Axiom of Choice to countable sets. The existence of non-Lebesgue measurable sets is surprising and counter-intuitive. In what follows, I shall derive a non-Lebesgue measurable set using the Axiom of Choice in concreto. We define a relation R on the closed unit interval [0, 1] : xRy if y-x is a rational. Note that y-x is in [-1, 1]. For instance, 1/3R1/2 because 1/2-1/31/6, which is a rational. Also, (π/10+1/4)R(π/10+1/γ) because (π/10+1/3)-(π/10+1/4) 1/12, which is a rational. But ~[π/6Rπ/5] because π/5-π/6π/γ0, which is an irrational. Since the relation R is reflexive, symmetric and transitive, it is an equivalence relation. As we have seen above, the Equivalence Relation Theorem tells us that an equivalence relation R on a set S partitions S, that is, divides S into equivalence classes so that any two distinct equivalence classes are disjoint. The difference of any two numbers that belong to distinct equivalence classes is an irrational. But the difference of any two numbers that belong to the same equivalence class is a rational. Therefore, each equivalence class is countable. Moreover, since the relation R partitions [0, 1], which is uncountable, there are uncountable many distinct equivalence classes. Note that the countable union of countable sets is countable. Now,

45 the Axiom of Choice guarantees us that there exists a set S containing exactly one element from each equivalence class. We shall show that S is non-Lebesgue measurable.

Consider the translates of S by SnS+rn, where rn is a rational in [-1, 1]. Sn are pair-wise disjoint. To see this, suppose for reductio that xSp∩Sq. So for sp, sqS, sp+ rpsq+rq. Then sp-sqrq-rp, which is a rational. As we have seen above, however, sp-sq is an irrational. A contradiction. So Sn are pair-wise disjoint.

∞ Consider S'n1 Sn. Since rn is a rational in [-1, 1] and the set of rn is countable,

∞ [0, 1]⊆S'n1 Sn⊆[-1, 2]

By the translate-invariant nature of Lebesgue measure, Lebesgue measures of Sn are all the same.23 So,

∞ m([0, 1])1m(S')∑n1 m(Sn)m(S)×∞m([-1, 2])3 Now we ask: m(S)0 or m(S)0? If m(S)0, then, by the -additive nature of Lebesgue measure, m(S') 0. On the other hand, if m(S) 0, then m(S') ∞. Thus S is non-Lebesgue measurable.

23 The translation-invariant property means that the distance between two points a, b remains the same even if each of them is shifted by t along the real line. That is, d(a, b)d(a+t, b+t)b-a In this way, the translation-invariant property is essential to derive a non-Lebesgue measurable set. If the measure does not need to be translation-invariant, is there a non-trivial countably additive measure on all sets of real numbers? This is Lebesgue‘s Measure Problem.

46

Figure 6: Example of a non-Lebesgue measurable set.

3.3 The Hausdorff Paradox In order to discuss the Hausdorff Paradox in a rigorous manner, we need to define two technical notions first. G-paradoxical Let X be a set and G be a group that acts on X.

X is G-paradoxical if there are pairwise disjoint subsetsA1, A2, … Ai, B1, B2, … ψj of X and 1, 2, … i, 1, 2, …jG such that i 1(A1)2(A2) … i(Ai)k1 k(Ak)X and j 1(B1)2(B2) … j(Bj)k1 k(Bk)X. The figure below shows the case where i4, j3. Note that i could be different from j.

i j Also, note that as the figure shows, k1 Ak+k1 Bk is not necessarily X itself but could be a subset of X.

47

Figure 7: G-paradoxical.

G-equidecomposable In order to define ―G-equidecomposable,‖ we need to define ―G-congruent‖ first.

Ai is G-congruent to Bi (i.e., AiGBi) if Bii(Ai).

A is G-equidecomposable to B (i.e., A~GB) if A and B are decomposed into the same, finite number of pieces such that AiGBi.

Figure 8: G-equidecomposable.

We are ready to formulate the Hausdorff Paradox.

48

The Hausdorff Paradox claims that there is a countable subset Ω such that S2╲ Ω is

SO3-paradoxical. S2 is a unit sphere centered at the origin. The difference S2╲ Ω is the set of all elements which belong to S2 but not to Ωμ S2╲ Ω{x∣ xS2 & x Ω}. The special orthogonal 3 group in three dimensions, denoted by SO3, represents a group of rotations of R . The following is the flow chart of the proof of the Hausdorff Paradox. 1. If a group G is paradoxical and acts on X freely, X is G-paradoxical. 2. A free group of rank 2 is paradoxical.

3. SO3 has a free subgroup of rank 2. 2 4. SO3 acts freely on S ╲ Ω 2 5. Therefore, S ╲ Ω is SO3-paradoxical. The Axiom of Choice is indispensable in Step 1. Since the Axiom of Choice is just used so generally in the major premise, however, we can‘t clearly see into which pieces the sphere is actually decomposed and how these pieces are reassembled. So, in what follows, we shall consider a concrete example of Hausdorff‘s paradoxical decomposition. We shall start off with the definition of a group. A group G is a set, with a binary operation, satisfying the following axioms: (1) Closure (2) Associative (3) Identity e (One under multiplication) (4) Inverse (Every element in G has an inverse) Let G be a group generated by α and . Let be a counterclockwise rotation by 120° around the z-axis and let α be a counterclockwise rotation by 180° around another line through the origin. But it is not the case that any line through the origin does the job. From the angles of the rotations α, we have α23e (where e is the identity). Using this equality, α35α54 can be reduced to α2α. Then we get two kinds of reduced products of α and :

ε1 ε2 α α …. . (εn1 or 2)

ε1 ε2 α α…. . (εn1 or 2)

49

But still different reduced products could be substantially the same. For instance, suppose that α is a counterclockwise rotation by 120° around the y-axis. A quick thought experiment shows that αααααα. Bracing for the following argument, we want to avoid a situation like this. By imposing constraints on we can make arrangements so that different reduced products represent different rotations. σow, suppose that α is a line on the xz-plane, which makes θ° with z-axis. If different reduced products are actually equal, an algebraic equation involving cos2θ can be solved, so cos2θ is a number that can be a solution of an algebraic equation (i.e., a algebraic number). Conversely, if cos2θ is a number that cannot be a solution of an algebraic equation (i.e., a transcendental number), different reduced products are certainly different. Thus, we take θ so that cos2θ can be a transcendental number.

Figure 9: The Hausdorff Paradox.

50

G is called a free group if αmne is the only relation between α and . In other words, αmne does not hold between α and . In that sense, α and are ―free‖ of relations. The upshot of this definition is that two different reduced products represent two different transformations. For if any two reduced products were the same, it would mean that there is some relation between α and except for αm n e. So, a combination of α and as presented in the Hausdorff υaradox forms a free group generated by α and . Here we define ―G acts on X freely.‖ It is trivial that any point can be fixed by the identity. So, a point that can be fixed by an element in G other than the identity, if any, is called a non-trivial fixed point. G acts on X freely if there are no non-trivial fixed points in X by an element in G, that is, every point in X can be transformed to another point in there by an element in G except for the identity. We have to note that a free group G doesn‘t necessarily act on X freely. In the Hausdorff Paradox, G doesn‘t act on S2 freely. We notice that an element in G is a rotation around some axis. So any element in G fixes exactly the two points (i.e., the intersections of the sphere and the axis of rotation). Since there are at most countable combinations of α and , the number of fixed points is also countable. Let Ω be the set of all fixed points of S2 by an element in G. And we can find out an interesting feature of S2╲ Ω. A point in S2╲ Ω is fixed only by the identity. The upshot is, an element in S2╲ Ω is moved to another element in there by any other element in G than the identity. Technically speaking, G acts on S2╲ Ω freely (i.e., with no nontrivial fixed points).

Now we consider the set of all points to which x1 is moved by any element in G. It is called the G-orbit of x1: Ox1{x∣ gx1, gG}. We can prove that for two points x1 and

2 x2 in S ╲ Ω, either Ox1Ox2 or Ox1∩Ox2Ø by showing that if Ox1∩Ox2Ø then Ox1 2 2 Ox2. Note that S ╲ Ω is uncountable because S is uncountable and Ω is countable.

So the G-orbit of x1 is the equivalence class of x1. The Equivalence Relation Theorem tells us that the equivalence relation R on a set X partitions X, that is, divides X into equivalence classes so that any two distinct equivalence classes are disjoint. Indeed, since 2 the elements in G are distinct, a point x1 in S ╲ Ω is moved to different points by different

51 element in G. Since there are countable combinations of α and , however, the number of elements in G is at most countable. So S2╲ Ω is partitioned into uncountable equivalence classes. Then the Axiom of Choice guarantees the existence of a set T containing exactly one element from each equivalence class.

Figure 10: The existence of a set T containing exactly one element from each G-orbit.

But we note that the T itself is not a piece into which S2╲ Ω is decomposed, although we use T to derive a paradoxical decomposition. Hausdorff decomposed S2╲ Ω into three pieces, S1, S2, S3, depending on the last word multiplied from T (Hausdorff decomposition).

S1{x∣ tT; xα …. (t)}

S2{x∣ tT; x…. (t)} 2 S3{x∣ tT; x …. (t)} 2 Thus, α(S1)S2S3, (S1)S2, (S1)S3.

Hence, S1S2S3S2S3.

52

Figure 11: Hausdorff‘s paradoxical decomposition.

In order to make this agree with the definition of G-paradoxical, by dividing the S1 further into two pieces, we decomposed the sphere into four pieces, A1, A2, B1, B2, classified by the last rotation multiplied from T as follows:

A1{x∣ tT; xt or xα …. (t)} 2 A2{x∣ tT; xαt or xα …. (t)}

B1{x∣ tT; xt or xα…. (t)} 2 2 B2{x∣ tT; x t or x α …. (t)} These four pieces are the ones we use in a paradoxical decomposition.

Next we consider the subsets, A1', A2', B1', B2', of these pieces.

A1'{x∣ tT; xα …. (t)} 2 A2'{x∣ tT; xα …. (t)}

53

ε1 B1'{x∣ tT; xα …. (t)}

2 ε1 B2'{x∣ tT; x α …. (t)}

Note that (A1'A2'B1'B2')(A1A2B1B2). Also,

2 2 A1A2B1B2 αA1'α B1'

A1A2B1B2αA2'αB2' This shows that S2╲ Ω is G-paradoxical. We can show that these four pieces used in a paradoxical decomposition are non-Lebesgue measurable. We have (A1A2)B1, B1B2, B2(A1A2). So,

(A1A2)≡B1≡ψ2. Measure is preserved by rotations. Hence, m(A1A2)m(B1)m(B2)

But we also have α(A1A2)╲ T(B1B2). For the same reason, m(A1A2)m(B1B2)

This is impossible. Therefore, we know that A1, A2, B1, B2, are all non-Lebesgue measurable. We have to determine how to count the number of pieces into which we decompose to derive a paradox. First of all, we need to define ―G-equidecomposable using n pieces.‖ Let X be a set and G a group that acts on X. For AX, A is G-equidecomposable with X using n pieces if

AA1A2 … An

X1(A1)2(A2) … n(An) Suppose that XAB. If A is equidecomposable with X using i pieces and B is equidecomposable with X using j pieces (in symbol, A~iX~jB), then X is G-paradoxical using i+jk pieces. For instance, suppose that

AA1A2A3, BB1B2

X1(A1)2(A2)n(A3)1(B1)2(B2)

Then A~3X~2B, so X is G-paradoxical using five pieces.

54

This is exactly what we mean below when we refer to the minimum number of pieces required in a paradoxical decomposition. In this sense, X cannot be G-paradoxical using fewer than four pieces. This definition is important because in some cases the number of pieces differs depending on the order of rigid motions. In the Hausdorff Paradox, it seems that we manage with only three pieces, S1, S2, S3, but according to this definition we need four pieces, A1, A2, B1, B2. 3.4 The Banach-Tarski Paradox 2 The Banach-Tarski Paradox claims that S is SO3-paradoxical. The Banach-Tarski Paradox made improvement on the Hausdorff Paradox by eliminating the need to exclude a countable subset Ω from S2. The rough sketch of the proof runs as follows:

1. If X~GY & X is G-paradoxical, then Y is also G-paradoxical.

2 2 2 β. For any countable subset Ω of S , S ~SO3 S ╲ Ω. 2 3. S ╲ Ω is SO3-paradoxical. (The Hausdorff Paradox) 2 4. Therefore, S is SO3-paradoxical. (The Banach-Tarski Paradox) To see how Banach and Tarski eliminates the need to exclude Ω from S2, we shall take a look at how to form the set A (0, ∞) from A╲ {0} only by a translation. First, we divide A╲ {0} into the following two sets:

A1A╲ {0, 1, 2, 3, …. }

A2{1, 2, 3, …. } Let be a translation by +1.

-1 A2{0, 1, β, γ, …. }

-1 Therefore, AA1 A2 Using the same technique, we want to divide S2╲ Ω into the following two sets. For some rotation ρ, 2 2 D1S ╲ {Ω, ρΩ, ρ Ω, …. } 2 D2{ρΩ, ρ Ω, …. }

But can we define such a rotation? Suppose that Ω{x1, x2, x3, …. }. →e want to avoid n a ρ such that xjρ xi. That is, a rotation such that a point in Ω can be transformed to a

55 point in there again by multiplications of the rotation. While there are at most countable bad choices for the rotation because the set Ω is countable and the number of multiplications is countable, there are uncountable rotations (Recall that the countable union of countable sets is countable). So there is a ρ such that S2╲ Ω is divided into the two sets D1, D2 above.

-1 2 ρ D2{Ω, ρΩ, ρ Ω, …. }

2 -1 Hence, S D1ρ D2 The Hausdorff Paradox is the prototype of the Banach-Tarski Paradox. The 2 Hausdorff Paradox states that S ╲ Ω is SO3-paradoxical. Informally speaking, a sphere is decomposed into finite number of pieces and reassembled by rigid motions to form two copies of almost the same size as the original. I‘m using the word ―almost‖ in slightly different way from that in which it is used measure-theoretically. In the theory of measure, ―almost everywhere‖ means ―except on a set of measure zero,‖ while ―almost‖ means ―except on a countable subset.‖ On the other hand, the Banach-Tarski Paradox states that 2 S is SO3-paradoxical. A sphere is decomposed into finite number of pieces and reassembled by rigid motions to form two copies of exactly the same size as the original. This is a Weak Form of the Banach-Tarski Paradox (Two Spheres from One Version). There is another Weak Form of the Banach-Tarski Paradox (The Pea and The Sun Version): 3 Any solid ball is M3-paradoxical (M3 is the group of all isometries of R ). Informally, a ball the size of a pea is decomposed into a finite number of pieces and reassembled by rigid motions to form a ball the size of the sun. The Strong Form of the Banach-Tarski Paradox says that if A and B are any two bounded subsets of R3 with non-empty interior, then A and B are equidecomposable.

56

Figure 12: The Weak Form of the Banach-Tarski Paradox (Two Spheres from One Version).

Figure 13: The Weak Form of the Banach-Tarski Paradox (The Pea and the Sun Version).

57

We shall introduce a new relation symbol, A≺ B. This notation can be seen as the abbreviation of A~GB'B. In words, A is G-equidecomposable to a subset B' of B. The relation ≺ is an equivalence relation, so reflexive, symmetric and transitive.24 The

Banach-Schröder-Bernstein Theorem claims that if A≺ B & B≺ A, then A~G B. So, in order to show that A~M3B, we shall show A≺ B & B≺ A. Actually we have only to show A≺ B because B≺ A follows by the same argument. Since A and B are bounded, there exists a ball K containing A and a ball L contained in B. Without loss of generality we may assume K is larger than L. Then K is covered by n many balls L1', L1', … , Ln', of the same size as L. Note that L1', L1', … , Ln', are allowed to partially overlap each other.

Let S be a set of pair-wise disjoint balls, L1, L2, … , Ln, of the same size as L. If we save a part of L1', L1', … , Ln', that is not overlapping each other and pick an overlapping part from any one of them. K is M3-equidecomposable to a subset S' of S. That is, K is decomposed into n many pieces and reassembled only by the identity and translations to form S'. So, K≺ S. Then, in order to show S≺ L, we need to prove that L is

M3-paradoxical (using the Axiom of Choice) and S~M3L. From here on, the proof proceeds as follows. Again, we omit the details and show the schema of the proof. 1. K≺ S 2. S≺ L 3. K≺ S & S≺ LK≺ L (because ≺ is transitive) 4. Therefore, K≺ L. Thus, AK≺ LB. Therefore, A≺ B. 3.5 What is a Paradox? In order to consider what a paradox is, we shall start off with the definition of the word ―paradox‖ from the Oxford English Dictionary. The following is a part deemed to be relevant here. 1.a A statement or tenet contrary to received opinion or belief; often with the implication that it

24 It‘s not trivial. The proof is necessary, although we leave it out here.

58

is marvelous or incredible; sometimes with unfavorable connotation, as being discordant with what is held to be established truth, and hence absurd or fantastic; sometimes with favorable connotation, as a correction of vulgar error. 2.a A statement or proposition which on the face of it seems self-contradictory, absurd, or at variance with common sense, though, on investigation or when explained, it may prove to be well-founded. b Often applied to a proposition or statement that is actually self-contradictory, or contradictory to reason or ascertained truth, and so, essentially absurd and false. c Logic. A statement or proposition which, from an acceptable premise and despite sound reasoning, leads to a conclusion that is against sense, logically unacceptable, or self-contradictory. Except for a paradox in logic, what all the definitions seem to have in common is that a statement called a paradox is self-contradictory or contrary to common sense. But the definition is divided into three parts as to whether a statement or common sense is true. 1.a, doesn‘t mention whether a statement or common sense is true. In 2.a, a statement is true, so common sense is false, while in 2.b, a statement is false, so common sense is true. But a paradox in the sense of 1.a could be eventually distributed into either 2.a or 2.b; so I would distinguish two senses of the word ―paradox,‖ distinct from a paradox in logic. The following indicates how I believe the word ―paradox‖ should be classified. (I) A logical contradiction from Logic and Mathematics (Logic, Mathematics)A&~A (ex.) Russell‘s υaradox (II) Logic and Mathematics conflict with our intuition (Logic, Mathematics)A & (τur Intuition)~A (i) (Logic, Mathematics) A is falsefallacious reasoning (ex.) Zeno‘s Paradox (ii) (Our Intuition) ~ A is falsecounterintuitive theorem (ex.) The Hausdorff Paradox, The Banach-Tarski Paradox, The Skolem Paradox A paradox in sense (I) is a logical contradiction. A case in point is Russell‘s Paradox. Is the set of all sets that are not a member of themselves a member of itself? If it is, then it isn‘t. If it isn‘t, then it is. Russell attempted to resolve this paradox by appealing to the simple theory of types. The simple theory of types distinguishes among

59 the individuals (type 0), the properties of these individuals (type 1), the properties of these properties (type 2), and so on, and for any property restricts its application to the next lower type. But Russell believed that the simple theory of types is not sufficient to construct mathematics from logic and must be ramified for the following two closely related reasons. Firstly, there are some paradoxes which cannot be resolved by the simple theory of types. Secondly, it is necessary to eliminate a vicious circle in definition. Take Grelling‘s Paradox as an example of the first reason. Some adjectives have the same property as they denote, e.g., the adjective ―English‖ is English, and others do not, e.g., the adjective ―German.‖ If we call the adjectives of the second kind heterological, it is easy to see that the adjective ―heterological‖ is heterological if and only if it is not heterological. Take the concept ―inductive number‖ as an example of the second reason. A number is called ―inductive‖ if it possesses all the hereditary properties of zero. But since the property ―inductive‖ itself is a hereditary property, this definition is seen as an impredicative definition, i.e., a definition of the part by the whole to which it itself belongs, and thus viciously circular. The ramified theory of types subdivides the properties of type 1 into the properties in whose definition all properties do not occur (order 0), the properties in whose definition all properties of the first order occur (order 1), the properties in whose definition all properties of the second order occur (order 2) and so on. When we define a certain property by reference to all properties, all these properties that occur in the definition of the property of order n have to be restricted to those of the order n-1. But the ramified theory of types stumbles on the definition of the real numbers and thus leads to the destruction of real analysis. According to the ramified theory of types, we cannot use the expression ―for all real numbers‖ without reference to a determinate order. So we have to say that all real numbers that occur in the definition of a real number of order n are restricted to those of order n-1. Whether practical or not, this would certainly be extremely inconvenient and probably intolerable. So Russell endeavored to resolve this difficulty by devising the Axiom of Reducibility: Any high-order sentence is reducible to an order-0 sentence which is its equivalent in extension.

60

Ramsey criticized the Axiom of Reducibility on the ground that it is too artificial. Ramsey divided all paradoxes into two kinds: logical paradoxes, such as Cantor‘s, Burali-Forti‘s and Russell‘s Paradox, on the one hand and semantical paradoxes, such as the liar‘s, Richard‘s and Grellings‘ Paradox, on the other. A paradox of the first kind is already eliminated by the simple theory of types. A paradox of the second kind is due to the defect of our ordinary language and we need not to take them into account in the construction of mathematics from logic. Therefore for Ramsey the ramified theory of types and the Axiom of Reducibility were redundant for the logicist program. According to Ramsey, impredicative definition is admissible insofar as it does not create a new entity and just defines the entity which already exists. For instance, it is innocuous to define a person by the description ―the tallest man in the room.‖ A paradox so called in the sense (II)(i) conceals a fallacious reasoning in itself. An example in point is so-called Zeno‘s Paradox. Zeno‘s Paradox is the one that Zeno presented as an argument against motion. Aristotle in Physics introduces four arguments of Zeno‘s and rejects them as fallacious.25 The first argument is that you cannot reach the end of the stadium because you must pass midpoints beforehand. The second argument is that Achilles cannot overtake the tortoise. Suppose that the tortoise started off ahead of Achilles. Although Achilles runs faster than the tortoise, the tortoise runs ahead when Achilles has reached where the tortoise started. When Achilles has reached where he ran at that time, the tortoise again runs ahead. The third argument is that the flying arrow is at rest. The fourth argument is that half of a given time is double the time. Suppose that there are three row A, B and C, each of which consists of the same four members. A is at rest, centered at the middle of the stadium. B extends from the beginning to the middle of the stadium. C extends from the end to the middle of the stadium. Then B and C move in opposite directions at the same speed. Eventually A, B and C lie centered at the middle

25 Aristotle, Physics, βγλb5ff. For commentaries on Zeno‘s υaradox, see McKirahan, Philosophy before Socrates, p. γ10ff. Also, Russell discusses Zeno‘s υaradox in Our Knowledge of the External World, p. 134ff.

61 of the stadium at the same time. This argument assumes that members of the rows pass each other in succession. So, time it takes for a member of a row to pass members of another row is the number of the members. But the assumption simultaneously forbids the situation like a member of the B passes a member of C at some point between two successive moments. So, time it takes for a member of B and C moving in opposite directions to pass members of A at rest is twice the number of the members. Therefore, a given time is twice the time. On the other hand, the number of members of A that the rightmost B passed is half the number of members of C that it passed. Therefore, a given time is half the time. A, B and C start and finish the movement at the same time. Hence half a given time is the twice the time. Zeno‘s Paradox is based on a fallacious reasoning of seeing time as the analogue of the sequence of natural numbers. As Aristotle suggests, however, time is not composed ―nows‖ but continuous. If time is seen as the continuum like the set of real numbers, the Paradox can be solved. Just as it does not makes sense to say the very next real number in the order of magnitude (because there are an infinitely many real numbers between), so it doesn‘t either to say the very next moment in the order of events. Also, just like a mathematical point, a moment has no size. If time were like the sequence of natural numbers, the end of the stadium or the point where Achilles overtakes the tortoise would be recognized as a limit of a convergent sequence. So it would take an infinite time to reach the limit. But since time is the continuum, any spatial point can be put in one-one correspondence with a moment in a finite time. So you can reach the end of the stadium and Achilles can overtake the tortoise in a finite time. Also, we can admit of some point between two moments. So the fourth paradox can be solved. According to the theory of measure, an uncountable number of points with measure zero yield a finite measure. So also uncountably many spatial points as moments in a finite time yield a finite space. So the flying arrow moves a finite distance. A paradox belonging to the two senses mentioned above should be avoided and dismissed. The Banach-Tarski Paradox is not a paradox in the sense that Russell‘s Paradox is a paradox or Zeno‘s Paradox is a paradox. Rather, it is a counterintuitive

62 theorem; it is a paradox only in the sense of being contrary to our intuition. The situation is similar to that in which one-to-one correspondence between natural numbers and even numbers had been considered as a paradox of the infinite before Cantor gave the definition of the infinite using the very fact. The rest of this dissertation is devoted to the study of paradoxes belonging to this category 3.6 What cannot Happen? We shall notice the following two things: (1) The Banach-Tarski Paradox holds in Rn (n≥3). In other words, the analogue of the Banach-Tarski Paradox breaks down in R1 and R2. 2 (2) S is SO3-paradoxical using four pieces, the minimum number possible. Any solid ball is M3-paradoxical using at least five pieces. These show that what happens in mathematics is still regulated by a rigorous reasoning. A paradox is not the one that tells us ―anything goes.‖ The proof reveals not only what can happen but also what cannot happen. The reason why the analogue of the Banach-Tarski Paradox doesn‘t exist in R1 and R2 is that an isometry group does not have a free subgroup of rank 2 in R1 and R2. But there is another way to see that the analogue of the Banach-Tarski Paradox breaks down in R2. Actually, we have the Bolyai-Gerwin Theorem: Two polygons are congruent by dissection if and only of they have the same area. The claim that the Banach-Tarski Paradox does not have the analogue in R2 amounts to this: If two polygons are congruent by dissection, they can‘t have different areas, i.e., they have to have the same area. This is intuitively acceptable. So it is crucial to show that if two polygons have the same area, they are congruent by dissection. But this is difficult to prove. Considering that it was not shown until the nineteenth-century century, we realize the recalcitrant nature of the problems of this sort. Then it is natural to ask whether this theorem has the analogue in R3: Two polyhedrons are congruent by dissection if and only if they have the same volume. Specifically, Hilbert‘s third problem was whether or not we can cube a regular tetrahedron by dissection into polyhedra. Dehn had already showed that a regular tetrahedron is not congruent by dissection with any cube. Therefore, the three dimensional analogue of the

63

Bolyai-Gerwin Theorem is simultaneously rejected. This result seems to contradict the Banach-Tarski Paradox that should hold in R3. For the Banach-Tarski Paradox implies that a regular tetrahedron is equidecomposable with a cube of the same volume. But we have to note that in this analogue a paradoxical decomposition is restricted to a dissection into polyhedra. So if we are more generous in the nature of pieces used in a decomposition, the Banach-Tarski Paradox does hold. To see that the minimum four pieces are necessary in a paradoxical decomposition, let X be a set and G be a free group of rank β (i.e., generated by α and ) that acts on X. Suppose that some relation R divides X into equivalence classes. The Axiom of Choice guarantees the existence of a set T containing exactly one element from each equivalence class. Then X is decomposed into four pieces, depending on the last word multiplied from T:

S1{x∣ tT; xα …. (t)}

-1 S2{x∣ tT; xα …. (t)}

S3{x∣ tT; x…. (t)}

-1 S4{x∣ tT; x …. (t)} It follows from this that

XS1S2S3S4

S1αS2

S3S4 Thus we can see that X is G-paradoxical using four pieces and four pieces cannot be improved. 3.7 A Paradox without the Axiom of Choice? In Section 5, we have seen that the Banach-Tarski Paradox is a paradox in the sense that it is a counterintuitive theorem, so it is not a logical paradox and does not contain fallacious reasoning. But the critics of the Axiom of Choice would say that even if the Banach-Tarski Paradox is a counterintuitive theorem, the non-constructive nature of the Axiom of Choice leads to the existence of non-Lebesgue measurable sets, which in turn yields the Banach-Tarski Paradox that is surprising and counterintuitive. Therefore, the

64

Axiom of Choice should be rejected. Rather than that, this is a distinguished feature of mathematics from natural science. The fact that we can get a counterintuitive theorem as a result of logical and mathematical reasoning is not its weakness but its strength. But we have to note that a paradox (or a counterintuitive theorem) can be derived even without the Axiom of Choice. A good example of this is the Sierpinski-Mazurkiewicz Paradox: There is a subset of R2 such that it‘s decomposed into two pieces and reassembled by rigid motions to form the two copies of itself. Let be a translation by +1 and ρ be a counterclockwise rotation by one radian about the origin O. Actually the set of points obtained by multiplications of and ρ from the origin O, which is a subset R2 , is paradoxical using two pieces. Call this set E. Note that a combination of and ρ is not a group transformation generated by and ρ because it does not contain the inverses -1 and ρ-1 . This is the reason why E is paradoxical using just two pieces. E is decomposed into the two sets, depending on the last word multiplied from the origin O. To see this,

E1{, , ρ, , ρ, , ….}

E2{ρ, ρρ, ρ, ρρρ, ρρ, ρρρρ, ….} Now,

EE1E2

-1 E1

-1 ρ E2 The fact that there is a paradox (in the sense of a counterintuitive theorem) without invoking the Axiom of Choice is in favor of the defenders of the Axiom of Choice. For even if the Axiom of Choice yields a paradox like the Banach-Tarski Paradox, it does not constitute a good reason why we should reject the Axiom of Choice. But it is worth mentioning that there is a version of the Banach-Tarski Paradox that does not use the Axiom of Choice. For although it is good news for the Platonists that the Axiom of Choice is not the culprit for yielding the Banach-Tarski Paradox, at the same time this means that the Paradox does not necessarily express the Platonic (non-constructive) truth. Actually, in one version we need a countable number of pieces in a paradoxical

65 decomposition rather than a finite number of (the minimum five, for that matter) pieces. More surprisingly, however, Dougherty and Foreman found another version in which we only use a finite number of pieces. This is a quite significant result because the Platonists can no longer claim that we can get the Banach-Tarski Paradox only in a non-constructive way. The Banach-Tarski Paradox can be derived in a constructive way as well. So I need to review the proof very carefully, especially in connection with the property of Baire. But at this point I would just say that this result does not necessarily deal a death blow to the Platonists. For the Platonists don‘t have to change their minds about the truth of the Banach-Tarski Paradox. This result is also disconcerting the Constructivists as well. For according to their methodological principle, they are forced to accept the Banach-Tarski Paradox that they have been refusing to accept. In any case, even if the Banach-Tarski Paradox can be reformulated without the Axiom of Choice, the Platonists may ask the Constructivists before making a concession to them. (1) Doesn‘t the proof have a lower capability than the proof with the Axiom of Choice (as to the minimum number of pieces, for instance)? (2) Doesn‘t the proof contain some non-constructive principle other than the Axiom of Choice? (3) Doesn‘t the proof depend on extremely complex or ad hoc principles, compared with the proof with the Axiom of Choice? If either of these questions is answered in the negative, momentum is on the side of the Platonists. If by assuming the Axiom of Choice the Banach-Tarski Paradox can be proved in a simpler, more systematic and more unified way, I believe the proof with the Axiom of Choice reflects reality. We should make a judgment on the ontological status of the Banach-Tarski Paradox comprehensively, not just based on the result but also including the process of the proof, considering that either proof fits in well with the essence of the matter. The Platonists are still not losing ground in this respect. 3.8 Some Philosophical Implications We have discussed the Hausdorff Paradox, especially focusing on how the sphere is decomposed into pieces and these pieces are reassembled. We have seen that the pieces of

66 a paradoxical decomposition take over non-measurablity from a non-Lebesgue measurable set derived using the Axiom of Choice, and non-measurablity of these pieces causes the paradoxical nature of decomposition. The Banach-Tarski Paradox represents a characteristic feature of mathematics as distinct from natural science. The fact that we can get a counter-intuitive theorem as a result of logical and mathematical reasoning is not its weakness but its strength. The difficulty with the Banach-Tarski Paradox is due to the fact that not only is it not intuitive but it is also not effectively carried out. There are a lot of things going on out there that are true but counter-intuitive. But we are convinced of their truth if they are actually realized. So the idiosyncratic nature of the Banach-Tarski Paradox lies in that we are forced to change our philosophy in order to accept it. In other words, we accept the Banach-Tarski Paradox only on the basis of the Platonic assumption, that is, by assuming the Platonic ideal world over and above the spatio-temporal world. Also, the Banach-Tarski Paradox casts light on an epistemological question about Platonism: How can we get access to mathematical entities that are supposed to be non-spatio-temporal and thus causally inert? I suspect that the pathologies concerning non-Lebesgue measurable sets or the Banach-Tarski Paradox deal a blow to epistemology based on a mathematical intuition such as Gödel‘s. Gödel claims that just as we have a sensible intuition in physical sciences, so we have a mathematical intuition in mathematical sciences. Indeed we could say that we have a mathematical intuition about lower mathematical objects such as natural numbers. But we could hardly say that we have an intuition about non-Lebesgue measurable sets or the Banach-Tarski Paradox. Since Gödel‘s mathematical intuition develops from Husserlian intuition of essences, phenomenological epistemology would have to be brought into question in all. Conclusion The conflict between Platonism and Constructivism forms the watershed in the philosophy of mathematics. The Platonists posit mathematical entities as super-spatio-temporal ones. By contrast, the Constructivists restrict them to those that are legitimately constructible in space and time. The dispute over the use of the Axiom of Choice is the most symbolic of the opposition between Platonism and Constructivism. The Platonists accept the Axiom of Choice and allow for the existence of a set of members formed by infinitely many

67 arbitrary choices, while the Constructivists rejects the Axiom of Choice and allows for only a set of members selected by the specific rule. One of the reasons the Constructivists reject the Axiom of Choice is that the existence of non-Lebesgue measurable sets can be proved using the Axiom of Choice, which is counterintuitive and surprising. In this way Lebesgue‘s theory of measure is very closely related to the use of the Axiom of Choice. We have seen the distinctive feature of Lebesgue measure in the notion of -additivity. The Lebesgue integral based upon the Lebesgue measure made much improvement on convergence property of the Riemann integral based upon Jordan‘s content. Lebesgue‘s theory of measure is applied to the theory of large cardinals. The use of the Axiom of Choice yields the Banach-Tarski Paradox. But this is not a good reason why we should reject the Axiom of Choice. For the Banach-Tarski Paradox is a paradox only in the sense that it is a counterintuitive theorem. In this sense, the Banach-Tarski Paradox casts doubt on Gödel‘s mathematical intuition and Husserl‘s intuition of essences. The Banach-Tarski Paradox is not just a mathematical figment. It reflects reality. But since the Banach-Tarski Paradox cannot effectively be carried out, what kind of reality is it? A contribution philosophy can make to the Banach-Tarski Paradox is to provide a solid foundation for the Paradox by claiming that it reflects the reality of the Platonic world over and above the natural world.

68

CHAPTER 4

GÖDEL’S INCOMPLETENESS THEOREMS

Introduction The aim of this chapter is to clarify the nature of arithmetical truths based on Gödel‘s First and Second Incompleteness Theorems. We approach Gödel‘s First Incompleteness Theorem from two different perspectives: Gödel‘s original paper and the theory of computability. We begin with the former (Section 1) and then move on to the latter. First of all, we make a distinction between the two senses of undecidability. An intuitive way to see if there is a decision method is to check whether or not a Turing machines halt. (Section 2) Church‘s thesis claims that the set of Turing computable functions coincides with the set of partial recursive functions. By these functions we define recursively enumerable sets and recursive sets, bracing for the different nature of the set of the true sentences and the set of theorems (provable sentences) in arithmetic. (Section 3) We show that the Halting problem is undecidable. (Section 4) Then, based on the undecidability of the Halting problem, we show that first-order logic and arithmetic are undecidable respectively. (Section 5) From the undecidability of arithmetic, Gödel‘s First Incompleteness Theorem is immediate. (Section 6) Finally, we shall see Gödel‘s Second Incompleteness Theorem to the effect that the consistency of arithmetic is unprovable within arithmetic itself. (Section 7) As a consequence, the consistency proof is not absolute but relative (Section 8). I conclude Gödel‘s Incompleteness Theorems, which implies that there are arithmetical truths we cannot get access to in an effective way, strengthen my claim that mathematical truths are of non-constructive nature. 4.1 Gödel’s First Incompleteness Theorem Before we go into Gödel‘s First Incompleteness Theorem, we shall take a quick look at the characteristic features of modern logic. Modern logic consists of propositional logic and predicate logic. The latter is further divided into first-order and second-order (or high-order). In first-order logic, (universal and existential) quantifiers range over only individuals but in second-order predicates as well. Propositional logic and first-order

69 logic behaves in a similar way. Both have the Completeness, Soundness, Compactness Theorems. These theorems are based on the distinction between syntax (theory of proof) and semantics (theory of truth). But first-order and second-order logic behave in a very different manner. The central mathematical notions, such as finitude and countability, can be defined in second-order logic, not in first-order logic. Also, the Completeness, Compactness and (upward / downward) Löwenheim-Skolem Theorems don‘t hold in second-order logic. They hold owing to the weakness of expressive power of first-order logic. For this reason they are called ―limitative theorems.‖ Introducing (universal and existential) quantifiers was an epoch of Fregean logic.26 For instance, both a sentence ―Socrates is mortal‖ and a sentence ―All human beings are mortal.‖ share the same grammatical form by virtue of being of the subject-predicate structure. But Frege shows that the former can be analyzed into M(S) but the latter can be analyzed into ∀x(H(x)M(x)). That is to interpret universally quantified sentences as conditionals. So in some cases two sentences share the same grammatical form but their logical forms are different. We could say that the distinction between grammatical and logical form is one of the fruitful results of Fregean Logic. The aim of Frege‘s ―ψegriffshrift‖ (―Concept writing‖ or ―Concept notation‖) is to create the ideal language rigorous enough to express mathematics by eliminating the ambiguity of the ordinary language. Now we shall see Gödel‘s First Incompleteness Theorem in his original approach. Gödel‘s First Incompleteness Theorem tells us thatμ Assume that a first-order formal system S is strong enough to express arithmetic and consistent. Then, S is incomplete, that is, there are undecidable sentences in the sense that they are neither provable nor disprovable in S. In other words, Gödel‘s First Incompleteness Theorem says that a formal system that has enough strength to express arithmetic is either inconsistent or incomplete. In this

26 Kant in Critique of Pure Reason says ―until now it has also been unable to take a single step forward, and therefore seems to all appearance to be finished and complete.‖ (ψ.viii) ψut I don‘t want to rush to the conclusion that Kant couldn‘t anticipate the revolution to come in logic. For some would say that Kant‘s distinction between intuition and concept resembles Frege‘s distinction between argument and function. (For this, see e.g. Redding, Analytic Philosophy and the Return of Hegelian Thought, p. 93.)

70 connection, we have to note the First Incompleteness Theorem does not say that every formal system is either inconsistent or incomplete. There are many formal systems that are both consistent and complete. First Incompleteness Theorem only applies to formal systems in which arithmetic can be carried out. More specifically, Gödel‘s First Incompleteness Theorem tells us that: Let the assumptions of a formal system S be the same as above. Then, there are true but unprovable sentences in S. Although we have already distinguished syntax (theory of proof) from semantics (theory of truth), Gödel‘s First Incompleteness Theorem shows that actually there is a gap between them. Using the method of Gödel numbering, which corresponds Gödel number to an arithmetical sentence, Gödel formulated a true but unprovable sentence called the Gödel sentence G in Peano Arithmetic PA. The Gödel sentence G is self-referential, meaning that ―I‘m not provable in PA.‖ In symbols, G: PA G. Pr(x, y) expresses that x is a proof of a formula with Gödel number y. Let z be Gödel number of a formula that has the only free variable in it, and sub(z, z) be the formula obtained by substituting for the variable of the formula with Gödel number z the numeral for z.27 Then, Pr(x, sub(z, z)) means that x is a proof of the formula sub(z, z) obtained by substituting for the variable of the formula with Gödel number z, which has the only free variable, the numeral for z. Take ~∃xPr(x, sub(z, z)). Since this is also an arithmetical formula, it has Gödel number, say, g. Now, consider what ~∃xPr(x, sub(g, g)) exactly means. This means that there is no proof of the formula (i.e., ~∃xPr(x, sub(g, g)) itself!) obtained by substituting for the variable (i.e., z) of the formula with Gödel number g (i.e., ~∃xPr(x, sub(z, z))) the numeral for g. Therefore, ~∃xPr(x, sub(g, g)) is an arithmetical translation of a sentence ―I am not provable in PA.‖ Now we shall prove that, assuming the -consistency of Peano Arithmetic PA, the Gödel sentence G is a true but unprovable sentence in PA.28 We have an important

27 Note that x is not a proof of just a formula with Gödel number z. 28 In order to prove that the Gödel sentence G is true but unprovable in PA, Gödel actually used a stronger assumption ―-consistency‖ than ―consistency.‖ -consistency means that:

71 theorem as follows:

If a relation R(x1, x2, ……, xn) is recursive, then it is representable in PA. Therefore there is a relation R(x1, x2, ……, xn) such that

If R(a1, a2, ……, an) does hold, then ⊢PAR(a1, a2, ……, an). And,

If R(a1, a2, ……, an) does not hold, then ⊢PA~R(a1, a2, ……, an).

(⊢PAR(a1, a2, ……, an) doesn‘t make sense because R is represented in PA but a1, a2, ……, an are not numerals.) The relation Pr(x, y) is recursive, so representable in PA. Therefore there is a relation Pr(x, y) such that

If Pr(m, n) does hold, then ⊢PAR(m, n). And,

If Pr(m, n) does hold, then ⊢PA~R(m, n). Now we show that G is neither provable nor disprovable. (i) G is not provable in PA. Suppose that G were provable in PA. Thus, ∃xPr(x, sub(g, g)) does hold.

So, ⊢PA∃xPr(x, sub(g, g)) by the theorem above.

But ⊢PAG.

So, ⊢PA~∃xPr(x, sub(g, g)). A contradiction. Therefore, G is not provable in PA. (ii) ~G is not provable in PA. We now know that G is not provable in PA by (i). Thus, ∃xPr(x, sub(g, g)) does not hold.

either ⊢ A(0), ⊢ A(1), …., or ⊢ ∃x~A(x). -inconsistency means that: both ⊢ A(0), ⊢ A(1), …., and ⊢ ∃x~A(x). If a formal system is -consistent, then it is consistent. But the converse does not hold.

72

29 So, ⊢PA~∃xPr(x, sub(g, g)) by the theorem above.

So, PA∃xPr(x, sub(g, g)) by -consistency. Therefore, ~G is not provable in PA. But undecidable sentences are not confined to self-referential sentences such as the Gödel sentence G. Even though a self-referential sentence can be translated into the language of arithmetic, the sentence itself is of no mathematical interest. This indeed does not reduce the importance of the First Incompleteness Theorem because, as we shall see soon, the Gödel sentence G plays an essential role in the Second Incompleteness Theorem. More significantly, however, undecidable sentences in Gödel‘s First Incompleteness Theorem could be not only self-referential but also of mathematical interest. Actually, the Four-Color Conjecture, which is now confirmed, was once considered as such an undecidable sentence as dictated in Gödel‘s First Incompleteness Theorem. It is indeed not the case that all undecidable sentences are true, but it is important that an undecidable sentence could be true. An undecidable sentence is undecidable not in an absolute sense but in a relative sense. An undecidable sentence in some formal system is undecidable because the formal system is too weak to prove that sentence. So an undecidable sentence in some formal system is always decidable in a stronger system than the system concerned. Especially, the Gödel sentence G in Peano Arithmetic PA is decidable in the formal system PA+G in which G is added as a new axiom to PA. But in the new formal system arises another undecidable sentence G‘ claming its own unprovability in PA+G. In symbols, G‘: PA+G G‘. 4.2 Turing Machines In what follows, we approach Gödel‘s First Incompleteness Theorem from the theory of computability. But in order to do that, we have to show that arithmetic is undecidable. It is worth noting that there are the two distinct senses of undecidability. We have seen above in Gödel‘s First Incompleteness Theorem that a sentence is said to be undecidable if

29 Or in what follows, suppose that ~G were provable in PA.

So, ⊢PA~G.

So, ⊢PA∃xPr(x, sub(g, g)). This contradicts -consistency.

73 it is neither provable nor disprovable. But there is another sense of undecidability in which a set is said to be undecidable if there is no mechanical method to decide whether or not x is a member of A. This is exactly what we mean by saying that first-order logic or arithmetic is undecidable. As we shall see later, these two senses are indeed connected in some way, but we should be careful not to confuse one with the other. An intuitive way to understand what constitutes a mechanical method to decide whether or not x is a member of A is to check whether or not Turing machines halt. Roughly speaking, there is a mechanical method to decide whether or not x is a member of A if and only if there is a computer program by which a Turing machine halts if x is a member of A and there is another computer program by which a Turing machine halts if x is a member of A’ (the complement of A). Now we shall see how Turing machines work. A Turing machine can be seen as a black box into which we input a tape which is divided into equal squares and infinite at both ends. The simplest Turing machines have two tape symbols, 0 and 1. When a

Turing machine is in the state qy and scanning the symbol xn, the instruction by a Turing machine is given by a quadruple Iiqy, xn, d, qy‘: (1) The current state (2) The scanned symbol (3) The action taken (4) The next state There are the only three actions to be taken: (a) Print a symbol (first erase the symbol in the square being scanned and then write the other symbol in the same square). (b) Move one square to the right. (c) Move one square to the left.

The combination of qy, xn, has to be unique, because otherwise the next instruction cannot be specified. The set of instructions {I0, I1, I2, ….., Ik-1} constitutes a computer program of Turing machine. A Turing machine halts if there is no instruction Ik. For instance, a set of instructions

q0, 1, R, q1, q1, 1, R, q0, q1, 0, R, q2, q2, 0, R, q1

74 is an example of a computer program to decide whether or not x is an even. That is, if x is an even, a Turing machine halts and if x is an odd, it doesn‘t halt. Another set of instructions

q0, 1, R, q1, q1, 1, R, q0, q0, 0, R, q2, q2, 0, R, q0 is an example of a computer program to decide whether or not x is an odd. That is, if x is an odd, a Turing machine halts and if x is an even, it doesn‘t halt. I shall also give an example of a computer program to compute x+y. Since there is one-one correspondence between the set of computer programs and the set of natural numbers, we can enumerate the computer programs of Turing machine by T0,

T1, T2, ….. Also, when we input n into a Turing machine Tm, we denote it by Tm(n).

75

Figure 14: Example of a computer program to decide whether x is an even.

76

Figure 15: Example of a computer program to decide whether x is an odd.

77

Figure 16: Example of a computer program to compute x+y.

4.3 Recursive Functions and Recursive Sets Church‘s thesis claims that the set of Turing computable functions coincides with the set of partial recursive functions. We say that f(x) is undefined if x Dom(f). A function is said to be partial if it cannot be defined in the domain of all natural numbers. A function is said to be total otherwise. We shall define primitive recursive and recursive functions. To that aim, we begin with the definition of the initial functions and the three rules. The initial functions are: (1) The zero function: z(n)0 (2) The successor function: s(n)s+1

j (3) The projection function: pi (x1, x2, …. , xj)xi 1 Especially, p1 (x1)x1. So this is the identity function. The three rules are:

78

(i) Composition: f(x)g(h(x)) (ii) Recursion: f(0)k f(1)h(1, f(0)) f(2)h(2, f(1)) ….. f(n)h(n, f(n-1))

(iii) Minimization: f(x)y[g(x, y)0] (f(x) is the least number y such that g(x, y)0.) A function is said to be recursive if it is obtained by applying the three rules to the initial functions. A function is said to be primitive recursive if it is obtained by applying the rules (i) and (ii) only to the initial functions. We now move from recursive functions to recursive sets. Informally, the set S is said to be recursively enumerable if for every member in the set S there is a computer program that halts when it is input. The set S is said to be recursive (decidable) if not only for every member in the set S there is a computer program that halts when it is input but also for every member in the complement S’ of S there is a complementary computer program that halts when it is input. So, we have to note that in a recursively enumerable set there is not always a complementary computer program that halts for input for which a computer program doesn‘t halt. In what follows, we shall give a few formal definitions of recursively enumerable and recursive sets. (1) The set S is recursively enumerable if (i) S is the domain of a partial recursive function. (ii) The partial characteristic function

S(x) 1 if xS undefined if x S is Turing-computable.

79

Figure 17: Recursively enumerable (r.e.).

(2) The set S is recursive (decidable) if (i) S is recursively enumerable and S‘ (the complement of S) is also recursively enumerable.30 (ii) The characteristic function

S(x) 1 if xS 0 if x S is Turing-computable. The upshot of these definitions is this: In a recursive (decidable) set, there is a computer program that halts for every member in the set and there is a complementary computer program that halts for every member in the complement of the set. So, by running simultaneously a complementary computer program that halts for every member in the complement of the set, we can decide whether, for a member for which a computer program is still running, the computer program does not halt forever or it does halt at some time in the future. In a non-recursive (undecidable) set, there is no computer program that halts for every member in the set and/or there is no computer program that halts for every member in the complement of the set. So, we cannot decide whether, for a member for which a computer

30 It is shown that the set S‘ is recursively enumerable if and only if xS‘ is expressed by ∃yφ(x, y) using a decidable predicate φ. So xS is expressed by ∀y~φ(x, y). Then S is called co-recursively enumerable. As a result, the set S is recursive if and only if S is recursively enumerable and S is co-recursively enumerable.

80 program is still running, the computer program does not halt forever or it does halt at some time in the future.

Figure 18: Decidable (recursive).

Figure 19: Undecidable.

For instance, the set S of even numbers is recursively enumerable. The set S‘ (the complement of S, i.e., the set of odd numbers) is recursively enumerable. So the set S of even numbers is recursive. The same applies to the set S‘ of odd numbers. The crucial result is that all recursive sets are recursively enumerable, but not vice versa. Actually, there is a set that is recursively enumerable but not recursive. As we shall see later, a couple of examples are K in the Halting problem or the set of theorems (provable sentences) in arithmetic.

81

Figure 20: Example of a recursive set.

Figure 21: Example of a recursively enumerable but not recursive set.

4.4 The Halting Problem We shall begin this section by giving some fundamental decidability and undecidability results: (1) Propositional logic is decidable. (2) Pure first-order logic (monadic logic) is decidable. (3) The Halting problem is undecidable. (4) First-order logic is undecidable. (5) Arithmetic is undecidable. The Halting problem is important because the undecidability of the Halting problem provides foundation for other undecidability proofs in the sense that it is a logician‘s stock in trade to solve the latter by reducing it to the former. So we have to state what the Halting problem is and why the Halting problem is undecidable. In order to prove that the

82

Halting problem is undecidable, we appeal to a diagonal argument. The proof is analogous to that of Cantor‘s theorem to the effect that the cardinality of the power set of a set is strictly greater than the cardinality of the original set. 31 We show K{n∣ Tn(n)} is undecidable. Suppose, for reductio ad absurdum, K were decidable. By definition, the characteristic function

K(x) 1 if Tx(x)

0 if Tx(x) is Turing-computable. Consider a function g such that xDom(g)⇔Tx(x) (i) Let g be g(x) 0 if Tx(x)

undefined if Tx(x)

SinceK is Turing-computable, g is also Turing-computable.

So there is a computer program Tn that computes g so

Tn(n)⇔nDom(g) (ii)

If Tx(x), then by (i), n Dom(g) by (ii), nDom(g) A contradiction

If Tx(x), then by (i), nDom(g) by (ii), n Dom(g) A contradiction Therefore, K is undecidable.

31 If we denote the set of members that halt when we input y into a Turing machine Tx by Wx{y∣ Tx(y)↓}, then K{x∣ xWx}.

83

Figure 22: The analogy between the Halting Problem and Cantor‘s Theorem.

84

We can go further than that. K is obviously recursively enumerable. So K‘ (the complement of K) is not recursively enumerable. K is called a creative set and K‘ is called a productive set. 4.5 The Undecidability of First-order Logic and Arithmetic

Based on this result, we shall show the undecidability of first-order logic. Let I1, I2, …..,

Ik-1 (i, jk) be a set of instructions in a Turing machine. So each instruction consists of a quadruple: Iiqy, xn, d, qy‘.

(i) If Ii―print a symbol,‖ let φi be the sentence

∀x1…..,∀xu∀y[R(x1, ….., y, xn, ...... , xu; i)R(x1, ….., y’, xn’, ...... , xu; j)] where xn’ 1 if xn0

0 if xn1

(ii) If Ii―move one square to the right,‖ let φi be the sentence

∀x1…..,∀xu∀y[R(x1, ….., y, xn, ...... , xu; i)R(x1, ….., y’, xn+1, ...... , xu; j)]

(iii) If Ii―move one square to the left,‖ let φi be the sentence

∀x1…..,∀xu∀y[R(x1, ….., y, xn, ...... , xu; i)R(x1, ….., y’, xn-1, ...... , xu; j)]

Now let φ be the sentence

85

φ: φ1&φ2& ….. &φk-1&R(0, 0, …...... , 0; 1) That is, φ is a set of instructions and the initial condition of the tape. Then let be the sentence

: φ∃x1…..,∃xu∃y[R(x1, ….., y, ...... , xu; k)] Assuming a set Γ of a part of Peano axioms, we have the following equivalence: Γ⊨ ⇔T(0) In order to see this, if T(0), then under a set of instructions and the initial condition of the tape there is a sentence such that R(a1, ….., b, ...... , au; k) so ∃x1…..,∃xu∃y[R(x1, ….., y, ...... , xu; k)] does hold. Conversely, if under a set of instructions and the initial condition of the tape there is a sentence such that R(a1, ….., b, ...... , au; k) so ∃x1…..,∃xu∃y[R(x1,

….., y, ...... , xu; k)] does hold, then T(0). Thus we actually formulate a true sentence that is equivalent to T(0) in the first-order language. It remains to show that T(0) is undecidable. Once we get this, by the equivalence above Γ⊨ is also undecidable. We complete the proof by showing that T(0)can be reduced to the Halting problem. (i) First, we input a blank tape into a Turing machine. (ii) Then, we construct n by the following computer program.

1 q0, 0, 1, q0 q0, 1, R, q1

2 q1, 0, 1, q1 q1, 1, R, q2 ….. n qn-1, 0, 1, qn-1

(iii) Finally, we set up a computer program in the same way that Tn processes n.

So there is some computer program Tm that combines both programs so Tn(n)⇔Tm(0).

We have already seen that the Halting problem Tn(n) is undecidable. Therefore, T(0) is also undecidable.

86

Figure 23: T(0)↓ is undecidable.

Indeed we are actually assuming a set Γ of a part of Peano axioms, but there is a theorem to the effect that if a system S‘ is a finite extension of S and S‘ is undecidable, then S is also undecidable. Here Γ is a finite extension of first-order logic, so it is shown that first-order logic is undecidable. We can show the undecidability of arithmetic using the same technique. 4.6 Undecidability and Incompleteness Once we have the undecidability of arithmetic, Gödel‘s First Incompleteness Theorem is immediate from this as follows. The set of true sentences is not recursively enumerable. Suppose, for reductio ad absurdum, that the set of true sentences in arithmetic were recursively enumerable. Then the set of false sentences in arithmetic would be also recursively enumerable. In order to see this, we have only to point out that for a false sentence its negation ~ is true. In symbols,

87

F{∣ }{∣ ⊨ ~} By definition, the set of true sentences in arithmetic is recursive if and only if S is recursively enumerable and S‘ (the complement of S) is also recursively enumerable. So arithmetic would be decidable. This contradicts the undecidability of arithmetic. So, the set of true sentences in arithmetic is not recursively enumerable. But the set of theorems (provable sentences) in arithmetic is recursively enumerable. Note that in any axiomatizable theory the set of theorems is recursively enumerable and PA is an axiomatizable theory. So it is easy to see that there is a sentence that is true but unprovable in PA., so there is some arithmetical truth we cannot get access to in an effective way. This is the approach to Gödel‘s First Incompleteness Theorem from the theory of computability. This approach more clearly shows the non-constructive nature of arithmetical truths than Gödel‘s original approach. PA is axiomatizable (so, the set of theorems in PA is recursively enumerable), whereas arithmetic is not axiomatizable (so, the set of theorems in arithmetic is not recursively enumerable). But PA is not complete, whereas arithmetic is complete. Here we can see that there is a tension between axiomatizability and completeness. The incompleteness proof doesn‘t show that the system is undecidable. So, Gödel‘s First Incompleteness Theorem doesn‘t constitute the proof that arithmetic is undecidable. Consider propositional logic. There is a sentence that is neither provable nor disprovable. But propositional logic is decidable. But a system in which some sets are undecidable is incomplete. In order to see the relation between decidability and completeness, the following theorem is useful:32 Any axiomatizable complete theory is decidable. A theory is complete in the sense that for every sentence in the theory either it or its negation can be proved. First-order logic is axiomatizable but incomplete in this sense. So it comes as no surprise that first-order logic is undecidable. Also, every complete decidable theory is axiomatizable. It is trivially true. But it is false that every decidable axiomatizable theory is complete.

32 For this theorem, see Boolos and Jeffery, Computability and Logic (1974), p. 177-8, Cohen, Computability and Logic, p. 204.

88

Figure 24: Approach to Gödel‘s First Incompleteness Theorem from the the theory of computability.

In propositional logic we have a decision procedure based on the truth table. In propositional logic, no matter how complicated the sentence is, we can mechanically decide whether or not the sentence is a tautology. That is, we assign the truth-values to every symbol of that sentence and pay attention to the last column of the truth table. If the entries are all true, the sentence is logically valid, otherwise not logically valid. We have already seen that there is a kind of analogy between propositional logic and first-order logic. In this vein, it is natural to conjecture that first-order logic might be decidable. But Church‘s undecidability theorem proves the opposite. We wish there were a mechanical method to decide whether or not some sentence is provable, especially when the sentence is very complicated and seems to be very difficult to prove. For, if there were such a method, we could know whether or not the sentence has a proof, even if we can‘t get the concrete proof. An undecidability result dashes that hope. It tells us that there is no such decision method so we have to come up with the proof case by case or wait until a Turing machine halts with the suspicion that the sentence might not

89 be a theorem. It is often said that Gödel‘s First Incompleteness Theorem shows that the human mind is not a machine. But we have to make a distinction between the effectiveness of proofs (formal methods) and the effectiveness of decision (mechanical methods). In order to be able to decide whether or not there is a proof in a mechanical procedure, both the set of theorems and its complement must be recursively enumerable. But if only the set of theorems is recursively enumerable, the proofs are formal. So the mechanical effectiveness is stricter than the formal effectiveness. ωhurch‘s Undecidability Thesis shows the limitations of mechanical methods, whereas Gödel‘s First Incompleteness Theorem shows those of formal methods. 4.7 Gödel’s Second Incompleteness Theorem Gödel‘s Second Incompleteness Theorem tells us that the consistency of a formal system that is strong enough to express arithmetic is unprovable within the system itself. Especially, the consistency of Peano Arithmetic PA is unprovable within PA itself. Suppose that the consistency of PA were provable in PA. In symbols,

⊢PACon(PA). Gödel‘s First Incompleteness Theorem implies we can prove that if arithmetic is consistent, then we have the Gödel sentence G. In symbols, ⊢PA(Con(PA)G).

So by modus ponens we get ⊢PAG, which contradicts the fact that, assuming the consistency of PA, G is unprovable in PA (Gödel‘s First Incompleteness Theorem). Thus the consistency of PA is unprovable in PA (Gödel‘s Second Incompleteness Theorem). ψut we have to note Gödel‘s Second Incompleteness Theorem does not say that the consistency of arithmetic is unprovable by any means whatsoever. The Second Incompleteness Theorem just rejects the consistency proof of Peano Arithmetic within PA itself. Actually, Gentzen later proved the consistency of arithmetic, not finitistically any more but using a stronger method—the . Just by replacing ―natural

90 numbers‖ in the complete (ordinary) induction with ―transfinite ordinals,‖ we obtain the transfinite induction.33 It is worth noting that Gentzen claims that a stronger method is indeed required to prove the consistency of arithmetic than axiomatic methods as dictated by Hilbert‘s programme but it is still in harmony with the constructivist interpretation of infinity. But it seems to me that even if the consistency of a formal system is proved by a stronger system, the consistency of the stronger system must in turn be proved by a much stronger system. Therefore, there will be need to secure its own consistency outside the formal system in one way or another at some point. For could the consistency of a formal system be proved by a stronger one that is more likely to be inconsistent? As a consequence of Gödel‘s Second Incompleteness Theorem, we have learned that there is no such thing as an absolute consistency proof: the consistency proof of the system within the system itself. The appeal to the existence of a model outside the formal system enables a relative consistency proof: the consistency proof of one formal system on the assumption of the consistency of another formal system. Gödel‘s Incompleteness Theorems show that in order to secure mathematical truths we have to appeal to the existence of a model outside the formal system. Gödel believed that the Incompleteness Theorems support Platonic realism in this sense. 4.8 Relative Consistency Proofs A model M of a first order system S is an interpretation that makes every theorem of S true. There are multiple models (interpretations) that satisfy a formal system S. The idea of a relative consistency proof runs as follows. Our goal is to prove the consistency of a formal system S+A assuming the consistency of the formal system S, that is,

Con(S)ωon(S+A) All we have to do is to construct a model M of S that makes A true. If we can find such a model M, M satisfies S and M also satisfies A. And we may assume Con(S). Then, suppose ~Con(S+A). Since S is consistent, S proves ~A. Since M is a model that

33 The transfinite ordinals are the extension of the sequence of natural numbers (n)μ 1, β, …, , +1, +β, …, β, …, 2, …, , …, …, ….

91 satisfies S, M is a model that also satisfies ~A. But we have already seen that M satisfies A. A contradiction. Therefore, we get Con(S+A). We shall discuss the reason why a relative consistency proof is so important. There is a tension between completeness and consistency. For if a formal system is strengthened for the sake of its completeness, then it will be more likely to be inconsistent. To the contrary, if a formal system is weakened for the sake of its consistency, then it will be more likely to be incomplete. This is the reason why there is need to adjust the axioms of a formal system. The more axioms are added to a formal system, the more chance there is that the system concerned is inconsistent. So, a formal system S+A is more likely to be inconsistent than a formal system S. This likelihood seems to constitute a good reason why we should adopt a formal system S instead of a formal system S+A. But if it can be shown that: Con(S)ωon(S+A) then such a justification loses its force. Rather, it would be more rational to say that if once we accept a formal system S, then we should accept a formal system S+A as well. For instance, ZF set theory is more likely to be inconsistent than ZF set theory minus the Axiom of Foundation (ZF-). So, at this point ZF is more dubious than ZF-. But suppose that we can prove that: Con(ZF-)→Con(ZF) This is a relative consistency proof because we prove the consistency of one formal system (i.e., ZF) assuming the consistency of another formal system (i.e., ZF-). Then, if we accept ZF-, there will be no good reason why we reject ZF. This line of argument directly leads to the consistency proof of the Axiom of Choice with ZF set theory. The consistency proof of the Axiom of Choice with ZF set theory is a relative consistency proof in the sense that we prove the consistency of ZF+AC (ZFC) assuming that ZF is consistent. That is, we prove that: (1) ωon(ZF)ωon(ZFω) Actually Gödel showed a little stronger result. Gödel used constructible sets in order to

92 construct a model of the axioms of ZF set theory that satisfies the Axiom of Choice. The Axiom of Constructibility says that every set is constructible. In symbols, VL. Gödel proved that: (1‘) ωon(ZF)ωon(ZF+VL) This is also a relative consistency proof of ZF+VL on the assumption that ZF is consistent. Now VL implies the Axiom of Choice.

So, as a corollary of (1‘) ωon(ZF)ωon(ZF+VL), we get (1) ωon(ZF)ωon(ZF+AC). In a later chapter we shall see that many mathematicians reject the Axiom of Constructibility. For Dana Scott showed that if we assume the Axiom of Constructibility, there exist no measurable cardinals. Since we know that measurable cardinals play a large role in the theory of large cardinals, we don‘t want to reject them. Gödel himself was skeptical about the Axiom of Constructibility. Here is a worry. Gödel did not show that the Axiom of Choice is consistent with the axioms of ZF set theory irrespective of VL. In other words, Gödel‘s relative consistency proof of the Axiom of Choice with the axioms of ZF set theory depends on the assumption VL. It is a separate issue whether or not the

Axiom of Choice is consistent with the axioms of ZF set theory if VL. ψut this doesn‘t mean that if we are seeking for the consistency of the Axiom of Choice, we have to adopt the model VL. What Gödel did show is that the model VL satisfies the Axiom of Choice and the axioms of ZF theory so the Axiom of Choice is consistent with the axioms of ZF set theory, not that the model VL is the only model that can satisfy them. So a relative consistency proof of ZFC with ZFC+VL becomes a matter of urgency, that is,

(γ) ωon(ZFω)ωon(ZFω+V≠L) To that aim, we have only to show that there is a model that denies VL which satisfies the Axiom of Choice and the axioms of ZF theory. But I can‘t follow this here because, in order to do so, we need Cohen‘s method. In general, if A is an undecidable sentence in a formal system S, we can assume the

93 two new formal systems in extension of S: the formal system S+A and the formal system S +~A. If both of the two new systems are consistent with the old system, A is independent from S. Cohen proved that the negation of the Axiom of Choice is consistent with the axioms of ZF set theory, that is, ωon(ZF)ωon(ZF+~AC) In order to show this, Cohen used the method of forcing and generic sets. Putting both (1) and (2) together, the Axiom of Choice was proved to be independent from ZF set theory. Conclusion We have seen Gödel‘s First Incompleteness Theorem in two different approaches: Gödel‘s original paper and the theory of computability. In his original paper, Gödel constructed the Gödel sentence ―I am not provable in PA‖ in the first-order language and showed that it is neither provable nor disprovable in PA. Since this sentence claims its own unprovability in PA, it is actually true but unprovable. Gödel‘s First Incompleteness Theorem can be shown from the theory of computability as well. The set of true sentences in arithmetic is not recursively enumerable. If it were recursively enumerable, the set of false sentence would be also recursively enumerable. But this contradicts the undecidability of arithmetic. On the other hand, the set of theorems (provable sentences) in arithmetic is recursively enumerable. So there is a sentence in arithmetic that is neither provable nor disprovable. Also, we have seen, as a consequence of the First Incompleteness Theorem, Gödel‘s Second Incompleteness Theorem to the effect that the consistency of arithmetic is unprovable within arithmetic itself. As a result, there is no such thing as an absolute consistency proof and a consistency proof is necessarily relative. A relative consistency proof is possible only by appeal to the existence of a model outside the formal system. Gödel‘s Incompleteness Theorems imply that there are arithmetical truths we cannot get access to in an effective way, and strengthen my claim that mathematical truths are of non-constructive nature.

94

CHAPTER 5

MODEL-THEORETIC ARGUMENTS

Introduction In the last chapter, by drawing upon Gödel‘s Incompleteness Theorems, we have seen that there are the limitations of formal methods. That is, Gödel‘s First Incompleteness Theorem tells us that there is a sentence in a first-order formal system that is true but unprovable. And, as a corollary of that, Gödel‘s Second Incompleteness Theorem tells us that the consistency of a formal system cannot be proved within that system itself. Also, we have seen that Platonists don‘t have to be bothered by both limitations because they claim that we have to go outside of the formal system and resort to informal methods at some point. In this chapter, by drawing upon the Löwenheim-Skolem Theorem we shall see another limitation inherent in formal methods. The Löwenheim-Skolem Theorem shows that if a formal system has a model, then it has multiple models. And, as a consequence of that, the Skolem Paradox shows that some mathematically important notions are relative to a model. This seems to pose a threat to Platonists. For since these results imply that a reference relation depends on a model, one is inclined to hold that there is no unique complete description of the world independently of our conceptual schemes. But again I claim that this is a defect for formalists, not for Platonists. To that aim, first of all, we state what a model is and consider the need to assign a model to a formal system (Section 1). Then, we discuss in detail the Löwenheim-Skolem Theorem and the Skolem Paradox (Section 2). Moreover, we shall argue that when reconsidered in the light of the Löwenheim-Skolem results, both Quine‘s thesis of the indeterminacy of translation and Putnam‘s model-theoretic arguments against metaphysical realism come to take on a clear meaning. Actually, we can regard the former as a precursor of the latter (Section 3 and 4). Finally, we take note of the positive and negative lessons from the model-theoretic arguments (Section 5). In spite of some important insights of the model-theoretic arguments, as a Platonist I conclude that there are still left

95 some informal methods for us to overcome the challenges from the model-theoretic arguments. 5.1 What is a Model? A formal system consists of a set of sentences with empty symbols stripped of meaning, except for logical constants. It is an interpretation that is assigned to that sentence. We shall see what it means to assign an interpretation to a formal system in detail. Let x, y, z be variables, P be the two-place predicate symbol and f be the two-place function symbol. Consider a formula: ∀x∀y∃zP(f(x, z), y)) First, let the domain of an interpretation be natural numbers, P stand for and f stand for multiplication. So an interpretation of the formula is: ∀x∀y∃z(xzy) This is false because, in order for this formula to hold, the domain of an interpretation has to be rational numbers (A counterexample: x3, y2, z2/3 N). But if we extend the domain of an interpretation to the set of rational numbers, leaving the rest of interpretation the same as above, then it is obviously true because this is exactly the peoperty of rational numbers. In general, the truth or falsehood of a formula depends on an interpretation given to the symbols in a formula. But there is a class of formulas that are true no matter what interpretation is given to the symbols in them. These formulas, which are true only in terms of their logical structures, are called logically valid. According to Gödel‘s Completeness Theorem, if a formula of first-order logic is logically valid, it can be proved in a formal system of first-order logic. A logically valid formula of first-order logic corresponds to a tautology of propositional logic. To take the simplest examples, let x be variables, P be the one-place predicate symbol. Consider a sentence: ∀xP(x)∃xP(x) Assume that we interpret the sentence by taking the predicate P(x) to be ―x is a philosopher‖ and taking the universe of discourse to be a set {Socrates, υlato, Kant}.

96

Then since all individuals in this domain are philosopher, and there is an individual who is a philosopher in this domain, this interpretation make this sentence true. Even if we take the universe of discourse to be a set {Socrates, Plato, George Washington}, since the antecedent is false, this sentence turns out to be true regardless of whether the consequent is true or false. Both interpretations make this sentence true. This comes as no surprise because this sentence is logically valid. Now consider the converse: ∃xP(x)∀xP(x) Assume that we give this sentence the same interpretations as above. If we take the universe of discourse to be a set {Socrates, Plato, Kant}, then there is an individual who is a philosopher in this domain, and every member in this domain is an individual who is a philosopher. So this interpretation makes the sentence true. But assume that we interpret the same sentence by taking the universe of discourse to be a set {Socrates, Plato, George Washington}. Obviously this sentence is false under this interpretation. One could hope that since mathematical truths hold indiscriminately with no regard to material objects, all the mathematical truths can be proved using the rules of inference in a finite number of steps together with some appropriate axioms. But the hope was dashed by Gödel‘s First Incompleteness Theorem and it became clear that there is a gap between the truth of arithmetical sentences (semantics) and their provability (syntax). Therefore, it becomes significant to assign an interpretation or model to a formal system. A formal system is said to have a model if there is an interpretation that satisfies every sentence in that system. The problem is that a first-order formal system cannot uniquely fix the model. 5.2 The Löwenheim-Skolem Theorem and the Skolem Paradox We have already seen that a first-order formal system is incomplete in the sense as shown by Gödel‘s First Incompleteness Theorem: There is a sentence in a first-order formal system that is neither provable nor disprovable. More specifically, there is a sentence in a first-order formal system that is true but unprovable. In this section, we shall see another sense in which a first-order formal system is incomplete by drawing upon the Löwenheim-Skolem Theorem. The incompleteness of a formal system in the latter sense

97 is no less important than that in the former sense in order to shed light on the limitations of formal methods.

As a corollary of Gödel‘s Completeness Theorem, we get Gödel‘s Model Existence Lemma: If a first-order formal system is consistent, then it has a model.

Then, we shall state the Löwenheim-Skolem Theorem. This consists of the Downward and Upward Löwenheim-Skolem Theorems.

The Downward Löwenheim-Skolem Theorem: If a first-order formal system has a model, then it has a countable model, i.e., a model whose universe of discourse is countable.

The Upward Löwenheim-Skolem Theorem: If a first-order formal system has a model, then it has an arbitrarily infinite model.

We shall not get into the technical proof of the Löwenheim-Skolem Theorem. I just mention that we need the Axiom of Choice in order to prove the Downward Löwenheim-Skolem Theorem. Now, the Löwenheim-Skolem Theorem, combined with Gödel‘s Model Existence Lemma, says that if a first-order formal system is consistent, then it has multiple models. This means that a first-order formal system cannot uniquely determine the model. But despite appearances multiple models might be essentially the same and different only in notations. Mathematically speaking, two models are notational variants of each other if one is isomorphic to the other. A function f is said to be an isomorphism if f is bijective and order-preserving (homeomorphic). Since mathematics is interested in more than just notational variants, it would be enough if a first-order formal system had only one model up to isomorphism. Actually, a formal system is said to be categorical if it has only one model up to isomorphism. Since the Löwenheim-Skolem Theorem claims that

98 there exist multiple models with different , however, a fortiori they are not isomorphic each other. Therefore the Löwenheim-Skolem Theorem shows that a first-order formal system is not categorical. Here arises another sense of incompleteness, distinct from that in Gödel‘s First Incompleteness Theorem. It seems that one way to resolve the incompleteness is to construct a new formal system by adding new axioms. But this cannot help because the Löwenheim-Skolem theorem tells us that the new formal system in turn has multiple models. So the Löwenheim-Skolem theorem claims that the incompleteness is inherent in a first-order formal system. The problem whether or not there is a complete formal system was already noticed in the early twentieth century. Veblen was the first to realize the problem and coined the word ―categorical‖ in this context. This is a serious problem for formalism. For the Löwenheim-Skolem Theorem shows that a first-order formal system has a variety of unintended or non-standard models as by-products along with the intended or standard model, so by the formal methods alone we cannot fix the model uniquely. For instance, assume that we formalize arithmetic in a first-order formal system. The model of arithmetic has to be countable. By the Löwenheim-Skolem Theorem, however, we have an uncountable model as well. This means that a first-order formal system has various of unintended or non-standard models so we cannot pinpoint the intended or standard model of arithmetic. Now, in order to avoid confusion, we need to distinguish at least three senses of ―completeness‖: (1) A first-order formal system is complete in the sense that there is a complete set of rules of inference to derive all logically valid sentences (Gödel‘s Completeness Theorem). (2) A first-order formal system is not complete (negation-complete) in the sense that for every sentence in the system, either it or its negation can be proved (Gödel‘s First Incompleteness Theorem). (3) A first-order formal system is not complete (categorical) in the sense that a formal system has only one model up to isomorphism (The Löwenheim-Skolem Theorem). The Skolem Paradox is based on the apparent conflict between the Löwenheim-Skolem Theorem and the Cantor Theorem. The Downward

99

Löwenheim-Skolem Theorem tells us that if a first-order formal system has a model, then it has a countable model (i.e., a model in which the universe of discourse is countable). The Power Set Axiom says that there exists the power set P(ω) of ω which consists of all the subsets of ω, and the Cantor Theorem says that they are uncountable. But how is it possible that a countable model makes true a sentence that claims the existence of an uncountable set?34 This is the Skolem Paradox. In order to resolve the Skolem Paradox, we have to deepen the understanding of what the Cantor Theorem really states. There are two things to note here: one is concerned with the definition of an uncountable set and the other is concerned with the Power Set Axiom. If a set S is uncountable, then there is no one-to-one correspondence between members of S and ω. But note this means that there exists no one-to-one correspondence in the model M. A one-to-one correspondence is formally expressed by an enumerating set of ordered pairs. Recall that we define an ordered pair in a set-theoretical term: a, b{{a}, {a, b}}. So, more specifically, there being no one-to-one correspondence in the model M means that there exists no enumerating set of ordered pairs in the model M. But there may be an enumerating set of ordered pairs outside M, so if we add such ordered pairs to M, there is one-to-one correspondence between members of S and ω outside M. Therefore it is perfectly possible that a certain set, which is uncountable from inside M, is countable from outside M. In regard to the Power Set Axiom, it says that for any set S, there is a set P(S) which consists of all the subsets of S. In symbols, P(S){x|x⊆S}. As we shall see in the next chapter, however, what is considered to be the power set P(ω) of ω may differ in accordance with the model concerned. This is exactly the reason why different models give us different answers to the Continuum Hypothesis. That is, according to Gödel‘s V

.(n (n2א(but according to Cohen‘s generic extension model P(ω ,1 א(L model P(ω

34 To see this more clearly, let M be a countable transitive model. A model M is transitive if for any x, y, if xyM, then xM. If by the Power Set Axiom there exists the power set P() of in M, by transitivity M contains all the subsets of in M. By the Cantor Theorem they are uncountable. But how is it possible that a countable model contains uncountable many sets?

100

So precisely speaking, the Power Set Axiom states that given any set x, there is a set P(S) which consists of all the subsets of x in the model M. We can resolve the Skolem Paradox by restating the Cantor Theorem based on these two facts. The Cantor Theorem just states that there is no enumerating set of ordered pairs in M between all the subsets of ω in M and ω. So, P(ω) is uncountable from inside the model M (the Cantor Theorem) but countable from outside M (the Löwenheim-Skolem Theorem). The argument goes as follows. There is no one-to-one correspondence in M between the power set P(ω) of ω (i.e., all the subsets of ω in the model M) and ω because no enumerating set of ordered pairs are between them in M. But since there may exist an enumerating set of ordered pairs outside M, if we add such ordered pairs to M, then there is one-to-one correspondence between them outside M. Thus the apparent tension between the Löwenheim-Skolem Theorem and the Cantor Theorem is overcome.

Figure 25: The Skolem Paradox.

101

The lesson we can learn from the Skolem Paradox is that a cardinality (countability, uncountability)—one of the most important mathematical concepts—is a notion relative to a model. That is to say, there is a set which is uncountable from inside the model but is countable from outside the model. So, technically speaking, an uncountable cardinal in the model M could ―collapse‖ into a countable cardinal in the extension model M‘. Since we usually assume that a mathematical concept has an absolute meaning regardless of the background in which it is placed, it is quite surprising to know that such an important mathematical notion as a cardinality is relative to a model. But the problem is whether or not we can press this argument by claiming that every set is relatively uncountable and it is countable outside the model. Can‘t we bite the bullet and claim that there is a set which is absolutely uncountable? Some people say that the Skolem Paradox is not a genuine paradox. For the Skolem Paradox is prima facie a paradox but not in the sense that Russell‘s paradox is a paradox. In this respect it may be similar to the Banach-Tarski Paradox, which is called a paradox though it should have been named a theorem. But I believe that the situation is more serious than that. Even if we advocate relativism at one point, there is still a possibility that absolutism raises its head again. Consider the set of real numbers. If we follow the relativist line of thought, the set of real numbers is relatively uncountable, so we can reconstruct the model proper to real numbers in a first-order formal system which has a countable model. But we can take Cantor‘s diagonal argument as showing not just that there is no enumerating set of ordered pairs in M between the set of real numbers in M and , but there is no such thing anywhere. If, as the Skolemite actually did, one claims that the cardinality of any set is relative to the model, then he or she at least has to show that there is a model in which the set of real numbers is countable, though it may be not a standard model. Otherwise, the set of real numbers is uncountable in whatever model, so we cannot construct the model proper to real numbers in a first-order formal system which has a countable model.

102

5.3 Quine’s Thesis of the Indeterminacy of Translation Quine in Word and Object doesn‘t mention the Löwenheim-Skolem Theorem and the Skolem Paradox when discussing the thesis of the indeterminacy of translation.35 But I believe that Quine‘s thesis appears afresh in this perspective. Actually, we can regard the thesis of the indeterminacy of translation as a precursor of Putnam‘s model theoretic arguments against metaphysical realism. First of all, by drawing upon Word and Object we shall see what Quine means by the indeterminacy of translation. Quine supposes a somewhat unusual situation, which he calls ―radical translation,‖ in which we translate the language of people with whom we have never previously made contact. Not only is our language so different from theirs syntactically and semantically but also our culture is so different from theirs that we have few clues as to what they are talking about. So, for instance, stimulus meaning cannot decide whether the native‘s ‗gavagai‘ refers to rabbit, rabbit stage, undetached rabbit part, rabbit fusion, or rabbithood. Actually, we have multiple translation manuals or ―analytical hypotheses,‖ as he calls them, which are all internally consistent with the totality of speech behavior and the totality of dispositions to speech but incompatible each other.36 So, it is perfectly possible to assume that an analytical hypothesis which interprets the native‘s ‗gavagai‖ as rabbit is no less internally consistent than one which interprets it as rabbit stage, though the two hypotheses are contradictory each other. This is what the indeterminacy of translation is all about. The reason why I put stress on Quine‘s thesis of the indeterminacy of translation is that his argument goes deeper than the ‗gavagai‘ example. As Quine himself says in ―On the Reasons for Indeterminacy of Translation,‖ his real intention in the indeterminacy of translation does not lie in the ‗gavagai‘ example. Actually, Quine in Word and Object considers the indeterminacy of translation by analogy with the underdetermination of scientific theory, and his point was in the latter.37 Quine believes that we know external

35 As we shall see later, Quine mentions the Löwenheim-Skolem Theorem elsewhere. 36 When Quine says that translation manuals or analytical hypotheses are internally consistent with the totality of speech behavior and the totality of dispositions to speech, he means that just as we have our speech behavior or dispositions to speech in the sense that we use a brief general term for rabbits but no brief general term for rabbit stages or parts, so, too, the native have their own. 37 In an informal discussion with Dr. Dancy, I was informed Quine thinks that the indeterminacy of translation is worse off than the underdetermination of scientific theory because in the former, unlike in the

103 things only through impacts on our nerve endings. But our surface irritations cannot completely determine the behaviors of invisible physical particles such as electrons, protons, neutrons and neutrinos. In fact, there are multiple scientific hypotheses about their behaviors, which are all internally consistent with the totality of data available to us but incompatible each other. So it is perfectly possible to assume that one hypothesis which claims that, for instance, neutrinos have mass is no less internally consistent with the other hypothesis which claims that neutrinos lack mass, though the two hypotheses are contradictory to each other.

Figure 26: The comparison of the indeterminacy of translation and the underdetermination of scientific theory.

latter, there is no fact of the matter, so Quine is pleased to accept the possible revisability of scientific theory. For this, see Quine, ―Indeterminacy of Translation Again,‖ pp.9-10. Indeed we now know that the neutrino is a massive particle as a result of late 1990s neutrino revolution, but for Quine it does not hurt the underdetermination of scientific theory.

104

But we should note it is not that for Quine the consistency alone is the only criterion scientific hypotheses should satisfy. Quine lists the following three criteria which he believes we should adopt in formulating scientific hypotheses. (1) Simplicity: Simplicity is guidance when scientists generalize from sample data to laws. (2) Familiarity of principle: A new theory conserves the truths of the older theory. In other words, we favor minimum revision. (3) Sufficient reason: A sufficient reason for positing invisible physical particles is that a theory which contains these particles can explain physical phenomenon more simply than others which do not. Quine emphasizes simplicity among other things. According to Quine, whenever simplicity and familiarity of principle conflicts, the verdict should be on the side of simplicity, and a sufficient reason may be subsumed under simplicity. Though, as I said earlier, Quine doesn‘t mention the Löwenheim-Skolem results when discussing the indeterminacy of translation, the situation is more precise in mathematics and logic. The Löwenheim-Skolem Theorem, combined with Gödel‘s model existence theorem, tells us that if a first-order formal system is consistent, then it has multiple models. So it is possible to interpret logical symbols of the first-order formal system in so many ways that we cannot uniquely fix the references of logical symbols. This fact is the more important since Quine believes the priority of the first-order formal language to the ordinary language. Quine says, ―we can see that paraphrasing into logical symbols is after all not unlike what we all do every day in paraphrasing sentences to avoid ambiguity.‖38 But Quine‘s point is of course that the indeterminacy of translation cannot be completely resolved even by paraphrasing into logical symbols because it is inherent in our language itself. A sentence has a meaning only relative to the frame of reference in which it is placed. Based on this argument, Quine is very skeptical of the possibility of the unique true scientific theory. Indeed, as we shall see later, Putnam‘s model-theoretic arguments against metaphysical realism push this skepticism. But it is interesting to see that Quine says, ―Vagueness, ambiguity, fugacity of reference, are traits of verbal forms and do not

105 extend to the objects referred to.‖39 Also, Quine admits numbers as objects because of their efficacy in organizing and expediting the sciences. We can see here Quine‘s indispensability argument of mathematical objects. Hence I believe that Quine‘s point is not so much the impossibility of the unique true scientific theory as the inexpressibility of the unique true scientific theory, if at all, even in terms of our formal language. If there were (contrary to what we just concluded) an unknown but unique best total systematization θ of science conformable to the past, present, and future nerve-hits of mankind, so that we might define the whole truth as that unknown θ, still we should not thereby have defined truth for actual single sentences. We could not say, derivatively, that any single sentence S is true if it or a translation belongs to θ, for there is in general no sense in equating a sentence of a theory θ with a sentence S given apart from θ. Unless pretty firmly and directly conditioned to sensory stimulation, a sentence S is meaningless except relative to its own theory; meaningless intertheoretically.40 Quine actually mentions the Löwenheim-Skolem Theorem in ―τntological Relativity‖ and ―Ontological Reduction and the World of Numbers‖ even though in a backhanded way. His point is that there is no tension between the Löwenheim-Skolem Theorem and the thesis of the indeterminacy of translation. The Downward Löwenheim-Skolem Theorem says that if a first-order formal system has a model, it has a countable model. So, at first sight, it seems that the Löwenheim-Skolem gives a false impression, as though an uncountable ontology were reducible to a countable ontology. But Quine argues that this really does not follow from the Löwenheim-Skolem Theorem. Carnap successfully reduced impure numbers of temperature to pure numbers. Zermelo and von Neumann succeeded to reduce natural numbers to sets. What is the difference between the failure and the success of reduction? Quine claims that the reason for the success of reduction is the existence of a proxy function that designates a permutation between the reducing and the reduced domain. So if there is a proxy function between two models, they are just notational variants of each other. As we have seen earlier, however, the Löwenheim-Skolem Theorem tells us not just that there are multiple models, but that there are non-isomorphic multiple models.

38 Word and Object, p. 159. 39 Word and Object, p. 193. 40 Word and Object, p. 23-4.

106

Two models with different cardinalities are non-isomorphic, so there is no proxy function between them. Hence Quine holds that the reduction of an uncountable ontology to Pythagorean ontology is doomed to fail. Since the Löwenheim-Skolem Theorem does not claim that one ontology is reducible to the other ontology, it is compatible with the thesis of the indeterminacy of translation. The thesis of the indeterminacy of translation is closely related to the thesis of the underdetermination of scientific theory. In ―τn the Reason for Indeterminacy of Translation‖ Quine himself says that when he is talking about the former, his real intention is in the latter. In ―On Empirically Equivalent Systems of the →orld‖ Quine explains the thesis of underdetermination of scientific theory in detail. The thesis of the underdetermination of scientific theory says that there is an empirically equivalent but logically incompatible theory. I use the term ―logically incompatible‖ in a strong sense. There is a weaker sense of logical incompatibility that is just apparent and rendered logically equivalent by a reinterpretation of predicates. This is parallel to the case in which one ontology is reducible to the other ontology by a proxy function. ψut Quine‘s point is rather that just as an uncountable model cannot be reduced to a countable model for the lack of a proxy function, so one theory cannot be rendered logically equivalent to another by a reinterpretation of predicates. This is exactly what the thesis of the underdetermination of scientific theory means and Quine goes in a direction where he claims that we can talk about the truth or falsity of sentences only relative to the background theory. 5.4 Putnam’s Model-Theoretic Arguments against Metaphysical Realism It is well-known that the model-theoretic arguments based on the Löwenheim-Skolem Theorem caused Putnam to change his mind from metaphysical realism to ―internal‖ realism. In this section, we shall see exactly how that happened by drawing upon his Realism and Reason. As we have already seen, the Löwenheim-Skolem Theorem tells us that if a first-order formal system has a model, then it has multiple models. This means that theoretical and operational constraints cannot fix the unique ―intended‖ model for a first-order formal system. Also, Putnam claims that there is nothing else with which to fix the intended model than theoretical and operational constraints. To extend the formal

107 system by adding new axioms doesn‘t work either because the Löwenheim-Skolem Theorem says that the new formal system in turn has multiple models. It‘s just adding more theory. So the impossibility of fixing the intended model is believed to be inherent in first-order formal systems. According to Putnam, the model-theoretic arguments deal a fatal blow to metaphysical realism. Metaphysical realism is a doctrine that there is the uniquely true and complete description of the world apart from our conceptual imposition. If we have two theories which are true and complete descriptions of the world, they are not different in substance but are just notational variations of each other. υutnam plays devil‘s advocate and shows that there are two ways to save metaphysical realism from the model-theoretic arguments against it. One is to assume that, along the lines of Gödel or Kripke, we have a non-natural mental power like an intellektuelle Anschauung in order to decide which of empirically equivalent but incompatible theories. The other, which is called called natural metaphysics, is to get the job done by, instead of an intellektuelle Anschauung, the ―scientific method.‖ But natural metaphysics goes against a scientific spirit that pursues its own activities within the confines of empirically significant claims. So, according to Putnam, the revival of metaphysics is far more likely to be along the lines of those who believe that we have an intellektuelle Anschauung than along the lines of natural metaphysics, but it seems highly unlikely that metaphysics will succeed in its revival along either line.41 Against metaphysical realism, Putnam bills his own view as internal realism. In this view, there is no such thing as the uniquely true and complete description of the world regardless of our conceptual schemes. A statement has a meaning only relative to the interpretation or model in which it is placed. According to Putnam, ―the world does not pick models or interpret languages. We interpret our languages or nothing does.‖42 The world ―is not a furnished room.‖43 This is the reason why Putnam believes that there isn‘t ready-made world. Our understanding of the meaning of a sentence consists not in knowing its truth conditions but in mastering its verification procedures. Thus Putnam

41 Realism and Reason, p. 228 42 Realism and Reason, p. 24.

108 identifies truth with our idealized justification. Here ―idealized‖ justification, as opposed to tensed justification or justification-on-present-evidence, indicates that some of the statements which are now justified may turn out not to be true.44 Putnam holds that explanation is epistemic or intentional because it is interest-relative and context-sensitive.45 Even though a statement has a meaning only relative to the model in which it is placed, however, we have to note υutnam doesn‘t believe that the truth or falsity of a statement is subjective. He says, ―Urging this relativism is not advocating unbridled relativism.‖46 Putnam also claims that reason can‘t be naturalized. For we can‘t eliminate the normative like the notions of rightness and wrongness because if we did it, then we would stop being thinkers and our statements would be nothing but noise-makings. He goes as far as to say that the elimination of the normative is attempted mental suicide. Putnam quotes Nelson Goodman‘s remark that ―relativity of rightness and the admissibility of conflicting right renderings in no way precludes rigorous standards for distinguishing right from wrong.‖ 47 In that vein, though Putnam agrees with Quine that even logical or mathematical statements are revisable, he tries to differentiate himself from Quine by saying that our notion of rationality cannot be so flexible as Quine believes. In this respect, it is interesting to see Putnam‘s view on the laws of classical logic (tautologies or logically valid sentences). Frege and Russell believe that the laws of logic are the truths which hold in the actual world, albeit in more general and abstract aspects of the actual world. But according to possible world theorists, the laws of logic are the truths which hold in every possible world. Indeed, it seems to me that there was a great advance from the former to the latter in regard to the understanding of the ontological status of logical laws. Putnam follows in Quine‘s footsteps and shows that even logical laws are revisable by pointing out that there are some laws of classical logic which do not hold in quantum logic:48 (1) The law of conjunction introduction p, q→p&q

43 Realism and Reason, p. 23. 44 Realism and Reason, p. 85. 45 Realism and Reason, p. 297. 46 Realism and Reason, p. 10. 47 Realism and Reason, p. 169.

109 has to be restricted to pairs of compatible propositions p, q. (2) The distributive law p(q∨r)≡pqpr has to be restricted to the case in which all three propositions p, q, r are ‗totally compatible.‘ But his emphasis is on the fact that every statement is revisable but not in every way. υutnam says that we must not make a hasty judgment ―from the fact that it is dangerous to claim that any statement is absolutely a priori to the absolute claim that there are no a priori truths.‖49 It could be that not all, but some of the laws of classical logic are a priori. For instance, υutnam doesn‘t take the law of contradiction ~(p&~p) as an a priori truth. For there is left room for the possibility that p&~p does hold. Nevertheless, there is indeed a statement p such that ~(p&~p) does hold a priori. Therefore, according to Putnam, there exists at least one a priori truthμ ―σot every statement is both true and false.‖ Putnam accepts the law of contradiction only in such a weak form.50 Putnam claims that, insofar as empirically equivalent but incompatible theories make different claims, some facts are ―soft‖ in the sense that they depend for their truth value on the speaker, the circumstances of utterance, etc.51 According to Putnam, whether VL or not belongs to a ―soft fact.‖52 So does whether the Axiom of Choice is accepted or not. Following Putnam, as a thought experiment, let‘s assume that there are intelligent extraterrestrials who reject the Axiom of Choice due to its some counter-intuitive consequences like the Banach-Tarski Paradox.53 Keep in mind that most mathematicians and logicians here on the earth accept the Axiom of Choice due to its pleasant consequences. Then could we say that we are right and they are wrong? Of course, our acceptance of the Axiom of Choice is not arbitrary. But the fruitful results from the Axiom of Choice alone are not good enough to say that acceptance of the Axiom of Choice is so rational that rejection of it is irrational. The Axiom of Choice depends for its truth-value

48 Realism and Reason, p. 48, p. 96, p.100. 49 Realism and Reason, p. 114. 50 Realism and Reason, p. 100, p. 112, p. 129, p. 131. 51 Realism and Reason, p. 19. 52 Realism and Reason, p. 23. 53 Realism and Reason, p. 14.

110 upon the model in which it is placed. This seems to me to be a foregone conclusion from the model-theoretic arguments. In the next section, we shall consider what are the positive and negative lessons that can be learnt from the model theoretic arguments. 5.5 The Lessons from the Model-Theoretic Arguments We have learned from the model-theoretic arguments that there are multiple internally consistent but mutually inconsistent models. The truth-value of some mathematical statements depends on a model we have in mind. Therefore, in some cases different models give us different answers to the same question. When we are confronted with multiple conflicting models, we should not think that only one of them is true and others are not if they are all internally consistent. We must be sufficiently tolerant to admit the possibility that each of them is true and accepted. Indeed the Skolem Paradox suggests that the cardinality of a set is relative to a model. But does it follow from this that all the truths depend on a model? Let me explain what I‘m trying to say here. I grant the following two points from mode-theoretic arguments: (1) A first-order consistent formal system has multiple models. (The Löwenheim-Skolem Theorem) (2) There are some truths that depend on a model. (The Skolem Paradox) ψut it doesn‘t necessarily follow from this that: (3) All the truths depend on a model. It seems to me that we cannot move from (1) and (2) to (3), without a surreptitious assumption that we have to treat every model in an egalitarian manner insofar as it is consistent. Even if there are multiple models, however, if one of the models is preferred to the others in light of other criteria (if at all), then we couldn‘t say that all the truths depend on a model. In fact sometimes none of the models is preferred to the others. Even so, it is one thing to say that, although one of the models is actually true and the others are not, we cannot decide which one is true due to a lack of evidence at this point, and another thing to say that, since the models are all equally true, we cannot tell which one is true. In the former, the models should be competing with each other for the truth and

111 one of them is actually true. Only in the latter, we could say that the models should coexist peacefully and the truth depends on a model. Even though there are multiple reductions of a natural numbers to a set, we have to pay attention to the fact that von Neumann‘s system is entrenched in set theory. For von Neumann‘s system has several advantages compared with Zermelo‘s. τne of them is that since in von Neumann‘s system each natural number is the set of all smaller natural numbers, a larger-than relation can be replaced by a membership relation. Indeed the Skolem Paradox suggests that the cardinality of a set is relative to a model. But is the cardinality of every set relative to a model? Is there a model in which the set of real numbers is countable? Also, regarding some mathematically important problem, such as the Axiom of Choice and the Continuum Hypothesis, could we say that its validity also depends on a model? The affirmative answer to this question is a consequence of the model-theoretic arguments. That is, the Axiom of Choice and the Continuum Hypothesis are only true or false relative to a model, and they are neither true nor false by themselves. More specifically, in Gödel‘s VL model both the Axiom of Choice and the Continuum Hypothesis are true, but by Cohen‘s generic extension we can have the one model in which the Axiom of Choice is false and the other model in which the Axiom of Choice is true but the Continuum Hypothesis is false. Note that in order that the Continuum Hypothesis should make sense, the Axiom of Choice has to be accepted. Recall Putnam‘s thought experiment on the Axiom of Choice. His point is that the pleasant consequences of the Axiom of Choice alone (e.g., the well-ordering of a set of real numbers) are not good enough to say that acceptance of it is so rational that rejection of it is irrational. So Putnam‘s identification of truth with idealized justification does not give us reason enough to justify the Axiom of Choice. Also, it seems to me that the criterion of our rational acceptability differs from person to person. This is because for me the fruitful results from the Axiom of Choice make it fully rational to accept the Axiom of Choice. What makes possible the Löwenheim-Skolem Theorem is the weak (limitative) expressive power of first-order logic. When it comes to second-order logic, the Löwenheim-Skolem Theorem doesn‘t hold any longer. So my concern is whether or not

112

Putnam‘s model-theoretic argument, based on Löwenheim-Skolem Theorem, still holds in second-order logic. One could argue that first-order logic is the only real logic, whereas second-order logic is not. Actually, Quine is a case in point. But since the central mathematical notions, such as finitude and countability, can be defined only in second-order logic, not in first-order logic, we cannot ignore the significance of second-order logic. But even if we limit ourselves to first-order logic, we can find a way to save Platonism. There are multiple models of a first-order formal system. The Skolem Paradox shows that a mathematically important notion is relative to a model. So, one is inclined to say that the Axiom of Choice and the Continuum Hypothesis depend on a model. But we should not give up the investigation there. As a Platonist, I propose to examine the interrelationships among models as follows. An apparent contradiction may be resolved if one makes a transition from a lower to higher level. Let me explain what I mean by this by giving examples from mathematics/logic and science. (1) Narrowing down the models. We have the intended or standard model by eliminating unintended or non-standard models. Mathematics/Logic: We can get the model of arithmetic by eliminating uncountable models because the model of arithmetic is countable. Also, most mathematicians reject Gödel‘s V L in favor of the existence of measurable cardinals. Science: The geocentric model of the universe was replaced by the heliocentric model. Moreover, both the phlogiston hypothesis and the ether hypothesis were abandoned in the scientific development. (2) Extending the models. There are some cases in which one model is an extension of the other. Mathematics/Logic: Gödel‘s VL is an extension of ZF set theory. ωohen‘s generic extension is another one. Science: We can explain the relation between σewtonian mechanics and Einstein‘s theory of relativity by a correspondence principle. That is, the former is a limiting case of the latter for a velocity much less than the speed of light.

113

(3) Overlapping the models. This case is the most intractable because it is difficult for us to see how the models are interrelated. Mathematics/Logic: The whole picture of the Skolem Paradox becomes clearer when viewed from both inside and outside the model simultaneously. Science: It is well-known that light has the wave-particle duality. Also, the behavior of a photon in the double-slit experiment can be explained by a superposition of states. Epistemologically speaking, then, how can we make comparisons among models? There are at least two major theories of truth: the correspondence theory of truth and the coherent theory of truth. The pith of the model-theoretic arguments is that insofar as we hold the coherent theory of truth, since there are numerous coherent models out there, we are haunted by the indeterminacy of reference. So Platonists cannot claim any longer, in accordance with the coherent theory of truth, that we can get access to them implicitly by the consistency of the model. Hence, Platonists are forced to claim, in accordance with the correspondence theory of truth, that we can get access to them explicitly by a causal connection. Since Platonists claim that mathematical objects are super-spatio-temporal entities, however, we have no choice but to assume that we have a special epistemic faculty such as a mathematical intuition. In an earlier chapter, however, I denied that we have a mathematical intuition by invoking the Banach-Tarski Paradox. So, as it seems, Platonism without a mathematical intuition is endangered. But can we not really be Platonists if we reject a mathematical intuition? One way out is to make an excuse by claiming that the problem of the indeterminacy of reference poses no threat to Platonism. Are there any theoretical terms ever whose references are fixed once and for all? To see this, it is enough to recall that there were multiple physical models for subatomic structure at the dawn of modern physics or chemistry. Just as the indeterminacy of reference of a theoretical term is compatible with scientific realism, so the indeterminacy of reference of a mathematical entity is compatible with Platonism. But I think we should take the problem of the indeterminacy of reference raised by the model-theoretic arguments seriously because it doesn‘t make sense even to say that we are talking about something unless we can reidentify what we are talking about.

114

Thus we are caught on the two horns of dilemma: on one hand we have multiple internally consistent models and on the other hand we cannot rely on our mathematical intuition to fix the model. I believe, however, that there are still good candidates which serve as criteria to measure the excellence of the model, such as simplicity, comprehensiveness and maximality. Conclusion Combined with the results from the last chapter, it becomes clear that there are the limitations inherent in formal methods in the following three respects: (1) As Gödel‘s First Incompleteness Theorem shows, there is a sentence in a first-order formal system that is neither provable nor disprovable. (2) As Gödel‘s Second Incompleteness Theorem shows, the consistency of a first-order formal system cannot be proven within that system. (3) As the Löwenheim-Skolem Theorem shows, a first-order formal system cannot uniquely fix the model. All these are the defects for formalists who rely only on formal methods, not for Platonists who claim that we need to make an appeal to informal methods by going outside of a formal system at some point. In light of the Löwenheim-Skolem results, we have seen both Quine‘s thesis of the indeterminacy of translation and υutnam‘s model-theoretic arguments against metaphysical realism. We learn from the model-theoretic arguments the dependence of the truth on a model. Indeed the Skolem Paradox suggests that the cardinality of a set is relative to a model. Nevertheless I cannot accept that the truth or falsity of the Axiom of Choice or the Continuum Hypothesis just depends on a model. We should not give up the investigation there. As a Platonist, I propose to examine the interrelationships among models. As I argued in an earlier chapter by invoking the Banach-Tarski Paradox, it is implausible that we have a special epistemic faculty such as a mathematical intuition. But I believe that we still have the principles which serve as criteria to measure the excellence of the model.

115

CHAPTER 6

WHAT IS MATHEMATICAL EXISTENCE?

Introduction The aim of this chapter is to meet Benacerraf‘s challenges to Platonism by claiming that in mathematics essence amounts to existence. In Section 1, we shall see Benacerraf‘s epistemological and ontological challenges to Platonism. In Section 2, in spite of Benacerraf‘s challenges, I defend Platonic realism in mathematics by showing the deductive power and stability of the Axiom of Choice through a variety of its applications in many branches of mathematics. In Sections 3 and 4, we shall examine how mathematical models are developed in the actual practice of mathematics. As a result, I claim that true mathematical theories belong to the maximally consistent theory that describes mathematical reality. The theoretical ground for my claim is that in mathematics essence amounts to existence. In Sections 5 and 6, I seek for the philosophical foundation for my view by drawing upon Anselm‘s argument for the existence of God and Locke‘s doctrine that nominal and real essences coincide in mathematics. In Section 7, I shall show that there are two conditions to be satisfied in order for us to derive existence from essence alone. Given the fact that in natural sciences existence cannot be derived from essence alone, this shows the unique nature of mathematical existence. 6.1 Benacerraf’s Challenges to Platonism Benacerraf‘s challenges to Platonism are twofold: one is epistemological and the other ontological. First of all, we shall examine Benacerraf‘s epistemological challenge. Benacerraf, in ―Mathematical Truth,‖ a paper with an immense influence on the philosophy of mathematics in recent decades, rejects Platonists‘ ―standard‖ view of mathematical truth. The account Benacerraf calls ―the standard view‖ treats a sentence like (1) and a sentence like (2) as straightforward instances of the logical form of (3). (1) There are at least three large cities older than New York. (2) There are at least three perfect numbers greater than 17.

116

(3) There are at least F G‘s that bear R to a. This account attempts to draw a parallel between mathematics and natural sciences. According to Benacerraf, we can find such an account only in Tarski‘s theory of truth, whose characteristic feature is to define truth in terms of reference, denotation, or satisfaction. On this account, there has to be some causal connection between ourselves and mathematical objects. τn υlatonists‘ account, however, mathematical objects are supposed to be super-spatio-temporal and thus causally inert. So if Platonism is true, then the causal theory of knowledge is false. And if the causal theory of knowledge is true, then Platonism is false. Hence, Platonism creates an irreconcilable tension with the causal theory of knowledge. To meet this challenge, Platonists have to explain how we can get access to mathematical objects that are supposed to be super-spatio-temporal and thus causally inert. Let me call this problem ―the Access υroblem.‖ To meet Benacerraf‘s challenge, one could simply stick with the correspondence theory of truth and claim that in mathematics we have a mathematical intuition just as we have an empirical intuition in natural sciences. As already explained, however, the Banach-Tarski Paradox raises a doubt that we have a special epistemic faculty such as a mathematical intuition. Next, we shall consider ψenacerraf‘s ontological challenge to Platonism in detail. Benacerraf, in ―What Numbers Could Not Be,‖ spells out the conditions for a correct account of numbers to satisfy. That is, it is necessary (1) to give definitions of ―1,‖ ―number,‖ and ―successor,‖ and ―+,‖ ―×,‖ and so forth, on the basis of which the laws of arithmetic could be derived; and (β) to explain the ―extramathematical‖ uses of numbers, the principal one being counting—thereby introducing the concept of cardinality and cardinal number. If numbers are sets, then we must be able to know which sets numbers are. But it is well-known that there are several different reductions of numbers to sets. Frege defines the number 3 as the class of all classes consisting of triples—the extension of the concept ―equivalent with some 3-membered set‖ in his terminology. Also, in 1908 Zermelo proposed to use

117

0Ø, 1{Ø}, 2{{Ø}}, {{{Ø}}}, … Later von Neumann proposed an alternative: 0Ø, 1{0}{Ø} 2{0, 1}{Ø, {Ø}}

3{0, 1, 2}{Ø, {Ø},{ Ø, {Ø}}}, … Therefore, for Zermelo, 3 17, whereas for von Neumann, 317. Their cardinality relations are also different. On the former, every number is single-membered, whereas on the latter, a set of the number n had n members. Therefore, for the former 17 has only one, while for the latter, it has 17 members. Although there are differences between the two systems, it still remains the fact that both can satisfy the conditions for a correct account of numbers as stated above.54 So we have no principled way of deciding between these reductions. But if a number is a set, then this set has to be unique. So Benacerraf concludes that numbers could not be sets at all. Let me call this problem ―the Multiple Reduction Problem.‖ This shows that any set could contain some superfluous conditions irrelevant to arithmetic, so a number is too weak to exist as a set. In the extension of his argument, he argues that numbers could not be objects of any sort. For there is no more reason to identify any individual number with any one particular object than with any other (not already known to be a number). To be the number 3 is no more and no less than to be preceded by 2, 1, and possibly 0, and to be followed by 4, 5, and so forth. Any object can play the role of 3; that is, any object can be the third element in some progression. Hence, according to Benacerraf, the essence of numbers exhausts itself in the relative positions they occupy in over-all structure. Any object could contain some superfluous conditions irrelevant to arithmetic, so a number is not such as can be given self-subsistent existence. Arithmetic is therefore the science that

54 We have already seen that even though there are multiple reductions of a natural numbers to a set, von σeumann‘s system is entrenched in set theory because von σeumann‘s system has several advantages compared with Zermelo‘s. See p. 107.

118 elaborates the abstract structure that all progressions have in common merely in virtue of being progressions. Arithmetic is not a science concerned with particular objects—the numbers. It might be objected to Benacerraf‘s ontological challenge that even if it is granted that numbers are not objects, it does not follow from this that numbers are not anything unique. Also, it might be objected that whether or not numbers are objects depends on what it is to be an object. Indeed, as Benacerraf himself recognizes, his opponent may simply bite the metaphysical bullet, affirming the possibility of objects that lack any inner intrinsic nature and whose essence consists entirely in relations to other objects. In any case, it seems to me that Benacerraf‘s point is that we don‘t have to be committed to the existence of numbers in order for arithmetical propositions to be true. 6.2 Some Applications of the Axiom of Choice In spite of Benacerraf‘s challenges, I believe that Platonic realism fits in well with the actual practice of mathematicians. Also, we can actually gain a more fruitful picture of mathematics by working hypotheses based on Platonic realism. Despite the early criticisms, not only did the Axiom of Choice play a major role in the systematization of Cantor‘s set theory, but it also had a tremendous impact on many branches of mathematics outside set theory. This means that the Axiom of Choice is not a freak ad hoc principle formed in the development of mathematics, but a stable principle which is widely applicable in many branches of mathematics. The deductive power of the Axiom of Choice is borne out by the fact that even some of the opponents of the Axiom of Choice used it implicitly. Zorn‘s Lemma is the key to the various applications of the Axiom of Choice. Actually, we can prove numerous theorems in mathematics by using Zorn‘s Lemma. For instance, using Zorn‘s Lemma, we can prove in linear algebra that every vector space V has a basis. A basis is a maximal independent subset of V. Especially, the subset of V, {{1, 0, …. , 0 , 0}, {0, 1, …. , 0 , 0}, …, {0, 0, …. , 1 , 0}, {0, 0, …. , 0, 1}} is called the standard basis. So the theorem amounts to the claim that every vector space has a maximal independent subset of V. Let S{X∣ X is an independent subset of V}. Partially order S by inclusion. Let C be any chain of S. Let C{Xi: iI}. Consider the set UiI Xi.

119

U is a subset of V. U is an upper bound of C because, for every XiC, XiU. In order to apply Zorn‘s Lemma, we now show that US. For reductio, suppose that U S. That is, U is dependent. So there should be some element of U that is expressed as a linear combination of the other elements of U. But since U is just the union of all the independent subsets of V in C, we can find an earlier set X in C that contains these linearly dependent elements. By assumption, however, X is an independent subset of V. A contradiction. Therefore US. Applying Zorn‘s Lemma, S has a maximal element. So we conclude that V has a maximal independent subset, i.e., a basis. This seems to be an intuitive consequence of Zorn‘s Lemma. In a two dimensional vector space, {{1, 0}, {0, 1}} is the standard basis. In a three dimensional vector space, {1, 0, 0}, {0, 1, 0}, {0, 0, 1} is the one. Extending this argument, the claim that every vector space V has a basis is not so surprising, though it‘s harder to prove when it comes to the infinite vector space. Also, in topology Zorn‘s Lemma is useful to prove the important Tychonoff Product Theorem for compact spaces that if all the factors are compact then also the product is compact. 6.3 The Axiom of Determinacy In the next two sections, we shall examine how mathematical models have been developed in the actual practice of mathematics. Two Polish mathematicians, Mycielski and Steinhaus, introduced an axiom that contradicts the Axiom of Choice: the Axiom of Determinacy. Imagine a game in which two players alternate in choosing natural numbers.

Player I starts and chooses a0, then Player II chooses a1, then Player I chooses a2 and so forth. After moves, the players construct an infinite sequence,

a(a0, a1, a2, … ) Let be the set of all infinite sequences of natural numbers and A⊆. Player I wins the game GA associated with A if aA, and Player II wins if a A. We say that A is determined if one of the two players has a winning strategy in the associated game GA. The Axiom of Determinacy tells us that for every A⊆ , the game GA is determined. Here is a set of two competing theories without inner contradictions (Theory-Choice I):

120

(1) ZF+the Axiom of Choice (2) ZF+the Axiom of Determinacy Most mathematicians reject the Axiom of Determinacy and accept the Axiom of Choice. For the Axiom of Determinacy implies that every set of real numbers is measurable and has the Baire property, and also that the set of real numbers cannot be well-ordered, whereas the Axiom of Choice implies the existence of non-Lebesgue measurable sets and the well-ordering of real numbers, which provide us with a more fruitful picture of mathematics. This fact shows that since there are some cases in which we could have multiple logically consistent competing theories, it is not sufficient to say that every logically consistent theory is a mathematical theory. More precisely speaking, the theories mathematicians are actually working on are the parts of the maximally logically consistent theory that describes mathematical reality. 6.4 The Axiom of Constructibility In 1960 Dana Scott proved a very simple result that the existence of a measurable cardinal implies VL, equivalently, if VL there are no measurable cardinals. Before we estimate this result, however, we shall see what the Axiom of Constructibility is. In ZFC the universe V of all sets is divided into a hierarchy of sets Vα by transfinite induction on α. At successor ordinals we take the power set of the previous stage, and at limit ordinals the union of the preceding stages. Each Vα is a , and Vα⊆Vα+1.

V0Ø

Vα+1P(Vα)

VnV n

By the axiom of Foundation, every set in the universe V is a member of some Vα. So,

VnONV n We shall modify the above definition by using the function SDef(S) instead of the function SP(S).55

55 A set is definable over S if there is a formula of x and some members a1, a2, …., an in S by which the members of the set satisfies x. Def(S) is the set of sets which are definable over S. A set is ordinal-definable if a1, a2, …., an are ordinals.

121

L0Ø

Lα+1Def(Lα)

LnL n Now we define

LnONL n The Axiom of Constructibility tells us that every set is constructible, i.e., VL. In short, the Axiom of Constructibility does not admit of the full universe of sets, but only a restricted universe of the constructible sets. In other words, the Axiom of Constructibility rejects the existence of a non-constructible set. But we have to note that the term ―constructible‖ here is used in too broad a sense for Constructivists to accept. The Axiom of Constructibility is so strong a hypothesis that it implies the Axiom of Global Choice, which is the strongest form of the Axiom of Choice, and the Generalized Continuum Hypothesis. Also, the Axiom of Ordinal-definability is a consequence of the Axiom of Constructibility. Now, we shall define the Borel sets: 0 ∑ 1the open sets 0 ∏ 1the closed sets 0 ∑ 2countable unions of closed sets 0 0 ∏ 2complements of the collection of ∏ 2 sets … 0 0 ∑ α+1countable unions of ∏ α sets 0 0 ∏ α+1complements of ∑ α+1 sets 0 0 0 ∆ αsets that are both ∑ α and ∏ α 0 Borel setsthe union of the ∑ αs The Borel sets are constructed from ―below,‖ as it were, and well-behaved. By contrast, the non-measurable sets are built from ―above‖ and pathological. The Axiom of Choice ensures that there are non-measurable sets of real numbers, while the Borel sets of real numbers are Lebesgue measurable.

122

Then, we shall define the projective sets:56 1 ∑ 0the open sets 1 1 ∏ 0the complements of ∑ 0 sets 1 1 ∑ 1the projections of ∏ 0 sets 1 1 ∏ 1the complements of ∑ 1 sets … 1 1 ∑ n+1the projections of ∏ n sets 1 1 ∏ n+1the complements of the ∑ n+1 sets … 1 1 1 ∆ nsets that are both ∑ n and ∏ n 1 Projective setsthe union of the ∑ n 0 0 0 ∆ 1, ∑ 1, ∏ 1, … are formulas with only first-order quantifiers and formulas with only first-order quantifiers are called arithmetical. So the hierarchy of Borel sets is a 0 counterpart of the arithmetical hierarchy. We also have seen that ∑ 1 is a recursively 0 57 0 0 0 enumerable set and ∏ 1 is a co-recursively enumerable set. σow ∆ 1∑ 1∩∏ 1. So 0 1 1 1 ∆ 1 is a recursive set. ∆ 1, ∑ 1, ∏ 1, … are formulas with second-order quantifiers as well and formulas with first-order and second-order quantifiers are called analytical. So the hierarchy of projective sets is a counterpart of the analytical hierarchy. Here is another set of two competing theories without inner contradictions (Theory-Choice II): (1) ZFC+―there exists a measurable cardinal‖ (2) ZFC+the Axiom of Constructibility (VL) Many mathematicians reject the Axiom of Constructibility and claim VL. For, though the Axiom of Constructibility implies that there are no measurable cardinals, the assumption of the existence of a measurable cardinal opens up more possibilities of fruitful

56 For the Borel sets and the projective sets, Maddy gives an informal presentation in Realism in Mathematics (p. 112, 113). For formal details, see e.g. Hinman, Recursion-Theoretic Hierarchies, p. 84, Jech, Set Theory, p. 140, 144, Soare, Recursively Enumerable Sets and Degrees, p.70. 57 See p. 74.

123 mathematics. We have to note that two theories give opposite answers to the question 1 about the measurability of ∑ 2 sets, which ZFC alone cannot answer. On one hand, the 1 Axiom of ωonstructibility implies that there exist ∑ 2 sets which are non-measurable and don‘t have the Baire property. On the other hand, as in 1967 Solovay showed, the 1 existence of measurable cardinals implies that every ∑ 2 set of reals is measurable and has the Baire property, and has the perfect set property. We could say that contemporary mathematics moves in a direction that opens the possibility of more fruitful mathematics guaranteed by the Platonic assumption, insofar as it is internally consistent. From all the above, I shall meet Benacerraf‘s challenge by claiming that true mathematical theories belong to the maximally consistent theory that describes mathematical reality.

Figure 27: Example of actual practice of mathematics.

6.5 Anselm’s Argument of the Existence of God The theoretical ground for my claim is provided by the idea that in mathematics essence amounts to existence. In general, we cannot derive existence from essence alone. For instance, although Sherlock Holmes has the properties of being the character created by Conan Doyle or living in Baker Street, he is a fictional character. As far as mathematical objects are concerned, however, essence and existence coincide. We can find the prototype of this argument in Anselm‘s proof of the existence of God. My aim here is not discuss whether or not Anselm‘s argument of the existence of God is sound. Rather, my point is that this argument applies to the existence of mathematical objects mutatis

124 mutandis. We can see the core of Anselm‘s argument of the existence of God in his Proslogion, Chapters II and III. God is supposed to be that-than-which-a-greater-cannot-be-thought. Anselm says even the Fool agrees that that-than-which-a-greater-cannot-be-thought exists in mind at least. For otherwise it would not make sense for the Fool to deny that that-than-which-nothing-greater-can-be-thought actually exists. But it may be objected that some things exist only in mind and do not actually exist. To forestall an objection like this, Anselm indeed takes as an example the picture in the mind of a painter. In this example, the painter having the picture in the mind is not sufficient to say that the picture actually exists. We can say that the picture actually exists only after the painter executed the painting. But this is not the case with the existence of God. Now we shall show that that-than-which-a-greater-cannot-be-thought also exists in reality. Assume for reductio ad absurdum that that-than-which-a-greater-cannot-be-thought exists in the mind alone. Even so that-than-which-a-greater-cannot-be-thought can be thought to exist in reality. The latter would be greater than the former. So that-than-which-a-greater-cannot-be-thought is that-than-which-a-greater-can-be-thought. A Contradiction. Therefore, that-than-which-a-greater-cannot-be-thought also exists in reality. We have to note that the matter at stake here is that-than-which-a-greater-cannot-be-thought, not that-which-is-greater-than-everything. Also, the existence of God is of a unique nature such that God cannot be thought not to exist. Again, assume for reductio ad absurdum that that-than-which-a-greater-cannot-be-thought can be thought not to exist. Something-that-cannot-be-thought-not-to-exist is greater than something-that-can-be-thought-not-to-exist. So that-than-which-a-greater-cannot-be-thought is that-than-which-a-greater-can-be-thought. A Contradiction. Therefore, that-than-which-a-greater-cannot-be-thought cannot be thought not to exist. This means that that-than-which-a-greater-cannot-be-thought necessarily exists. The upshot of Anselm‘s argument is that in God essence and existence coincide.

125

This means that we can derive the existence of God from the concept alone insofar as it has no inner contradictions. My claim is that in mathematics essence and existence coincide. I shall argue that we can derive the existence of mathematical objects from the concept alone insofar as it has no inner contradictions. Although the object of his intention was different from that of mine, I pay attention to Anselm‘s proof only as a primary source of the argument that essence amounts to existence. It would be too sensitive to flatly reject the argument itself just because it is the argument that concerns the proof of the existence of God. The success or failure of Anselm‘s proof depends more on what Anselm means by God than on the argument itself. We have seen that by God Anselm means that-than-which-a-greater-cannot-be-thought. It seems to me that this notion of God is quite different from what we normally call ―God.‖ But it would be too great a digression if I discussed here whether Anselm‘s notion of God should be supposed to be ―God‖ or not. I‘m not totally committed to Anselm‘s proof of the existence of God, still less to the existence of God by his proof. I‘m committed to Anselm‘s proof only in the sense that his argument gives us a clue about how we get access to an object to be known when there is no causal connection between ourselves and the object. So Anselm‘s proof contains important insight into the existence of mathematical objects as the paradigmatic example of that, although he might have not intended it by his proof. 6.6 Locke on Essences In this connection, it is worth mentioning Locke‘s philosophy of mathematics. Although Locke is often referred to as the father of British Empiricism, it is worth noting that he has a unique philosophy of mathematics. Provided that Locke gives mathematics a special status, it would be rash to criticize his philosophy of mathematics for having the defects in which empiricist philosophers of mathematics (e.g., J. S. Mill) are often said to be involved, that is, not being able to explain high-order mathematical objects, such as set, large numbers, and infinite divisibility of space. What makes Locke‘s philosophy of mathematics unique is the distinction between nominal and real essences and his claim that in mathematics nominal and real essences coincide.58 By contrast Locke claims that in

58 Locke also claims that in morality as well nominal and real essences coincide. But I want to focus on mathematics alone in this paper.

126 substances we have no idea of the real essence at all. The significance of Locke‘s theory of essences can be seen more clearly when considered in light of Aristotelian doctrine of ―substantial forms.‖ I grant that it is an oversimplification to identify proto-Aristotelian notion of a substantial form with the notion of a substantial form which modern philosophers reject as Aristotelian. The notion of a substantial form took on new life among scholastic Aristotelians, and was developed in ways that Aristotle himself never suggested. But the notion of a substantial form has its roots in Aristotle‘s physical conception of form as one of the four causes, along with his metaphysical condition that form, above all else, is substance in the primary sense. So it seems to me that there is a good reason modern philosophers ascribe the doctrine of substantial forms to Aristotle. Aristotle rejected the atomic theory of matter and saw matter as extending in a continuum throughout the universe, leaving no void. Aristotle thinks that everything under the moon is compounded from the four elements which he borrowed from Empedocles: earth, water, air, and fire. But they are all composed of the same prime matter, and are differentiated by their simple qualities, one from each pair of opposites hot/cold and wet/dry. Here it is worth noting that after the quantitative, mathematical theories of the atomists and Plato, Aristotle appeals to a qualitative doctrine. Thus earth is cold and dry, water wet and cold, air hot and wet, fire hot and dry.59 When the four elements are combined to form more complex substances, the qualities which are innate in the matter composing the elements also are combined, producing the properties of the mixture. Aristotle makes an absolute distinction between ―up‖ and ―down.‖ This leads him to distinguish ―heavy‖ bodies, which naturally tend to move ―down,‖ from ―light‖ bodies, which tend to move ―up‖ away from the center. The heaviest elements tend to be gathered together nearest the center, the lightest to be furthest from it. Each element thus has its ―natural place,‖ that of water being immediately above earth, that of air next, and that of fire further from the center, and nearest to the regions

59 The Greek terms ὑγρό and ηρό are wider than ‗wet‘ and ‗dry‘ in English, for ὑγρό refers to both liquids and gases, and ηρό especially, but not exclusively, to solids. (See Lloyd, Early Greek Science: Thales to Aristotle, p. 107)

127 occupied by celestial matter. This line of thought bears fruit as the doctrine of ―substantial forms‖ and ―real qualities.‖ That is, heat, color and a bunch of physical properties are thought to be real, innate and intrinsic in bodies. Locke assumes that things are particular, whereas most words are general, and abstraction is suggested to be what makes ideas general. The point is that we get abstract ideas by separating particular ideas from all other existences such as space and time. For example, we can get the abstract idea of men by leaving out that which is peculiar to Peter and James, Mary and Jane, and retain only what is common to them all. This procedure can be repeated to yield the still more abstract idea of animal. According to Locke, however, the categorization or classification of natural kinds is ―the workmanship of the understanding,‖ and in this sense the essences of natural kinds are nominal essences. The real essence on which the sensible properties depend is unknown to us. Take ―gold‖ as an example. We have the complex idea of gold, e.g., yellowness, weight, fusibility, malleableness and solubility in aqua regia, but it is just the nominal essence of gold. Since we have no idea at all of the real essence of gold, it is impossible for us to certainly know whether or not these properties are universally affirmed of gold and how they are connected each other. Their connection can be ascribed to an arbitrary will of God, so we have only experimental knowledge of this. But, Locke says, there are some cases in which the ideas contain natural connections among themselves, and in these cases alone we have certain and universal knowledge. Here Locke has in mind the ideas which are the nominal as well as real essences: ―Three angles of a triangle are equal to two right ones.‖ Locke claims that in mathematics nominal and real essences coincide. According to Locke, the ignorance of mathematical truths is not due to any imperfection of our faculties or uncertainty in the things themselves, but a failure in acquiring, examining and comparing our own ideas. (E, IV, iii, 30) When Locke says that in mathematics nominal and real essences coincide, I don‘t take it that Locke claims that mathematical truths are fictional or analytic. According to Locke, mathematical truths are not only certain but real knowledge, and not the bare empty

128 vision of vain, insignificant chimeras of the brain. (E, IV, iv, 6)60 Also, Locke actually classifies truth or knowledge into two kinds: verbal/trifling and real/instructive. Indeed Locke acknowledges that there are general propositions which are true but do not increase our knowledge. For example, the propositions such that ―the whole is equal to all its parts,‖ or ―if you take equals from equals, the remainder will be equal,‖ do not help us increase our knowledge. When it comes to mathematics, however, Locke claims that the mathematical truth ―the external angle of all triangles is bigger than either of the opposite internal angles‖ is not contained in the complex idea ―triangle.‖ He goes on to say that ―this is a real truth and conveys with it instructive real knowledge.‖ (E, IV, viii, 8) Indeed Locke argues that we cannot discover the real essence of natural kinds, this does not necessarily mean that Locke flatly denies the existence of real essences. The following two are quite different claims: (1) There are no real essences of natural kinds. (β) →e even don‘t know whether or not there are real essences of natural kinds. If any, they are unknown to us. Locke just argues against taking our ideas of natural kinds for real essences, but he doesn‘t deny the existence of real essences of natural kinds. In this respect, I agree with ψolton‘s claim that Locke‘s anti-essentialist doctrine of nominal essences holds without denying the existence of real essences.61 Although Locke‘s theory of essences is often construed as anti-essentialism, this should have to do only with nominal essences. To put it dramatically, Locke‘s anti-essentialist view of nominal essences is compatible with essentialist metaphysics. Although there is no doubt that Locke‘s doctrine of the coincidence of nominal and real essences in mathematics is significant, I believe that it comes short of the Platonic view of mathematical existence. Since for Locke all there are in mathematics are the ideas inside our minds, we don‘t have to care about external existence corresponding to them (E,

60 Locke‘s definition of certain and real knowledge is as followsμ ―→herever we perceive the agreement or disagreement of any of our ideas, there is certain knowledge: and wherever we are sure those ideas agree with the reality of things, there is certain real knowledge.‖ (E, IV, iv, 1κ) 61 Bolton, ‗‗The Relevance of Locke‘s Theory of Ideas to His Doctrine of Nominal Essence and Anti-Essentialist Semantic Theory,‖ In Chappell, Locke, p. 214-225.

129

IV, iv, 8). But since Platonists literally believe in mathematical objects outside our minds, some explanation to link essence with existence in mathematics is still needed for them. 6.7 Essence and Existence in Mathematics Even though the main interests of contemporary debate on essentialism lies not in mathematical or logical essence but rather in natural kinds, when the latter is seen from the former perspective, we can clearly see the characteristic features of natural kinds. We should make a distinction between two cases in which the truth holds: (1) Since it is impossible that there does not exist an object which the truth is about, the truth does hold in every possible world. (2) Since it is possible that there does not exist an object which the truth is about, the truth holds only in every possible world where the object exists. The former is concerned with mathematical objects and the latter with natural kinds (if any). This distinction also coincides with that between necessary a priori and necessary a posteriori. In this light, we can clearly see why in mathematics nominal and real essences coincide, whereas in empirical sciences they diverge. The world has a hierarchical structure organized by how nominal and real essences are interwoven. Now, with the aid of arguments above I shall claim that there are two conditions to be satisfied in order for us to derive existence from essence alone. (1) Insofar as we think, we could think existence without contradictions. In other words, except when we don‘t think, we could not think non-existence. (2) Existence contains no empirical contents. This applies only to mathematical objects. I mean that, by (1) actual mathematical theories belong to the maximally consistent theory, and by (2) mathematical truths are a priori. Even in natural sciences we can find out some examples in which existence was successfully derived from essence. In l846 Leverrier and Adams predicted the existence of Neptune on the ground that its gravitational force would explain the observed anomalies in the motion of Uranus. Based on this conjecture, Galle actually discovered the planet. In this case, we could say that there were indirect causal connections with the planet. So the discovery of germanium would be more appropriate here. Before Winckler discovered

130 germanium in 1887, Mendeleev knew many properties of the metal due to the ―gap‖ in the periodic table. In this case, there were no causal connections at all with the metal. From the history of science, however, we also know there were some failed attempts to derive existence from essence. The ether and phlogiston hypotheses are two typical examples. In natural sciences, on one hand, even if a theory which predicts the existence of an object is internally consistent, the theory is not confirmed until we can find out the real existence of the object. On the other hand, even if the existence of an object is empirically verified, we need to corroborate the existence by formulating a theory that explains it. 6.8 What is a Maximally Consistent Theory? If consistency is the only criterion that mathematical theories must satisfy, there are multiple internally consistent but mutually inconsistent theories. By arguing that in mathematics essence amounts to existence, I mean that mathematical theories should satisfy maximal consistency. In the last chapter as a Platonist I proposed to examine the interrelationships among the models. Actually, the independence proofs give us clues as to how the models are interrelated to each other. Gödel‘s inner model VL implies the Axiom of Choice and the Continuum Hypothesis. L is the smallest proper class that is a first-order universe. ψut the inner model method doesn‘t work in proving the independence of the Continuum Hypothesis. Let M be a countable transitive model for ZFω. ωohen‘s method of forcing also shows that in the generic extension of M (i.e., M[G]), the Axiom of Choice is true but the Continuum Hypothesis is false. But the forcing method can also be used to construct a model in which the Axiom of ωhoice doesn‘t hold. This means that there is a submodel N such that MNM[G]. From this perspective, I‘m inclined to think that the Axiom of Choice is true, but the Continuum Hypothesis is false.

131

Figure 28: Example of the interrelationships among the models.

Since two models with different cardinalities are enough to prevent isomorphism, all we can hope is that all models with the same cardinality are isomorphic. A first-order formal system S is κ-categorical if all models of S with an infinite cardinal κ are isomorphic. Vaught‘s test tells us an interesting fact about the relation between ―κ-categorical‖ and ―complete‖μ If all models of S are infinite and S is κ-categorical for some infinite cardinal κ, then S is complete. This means that there are two non-isomorphic models with the same cardinality. Which model should we choose in this case? Since two models are of the same cardinality, whether or not there is one-one correspondence between the two models is not good enough to compare the size of the models. For instance, there is one-one correspondence between the set N of natural numbers and the set Q of rational numbers. Should we choose the set of natural numbers on Kronecker‘s dictumμ ―God made the natural numbers; all else is the work of man.‖?

132

A good epistemology must be able to answer the question of which theory we should choose when facing multiple internally consistent but mutually inconsistent theories. In my view, the problem with an appeal to a non-natural mental faculty, such as a mathematical intuition, is that it cannot give an objective criterion to this question. Here maybe we should be reminded of the three criteria Quine believes we should adopt in formulating scientific hypotheses other than consistency: such as simplicity, familiarity, and sufficient reason. In the above example, however, the set of natural numbers is a subset of the set of rational numbers. So we should favor the set of rational numbers over the set of natural numbers. So in the case of two models with the same cardinality as well we can maintain the criterion of maximality. Indeed there is a case in which we cannot tell for sure which theory maximizes a realm of mathematical objects. But considering that mathematical theories are not accomplished but still developing, some theories are more fruitful than others in the sense that they lead to further research by the method of trial and error. We cannot say that physical objects actually exist on the ground that we can think of them. As Kant says, ―in my financial condition there is more with a hundred actual dollars than with the mere concept of them.‖62 In regard to mathematical objects, however, if they are to exist only in mind and do not actually exist, it doesn‘t even make sense to say that we can think of them. In mathematics as well as in natural sciences we are thinking about not the objects inside of our mind but the objects outside of our mind. But physical objects need empirical materials to be realized, whereas mathematical objects contain no empirical contents. So in mathematics alone, there is no problem of whether or not there is an object corresponding to a concept, but a concept turns to an object by default. For instance, a unicorn does not actually exist because it needs a horn, head, nose, etc., whereas a measurable cardinal actually exists because it does not need such empirical materials. It seems to me that the relation between essence and existence has some implications for the contemporary debates on possible worlds. My concern is to what degree an analogy does hold between the existence of possible worlds and that of mathematical

62 Kant, Critique of Pure Reason (A599/B627).

133 objects. I shall draw upon Lewis‘s possibilism. On one hand, Lewis attempts to justify the possible worlds in parallel with the mathematical objects in that the existence of both can be accepted even without the causal connection between ourselves and the objects. On the other hand, he claims that the mathematical objects are different from the possible worlds in that the former are abstract objects, while the latter are concrete objects. Considering the events outside the light cone, I grant that there are a bunch of concrete objects which we cannot get access to. But they are concrete objects, which contain empirical contents. Since the existence of possible worlds does not satisfy the condition (2) stated at the end of the last section, essence alone cannot guarantee existence. In the case of mathematics, the existence of mathematical objects does not contain empirical contents, essence alone can guarantee existence. But my view is not that there are not the possible worlds, rather that we just cannot show whether or not there exist the possible worlds. Conclusion We have seen that most mathematicians prefer the Axiom of Choice to the Axiom of Determinacy in favor of the existence of non-Lebesgue measurable sets and the Well-Ordering of reals. We have also seen that most mathematicians reject the Axiom of Constructibility in favor of the existence of a measurable cardinal. In both cases, working mathematicians are driven by Platonic realism rather than Constructivism. Not only does Platonic realism fit in well with the actual practice of mathematicians. More importantly, we can actually gain a more fruitful picture of mathematics by working hypotheses based on Platonic realism. The fact that there are some cases in which we could have multiple logically consistent competing theories shows that it is not appropriate to say that every logically consistent theory is a mathematical theory. More precisely, we should say that the actual mathematical theories are the parts of the maximally logically consistent theory that describes mathematical reality. Platonists put emphasis on the difference of existential nature between concrete and abstract objects. Also, as far as abstract objects are concerned, Platonists see a close connection between essence and existence, or possibilities and actualities. The problem with this argument is that it is somewhat dogmatic and

134 heavily depends on the rationalist tradition. But this argument is destined to be of a self-foundational nature. We should of course seek to explain the established world as efficiently as possible. But we also have to investigate the new world. If the mathematical objects should be restricted to those that are indispensable to empirical sciences, mathematical objects such as measurable cardinals cannot be accepted. If we admit of the significance of mathematical activities themselves, however, we can believe in such mathematical objects. What working mathematicians are doing is not blindly increasing the number of mathematical entities, but rather introducing new potential objects into the mathematical universe.

135

CONCLUSION

The Axiom of Choice has been a paradigm case for the debate between Platonic realism and Anti-Platonism since its appearance in the early last century. Despite the early criticisms, not only did the Axiom of ωhoice play a major role in the systematization of ωantor‘s set theory, but it also had a tremendous impact on many branches of mathematics outside set theory. A variety of applications show the deductive power and stability of the Axiom of Choice. This also suggests that there are a lot of things we can do in the presence of the Axiom of ωhoice that we couldn‘t do otherwise. Against the actual practices of mathematics, the trend of philosophy of mathematics in the last few centuries moves in a direction that eliminates Platonic entities from mathematics. This leads to the rejection of Kantian doctrine that the mathematical truths are a priori synthetic. Coffa suggests that the most fruitful anti-Kantian line was what he calls the ―semantic tradition,‖ culminating in Logical Positivism. The main insight was to locate the source of necessity and a priori knowledge in the use of language. A priori knowledge is truth by definition. Dummett calls the approach the linguistic turn in philosophy. Fictionalism is of the same ilk. According to fictionalism, all of mathematics is simply false. The usefulness of mathematics in natural sciences can be explained by means of the conservation theorem. This means that the mathematical theory preserves the truth of the scientific theory, but facilitates the deductions that could be made at greater length and with greater difficulty otherwise. A more plausible view is the indispensability argument. The indispensabilists do not believe that such a fictionalist attempt to nominalize mathematics in its entirety would be successful. According to this view, we may accept the existence of mathematical entities insofar as they are indispensable in explaining natural sciences. Considering that mathematics after the nineteenth-century century is separated from and develops itself automatically independently from natural sciences, however, the drawback of the indispensability argument is that we have to pay a high price: the sacrifice of many fruitful results of contemporary mathematics. Maybe the Axiom of Choice is one of them. I

136 doubt that the Axiom of Choice is indispensable to natural sciences dealing with the finite domain of the universe. In my view, however, the development of contemporary mathematics is too well organized to be ―mathematical recreation‖ as Quine calls it. We should believe that it reflects the mathematical reality of some kind. Also, Russell refuses to posit the Axiom of Choice as an axiom. The Axiom of Choice is formulated in an existential sentence. And, according to Russell, mathematics is simply a development of logic. But logic is not committed to the existence of any object whatsoever. Therefore mathematics is also not committed to existence as such. Russell claims that mathematics is only concerned with conditional statements with regard to existence. That is to say, mathematics claims nothing more than this: If the Axiom of Choice is true, then such and such a statement follows. Thus Russell interprets the Axiom of ωhoice as investigated in an ―if-then‖ sprit, and claims that conditional statements are the laws of logic. A lesson from the model-theoretic arguments is that there are multiple internally consistent but mutually inconsistent theories. So Platonists can no longer claim that every consistent theory belongs to the true mathematical theory. In fact, model-theorists argue that some mathematically important problem depends for its truth-value upon the model in which it is placed. In my view, however, indeed it is suspicious that we have a mathematical intuition to fix the intended model, but we still have principles which serve as criteria to measure the excellence of the model. We should say that the true mathematical theories are those that belong to the maximally consistent theory that describes mathematical reality. As far as mathematical objects are concerned, essence and existence coincide. By contrast, in natural sciences we cannot derive existence from essence alone. Therefore, the coincidence of essence and existence shows the unique nature of mathematical truths.

137

APPENDIX

SOME MATHEMATICAL PROOFS

(I) The Axiom of Choice is equivalent to the Well-Ordering Theorem. Proof. (i) The Axiom of ωhoicethe →ell-Ordering Theorem: Let S be a set. In order to find a well-ordering of S, we can only find an α and a one-to-one α-sequence a0, a1, …, aξ, … (ξα) which enumerates S. By the Axiom of Choice, there is a choice function f on the power set P(S). Now we construct the sequence by transfinite recursion: a0f(S) aξf(S-{aημηξ}). The construction stops as soon as we exhaust all members of S. (ii) The Well-Ordering TheoremThe Axiom of Choice: Let F be a family of nonempty sets S. By the Well-Ordering Theorem, there is a well-ordering of F. Then, we can define a choice function f on F:

(II) The Principle of Dependent Choices implies the Denumerable Axiom of Choice.

Proof. Let {Sn} be a family of denumerably many nonempty sets. Assuming the Principle of Dependent Choices, we shall find a choice function f on {Sn}. Let A be a set of the all finite sequences of Sn. Let R be such a relation on A as defines zn+1 for znx0, x1,….. , xn such that xn∈Sn.

By the Principle of Dependent Choices, since R is a relation on A such that for every znA there exists zn+1A with znRzn+1, there is a sequence z0, z1, z2, … of members of A such that

138 z0R z1, z1R z2, …, znR zn+1, …

Therefore we can define a choice function f on {Sn}: f{Sn, zn(n)}. ■

(III) The union of countably many countable sets is countable (The Countable Union Axiom). 63 .0א≧|and |n 0א≧|Proof. Let {Sn} be a family of countably many countable sets. So |Sn

By the Denumerable Axiom of Choice, there exists a choice function f on {Sn}. Therefore

■ .0א0א×0א≧|Sn|

(IV) Every infinite set has a denumerable subset.

Proof. Let T be an infinite set. Since T is infinite, there are subsets Sn of T such that n∈ω .

By the Denumerable Axiom of Choice, there exists a choice function f on the set of Sn.

Therefore there exists a set C such that {xnfSnxn∈Sn }. This set C is a denumerable subset of T. ■

139

BIBLIOGRAPHY

Anselm, St. (1979 [1077-78]), St. Anselm's Proslogion, with A Reply on Behalf of the Fool by Gaunilo and The Author's Reply to Gaunilo, translated by M. J. Charlesworth (Notre Dame: University of Notre Dame Press).

Awodey, S. and ωarus, A. →. (β001), ―ωarnap, Completeness, and Categoricity: The Gabelbarkeitssatz of 1λβκ,‖ Erkenntnis 54: 145-172.

Balaguer, M. (1998), Platonism and Anti-Platonism in Mathematics (Oxford: Oxford University Press).

Ballard, D. (1994), Foundational Aspects of “Non” Standard Mathematics (American Mathematical Society).

Baumslag, B. & Chandler B. (1968), Theory and Problems of Group Theory (New York: McGraw-Hill).

ψays, T. (β001), ―τn υutnam and His Models,‖ The Journal of Philosophy 98: 331-350.

— (β006), ―The Mathematics of Skolem's υaradox,‖ In Dale Jacquette (ed.), Philosophy of Logic: An anthology, 615-648.

Bell, J. L. (1985), Boolean-Valued Models and Independence Proofs in Set Theory (Oxford: Clarendon Press).

Benacerraf, P. (1965), ―What Numbers Could Not Be,‖ Philosophical Review, 74: 47-73; in Benacerraf, P. and Putnam, H. (eds), 272-294.

— (1973), ―Mathematical Truth,‖ Journal of Philosophy, 70: 661-79; in Jacquette, D.(ed), 99-109.

— and Putnam, H.(eds.) (1983), Philosophy of Mathematics, second edition (Cambridge: Cambridge University Press).

Bolton, M. B. (1994), ‗‗The Relevance of Locke‘s Theory of Ideas to His. Doctrine of Nominal Essence and Anti-Essentialist Semantic Theory‘‘ In ωhappell (1λλκ)μ β14–225.

Boolos, G. S. and Jeffrey, R. C. (1974), Computability and Logic (Cambridge: Cambridge University Press).

Boolos, G. S., Burgess J. P. and and Jeffrey, R. C. (2002), Computability and Logic 4th ed.

63 We shall denote the cardinality of x by |x|.

140

(Cambridge: Cambridge University Press).

Brouwer, L. E. J. (1999), ―Intuitionism and Formalism,‖ Bulletin (New Series) of the American Mathematical Society 37, 1: 55-64: in Jacquette, D.(ed), 269-276.

ψrueckner, A. (1λκ4), ―υutnam's Model-Theoretic Argument Against Metaphysical Realism,‖ Analysis 44: 134-140.

Burkill, J.C. (1951), The Lebesgue Integral (Cambridge: Cambridge University Press).

Cameron, P. (1998), Sets, Logic and Categories (Berlin: Springer).

Capinski, M., Kopp, P. E. (2004), Measure, Integral and Probability (Berlin: Springer).

Chappell, V. (ed.) (1998), Locke (Oxford: Oxford University Press).

Coffa, A. (1991), The Semantic Tradition From Kant to Carnap (Cambridge: Cambridge University Press)

Cohen, P. J. (1966), Set Theory and the Continuum Hypothesis (Benjamin, New York).

Cohen, D. E. (1987), Computability and Logic (J. Wiley & Sons).

Colyvan, M. (2001), The Indispensability of Mathematics (Oxford: Oxford University Press).

ωorcoran, J. (1λκ0), ―ωategoricity,‖ History and Philosophy of Logic, 1: 187-207.

Crossley, J. N. et al. (1972), What is Mathematical Logic? (Oxford: Oxford University Press).

Cutland, N. J. (1980), Computability—An introduction to recursive function theory (Cambridge: Cambridge University Press).

Dalen, V. D., Ebbinghaus, H-D. (β000), ―Zermelo and the Skolem υaradox,‖ The Bulletin of Symbolic Logic, 6: 145-161.

Dales, H. G. and Oliveri, G. (eds.) (1998), Truth in Mathematics (Oxford: Clarendon Press).

Dancy, R. M. (2004), Plato's Introduction of Forms (Cambridge: Cambridge University Press).

Davidson D. and Hintikka, J. (eds.) (1969), Words and Objections: Essays on the work of W. V. Quine (Dordrecht: Reidel).

141

Demopoulos, W. (1λλ4), ―Frege, Hilbert, and the ωonceptual Structure of Model Theory,‖ History and Philosophy of Logic, 15: 211-225. DeVidi, D. (2004), ―Choice Principles & Constructive Logics,‖ Philosophia Mathematica, 12.

Devlin, K. (1979), The Joy of Sets (Berlin: Springer).

— (1988), Mathematics: The New Golden Age (New York: Columbia University Press).

Drake, F. (1974), Set Theory: An Introduction to Large Cardinals (Amsterdam: North-Holland).

Ebbinghaus, H-D. (β000), ―Zermeloμ Definiteness and the universe of definable sets,‖ History and Philosophy of Logic, 24: 197-219.

—, Flum, J. and Thomas, W. (1994), Mathematical Logic (Berlin: Springer).

Enderton, H. (1972), Elements of Set Theory (New York: Academic Press).

Feferman, S. (2005), ―Predicativity,‖ in Stewart Shapiro ed., The Oxford Handbook of Philosophy of Mathematics and Logic (Oxford: Oxford University Press), pp. 590-624.

Field, H. (1989), Realism, Mathematics and Modality (Oxford: Blackwell).

Fraenkel, A. A., Bar-Hillel, Y. and Levy, A. (1973), Foundations of Set Theory, 2nd rev. edn. (Amsterdam: North-Holland).

Franzén, T. (2005), Gödel’s Theorem: An Incomplete Guide to Its Use and Abuse (A K Peters, Ltd).

Gentzen, G. (1969), The Collected Papers of Gerhard Gentzen (Amsterdam: North-Holland).

George, A. (1λκ5), ―Skolem and the Löwenheim-Skolem Theorems,‖ History and Philosophy of Logic, 6: 75–89.

Gödel, K. (1989), Collected Works, Volume II, Publications 1938-1974 (Oxford: Oxford University Press).

Goldring, N. (1995), ―Measures: Back and Forth between Point sets and Large Sets,‖ The Bulletin of Symbolic Logic, Vol. 1, No. 2 (Jun.), pp. 170-188.

Goldstein, R. (2005), Incompleteness: The Proof and Paradox of Kurt Gödel (W. W. Norton & Company).

142

Halmos, P (1960), Naïve Set Theory (Berlin: Springer).

Hamilton, A. (1988), Logic for Mathematicians (Cambridge: Cambridge University Press).

Hawkins, T. (1975), Lebesgue’s Theory of Integration. Its origins and Development. Second edition. (New York: Chelsea).

Hinman, P. G. (1978), Recursion-Theoretic Hierarchies (Berlin: Springer-Verlag).

Hintikka, J. (1991), ―Is Truth Ineffable?‖ in K. Lehrer & E. Sosa (eds.) (1991), The Opened Curtain, (Boulder: Westview).

— (1995), ―The Standard vs. Nonstandard Distinction: A Watershed in the Foundations of Mathematics.‖ in Hintikka, J. (ed.), From Dedekind to Gödel, Kluwer Academic, Dordrecht, 21-44.

— and Sandu, G. (1992), ―The Skeleton in Frege‘s Cupboard: The Standard Versus Nonstandard Distinction,‖ Journal of Philosophy, 89, 290-315.

Hodes, H. T. (1984), ―Logicism and the Ontological Commitments of Arithmetic,‖ Journal of Philosophy, 81: 123-149.

— (1990), ―Ontological commitment, thick and thin,‖ Meaning and method: essays in honor of Hilary Putnam (G. Boolos, ed.), Cambridge: Cambridge University Press: 235-260.

— (1991), ―Where Do Sets Come From?‖ The Journal of Symbolic Logic, 56, 150-75; in Jacquette, D.(ed), 377-395.

Howie, J. (2001), Real Analysis (Berlin: Springer).

Jacquette, D.(ed.) (2002), Philosophy of Mathematics An Anthology, Blackwell Publishers. — (2006), Philosophy of Logic: An anthology (Oxford: Blackwell).

Jaggar, A. (1λ7γ), ―τn τne of the Reasons for the Indeterminacy of Translation,‖ Philosophy and Phenomenological Research, 34: 257-265.

Jech, T. (1973), The Axiom of Choice (Amsterdam: North-Holland).

— (1978), Set Theory (New York: Academic Press).

Jensen, R., ―Inner Models and Large cardinals,‖ The Bulletin of Symbolic Logic, Vol. 1, No. 4 (Dec., 1995), 393-407.

Jones, F. (2001), Lebesgue Integration on Euclidean Space (Sudbury, Mass.: Jones and

143

Bartlett Publishers).

Judah, H., Just, W. and Woodin, H. (eds.) (1992), Set Theory of the Continuum (New York: Springer-Verlag).

Kant, I. (1998 [1781/87]), Critique of Pure Reason, translated by P. Guyer and A. Wood (Cambridge: Cambridge University Press).

Kitcher, P. (1983), The Nature of Mathematical Knowledge (Oxford: Oxford University Press).

Kleene, S. C. (1952), Introduction to Metamathematics (Princeton, N.J.: Van Nostrand).

Kunen, K. (1980), Set Theory: An Introduction to Independent Proofs (Amsterdam: North-Holland).

Kuratowski, K. & Mostowski, A. (1968), Set Theory (Amsterdam: North-Holland).

Levy, A. (1979), Basic Set Theory. Perspectives in Mathematical Logic (Berlin: Springer).

Lewis, D. (1986), On the Plurality of Worlds (Oxford: Blackwell).

Lloyd, G. E. R. (1970), Early Greek Science: Thales to Aristotle (W. W. Norton & Company).

Locke, J., (1995 [1690]), An Essay Concerning Human Understanding (Amherst, NY: Prometheus Books).

Maddy, P. (1990), Realism in Mathematics (Oxford: Oxford University Press).

— (1997), Naturalism in Mathematics (Oxford: Oxford University Press).

Manin, Y. I. (1977), A Course in Mathematical Logic (Graduate Texts in Mathematics), (New York: Springer-Verlag).

Marion, M. (1998), Wittgenstein, Finistism, and Mathematics (Oxford: Clarendon Press).

McKirahan, R. (1994), Philosophy before Socrates (Hackett).

Mendelson, E. (1997), Introduction to Mathematical Logic 4th ed. (Chapman & Hall, London).

Moore, A. W. (1λκ5), ―Set Theory, Skolem's υaradox and the Tractatus,‖ Analysis, 45: 13-20.

144

Moore, G. H. (1982), Zermelo’s Axiom of Choice (New York: Springer).

Myhill, J. (1953), Symposium on the Ontological Significance of the Löwenheim-Skolem Theorem, Academic Freedom, Logic, and Religion. Philadelphia, PA: Amer. Philos. Soc., pp. 57-70.

Nagel, E. and Newman, J. R. (1958), Gödel’s Proof (New York University Press).

Putnam, H. (1979), Mathematics, Matter and Method (Cambridge: Cambridge University Press).

— (1983), Realism and Reason (Cambridge: Cambridge University Press).

Quine, W. V. O. (1960), Word and Object (Cambridge: Cambridge University Press).

— (1λ64), ―τntological Reduction and the →orld of σumbers,‖ The Journal of Philosophy 61: 209-16.

— (1969), Ontological Relativity and Other Essays (New York: Columbia University Press).

— (1λ70), ―τn the Reason for Indeterminacy of Translation,‖ The Journal of Philosophy, 67: 178-183.

— (1λ75), ―On Empirically Equivalent Systems of the →orld,‖ Erkenntnis 9: 313-328.

— (1987), ―Indeterminacy of Translation Again,‖ The Journal of Philosophy, 84: 5-10.

Read, S. (1997), ―ωompleteness and ωategoricityμ Frege, Gödel and Model Theory,‖ History and Philosophy of Logic, 18: 79–93.

Redding, P. (2007), Analytic Philosophy and the Return of Hegelian Thought (Cambridge: Cambridge University Press).

Resnik, M. D., (1λ66), ―τn Skolem's υaradox,‖ The Journal of Philosophy, 63: 425-438. — (1969), ―More on Skolem's υaradox,‖ Noûs, 3: 185-196.

Robinson, R.M., (1947), ―On the decomposition of spheres,‖ Fund. Math. 34, 246-260.

Rogers, H. (1987), Theory of Recursive Functions and Effective Computability (MIT Press).

Rosenlicht, M. (1968), Introduction to Analysis (New York: Dover Publications, Inc).

Russell, B. (1996 [1903]). Principles of Mathematics (New York: W. W. Norton & Company).

145

— (1919), Introduction to Mathematical Philosophy (London: Allen & Unwin). — (1960), Our Knowledge of the External World (New York: Mentor Books).

— and Whitehead, A. N. (1910), Principia Mathematica (Cambridge: Cambridge University Press).

Shapiro, S. (1997), Philosophy of Mathematics: Structure and Ontology (New York, Oxford University Press).

— (2000), Thinking About Mathematics (Oxford: Oxford University Press).

Shoenfield, J. R. (1967), Mathematical Logic (Addison-Wesley Publishing Co).

Smullyan, R. M. and Fitting, M. (1996), Set Theory and the Continuum Problem (Oxford: Clarendon Press).

Soare, R. I. (1987), Recursively Enumerable Sets and Degrees (Berlin: Springer).

Stromberg, K. (1979), ―The Banach-Tarski paradox,‖ Amer. Math. Monthly 86, 151-61.

Suppes, P. (1960), Axiomatic Set Theory (Van Nostrand).

Tait, W. (1994), ―The law of excluded middle and the axiom of choice,‖ in Alexander George, ed., Mathematics and Mind. (Oxford: Oxford University Press), pp. 45-70.

Temple, G. (1971), The Structure of Lebesgue Integration Theory (Oxford: Clarendon Press).

Thomas, →. (1λ6κ), ―υlatonism and the Skolem υaradox,‖ Analysis, 28: 193–196.

Tiles, M. (1989), The Philosophy of Set Theory: An Historical Introduction to Cantor's Paradise (Oxford: Basil Blackwell). van Heijenoort, J. (ed.) (1967), From Frege to Gödel (Cambridge, Mass.: Harvard University Press).

Wagon, S. (1985), The Banach-Tarski Paradox (Cambridge: Cambridge University Press).

Wapner, L. (2005), The Pea and the Sun (A K Peters, Ltd.).

Weir, A. (1973), Lebesgue Integration & Measure (Cambridge: Cambridge University Press).

Wilcox, H. and Myers, D. (1978), An Introduction to Lebesgue Integration and Fourier Series (New York: Dover Publications, Inc).

146

Willard, S. (1970), General Topology (New York: Dover Publications, Inc).

Zermelo, E. (1904), ‗Proof that every set can be well-ordered,‘ repr. in van Heijenoort (ed.), 139-41.

— (1908), ‗A new proof of the possibility of a well-ordering,‘ repr. in van Heijenoort (ed.), 183-98.

147

BIOGRAPHICAL SKETCH

Wataru Asanuma entered a Ph.D. program in philosophy at Florida State University in 2001. Originally from Japan, he had received his B.A. and M.A. in the same subject from Kyoto University. His main research interest lies in logic and the philosophy of mathematics and the philosophy of science in general.

148