<<

and Foundation

Zach Weber Submitted in total fulfilment of the requirements of the degree of Doctor of May 2009 School of Philosophy, Anthropology and Social The University of Melbourne This is to certify that - the thesis comprises only my original work towards the PhD, - due acknowledgement has been made in the text to all other material used, - the thesis is less than 100,000 words in length. Preface

Dialethic paraconsistency is an approach to formal and philosophical theories in which some but not all are true. Advancing that program, this thesis is about and the foundations of mathematics, and is divided accordingly into two main parts. The first part concerns the history and philosophy of theory from Cantor through the independence proofs, focusing on the set concept. A set is any col- lection of objects that is itself an object, with identity completely determined by membership. The set concept is called naive because it is inconsistent. I argue that the set concept is inherently and rightly paradoxical, because sets are both intensional and extensional objects: Sets are predicates in extension. All consistent characterizations of sets are either not explanatory or not coherent. To understand sets, we need to about them with an appropriate ; paraconsistent naive is situated as a continuation of the original foundational project. The second part produces a set theory deduced from an unrestricted compre- hension using the weak relevant logic DLQ, dialethic logic with quantifiers. I discuss some of the problems involved with embedding in DLQ, especially related to identity and substitution. Then I outline how the basic toolkit of standard set theory may be developed in this logic from the naive set concept alone, up through the theories of ordinal and cardinal numbers, including Cantor’s . The in- famous paradoxes are just , as are the existence of objects like a , unrestricted compliments, and the full set of ordinals. This furnishes the start of a purely paraconsistent foundation for mathematics, providing recapture of clas- sical theorems. It is further demonstrated that paraconsistent is able to establish strong results such as the existence of large cardinals, the of choice, and a decision on the generalized . The founda- tional reduction to sets, then, is both philosophically illuminating and technically rewarding. To conclude, some mathematical results are used to answer outstanding ques- tions about infinity and the transconsistent—demarcating the line between trans- finite and absolute infinity, in the same way that Dedekind demarcated the line between finite and infinite: by turning a paradox into a definition.

3 4 PREFACE

The main original contribution of this piece is an operational artifact of para- consistent machinery. The elemental chapter 5 capturing the theories of ordinal and cardinal numbers is the central moment. A reader already convinced of the intrinsic interest of a robustly inconsistent paraconsistent set theory may wish to begin there.

∗ ∗ ∗ A note on names. To token that some contradictions are true, Routley and Meyer had been using variations of dialectic, until 1981 when Routley and Priest co-coined the neologism dialeth(e)ism, or two-way , inspired by a remark of Wittgenstein’s. (DL, then, originally stood for dialectical logic.) According to first-hand accounts, Priest and Routley then forgot to agree how to spell the ism, and it has appeared since both with (in Priest) and without (in Routley) the extra ‘e’. I use the e-free form uniformly, both for aesthetic preference and in honor of Routley, who inaugurated dialethic set theory in his [Rou77]. Further, in the mid-eighties Routley changed his surname to Sylvan, and both names can be found intermittently distributed in the literature. As all the works I cite by him were published as Routley, that epithet is the only one used here.

∗ ∗ ∗ Thanks are due foremost to my supervisor, Graham Priest, who’s own work and patient hours of discussion have been invaluable. Greg Restall co-supervised and offered constructive discussion and comments. Audiences at the Melbourne Logic Group and the University of Melbourne Postgraduate Colloquia have listened and contributed over the years. Similar thanks are due the Melbourne-Adelaide Logic Axis, the Melbourne Postgraduate Logic Conference 2006, the APPC 2004, and the AAP in Australia and New Zealand 2004 - 8. Ross Brady has gone above and beyond in checking details and suggesting strategies for improving the proofs. Anonymous referees for [Webng], which is incorporated here into chapter 4, also provided a great deal of help. My office mates, in particular Conrad Asmus, have been constant sources of consultation and challenge, as have been many of the postgrads here. And Vicki Macknight has been loving, listening, and patiently encouraging throughout—and read the manuscript, twice. This work was funded by an International Postgraduate Research Scholarship and a Melbourne International Research Scholarship. My thanks go to the Aus- tralian government and its taxpayers, who I am sure are eagerly awaiting the rise of inconsistent set theory. Melbourne, May 2009 Contents

Preface 3

Introduction 7 1. The Set Concept 7 2. Definitions and Diagonals 11 3. Dialethism and Paraconsistency 13 4. A Place to Stand 16

Chapter 0. Into the Dialethic Fields 19 = 1. Paraconsistent Set Theory in C1 19 2. Dialectics 21 3. Inconsistent Sets 28

Part One — Paradox 31

Chapter 1. First Foundations—The Nineteenth Century 33 1. Introduction 33 2. Cantor: From Transfinite Numbers to Sets 35 3. Dedekind 46 4. Frege: From Logic to Number 48 5. Paradox and Prospect 51

Chapter 2. The Absolute 57 1. Introduction 57 2. Equivalents of the 59 3. The View From Below 64 4. The View from Above 68 5. Conclusion 71

Chapter 3. Foundations in the Twentieth Century 73 1. Introduction 73 2. On Method 76 3. Paradise Lost, Paradise Regained 77 4. Hierarchy 82

5 6 CONTENTS

5. Of Comprehension 94 6. Conclusion 97

Part Two—Foundation 101

Chapter 4. On A Logic for Naive Set Theory 103 1. Introduction 103 2. of Formal Inconsistency 104 3. A 109 4. Relevant Identity 114 5. Restricted Quantification 118 6. An Iatrogenic Disorder 122

Chapter 5. Elements 127 1. Introduction 127 2. Logic 128 3. 132 4. Basics 133 5. ZF 136 6. Ordinals 139 7. Global Choice 148 8. Cardinals 151

Chapter 6. Reflection and Large Cardinals 159 1. Introduction 159 2. Reflection Theorems 162 3. Ultrafilters 165 4. Axioms of Infinity 166 5. The End of the Ordinals 171 6. Reflecting the Universe 174 7. Conclusion 176

Chapter 7. On the Transconsistent 177 1. Characterizing the Absolute 178 2. Characterizing the Transconsistent 181 3. Characterizing the Mathematics 184

Conclusion 195

Bibliography 201 Introduction

1. The Set Concept

The concept of a set is simple to state: A set is any collection of objects that is itself an object, with its identity completely determined by its members. Many set theory textbooks open by claiming that ‘set’ cannot be formally defined, because it is too primitive, but this isn’t so; we’ve just had a fine definition. This is the naive set concept, and can be completely characterized in the of first order logic by the of abstraction and ,

x ∈ {z : Φ(z)} ↔ Φ(x), x = y ↔ (∀z)(z ∈ x ↔ z ∈ y), respectively. These clauses fix the meanings of ∈ and =, the only non-logical parts of the vocabulary of set theory; they look very much like analytic definitions of predication and identity. From abstraction immediately follows the principle of comprehension, (∃y)(∀x)(x ∈ y ↔ Φ(x)) (Later we will state these officially, specifying a language.) The clauses underpin broader mathematical needs in a natural way: The comprehension principle pro- vides for existence, and the extensionality principle governs uniqueness, of objects. A set is the unique extension of an arbitrary predicate. The reason textbooks claim that there can be no definition of set is not that the concept is somehow opaque or ambiguous. The issue is that the set concept is formally inconsistent. The concept is not indeterminate, or underdetermined; it is overdetermined. I will argue that the inconsistency—a paradox, since it is a inside a —is not accidental, but inherent to the concept. A dialethia is a contradiction following from true by valid inference rules, a true inconsistency. Dialethism is the thesis that some theories comprised of inconsistent sentences are still true; some of the sentences about sets are incon- sistent, and they are still true. Insofar as sets are among the most basic, primitive objects that can be described at all, the presence of contradiction in them is trans- fixing.

7 8 INTRODUCTION

Roughly, a set is a multiplicity that also has unity, or, in more antique terms, a many that is also a one. How can a many also be a one? It sounds inconsistent for something to be both singular and plural—and it is. But this way of putting the paradox has a mysterious ring to it; happily, modern logic allows us to to apply sophisticated tools to old problems, to make the meaning of the phrase, and so its deep tension, precise. Generally, a system is extensional if coincidence of members, or of some other objects (e.g. truth values) obtained from a list is sufficient to establish an identity. Logicians then often call a system intensional just if it is not extensional. But an intension, and so an intensional system, has an independent meaning: Identity and other relations are intensionally determined by properties or predicates, by defini- tions. A basic assertion here is that extensional systems and intensional systems are not exclusive categories; intensionality does not rule out extensionality. Set theory is considered the theory of extensions par excellence, and rightly so: Sets are extensions. On the full, naive set conception, sets are further naturally thought of as predicates in extension, suggested at least by Peano’s choice of the  sign to denote predication, from the greek verb στιν, ‘to be’. In this conception, the ‘is’ of predication becomes a matter of membership; in usual , predicates are still interpreted as sets. Without qualification, this is a simple ex- planation of what it means to be a set, a collection encompassed by a predicate. And this means we will face a collision of extensionality with intensionality. Such was Cantor’s original in 1883 (from e.g. [Hal84](33)):

By a ‘manifold’ or ‘set’ I understand generally any multiplicity which can be thought of as one (jades Viele, welches sich als Eines denken lasst), that is to say, any totality of definite elements which can be bound up into a whole by means of a law.

Similarly, Frege’s system construed sets as the of predication. In his Grundgesetze, he stated the set concept in a single axiom, his fifth basic law, the infamous equivalence

{x : Φ(x)} = {x : Ψ(x)} ↔ (∀x)(Φ(x) ↔ Ψ(x)).

Frege’s axiom looks like a tautology: The set of Φs is identical to the set of Ψs exactly when the Φs are all and only the Ψs. That is obvious to the point of banality. Sets and concepts, or properties, or predicates-in-extension, are all much the same thing. The key to the appeal of the comprehension scheme is that it illuminates collections in a way that extensionality alone cannot. There is, in a word, more than what Forster [For82](2) calls “the most tough-minded expression” of extensionality, that “the only thing a set theorist can know about = is that it 1. THE SET CONCEPT 9 is a congruence relation with respect to ∈.” We can know that = unites identical definitions. Here then is the nexus of the paradox. Sets are bound up into a whole by a law, a concept, a definition; the comprehension principle induces sets that are intensional. Meanwhile the extensionality principle governs all sets. Since these to- gether characterize sets, sets are both intensional and extensional. This is unstable, as witnessed by the ensuing paradoxes; this is unavoidable because it is analytic. This is the reason that the set concept is dialethic. The naive position is tenable for two . First, there are logics available in which the naive set concept may be studied in full without falling into incoherence. Any such logic is called paraconsistent; so paraconsistency insures that the naive position, even if surprising in places, is still sensible. Second, and most emphasized here, the axiomatized set concept can be used to provide the same resources as standard axiomatizations, as well as answer questions that no other system has. The set concept is axiomatic, in the old sense of being self-evident. The concept just affirms that sets are the last word on collections. If the concept were not so intu- itive, even inalienable, then its inconsistent consequences would be reason to reject it. Like the naive concept of truth, it “seem[s] forced upon us in such a way,” writes Slaney, “that we should in all intellectual honesty take [it] seriously.”[Sla89](472) As as a consequence of being axiomatic, the set concept is not just inconsistent, but dialethic—a paradox forced upon us. And this account completely explains the set theoretic antinomies in an immediate, satisfying way: The paradoxes are theorems. Our task becomes not to explain the contradictions away, but rather, as has been the case with past surprises in science and mathematics, to come to grips with the phenomenon, in the service of trying to understand. In the presence of , the comprehension principle is outright in- coherent. This has been cause of much surprise and consternation. The most prevalent response has been to maintain classical logic, and to adopt Zermelo’s 1908 selection of some instances of the comprehension principle, and abandon the rest. This is not the only possible response. Given a theoretical surprise, a para- dox in the informal, etymological sense of thwarting δoξα, , we have a choice. We can treat the result as a mistake and revise the assumptions that brought the paradox about. Or, if the force of the assumptions was strong enough, and more if the rejection of the assumptions is at least as paradoxical as their consequences, we can recast the surprise as a new truth, and instead go about revising the way we think about these . Set theory has been moulded exclusively in the past century by the first strategy; this has resulted in much fine mathematics, philosophical controversy, and the persistence of seemingly intractable questions about the nature of sets themselves. Classical logic makes comprehension absurd; 10 INTRODUCTION the consternation, then, is due to the deep sense that comprehension is not absurd. So we may instead maintain comprehension, and adopt, say, Routley and Meyer’s 1976 selection of logical axioms with which to reason about it. The dialethic paraconsistent approach recasts how we think about old phe- nomena. Forster, introducing his own unorthodox work on Quine’s set theory NF , writes

In the ZF [Zermelo-Fraenkel set theory] world, ... the para- doxes are viewed as large holes in the ground that one might fall into. ... However, it is always a mistake to think of anything in mathematics as a mere pathology, for there are no such things in mathematics. ... One should think of the paradoxes as super- natural creatures, oracles, minor demons—on whom one should keep a weather eye in case they make prophecies or by some other means divulge information from another world not normally ob- tainable otherwise. One should approach them as closely as is safe, and from as many different angles as possible.[For95](11)

The question turns on just how close is still a safe distance. In the first foundations of the nineteenth century, we will find paradoxes, see how and why they form, and what they mean. Then we will build a paraconsistent foundation, giving some precision to Wittgenstein’s imagery:

Why should Russell’s contradiction not be conceived as some- thing ¨uber-propositional, something that towers above the propo- sitions and looks in both directions like a Janus-head? . . . Might one begin logic with this contradiction? And as it were descend from it to . The that contradicts itself would stand like a monument (with a Janus-head) over the log- ical propositions.[Wit56](III.59)

The whole can be summarized by paraphrasing Frege’s 1902 reply to Russell: With the loss of a full comprehension principle, the sole possible foundation seems to vanish. It still ought to be possible to reason coherently with such a principle. Ultimately this is a work on set theory; since it is the case that set theory is inconsistent, this is a work on dialethic set theory; since not everything is true, this is a work on paraconsistent set theory. Dialethic paraconsistency is a motive and a method for the rational study of theories containing contradictions. Dialethism is the motive: Because contradiction is embroiled in truth, it is incumbent upon us to try to understand contradiction. Paraconsistency is the method, insulating truth from explosive inferences, from ex falso quodlibet. This is to speak of potentials, 2. DEFINITIONS AND DIAGONALS 11 and this much is already known. What is not known is how much work dialethic paraconsistency can accomplish. The aim of the present work is to begin to find out.

2. Definitions and Diagonals

An otherwise disparate collection is encircled by a predicate, a property, a definition. Birds become a flock, stairs a flight, people a crowd (a Menge). The most general instance: Objects become a set. The naive concept captures, in slogan form, sets as the of definitions. Because he emphasized intelligibility, spoke of definitions. Definitions are the key to, perhaps synonymous with, intelligibility; and the Socratic hope is that better definitions will lead to more intelligibility, to the better life. From the Euthyphro 6e [SMC00], Tell me what is the nature of this idea, and then I shall have a standard to which I may look, and by which I may measure actions, whether yours or those of any one else, and then I shall be able to say that such and such an action is just, such another not just. Large abstraction schemas are what is wanted, of the form x is Φ if and only if Ψ, whereby it can be further known that x is not Φ if and only if not Ψ. The Socratic call for intelligibility leads inexorably to such general abstraction schemes. Such general abstraction schemes lead inexorably to inconsistency, by diagonalization. Consider the definition x is Λ if and only if x is not x, and its diagonal: Λ is Λ if and only if Λ is not Λ. Then by excluding the middle, Λ is Λ and Λ is not Λ, a contradiction. Truth and sets both lead to such unsettling self-referential state- ments, to outright contradictions. Diagonals have been and continue to be a very fertile source of information. Diagonal find the limits of definitions. It is suitably sweeping to say that “the history of Western thought can be regarded as a history of our relationship to the diagonal .”[Kin93] A very early appearance of diagonalization is Euclid’s proof that there are infinitude of primes: Take any finite set of primes Qn {k0, ..., kn} and consider i=0 ki plus one; this is either a prime not on the list, or 12 INTRODUCTION implies the existence of a new prime by the prime factorization theorem. In this instance, diagonalizing leads only, harmlessly, to more objects. When diagonals meet totality, though, then fixed points result (c.f. §4 below). These fixed points can do, and have done, a lot of philosophical work; for example, Descartes famously used a doubt-operator, and came upon his cogito1:

I doubt(p I am q) iff I am. In 1874 Cantor proved that the concept of countable infinity has a diagonal, and so showed the existence of uncountable cardinal numbers. The most important appearance of a diagonal was as the hinge of G¨odel’s1931 incompleteness proof, in the form

pΞq is not provable iff Ξ. With “the development of mathematics toward greater precision,” as G¨odelwrote, he proves that no consistent system capable of expressing anything as interesting as arithmetic can go on to establish all . The impact of inconsistency in such otherwise innocuous notions has left a crater in the Socratic/Cartesian rational project. Inconsistency was not the projected or expected outcome, and after many decades of our getting used to incompleteness, it is important for us to recall just how bad G¨odel’snews was. It was to be in mathematics that proof and truth perfectly coincide; there was to be no ignorabimus. Responding appropriately to G¨odel’sdiscovery remains an outstanding task for human thought. One response has been that these discoveries show our intuitions are bankrupt. On this despondent reaction, there is no formal or precise concept of set, proof, or truth with which to work [Woo03](159). We can only work scientifically with stripped down fragments of these original ideas. This attitude takes contradiction to be the worst thing that can happen—worse than abandoning hope of a precise theory of sets, proof and truth. Paraconsistency can be taken as the doctrine that a contradiction is not the worst thing that can happen. In , since contradiction does seem to be the sort of thing that happens, it is rather unhelpful to panic or accept rational chaos when they do. Consider the folly of an explosive inference (discussed by Weir in [PBAG04](406)): “War is wrong. Therefore, if we go to war, it is okay to bomb civilians.” At the least, this is not the way we want our leaders to reason. This is not the way important philosophical concepts like truth should be reasoned about. The response to G¨odelhas not been good. Received wisdom is now that the world is too complex, too difficult, too big to understand as a whole. There is no such thing as everything. We will be studying a strain of scientific optimism as it

1Barwise and Moss similarly gloss the cogito as a circular phenomenon [BM96]. Treatment of the cogito as a full diagonal, not just left to right, is [Boo83](285). 3. DIALETHISM AND PARACONSISTENCY 13

flourished at the conclusion of the nineteenth century, and how it collapsed in the twentieth. In broad terms, the old project of fully and rationally understanding the world has been decided a failure. Certainty and completeness are lost. But the last century’s was only the first response to the emergence of dialethias. Beginning with Descartes, the Enlightenment project was to fix a total theory of a uniquely quantifiable world, and, I will maintain, its proponents were verging on such a theory—complete with the strange and wonderful surprise that a total theory will be paradoxical. Tarski states a precise schematic definition of truth,

T pΦq ↔ Φ. And in the presence of sentences like The first self-referring sentence of [Web09] is not true, the ancient follows as a theorem. When a system is complete, it will contain paradox. In 1944, Tarski wrote of the liar paradox: In my judgment, it would be quite wrong and dangerous from the standpoint of scientific progress to depreciate the impor- tance of this and other antinomies, and to treat them as jokes or sophistries. It is a fact that we are here in the presence of an absurdity, that we have been compelled to assert a false statement....[Tar44] And in 1990, Vann McGee accords: There are scarcely any philosophical problems of greater urgency than the liar paradox, for there are scarcely any concepts more central to our philosophical understanding than the concept of truth. ... Quite unmistakably, our present way of thinking about truth and reference is inconsistent.[McG90](vii) The liar is only the easiest to state of the paradoxes. That these are serious para- doxes is not in doubt; they call for a response.

3. Dialethism and Paraconsistency

Some propositions of the form Φ∧¬Φ are true; some true propositions have true . Classical logic in turn is wrong for the most basic reason: Its consequence relation is not truth preserving. A paraconsistent logic remains appropriate to the nature of truth, by using some modern technology. A logic is a pair hL, `i, where L is a set of formulae and ` is a relation between of L and members of L. Distinguishing between logics is a matter of the relation `, the consequence relation. In classical logic, the consequence relation `C is explosive: Where Φ, Ψ 14 INTRODUCTION are formulae,

Φ, ¬Φ `C Ψ, a rule also called ex contradictione quodlibet. Since Ψ can be anything at all, di- alethism in classical logic is nonsensical. Any logic is paraconsistent when explosion fails over the consequence relation. An immediate consequence of dialethism is arrestingly simple. Dialethism is itself false, because all contradictions are false. So dialethism itself is a dialethia [Pri79](238); see chapter 0 below. And Tarski was clear that the problem of contradiction is not ex falso, but rather falsity:

I do not think that our attitude toward an inconsistent theory would change even if we decided for some reason to weaken our system of logic so as to deprive ourselves of the possibility of deriving every sentence from any two contradictory sentences. It seems to me the real reason for our attitude is a different one: We know (if only intuitvely) that an inconsistent theory must contain false sentences; and we are not inclined to regard as acceptable any theory which has been shown to contain such sentences.[Tar44](368)

Any theory which includes contradictions will include falsities. The question then turns on which way we incline—toward completeness or consistency. Priest agrees with Tarski:

Truth and falsity come inextricably intermingled, like a constant boiling mixture. One cannot, therefore, accept all truths and reject all falsehoods... [Pri06b](100)

In the “concluding self-referential postscript” to his 1979 paper, Priest notes that “the subject of paradoxical assertions is one full of surprises. However, that it should be so is not particularly surprising.”[Pri79](240) To wit, by maintaining full abstraction schemes, dialethic paraconsistency recasts the force of counterexamples. Both ∀xΦx and ∃x¬Φx can hold at the same time, in the same way (and then by the usual rules, ¬∀xΦ(x) and ¬∃xΦx would both hold, too). The cartesian method is relaxed: While that which can be doubted is not certain, it may also still be certain. Some certainties are dubitable, just as some truths are also false. Here in dialethic paraconsistency, paradox is still playing its traditional role of inspiring new developments. Here is also the departure and so a way forward. The thesis of dialethism remains contentious, but I generally assume it without protracted argument, for two reasons. The first and main reason is that the initial battles have already been fought in the canonical works, e.g. [Pri06b]. To advance 3. DIALETHISM AND PARACONSISTENCY 15 from here does not require a rehearsal of abstract polemics about the law of non- contradiction. Necessarily along the way, some of these debates will surface; but whoever is yet unmoved is unlikely to be swayed by repetition, while whoever does have some sympathy will be ready for something fresh—for progress. While there are no vertebrate arguments against dialethism, to interrupt the inertia of classicism requires some very strong compulsion to dialethism, lest we appear to be paradox- mongering. So as my second reason for letting the assumption stand, what is vitally important now is to show that dialethic paraconsistentcy forms a fertile ongoing research field. A project emerging from mathematics and logic, after all, can only be vindicated when it produces some good, true mathematics and logic. Then, these should nourish some good philosophy. Most new ideas require their fitting into pre-existing conceptual categories. Some ideas are novel enough to require new categories altogether. Dialethic para- consistency is not like either of these. Dialethism is like gravity, in that it shapes the very arena in which it acts. Dialethism changes the configuration of our concepts; it changes the way we think, and tries to leave the concepts themselves untouched. By analogy, in non-euclidean, elliptic geometry, we are to imagine that any two straight lines, even two lines both perpendicular to a third, have a point of intersection. This is not what is usually understood from straight lines, right angles, and so forth; but an assumption, the parallel postulate, has been dropped, and the concepts realign coherently. Elliptic geometry can be studied fruitfully once one makes some adjustments, not to the notion of straight line, but to the way one thinks about straight lines. So here we recast the law of non-contradiction, we recast the idea that implication is material or that ` is classical, and begin to see in a new way. Contrary to common perception, dialethism tampers neither with truth, nor , nor even with contradiction; it shows only how these concepts are to fit if the world is to be intelligible. There are very few certainties to be had, so it would seem incumbent upon us that we learn to think about them well. Paraconsistent naive set theory shows why there are paradoxes, and how a foundation can be had from them. A set is both an extensional entity, determined completely by membership, and also an intensional object, a unity out of multi- plicity bound by a law. Respecting closure on both these aspects, the universe of sets has a dialethic boundary. I will therefore indicate that the infamous paradoxes of naive theory are all due to a single, powerful truth: An inconsistent fixed point stands where none can be, at the ineluctable, impossible meeting of extension and intension, at what Cantor called absolute infinity. The aim is to make philosophical and mathematical sense of the fixed points, and moreover, to put them to work. 16 INTRODUCTION

4. A Place to Stand

Total provability operators, total truth predicates, total collectivizing proper- ties, cannot be free of paradox. So teaches G¨odel: No consistent scope is wide enough to see everything. Put broadly, to understand the world completely would be to see it from the outside—a vantage which is impossible to occupy. “In order to draw a limit to thought,” writes Wittgenstein, “we should have to think both sides of this limit (wir m¨usstenalso denken k¨onnen,was sich nicht denken l¨asst).” We find the other side both thinkable and not. On the dialethic view, incompleteness phenomena show only that, because we can conceive of the world as a whole, the intelligibility we find when chasing Socratic definitions will harbor paradox. One can only ask that a theory account for what the data offers. The data is inconsistent; so will the theory be. In this way, dialethism provides a solution to the outstanding antinomies in the foundations of mathematics, the contradictions of set theory, the contradictions at the limits of thought in general. As Russell suggested, and as is shown by Priest [Pri02a], all the paradoxes of self-reference—Russell’s, K¨onig’s,Grelling’s, Berry’s, Mirimanoff’s, Burali-Forti’s, not to mention the liar and its cognates—are instances of a single schema, the inclosure: There is a set, W = {x : Φ(x)}; and Ψ(W ). There is a diagonal function δ, such that if y ⊆ W and Ψ(y), then δ(y) 6∈ y, called transcendence, and δ(y) ∈ W , called closure. The contradiction comes because the whole is a part, W ⊆ W , entailing that δ(W ) ∈ W and also δ(W ) 6∈ W . All the dialethias we will be concerned with in this work are inclosure paradoxes. (It is my suspicion, as yet unsubstantiated, that the only true contradictions are inclosure paradoxes.) The central motif is that paradoxes are the limit, the horizon of the world; they are both inside and outside, defining the world’s edges; and paradoxes are accessible—part of naive truth, concepts, and language. Paradoxes, then, are a possible way to see the world from the impossible exterior vantage. Euclid’s [Euc56] thirteenth definition tells us that a figure is that which is given by limits. A limit is that which is an extremity of anything. A figure (σχµα) is that which is contained by any limit or limits. Without the limit of a figure, there is no figure at all. Intelligibility issues from the edges. And as with geometry, so it is with the world: To know the world as a whole, we need to know its extremity. If paradoxes, true contradictions, demarcate the limits of truth, then we have by Euclidean lights a figure, an overall shape, of truth. The motto is from Archimedes: 4. A PLACE TO STAND 17

δoυ µoι πoυ στω, και κινω την γην. “Give me a place to stand, and I move the world.” My hypothesis is that paradox offers an Archimedean point, and I intend to investigate this hypothesis through the foundational fragment of set theoretic math- ematics, using the modern rendering of Socrates’ scheme, a full comprehension prin- ciple, inconsistent fixpoints and all. A call for definitions provokes the question, and, as I aim to show, its very schematic form provides the answer.

CHAPTER 0

Into the Dialethic Fields

De omnibus dubitandum. — Descartes The origin and testing ground of dialethic paraconsistency is in mathematics and set theory. Throughout this work we will be focused on set theory, and its naive, paraconsistent rendering; so a concrete example and discussion of methodology is important to have up front. In this short inaugural excursion, I quote a theorem and question whether, in addition to being true, it is also false. Since the possibility of overlap between truth and falsity is the defining feature of dialethism, I use the opportunity to meet some standard criticisms, all of which have that dialethism is defective in some way or another. The aim is to air questions about the force of dialethic paraconsistent truths, coherence and absurdity, and reasoning about inconsistent theorems. For most of this work we will be in the paraconsistent tradition of Routley, Priest, Brady, et al—the Australian tradition. Another tradition stems from the work by Newton C.A. da Costa and his school in Brazil; in those papers we have some of the more direct mathematical engagement with inconsistent structure, summarized in [dCKB04]. Here we will use some of that Brazilian mathematics as a case study.

= 1. Paraconsistent Set Theory in C1 I take as a starting point Newton da Costa’s set theoretic investigations [dC00] = using his extensional paraconsistent logic C1 . (The results here were first obtained by Arruda and Batens [AB82], in the context of a much weaker logic P , which is a fragment of almost any other logic, including Routley’s DK; see ch.4.) To flag that this system is distinct from the theory of part two, I will follow da Costa’s use of the signs ⊃ and ≡ here for the conditional and biconditional. For da Costa the negation operator is divergent [PR89b](163, 175) too, but to avoid needless confusion in the discussion to follow, the standard ¬ is used. = With ◦ read as ‘behaves consistently,’ the sentential fragment of C1 has the following axioms; see [dC74]. Φ ∧ Ψ ⊃ Φ. Φ ∧ Ψ ⊃ Ψ.

19 20 0. INTO THE DIALETHIC FIELDS

Φ ⊃ Φ ∨ Ψ. Ψ ⊃ Φ ∨ Ψ. Φ ⊃ (Ψ ⊃ Φ). (Φ ⊃ Ψ) ⊃ [(Φ ⊃ (Ψ ⊃ Υ)) ⊃ (Φ ⊃ Υ)]. Φ ⊃ (Ψ ⊃ Φ ∧ Ψ). (Φ ⊃ Υ) ⊃ [(Ψ ⊃ Υ) ⊃ (Φ ∨ Ψ ⊃ Υ)]. Ψ◦ ⊃ [(Φ ⊃ Ψ) ⊃ ((Φ ⊃ ¬Ψ) ⊃ ¬Φ)]. Φ◦ ∧ Ψ◦ ⊃ (Φ ⊃ Ψ)◦ ∧ (Φ ∧ Ψ)◦ ∧ (Φ ∨ Ψ)◦. ¬¬Φ ⊃ Φ. Φ ∨ ¬Φ. There is one rule: , Φ, Φ ⊃ Ψ ` Ψ. For identity postulates, we have

x = x, x = y ⊃ [Φ(x) ≡ Φ(y)].

Also assumed are the usual ZF axioms, including extensionality,

x ⊆ y ∧ y ⊆ x ⊃ x = y, with x ⊆ y defined as (∀z)(z ∈ x ⊃ z ∈ y). While da Costa’s student Arruda found that his principle systems trivialize full comprehension [Arr89], by taking some of the facts about naive set theory as axiomatic, the system can be used for fruitful non-trivial research. Da Costa’s work is conducted in Church’s version of ZF with added existence assumptions about Russell’s set, R, and a universe, U; he postulates that

(∃R)R = {x : x 6∈ x}, (∃U)U = {x : x = x}.

To gain a sense of the system’s properties, we first follow da Costa in proving that the Russell set is closed under some set-forming operations. Given singletons {x}, note that by definition

y ∈ {x} ≡ y = x.

To prove that R contains all the singletons of its members, let x ∈ R. Either {x} ∈ {x} or {x} 6∈ {x}. In the second case, {x} ∈ R, and we are done. But the first case gives {x} = x.1 If {x} = x then by substitution of equals for equals,

x ∈ R ⊃ {x} ∈ R.

1In Aczel’s set theory [Acz88], the set x such that x = {x} is unique; foregoing his anti-foundation axiom, the identity or difference between a = {a} and b = {b} here remains undecided. 2. DIALECTICS 21

Given any two elements x, y of R, the pair {x, y} is in R, too. For, by the pairing axiom, from x, y we have {x, y}, and if this pair is not self-membered, it is in R. If the pair is self-membered, then either x = {x, y} or y = {x, y}, and again by substitution we conclude that

x ∈ R ∧ y ∈ R ⊃ {x, y} ∈ R.

The argument generalizes: If x, y, z, ... ∈ R then {x, y, z, ...} ∈ R, too. Now for a lemma leading to the main theorem: For any x,

{{x, R}} ∈ R.

The proof again considers the case where {{x, R}} ∈ {{x, R}}, since given non- self-membership the result is immediate. So {x, R} = {{x, R}}, and therefore x = R = {x, R}. And R ∈ R, so by substitution the lemma holds. Finally, Arruda and Batens make the following observation: Every x is in the of the Russell set, [ R = U where S R = {x :(∃y)(x ∈ y ∧ y 6∈ y)}, the union of R. Again by pairing, we can always consider {x, R} for any x; and again, if {x, R} 6∈ {x, R}, then the pair is in the Russell set and so x ∈ S R. If {x, R} is self membered, then either R = {x, R} or x = {x, R}. In the first case, since R ∈ R, by substitution {x, R} ∈ R and then x ∈ S R. In the second case, {x} = {{x, R}}. By the lemma above, {{x, R}} ∈ R. Substituting, {x} ∈ R, and therefore x ∈ S R. This shows (∀x)(∃y)(x ∈ y ∧y ∈ R), that (∀x)(x ∈ S R). 1 Not being a relevant logic, C= includes the axiom that Φ ⊃ (Ψ ⊃ Φ). From this it follows that (∀x)(x ⊆ S R) and (∀x)(x ⊆ U). This is sufficient for S R = U as desired. Let us call this the Russell-union theorem, or RU for short.

2. Dialectics

Now let us stand back and enquire about the nature of this mathematics.

2.1. First we make a general observation about negation. The discussion is on the sentential level so we use variables p, q, r... etc. Paraconsistent logic is often described in terms of negation. Generally, the operator ¬ can be characterized by the basic semantic clauses: p is true iff ¬p is false, p is false iff ¬p is true. 22 0. INTO THE DIALETHIC FIELDS

Classical logic views ¬p as indistinct from p ⊃ ⊥, where ⊥ is, say, the absurd conjunction of all propositions. (See ch.4 for an exact formulation.) If we are to study non-trivial inconsistent theories in the presence of a detachable conditional →, this equivocation must be incorrect. The notions of denial and rejection are teased apart. Following Frege, denial can be construed as the assertion of negation2: ¬p expresses the denial of p. Mean- while, p → ⊥ can be construed as the rejection of p; Curry called this absurdity negation and we will write −p. To accept p is to walk into incoherence, so p is to be rejected. This is a strong notion, far stronger than asserting ¬p. Dialethism claims that it may happen that both p and ¬p are true (and so both p and ¬p are false (and so p is both true and false (and ¬p is both true and false))). By relaxing the explosion rule, we separate denial from rejection, ¬p from −p. The former only leads to the latter via ex contradictione quodlibet. The other direction, from −p to ¬p, is (literally) trivial. Classical mathematics marks no difference between denial and rejection. As our concern is paraconsistent mathematics, novel questions emerge. In a paraconsistent logic with contraposition, denying a proposition p by reduc- tio is unproblematic. But even with contraposition, rejecting a claim by reductio ad contradictione is not possible. Supposing that p entails q ∧ ¬q, still it may be that p is true. Even if p entails ¬p, the possibility of p is not ruled out; witness the Russell set, where R ∈ R → R 6∈ R. Contradiction is not grounds for rejecting the assumptions at the start of a line of reasoning. So the question becomes: What, if anything, is? A further question is on the force of a proof: Suppose p to be proved. That is, we are contented to agree that p is true. If there is no way to ensure that p is not also false, then what have we really proved?

2.2. Apropos: How do we know that the Russell-union theorem is true only? Glut logics, the sort appropriate for dialethic paraconsistency, tell us that some propositions are true; some are false; and some are both. Any interesting theory should have propositions in each category; and presumably, as suggested by Brady’s non-triviality result for weak relevant logics [Bra89], naive set theory has some true (and not false) theorems. But we as yet lack any mechanism for deciding, upon proof of p, whether ¬p is also provable, or whether p is true and true only. As the axiom list shows, da Costa’s systems have a unary operator that denotes consistency at the propositional level. He writes p0 for ¬(p ∧ ¬p). Then a stronger 0 negation can be defined, ¬0p for ¬p ∧ p . These devices are meant to capture our ability to insist that a proposition is true and true only.

2I am diverging in terminology and theory from [Pri06a](103 - 115). 2. DIALECTICS 23

There is much to say on this question, which Berto calls the “exclusion problem” [Ber07], the question of expressing consistency. Rather than remain in the tortured abstract, though, let us see first if in fact we can determine the consistency of the theorem in question, perhaps in order to utilize ◦. For, in the case of RU, we have a simple argument using elementary logic. But there is little reason prima facie to take the theorem as true only; in fact, since it makes a claim about a notoriously inconsistent object, the Russell set, there is a strong likelihood that the theorem is also false. I should emphasize that I do not dispute the proof of RU, at least in principle. Assuming the lemmas above really can be delivered in da Costa’s logic, I think that RU is a theorem. I am concerned, rather, with the force of saying that it is a theorem. Let us test the negation of RU, and see whether any more light is shed. If RU is also false, then S R 6= U is true. By the , the sets differ with respect to membership:

x 6= y ≡ (∃z)(z ∈ x ∧ z 6∈ y) ∨ (∃z)(z 6∈ x ∧ z ∈ y).

Something must be in one that is not in the other. We will first ask if S R is lacking: Are there any objects in the universe which are not in the Russell union?3 There would have to be some x such that for no y ∈ R is x ∈ y. Using this as a collection criteria for a set, [ R = {x :(∀y)(x ∈ y ⊃ y 6∈ R)}.

(That S R = T R is, as it will turn out, an undesirable feature peculiar to some systems and absent from our theory.) Now, S R cannot be empty to be a coun- terexample; so we christen a member: Suppose a ∈ S R. Fact one: (∀y)(a ∈ y ⊃ y ∈ y), because a ∈ {x :(∀y)(x ∈ y ⊃ y 6∈ R)}. From this fact it follows that, since a ∈ S R, by modus ponens S R ∈ S R. Fact two: Because a ∈ {a}, by fact one, a = {a}. Now let b be an arbitrary set, and consider the pair {b, a}. Since a ∈ {b, a}, by fact one it follows that {b, a} ∈ {b, a}. Thus either b = {b, a} or a = {b, a}. From either of these disjuncts, unpleasantness follows, as we now see. If b = {b, a}, then either b ∈ b (since b ∈ {b, a} = b) or b = a. Since b is arbitrary, then everything is either self-membered or identical to a. To trivialize this we define ∅ = {x : ⊥} and pick it for b. Instantiating, ∅ ∈ ∅ ∨ ∅ = a. In the first case, ⊥. In the second case, since a = {a}, by transitivity of identity follows ∅ = {∅}; but since ∅ ∈ {∅}, substitution implies ∅ ∈ ∅, and ⊥ again. Ergo, ⊥, triviality.

3The following was developed in discussions with Conrad Asmus. 24 0. INTO THE DIALETHIC FIELDS

Alternatively, if a = {b, a}, then also {a} = {b, a}, since a = {a}. Since b ∈ {b, a}, by substitution of equals for equals, b ∈ {a}. Then b = a, and ⊥ again. Alternatively, if there were some x ∈ S R but x 6∈ U, then by substitution, still some x 6∈ S R, and we are back to ⊥. All lines lead to ⊥. And so this seems sufficient cause to declare RU as true and true only. (If the system we are working in is not trivial, then RU must be exclu- sively true.) But very much is called into question by dialethic paracosnsistency— enough, at least, to require analysis of some of the dialectical norms in this chain of reasoning. Since some of the usual principles of mathematical practice have been suspended, we need a principled way to interpret these consequences.

2.3. Since the start of dialethism, amongst both its adherents and serious crit- ics, there has been a nagging concern. Routley flags it in [Rou80]. More recently, several authors in the landmark collection [PBAG04] bring it up: How, dialecti- cally, is rejection possible? The dialethist asserts p; but this does not entail a rejection of ¬p, since a true proposition can have a true negation. Thus someone wishing to disagree with the dialethist’s assertion of p may feel unable to register disagreement by imploring ¬p. Does this not constitute an infringement of the rules of rational discourse, by frustrating any attempt at refutation? The issue takes prototypical form when the object of discussion is dialethism itself. An opponent of dialethism wishes to say that, no, no contradictions are true. The dialethist will agree. No contradictions are true (only), at least not according to the logics we will be using. The law of non-contradiction is itself a dialethia. Shapiro explains,

It is difficult for the critic to know what to say. One can refute other opponents by showing that their views lead to contradic- tion, especially if the inconsistency comes by the opponent’s own lights ... This cuts no ice against the dialethist. [PBAG04](337)

By its own admission, the dialethic thesis is false. What more can be done to force retreat, the critic asks, than to prove the falsity of a claim? There is a constellation of arguments around the idea that dialethism is a non- starter, because by its very nature it flouts the norms of dialectics. In 1999, David Lewis sent this note to the editors of [PBAG04]:

I’m sorry; I decline to contribute to your proposed book about the ‘debate’ over the law of non-contradiction. My feeling is that since this debate instantly reaches deadlock, there’s nothing much to say about it. 2. DIALECTICS 25

Nearly identical recoiling is widespread: “A common response to those who question the law of non-contradiction is that it is impossible to debate such a fundamental law of logic,” note Ot´avioBueno and Mark Colyvan, because “there is a minimal set of logical resources without which rational debate is impossible.” [PBAG04] (156) Common ground is essential. Let us see what sort of common ground is about.

2.4. The dialectical worries are essentially tied to the question of what mech- anism decides the true contradictions from the false (only) ones. An exact answer is provided in chapter 7 using the full mathematical apparatus. But by addressing this question now, we will delimit a clearer map of the dialethic landscape, what dialethism shares with the wider community, and thereby assess the work done on the Russell-union theorem. While it would seem to make a radical claim about truth, in fact dialethism derives from very commonsense commitments. A few facts are indubitable, such as ppq is true iff p, and what validly follows from truth is also true. Negation, conjunction, and truth all are as before. Further, in most paraconsistent logics one has the inference of ,

p → q, ¬q ` ¬p.

Along with the theorem ¬(p∧¬p), which these logics also carry, then, the dialethist can ascertain the truth of a negation by reductio ad contradictione, like anyone else. So some propositions, which happen to have the form p ∧ ¬p, are true, in just the same way as any other true proposition is true. And p ∧ ¬p is false in the same way as any other propositon is false, too. Thus the short answer as to how true contradictions are discerned from untrue contradictions is instructive, if a bit dissatisfying. Philosophers have yet to devise a failsafe device to sort truth from falsehood. (By classical lights, the construction of such a device even for relatively simple systems would solve the decision problem, and is provably impossible.) Nevertheless, we have imperfect means; and whatever these means are, are basically the same means the dialethist uses.4 Less quickly, a useful appeal is made to absurdity. An analogy, that consistency is to contradiction as coherence is to absurdity, is helpful. The inference of reductio ad absurdum would be stronger than reduction to a contradiction; this is a means

4Priest’s standing answer is similar. “I often find myself being asked the following question: ‘Since you believe some contradictions, but not all, you must have a criterion for deciding between those that are true and those that are not. What is it?’ In reply I usually point out that the questioner some things are true, but not all, and ask them what criterion they have ... The answer, I think, is the same in both cases. Nice as it would be to have a criterion of truth, to expect one would seem utopian. One has to treat each case on its merits, whether the proposition concerned is a contradiction or some other thing.”[Pri06a](56) 26 0. INTO THE DIALETHIC FIELDS to reject. And certain notions are, indeed, absolutely absurd. Positions which are committed to absurdities are to be rejected. The conjunction of all propositions ⊥ is an absurdity. (What else might count as absurd is an open philosophical question.) So a mechanism is provided for rejection: If p entails ⊥, then reject p. Being concerned with truth, like everyone else dialethism does not admit absurdities; this is why so much work must be poured into finding appropriate paraconsistent logics. If anything, I am compelled to dialethism exactly because abandoning some principles would deviate so far from common sense as to be an absurdity. Some (e.g. [Sor01](77)) claim that dialethism changes the meaning of ‘con- tradiction’: Since dialethists don’t think contradictions are absurd, they can’t be talking about contradictions. This is relies on the bald assertion that the role a contradiction plays in reductio ad absurdum proofs is constitutive of the meaning of ‘contradiction’, as an “absolute stopping point”. This is to equivocate. A con- tradiction is a proposition conjoined to its own negation. It can be the stopping point in a modus tollens inference; absurdity is grounds for total rejection. Insisting that contradiction is synonymous with, or entails absurdity, is either a change in terminology or begging the question. The only clogging point for our rejection device via reductio ad absurdum would be the so-called thesis of trivialsim, ‘everything is true.’ If is true, it would seem, then ⊥ is not absurd and there is no route to reductio.“The question,” Priest states, “of why one should not accept everything is one an honest dialetheist will ask themself.”[Pri06a](56) Frederick Kroon pushes more in [PBAG04], asking, “What makes the claim that everything is both true and false incoherent?” and insisting that dialethists owe an answer. For both Priest and Kroon, I take the reasoning to be that, previously, contradiction announced a point of no logical return; but the dialethic paraconsistentist ventured farther; so some new line must be drawn. Trivialism is a red herring. It serves only as a useful point of comparison, as an example of an absolutely untenable and defective thesis. Dialethism is unlike trivialism, and a dialethist is under no more obligation to redress it than any other philosopher. Dialethism is independently motivated by natural paradoxes and is, on the whole, conservative. The same considerations that lead to acceptance and re- jection of more pedestrian theses lead also to dialethism, and likewise could lead to its rejection. For example, if a compelling and consistent solution to the paradoxes of self-reference were produced, and concomitantly a theory of unrestricted quan- tification, then I would very seriously reconsider the thesis. By contrast, nothing can challenge trivialism (or defend it, for that matter), a priori. Trivialism entails, since it entails everything, that trivialism is by far the most heinously incoherent thesis that can ever be formulated. Dunn calls it “that than which nothing sillier 2. DIALECTICS 27 can be conceived,”[PRN89](476) and a moment’s thought will show this is not an exaggeration: Trivialism says all this of itself, adding that there can never be any cogent engagement with trivialism. “Surely,” writes Weir, “one is under no more obligation to reason with those who reject rational argumentation than to reason with a rabid dog.”[PBAG04](386) Contradiction is not the reason for rejecting trivialism; formal inconsistency, I think, is not the reason sensible people reject most of the things they reject. Rather, like the vast majority of grammatical strings, what trivialism asserts is just nonsense (quite literally: If everything is true, then it is true that BA!!!BA is a well-formed sentence, and a true well formed sentence at that; so trivialism asserts certifiable gibberish (and asserts that it does so)). A few, perhaps even a lot, of contradictions being true, is categorically unlike the thesis that everything is true. The latter is only a bizarre form of madness. Trivialism provides a rejection device. And the general form of the following proof by cases, which the trivialist would consider valid, will be our last word on that topic. Above, we found that the consequences of negating RU led to triviality. So it seems right to conclude that S R = U is true, and true only, at least as far = as C1 goes. The possibility of it being false is excluded, because either da Costa’s set theory is trivial, in which case it proves that S R is universal only, or else the set theory is not trivial, in which case S R is just universal.

2.5. How to argue against dialethism? Logically, reduce it to absurdity. Rhetor- ically, produce consistent solutions to the logical and set theoretic paradoxes. Then, like any philosophical position (and unlike the dialectically defective trivialism), di- alethism would be rebuked. ’s arguments in Metaphysics Γ for the law of non-contradiction fail [Pri06a](7 - 42), which suggests that one cannot be so quick to dismiss dialethism; on the other hand, because dialethists follow dialectical rules, the failure in Γ is not enough to establish the truth of dialethism, because rational agents bear burdens of proof. Dialethism, again, follows standard dialectical rules—importantly, rejecting incoherence and absurdity. The purpose of this whole thesis is an indirect argument, beyond defense against challenges, for dialethism. All up, the dialethic dialectic is more difficult, but far from impossible. It does not do simply to prove the negation of a proposition p in order to rule out p. Nor does it do to demonstrate an arbitrary contradiction. But it has always been the case that such inferential steps are expressions of faith, not constructive demon- strations; this is detailed in chapter 3. The dialethist requires serious engagement with the components of a theory, a thoroughgoing understanding of the underlying 28 0. INTO THE DIALETHIC FIELDS principles, and articulation of what is absurd or unacceptable. These are the mak- ings of far stronger grounds for debate than unreflective acceptance that the world is like a sudoku puzzle. This discussion has aimed to show, then, that there is rational space for cogent dialethic investigations, in particular inconsistent mathematics. I have not given an of the meaning of inconsistent theorems, save to say that the force of a proof is that its conclusion is to be regarded as a theorem. As with the excursion here, better to go out and do some real mathematical work, rather than try to answer such questions a priori. The question of truth and the bearing our theorems have on the rest of mathematics, will be taken up at the end of this thesis, chapter 7, once we have some theorems to examine.

3. Inconsistent Sets

There are difficult and unanswered questions in the field of paraconsistent set theory. Some questions are philosophical and methodological, and these themes have been touched upon. There are also straightforwardly mathematical questions, directed at new theorems on the structure of inconsistent sets, of their own intrinsic merit. Indeed, it is most urgent to get to these, since otherwise there is no such thing as paraconsistent set theory. And this I believe deserves investigation for its own sake. It would be as interesting to study the inconsistent systems as, for instance, the noneucledian geometries: we would obtain a better idea of the nature of certain paradoxes, could have a better insight on the connections amongst the various logical principles necessary to obtain determinate results, etc. [dC74](498) To highlight the basic richness and novelty of the field, this opening excursion ends by observing a strange topology on the Russell set.

3.1. Even though everything is self-identical, since R both is and is not a member of itself, R 6= R. (This is proved in detail in chapter 5, prop 5.7.) By the contrapositive of the extensionality axiom,

R 6= R ⊃ ¬(R ⊆ R ∧ R ⊆ R) ⊃ R 6⊆ R ∨ R 6⊆ R ⊃ R 6⊆ R.

Let P (y) = {x : x ⊆ y}, where

x ⊂ y ↔ x ⊆ y ∧ (∃z)(z ∈ y ∧ z 6∈ x). 3. INCONSISTENT SETS 29

We use the eccentricities of R to prove

(∀n)[P n+1(R) ⊂ P n(R)], where, by finite recursion, P 0 = P and P n+1 = PP n. That is, each successive iteration of the powerset is a proper of the previous. First, Arruda in [dC00] has found that

...PPP (R) ⊆ PP (R) ⊆ P (R) ⊆ R.

To see that P (R) ⊆ R, suppose x ∈ P (R). If x 6∈ x, then x ∈ R. If x ∈ x, then x ∈ R because x is a subset of R. So P (R) ⊆ R. Now suppose x ∈ PP (R). Then x ⊆ P (R), so x ⊆ R by transitivity. Therefore x ∈ P (R), and thus PP (R) ⊆ P (R). The argument can be repeated. This is Arruda’s finding. To strengthen it, we employ contraposition at each arrow. For P (R) ⊂ R, recall that R ∈ R and R 6∈ R; this implies R 6⊆ R, and all up, (∃x)(x ∈ R ∧ x 6⊆ R). And R 6⊆ R ∧ R ∈ R gives P (R) ⊂ R. For PP (R) ⊂ R, R ⊆ R but R 6∈ P (R) so R 6∈ PP (R). Again the argument may be continued:

...P P P (R) ⊂ PP (R) ⊂ P (R) ⊂ R.

= As it happens, since C1 does not have contraposition on inconsistent propositions, the above argument probably cannot be carried out there. But a version of it would be formalizable in the logic DLQ of part two. Could one go on to define by recursion on the ordinals the ultimate ‘inner model’:

R0 = R,

Rα+1 = P (Rα), \ Rλ = Rκ, κ∈λ where λ is a limit ordinal? The many combinatorial questions, and metaphysical questions, too, that can be asked about a structure like this, will be raised in chapter 6.

3.2. What follows will present a view of sets—a proposal about the nature of the world in so far as it is comprised of collections. I claim that sets are inconsistent because they are both extensional and intensional; this claim follows from the ax- ioms of extensionality and the naive comprehension principle, respectively. For the sake of honesty, it is important to reiterate up front that naive comprehension is axiomatic. Indirectly I argue for comprehension by showing that no other assump- tion is adequate for a foundation, and that comprehension is a good description of 30 0. INTO THE DIALETHIC FIELDS pervasive mathematical practice. Still, this amounts to assertion that sets are one way and not another. The claim that a set is intensional, for example, might pique some questions. If a collection is intensionally bound, who is binding it? Neither Cantor nor Frege, we are about to see, wanted the binding to be a personal, psychological matter; Cantor nominated God, and Frege adopted a brute realism. In practice these come to the same thing—a primitive observation about the nature of structured objects: that they are structured. This will be our view, too. There are some facts, namely about mathematical structure, and we are interested in their nature. Specifically, we want to know which theory records the facts about sets correctly. I opened with Socrates’ call for intelligibility, which presupposes an organized world, prone to intellection. A brute fact about reality is that it is structured, bound by some appropriate intension about which not much more can be said. If this is not so, then there is no reason to investigate. There will not be, then, a full plunge into the philosophy of mathematics—in particular, the sea trenches of metaphysical debate over , nominal- ism, or logical . The hope is to keep the technical results as philosoph- ically neutral as possible, the arguments simply and recognizably valid. Now, a neo-meinongianism of the sort found in Routley’s jungle [Rou80] or Priest’s re- exploration thereof [Pri05] dispenses with the more confusing aspects of realism and leaves a commonsense view in place. The idea is to quell undue anxiety about existence per se, to see ∃ as without existential weight. I imagine that this is not the only means of interpreting the results. (Compare Tarski’s great set theoretic skill, despite avowedly not believing in sets, to G¨odel’s,who avowedly did.) How one reads the ∃ claims will probably strongly influence how much one believes. As far as possible, then, the mathematics is presented on its own terms. This is the stuff of axioms—brute facts—about which there cannot be more justification. Dialethic paraconsistency, again, will not sink or swim on the basis of aca- demic dissection of Aristotelian liturgy. The program will carry on if its practice is adopted, for the sociological reasons programs are adopted: intrinsic interest and beauty; explanatory and elegant advantage over competing programs; and, most elusively, a ring of truth. The original purpose of the exercise with Arruda and Batens’ theorem was to see if any indication of ‘truth-only’ could be had by negating the theorem and following out its results. Although our paraconsistent logic provides shelter from explosion, it is clear that such investigations tread at a precipice, without safety netting. Truth is dangerous; at least it is inconsistent. I am moved by this surprising and ineluctable fact and now endeavor to make a study of this unexplored tract, the transconsistent. Part One — Paradox

Set theory as foundation of mathematics must satisfy at least two criteria. • Formally, it must provide and describe an ontology for mathematics. • Philosophically, it must be a clear and natural explanation of mathematical truth. A foundation is a means and an end. Classical drives at foundation have faltered on both counts, because at the heart of set theory is a paradox, the absolutely infinite, the impossible point of convergence between intension and extension. Aristotle was correct: The infinite always was and always will be contradictory. The infinite is also indispensable to quantitative thought. In 1786, the Berlin Academy of Sciences announced a competition:

The usefulness of mathematics, the esteem in which it is held, and the honourable name of ‘exact science par excellence’ rightly given it are all due to the clarity of its principles, the of its proofs and the precision of its theorems. In order to ensure the perpetuation of these precious merits in so beautiful a part of our knowledge, we seek a clear and precise theory of what is called Infinite in mathematics.

The need to understand the infinite and thereby secure an unequivocal foundation has met severe obstacles and is still outstanding. As Erd¨oshas quipped, though, “Problems worthy of attack prove their worth by fighting back.” Advances in future research as much as past missteps depend crucially on “the definitive clarification of the nature of the infinite,” as Hilbert proclaimed in 1925 [Hil67]—“not merely for the interests of the individual sciences, but rather for the honour of the human understanding itself.” I will argue by tracing paradox’s history from Cantorian roots to modern ax- iomatization, showing that the problem is everywhere acknowledged but nowhere solved. The absolute was brought from the nineteenth to the twentieth century as the doctrine of limit on size, a brace on the iterative conception of sets and shelter for the contrivance of proper classes. Once we show that the absolute presents an

31 32 PART ONE — PARADOX intractable and unpurged contradiction, then its derivatives cannot be classically maintained. The first chapter covers basic themes in foundational history. The second chap- ter gives sustained attention to Cantor’s idea of the absolute and how it plays into shaping the transfinite. The third chapter concerns the muddy effect of paradoxes on the mathematics of last century. The central and persistent claim is that the naive set concept just is the set concept—the analytic constitution of the meaning of the word. When this intuition is resisted, confusion and vengeful paradox results. The path of least resistance puts bends in a stream.

∗ ∗ ∗ And life itself confided this secret to me: “Behold,” it said. “I am that which must always overcome itself.” — Nietzsche, Zarathustra CHAPTER 1

First Foundations—The Nineteenth Century

After introducing the idea of set theoretic reduction and its foundational merits, we examine the works of Cantor, Dedekind, and Frege on this basis. By isolating the set concept of nineteenth century foundations, the potentials and problems of the reduction are drawn out.

1. Introduction

Set theory demonstrably provides a language, a toolkit, for mathematics. This much is lexography. Since its inception, set theory has also contended as an on- tology for mathematics, a foundation. If this ontology were provided, then all of mathematics could be reduced to set theory, showing the objects of mathematical study are all just sets, and, even more presumptuously, these just part of logic. As always, this grandiose possibility is rife with both hope and hindrance. One hindrance to a reduction is conceptual clarity. Mathematical objects such as numbers, functions, and metric spaces are reasonably well understood. To le- gitimate set theoretic reduction, sets must be conceptually clearer, better under- stood objects. Otherwise there is no point—indeed, serious loss of perspicacity—in rephrasing mathematics in terms of sets [Hal84](299). Frege’s critique of past foun- dations for arithmetic may have been scathing in his 1884 Grundlagen, but would have been no more than intellectual vandalism without the development of his own elegant analytic system. Hausdorff in 1914 warned against definitions of obscu- rum per obscurius, defining the obscure by the still more obscure [Hau57](11). A foundation as reduction requires both formal and philosophical vision. A second raft of troubles suggests that the reduction is not even possible. Frege’s logicist program sought to build up arithmetic from logic alone, but his own system exploded under ex falso quodlibet. And a weightier second attempt at this project, Russell and Whitehead’s 1910 - 13, was forced to assume rather than prove important facts, such as the axiom of infinity. (On this, “One might well think that something more like utter exasperation with Russell’s procedure is called for,” writes Boolos [Boo98](256).) The axiom of re- ducibility, too, has been generally condemned; so Frege’s project was inconsistent,

33 34 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY and Russell’s repair was not analytic, leading to widespread scepticism about logi- cism’s potential. More, if reduction means, as it now usually does, reduction to the iterative conception of sets, then there is mathematics (e.g. category theory, not to mention the programs sketched in [Rou80] or [Mor95]) that fall outside the scope of the cumulative hierarchy. And to reiterate the first worry, which we will be substantiating over this and the next chapters, the iterative conception of sets is not able to answer basic questions about the nature of sets, making the reduction one to confusion rather than clarity. These obstacles stand or fall depending on our conception of set. And here is the hope. A set is a multiplicity that forms a unity, a many that is also a one; but this is not a definition; this is a mystery. Here are birds. Here is a flock. How is a many also a one? If we can clear the fog by giving a firm characterization of what a set is, and what makes its unity hold, then we can simultaneously make an intellectually satisfying reduction of mathematics, and rework the reductive project so that it does not founder on classical limitations. The unrestricted understanding of sets endorsed here goes back to Bernard Bolzano’s 1837 Wissenschaftslehre,

I permit myself, then, to call any group you please, in which the nature of the connection among the parts is to be regarded as an indifferent matter, a set [Inbegriff ]....[Bol73](128)

A natural, intuitive, and complete explanation of the set concept—of sets as the unique extensions of arbitrary predicates—can vindicate reduction and facilitate reconstruction. We will in part seek this explanation in the narrative of its unfold- ing. This chapter considers the conceptual roots and early developments of the subject, working through the history of the set concept in the nineteenth century, the romantic period of Cantor, Dedekind and Frege. I give considerable attention to Cantor and his tangled, beautiful first sketch of the full set theoretic universe, by following his efforts to justify his transfinite numbers and naming the underlying assumptions that are still alive in modern set theory. I will also briefly describe Frege’s system, which in many ways is the clearer, and so more clearly contradictory, edifice. A few comments on Dedekind’s excellent 1888 monograph are also made. None is intended as primary exposition, which is ably done elsewhere; nor is this a history. Discussion is guided to areas of dialethic gravity. The nineteenth century set concept is not uniformly stated, or even particularly clear in all points of detail. But by the end we will have found the seeds of a naive approach, of intension and extension together, waiting in the fertile soil. 2. CANTOR: FROM TRANSFINITE NUMBERS TO SETS 35

2. Cantor: From Transfinite Numbers to Sets

Georg Cantor (1845-1918) had many interests, but one passion: to foster his transfinite numbers into the intellectual world. While he sought clarification in the foundations of analysis; while he developed a conception of number that drew on Euclidean themes; and while he discussed the nature of mathematical truth, above all was an uncompromising and urgent need to explain the transfinite and make its arithmetic a part of accepted mathematics. The transfinite is capable of manifold formations, specifications, and individuations. In particular there are transfinite cardinal numbers and transfinite ordinal types which, just as much as the finite numbers and forms, possess a definite mathematical uniformity, discoverable by men.[Hal84](21) Cantor stated in a variety of correspondence that he regarded his discovery as no less than revealed world of God.1 Establishing the transfinite’s legitimacy was a matter of missionary fervour. Of his many writings, there are a few milestones. In 1874 Cantor published proof that the real numbers are uncountable, signalling the start of a science of the infinite. The six papers titled “Uber¨ unendlich, lineare Punktmannigfaltigkeiten” (exposited in [Dau79]) culminated in his Grundlagen einer allgemeinen Mannig- faltigkeitslehre of 1883, which presented and defended transfinite numbers on both mathematical and philosophical grounds, discussed in [Tai00]. His “Principien einer Theorie der Ordnungstypen”, written in 1884 but first published almost a century later, clearly articulates set theoretic reductive ideas. Crucially, the de- finitive Beitr¨agezur Begr¨undungder transfiniten Mengenlehre, published into two parts in 1895 [Can95] and 1897 [Can97], purged almost entirely of philosophi- cal comment, is the only work popularly available in English [Can15] and is the mature expression of Cantor’s theory. During his most active periods, Cantor suffered psychologically and emotion- ally from lack of recognition, and from what he perceived to be hostility from the mathematical powers in Berlin—Wierstrauss, Kummer, and especially Leopold Kronecker. (The biographical sketch by E.T. Bell in his influential and entertain- ing Men of Mathematics is largely embellished or outright fabricated; but the more scholarly sources, as well as Cantor’s own voluminous epistles, confirm in broad outline the story of a passionate, troubled man in real strife against the academic establishment.) Remaining isolated in Halle for decades, Cantor ascended to fame only in the last years of his active life, when it was in many ways too late. So the

1“I entertain no doubts about the truth of the transfinite, which with God’s help I have recognized and studied in its diversity, multiformity, and unity more than twenty years.”[Hal84](11) 36 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY transfinite and its place in mathematics was a stikingly personal matter. In this section we will follow Cantor’s attempts to mount campaigns in service of his cru- sade, each successively clearer and more successful than the last, which culminate in set theoretic reductionism.

2.1. Actual Infinity. The first obstacle for Cantor to overcome was the long- standing resistance to actual infinities. An actual infinity is complete, extant, and stable quantity; a potential infinity, by contrast, is a modal notion of a process that can go on indefinitely, and is not a fixed quantity. Polemical, Cantor (from [Ruc82](46)) writes The fear of infinity is a form of myopia that destroys the pos- sibility of seeing the actual infinite, even though in its highest form it has created and sustains us, and in its second transfinite form occurs all around us and even inhabits our minds. On this, though he realized that he was placing himself “in a certain opposition to views widely held concerning the mathematical infinite and to frequently defended on the nature of numbers,” [Dau79](96) Cantor held strong convictions, which he did not attempt to disguise.2 The Grundlagen involves, in the words of a reviewer at the time, large sections which “are exclusively philosophical and are intended to obviate the quarrel with philosophers who have denied transfinite numbers since Aristotle.” [Dau79](95) The story of the Cantorian triumph is often told; but it is important to recall the extreme tension inherent in Cantor’s initiative, and the legacy that remains thereof. The resistance he faced was well entrenched and well motivated by a suspicion: that the infinite is inconsistent. In passing I compare Cantor’s situation to the dialethic paraconsistent tra- dition, particularly in having to argue against a prejudice dating from Aristotle, and particularly in arguing that the Aristotelian dogma is simply question-begging [Pri06a]. The finite stands to the transfinite in much the same way that the consistent stands to the paraconsistent, a thought behind Priest’s suggestive (but thereafter forgotten) coinage, the transconsistent, detailed in my chapter 7. We move to a new genus of entities, be they transfinite or transconsistent, and main- tain cogency by adapting the arithmetic and logic we bring to these novelties. All the so-called proofs against the possibility of the actual infi- nite are faulty....From the outset they expect or even impose all the properties of infinite numbers upon the numbers in question, while on the other hand, the infinite numbers if they are to be

2“I am so in favour of the actual infinite that instead of admitting that Nature abhors it, as is commonly said, I hold that Nature makes frequent use of it everywhere, in order to show more effectively the perfections of its Author. Thus I believe that there is no part of matter which is not—I do not say divisible—but actually divisible....” [Dau79](124) 2. CANTOR: FROM TRANSFINITE NUMBERS TO SETS 37

considered in any form at all, must (in their contrast with the fi- nite numbers) constitute an entirely new kind of number, whose nature is entirely dependent upon the nature of things and is an object of research, but not of our arbitrariness or prejudices. [Dau79](125)

I lay aside these comparisons for now, to recall a few pre-Cantorian theories of infinity. The poet and philosopher Parmenides offered sophisticated thoughts on the behaviour of negation, and more importantly, used these insights to support his monism [SMC00]. Parmeidean arguments to the unity of the many are a first articulation of the problem in the set concept: What is the relation between a many which is also a one? Centrally for us, Parmenides’ student Zeno of Elea put forward devious arguments in the form of his various paradoxes: On one reading, Zeno demonstrates that the space between any two points is infinitely divisible, and concludes that there are infinitely many points between any two. It being impossible, this gloss of the argument goes, to cross an infinite number of points in any finite amount of time, motion is impossible. Zeno’s arguments are both very clever and obviously empirically wrong. With the syllogistic technology available at the time, the manifest falsity of Zeno’s con- clusion provoked a rejection of one of his premises: Aristotle fixed on the claim that there is an actual infinity of points between any two. While the process of division can go on indefinitely, reasoned Aristotle, it does not complete at some infinite stage—indeed, assuming some such infinite stage is just to smuggle the conclusion into the . So there is no resultant totality of points numbering beyond the finite, Aristotle thus decreed, no problem with potential infinities, and likewise no actual infinities. Conciliatory, in the Physics III:7, Aristotle:

Our account does not rob the mathematicians of their science, by disproving the actual existence of the infinity ... In point of fact they do not need the infinite and do not use it. They postu- late only that straight lines may be produced as far as they wish.

This last would become Euclid’s second postulate [Moo82]. Cantorian philosophy notwithstanding, the actual infinite’s eventual resurgence was not due to any sound defeat of Aristotelian dogma, nor even a putative solution of Zeno’s puzzle in the concept of convergence. The mathematicians, rather, really did not need the infinite and did not use it, until roughly Newton’s and Leibniz’s advances in analysis, at which time they then did need and use it. Clarification of ideas underlying trigonometric series and real functions was exactly Cantor’s 38 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY point of entry to transfinite research and the background social legitimation for talk about actually infinite sets. It is in the details, and not the dogma, that the serious points of contention lie. Foreshadowing, there is a steady relocation of troubles in the progression of the calculus: from infinitessimals to reals, from reals to sets of reals, from sets of reals to transfinite sets. This puts the trouble—contradiction—at enough distance for working mathematicians to work without inhibitions; but it will be the aim of especially the next two chapters to show that the latent contradictions are not in any satisfactory way, because they cannot be, eliminated. The most obvious, and most serious, concern about quantitative treatment of the infinite is that it disobeys arithmetic rules. Using the lemniscate ∞, it seems fairly clear that

∞ + 1 = ∞ is true; but with the usual cancellation laws, this leads to absurdities, since sub- traction of ∞ from both sides yields 0 = 1. Confusion and the threat of inconsis- tency is enough to keep ideas in abeyance for a period (however unscientific and self-defeating it is not to investigate a topic due to lack of knowledge). By the Enlightenment, though, the ban on actual infinities was growing stale, and, himself no friend to Aristotelian orthodoxy, Galileo began to think seriously about infinite sets. In 1638 he noted in his Two New Sciences that , on the one hand, there are more natural numbers than even numbers, but on the other hand there seem to be just as many, at least because both sets are infinite, and more because the two sets can be put into a one-one correspondence. At this idea, then, we glimpse the modern technology required to make trans- finite numbers rigorous, and the means to overcome the problems Aristotle could not solve. In a turn that is a dialectical motif, the paradox identified by Galileo becomes the modern definition of infinity: A set is (dedekind) infinite if and only if it can be put into a one-one correspondence with a proper part of itself. And a set is (dedekind) finite iff there is no such correspondence. (In the presence of the axiom of choice the ‘dedekind’ qualification can be omitted.) This redirects wonder that infinity is not arithmetically well behaved to wonder at the arithmetic behaviour of the infinite. With the development of a coherent theory of the infinite, the Aristotelian prohibitions went slack. Bolzano in his [Bol51] made important initial observations, including use of one-one correspondences, but his work did not receive attention until later, from Husserl and Kazimierz Twardowski, and also Frege. It is Cantor who has been credited with an epochal advance. In Russell’s idiom,

The Infinitessimal Calculus, though it cannot wholly dispense with infinity, has as few dealings with it as possible, and contrives 2. CANTOR: FROM TRANSFINITE NUMBERS TO SETS 39

to hide it away before facing the world. Cantor has abandoned this cowardly policy, and has brought the skeleton out of its cupboard. He has been emboldened in this course by denying that it is a skeleton. Indeed, like many skeletons, it was wholly dependent on its cupboard, and vanished in the light of day. [Rus37](304)

Russell’s approval at the time of writing was not yet a consensus view. Cantor’s ideas remained controversial for decades after his initial observations of 1874 that there are differing sizes of infinite cardinal. Some attention to the fundamental concept of his theory is due.

2.2. Wellorder and Counting: The one-one correspondence. Accord- ing to Dauben, few theories in any science so clearly bear the mark of the personality of their creator as does Cantor’s Mengenlehre. While the idea of a one-one corre- spondence describes something ubiquitous in mathematics, nevertheless Cantor’s emphasis on it was distinctive as much for its prominence as its audacity. And, as we will be seeing, this feature of Cantor’s thinking is swollen with his meta- physics: Many in the pure sciences have talked with varying degrees of literalness about understanding the mind of God; Cantor was absolutely serious about this [Dau79](147) and so, without irony or scorn I think that Cantor believed that the mind of God is organized into perfectly structured, inductive lines. As the focal notions in set theory, wellorder and one-one correspondence carry the full weight of both the interest and problems. Cantor’s theory fixates on the idea that any two sets can be compared in size using pairing functions. This requires not only that all sizes are comparable, but that all sets have some size; and therefore that all sets can be primed for one-one mappings. While there are preludes in Bolzano [Bol51], the seminal, idiosyncratic moment in Cantor’s thought is in 1883, when any set at all can be arranged so as to allow all its members to be counted off. From [Hal84](155),

The concept of well-ordered set is fundamental for the whole theory of manifolds. It is a basic [Denkgesetz], rich in consequences and particularly remarkable for its general validity, that it is always possible to bring any well-defined set into the form of a well-ordered set. I will return to this law in a later memoir.

(A proof of Cantor’s law of thought without use of the axiom of choice is found in my chapter 5, theorem 5.50.) A wellorder presents an inductive structure: When a set is wellordered, every subset has a least ; and all the elements stand in a 40 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY linear connection. Cantor’s original presentation, which states a base case and an induction step, makes the priming obvious: We call a simply ordered set F ‘well-ordered’ if its elements f

ascend in a definite succession from a lowest f1 in such a way that

I. There is in F an element f1 which is lowest in rank. II. If F 0 is any part of F and if F has one or many elements of higher rank than all elements of F 0 then there is an element f 0 of F which follows immediately after the totality F 0, so that no elements in rank between f 0 and F 0 occur. [Can15](137) Any two inductive structures can be naturally put into a one-one correspondence. Helping ourselves to the existence of functions which pair off uniquely and completely all the members of sets of any size solves Galileo’s paradox for a start, and accordingly leads to the idea that collections which cannot be wellordered have no size at all, and so are not even sets. Cantor was keenly aware that this is all very presumptuous and accordingly directed the majority of his intellectual power at a dense thicket of problems: the wellorder theorem, the aleph theorem, and the continuum hypothesis. These are, respectively, that: every set can be wellordered; every set can be assigned a distinct cardinal power; and no cardinals fall between the powers of the naturals and the reals. The former problems, most histories agree, were to be solved in service of the latter; failure to secure theorems that every set has a cardinal, and therefore that every cardinal is tied to an aleph, would likewise cast into extreme doubt whether the continuum hypothesis can be answered at all. Now, Cantor did not himself fear for the integrity of his system.3 But he did hope to persuade his colleagues of the legitimacy of his system, and further was deeply perplexed about the power of the continuum. A clear elucidation of wellorders was urgent. While the motivations concern cardinals and specifically the cardinal of the continuum, then, the route to cardinals is in wellorder: Cantor descended on his theory of ordinals. These prototypes of wellorders can be described from a variety of angles—as equivalence classes, as ordertypes of wellordered sets, as von Neumann ordinals, the recursive notion that an ordinal is just the set of all preceding ordinals. What is important is that the ordinal line is a series of discernable cairns leading from zero to absolute infinity, or to take up another metaphor, that the ordinals act as the central load-bearing column of the set theoretic architectonic (explicitly

3“My theory stands as firm as a rock; every arrow directed against it will return quickly to its archer. How do I know this? Because I have studied it from all sides for many years; because I have examined all objections which have ever been made against the infinite numbers; and above all because I have followed its roots, so to speak, to the first infallible cause of all created things.” In [Dau79](289) 2. CANTOR: FROM TRANSFINITE NUMBERS TO SETS 41 discussed in the opening chapter of [Hal84]). The ultimate success or failure of the Cantorian initiative and set theory in general rests on the extent and consistency of the ordinals—which, we will see, turn out to be very much the same thing. The question that comes out of wellordering goes to some basic concerns about actual infinity. For in some literal sense, which we have today in von Neumann ordinals coupled with the axiom of choice, Cantor thought that infinite sets can be counted, and he gave expression to this thought in the wellordering principle. This is not to use ‘countable’ in the specific sense of ‘denumerable,’ meaning ω-size, but rather in the primitive sense of quantitative discernment. Zeno put pressure on the idea of traversing an infinity of points, a pressure largely relieved by advances in calculus. But—even without Aristotelian reservations—why would anyone presume that an infinite set, even points on an infinite line, can be completely counted? The repeated locution ‘to put in a one-one correspondence’ is highly constructive in its phrasing, even while it is beyond contention that no human can or ever will be able to put infinite sets, even small ones, into any correspondences at all (Zeno-esque ‘hurry-up’ processes notwithstanding). We will take up the issue of metaphors in mathematics in chapter three; for now I insist that Cantor did not have any metaphor in mind. Counting indicates a process of discerning discrete units in succession. God is the counter. To purge the idea of its constructive and theological phrasing, the notion of ‘putting into correspondence’ can be understood, not as a process like counting, but stating a rule. For example, N is put into a one-one correspondence with the squares by the rule f(n) = n2. Without God, we have baldly stated the existence of a function. What is the justification for such an existence statement? One natural thought is ruled out: Existence is not a definition by recursion on the ordinals. It would be viciously circular to invoke any such principles at this stage, as Poincar´e in 1908 righty saw: The development of set theory and one-one correspondences is itself intended to justify the complete induction. Yet some justification, some underlying principle, is warranting the invocation of one-one correspondences and other mappings. The needed explanation is that we are invoking a broad com- prehension principle. A law or property binds extensions of arbitrary size. No idealized, infinite being is required. The required function is a set, that exists by virtue of a law (c.f. chapter 5, theorem 5.47). To make more sense of this we grapple closely with the assumptions behind Cantor’s mathematics, his medieval metaphysics.

2.3. The Domain Principle and the Set Concept. Having cleared con- ceptual space for the possibility of actual infinities, Cantor needed to present reasons 42 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY why there are transfinite quantities, and how their existence and manipulation is to be understood. Cantor’s justification for the transfinite is a far-flung ontological optimism, which is vividly suggested in the metaphor of a road:

Apart from the journey which strives to be carried out in the imagination [Phantasie] or in dreams, I say that a solid ground and base as well as a smooth path are absolutely necessary for secure travelling or wandering, a path which never breaks off, but one which must be and remains passable wherever the journey leads. Thus every potential infinity (the wandering limit) leads to a Transfinitum (the sure path for all wandering), and cannot be thought of without the latter. [Dau79](127)

From image we pass to ideas. The domain principle, so named by Hallett, is Cantor’s master argument to the existence of the transfinite. The principle states that quantification presupposes a stable domain of quantifi- cation; modulo the transfinite, that “every potential infinity presupposes an actual infinity.” In 1886 [Hal84](25),

In order for there to be a variable quantity in some mathematical study, the domain of its variability must strictly speaking be known beforehand through a definition. However, this domain cannot itself be something variable since otherwise each fixed support for the study would collapse. Thus, this domain is a definite, actually infinite set of values.

This is essentially a comprehension principle—in the contentious direction: If x is Ψ, then there is a set of all the Ψ objects, Ψ(x) → (∃y)x ∈ y; and from this follows the standard first order comprehension schema:

(∃y)(Ψ(x) → x ∈ y).

(The other direction, that given a set all its members have some property in com- mon, is tautological.) To put it again but the other way around, if there were no appropriate domain of quantification, there would be no meaning to Ψ(x), because there would be no way to ground the variable. And

There is no doubt that we cannot do without variable quan- tities in the sense of the potential infinite; and from this can be demonstrated the necessity of the actual infinite. . . . Each potential infinite, if it is rigorously applicable mathematically, presupposes an actual infinite.[Hal84](25) 2. CANTOR: FROM TRANSFINITE NUMBERS TO SETS 43

Priest calls the domain principle “a formulation of the Kantian insight that total- ization is conceptually unavoidable.” [Pri02a](124) Jan´egoes further, remarking that “the domain principle is more than an existence principle—it is a principle of intelligibility as well.”[Jan95](385) Putting the principle in the contrapositive, he reads Cantor as saying that any sequence is unacceptable in mathematics unless the terms of the sequence have been determined, unintelligible if the totality has not been previously defined.(386) Cartwright has re-identified the domain principle as the all-in-one principle: Any constitutes a set, or set-like object [Car94]. These are elucidating, and essentially repetitious assertions of the same idea. The general axiom, that there must be sets to serve the needs of quantification, simply stands: as axioms do. Behind Cantor’s formulation of the domain principle lies a thorny theological metaphysic, which may be summarized in a nod: The transfinite exists because it is known to God. Cantor in 1886 states,

That an infinite creation must be assumed to exist can be proved in many ways. ... One proof stems from the concept of God. Since God is of the highest perfection one can conclude that it is possible for him to create a transfinitum ordinatum. Therefore, in virtue of his pure goodness and majesty we can conclude that there actually is a created transfinitum.[Hal84](23)

To structure his conception, Cantor distinguishes between immanent and transient reality, which are enjoyed by objects when they are clearly defined in our thoughts, and are a reflection of mind-external reality, respectively. This emphasizes the con- nection Cantor saw between mind and ontology. But without any further argument, these are little more than unapologetic existence claims—in modern discourse, un- abashed mathematical platonism, seasoned with the g¨odelianconviction that we can apprehend platonic objects. The details of Cantor’s thought are not of as much concern to us as the ideas underlying them. In the domain principle and its correlate metaphysics is Cantor’s answer to our initial question, as to how collections of diverse objects are joined into a unity. Given a multitude of objects, God perceives them all together; the unity has abstract existence erkettnis, in the mind of God, and so fall under a single compass. A plurality becomes singular when they are under a purview; a common property, which is at the least the property of being thought about, provides the relations absent in sheer multiplicity. (Again, in a bare realist setting, concepts do not require a conceiver; they just are.) Be it numbers or sets, abstract objects both, Cantor rests his theory on an ontology of logical objects. 44 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY

Drake, albeit while trying to discredit the notion, neatly links the domain principle to excluded middle. The notion that every abstraction term should give a set seems to arise if we think of a complete universe given, and a set as being any partition of the universe (so that a set is only something about which we can say, of anything in the universe, either that it belongs to the set or that it does not).[Dra74](14) To complete the naive characterization of sets in Cantor, compare the following to the Socratic schema given in the introduction: An aggregate of elements belonging to any sphere of thought is said to be ‘well defined’ when, in consequence of its defini- tion and the logical principle of the excluded middle, it must be intrinsically determined whether any object belonging to this sphere belongs to the aggregate or not. [Can15](46) If the identity of a set a is defined materially, a = a iff (∀z)(z ∈ a ≡ z ∈ a), then this last is classically equivalent to (∀z)(z ∈ a ∨ z 6∈ a); that is, Cantor’s connection between an exhaustive universe and a set being ‘well defined’ can be seen to endorse the axiom of extensionality. As with any robust naive principles, the Cantorian set concept soon leads to contradictions. In the next chapter we will look at the most important of these: If a multiplicity becomes a unity because God thinks of it or sees it to be so, then the multiplicity of all unities must itself be a unity— In the absolute mind the entire sequence is always in actual con- sciousness, without any possibility of increase in the knowledge or contemplation of a new member of the sequence. [Dau79](143) —triggering the paradoxes of the absolute.

2.4. Reduction and Abstraction. Until 1883, Cantor seemed to take his transfinite numbers as sui generis, and argued for them as such. As the foundations of his theory became cemented, however, he saw that the numbers were in fact closely tied to sets. The unpublished Principien of 1884 can now be regarded as a turning point in his thought. A cardinal power, realized Cantor, is a power of something, and that something is, in a term picked up from Bolzano’s earlier pioneering work, a Menge [Dau79]. (Also following Bolzano, and perhaps more tellingly etymologically, Cantor alternately used the word Inbegriff.) The reduction of mathematics to sets had begun. Indeed the move to sets was Cantor’s best hope for the acceptance of transfinite numbers. For if these numbers are just properties of sets (or even, as it would later become, nothing more than sets), then belief in the transfinite could be reduced to 2. CANTOR: FROM TRANSFINITE NUMBERS TO SETS 45 belief in sets. And this was an easier battle to fight, sets being at least a prima facie primitive mathematical notion. By the end of his productive mathematical career, Cantor had a sufficiently precise and palatable exposition of his theory to insure, in a way he previously had been unable, the continued interest of and widespread acceptance by the mathematical community. Over twenty years of sustained attention culminated in the two part Beitr¨age, a near-text book of one of the most original pieces of mathematics in two millennia. The Beitr¨age articles of 1895 and 1897 contain the most oft-quoted Cantorian fragments: his definitions of set, cardinal, and ordinal. Common to each of these is an appeal to the mental faculty of abstraction, by which we are to arrive at the desired concept. Unter einer ‘Menge’ verstehen wir jede Zusammenfassung M von bestimmten wohlunterschiedenen Objecten m unsrer Anschau- ung oder unseres Denkens (welche die ‘Elemente’ von M genannt werden) zu einem Ganzen.[Can95](31) By a set we are to understand any collection into a whole M of definite and separate objects m of our intuition or our thought. These objects are called the elements of M.[Can15] Cantor earlier, in 1883, had given a stronger form, one which should leave no doubt of a naive comprehension principle; we have seen it already in the introduction: By a ‘manifold’ or ‘set’ I understand generally any multiplicity which can be thought of as one, that is to say, any totality of definite elements which can be bound up into a whole by means of a law.[Hal84](33) And in stark rebuke of later revisionist reconstructions that would have Cantor advocating a highly modern ‘combinatorial’ view of sets (see chapter 3), he appends to this definition the remark: “By this I believe I have defined something related to the Platonic ιδoς or ιδα.” The binding laws are left rather indeterminate, but the ones he has in mind will instantiate transfinite numbers. From 1895, We will call by the name ‘power’ or ‘’ of M the general concept which, by means of our active faculty of thought, arises from the aggregate M when we make abstraction of the nature of its various elements m and of the order in which they are given. Similarly, an is “the general concept which results from M if we only abstract from the nature of the elements m and retain the order of precedence among them.” 46 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY

Abstraction wedded the transfinite to sets, and incorporated reduction to sets as a permanent fixture in Cantorian mathematics. The abstraction operation as stated is ultimately underdetermined and attracted criticism; Cantor and Frege’s debate on the matter can be followed from Frege’s more caustic remarks on abstraction—“If for example one finds a property of things upsetting, one abstracts it away...Possessing such magical powers, one is not very far from omnipotence”—in [Dau79](222), to Cantor’s sometimes agreeing [Hal84](121), sometimes acerbic [Hal84](127) com- ments on Frege. Further discussion is in [Dum91](49). Glossed as the intensional application of some explicit predicate Φ, abstraction is just a comprehension pro- cess. Abstraction would hold as a respectable explanation of mathematical objects long enough to carry the actual infinite into the next century.

3. Dedekind

The approach of abstraction is not unique to Cantor. In 1888, Julius Wilhelm (1831-1916) published his short masterpiece, Was sind und was sollen die Zahlen? [Ded88]. In what is now reconstructed as a second-order theory, he too endorsed the general use of abstraction as an operation for arriving at numbers. Dedekind: If, in considering a simply infinite system N, ordered by a map- ping s, we entirely disregard the particular nature of its elements, retaining only their discriminability from each other, and having regard only to the relations to one another imposed by the map- ping s which orders them, then these elements are called natural numbers... He supposed that a number, or any mathematical object, is a “free creation of the human mind.” This can be cast in retrospect as a species of early structuralism [Sha97]. Dedekind is often situated between Cantor and Frege. Like Cantor, his concerns were allied with pure mathematics; his philosophy was ardent but inarticulate; his proofs were in natural language rather than Frege’s unimpeachable formalism; and his ontology was considerably richer than Frege’s. For example, in his fantastically influential essay on continuity also in [Ded88], Dedekind uses actually infinite sets in his famous method of cuts to produce the reals. Dedekind even infamously proves the existence of an infinite set using a direct ‘mental construction. Proof of theorem 66: The totality of my thoughts is infinite....” (See McCarty’s “The Mysteries of Richard Dedekind,” in [Hin95].) We will give this theorem a precise form in part two. Dedekind’s set concept is perhaps closer to what is now called mereology, ac- cording to Potter, as evinced by his discarding of an , and elision between 3. DEDEKIND 47 subset and membership [Pot04](23). Peano also, according to Potter, exhibited some confusion about the difference between a set and its , although he is usually credited for introducing distinct symbolism for parts and members in 1889. Regardless, Dedekind went much further than Frege in giving a characterization of the natural numbers [Pot00]. In 1889, Peano stated axioms for arithmetic [vH67], but these were based on Dedekind’s monograph of the previous year, and can be stated as follows:

0 is a number, and is not the successor of any number. The successor of every number is a number, too, and is unique. These are all the numbers (mathematical induction).

Within his naive theory, Dedekind proves the recursion theorem (theorem 126), that for any function g from sets M to M, with some m ∈ M, there is a unique function f from the numbers N to M such that

f(0) = m f(n + 1) = g(f(n)) for every n ∈ N. This provides for recursive definition; but more, it paves the way to showing any function satisfying the peano axioms is unique up to isomorphism. That is, if the numbers N are no more than an infinite system under an appropriate successor function s, then the existence and uniqueness of the recursive f show that second order arithmetic is categorical: “Theorem 132. All simply infinite systems are similar to N...” This is a much more robust provision for arithmetic foundation than Frege’s, in that the intended model is pinned down; insofar as it makes sense to say so, these really are the numbers and what they are for. To my mind, Dedekind’s 1888 monograph is the most appealing product of the romantic period in naive set theory. The work exemplifies exactly what was so unusual about nineteenth century science: intuitive yet sophisticated, concise yet wide-ranging, surprising and yet obviously, luminously true. As rope-and-pulley stagecraft at the time was at its zenith, soon to be overtaken by electronics, so is Dedekind 1888 the height of pure mathematics just before the onset of the un- certainty and complexity that characterize the digital age. Wedging Dedekind be- tween Cantor and Frege—myself here partaking in the guilt—does a disservice to his impressively clear and directed, and highly successful, answer to his eponymous question. A good foundation for mathematics would look much like Dedekind’s, drawing on the full resources of our intuitions with evident rigour. This said, we forge ahead. 48 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY

4. Frege: From Logic to Number

Friedrich Ludwig (1848-1925) in almost a single stroke inau- gurated the foundational program in mathematics, and the analytic movement in philosophy. Frege had an idea: to ground the proofs, and thereby the truths, of arithmetic in the apodictic certainties of pure logic. He called for the deduction, or construction, of arithmetic from logic alone, showing that arithmetic is a part of logic, and therefore is just as certain. is the thesis that mathematical truth is logical truth. If successful, the logicist project would, in Shaprio’s idiom, make mathematics “maximally immune to rational doubt.”[Sha91] Frege wrote two central books, the Grundlagen (1884) and the Grudgesteze (1893/1902), the first in prose and the second a formal counterpart in the notation of his Begriffschrift (1879). Both aimed to derive arithmetic a priori from first- principles and, against Kant, prove it to be analytic. As a test for the success of logicism we have the axioms for arithmetic given by Peano. Whatever Frege’s approach—and to this day the same goes for any foundationalist—it must show at the end that the Peano postulates are true. The first question for Frege is simple: What is arithmetic about? The answer is simple, too: numbers. But now follows an opaque question: What is a number? By analogy, astronomy is about stars, and a star is a mass of burning gas. A number by comparison is much harder to fix; nevertheless, Frege committed to giving an answer of the same form: “The number 3 is x.” And x is to be replaced by some object.A Grund for arithmetic, then, would be a precisely defined bunch of objects which behave according to Peano’s postulates. And Frege was clear that these objects are not psychological artifacts,4 echoing platonism and foreshadowing G¨odel’sunabashed realist imperatives. A judgeable content is not the result of an inner process or the product of some human being’s mental operation, but something objective, which means something that is exactly the same for all rational beings, for all capable of grasping it, just as the Sun, say, is something objective. [Dum91](49) Frege’s bold realism was to meet an adequacy criterion: It is impossible to ascribe to each person his own number one; for it would then have to be investigated how far the properties of these ones coincided. And if one person said one is one, and

4Wittgenstein once told Peter Geach, “The last time I saw Frege, as we were waiting at the station for my train, I said to him, ‘Don‘t you ever find any difficulty in your theory that numbers are objects?’ He replied, ‘Sometimes I seem to see a difficulty—but then again I don’t see it.’” [Wri83] 4. FREGE: FROM LOGIC TO NUMBER 49

another one is two, we could only register the difference and say: your one has that property, mine has this.(ibid) To give his solution, his foundation, Frege introduces concepts and extensions. A concept is a property or predicate, under which every, some, or no things fall; an extension is the collection of all those things. Aside from some idiosyncratic terminology, this is very ordinary, little more than scaffolding for subject/predicate discourse. That is, this is just logic, the logic of predication. What is most impres- sive is that this simple distinction leads to a ground for mathematics. In a way Cantor never explicitly did, Frege drove back to a primal level of thought, which has less to do with the particulars of mathematics so much as universals or attributes (called ‘classes’ in the Middle Ages), and found the numbers there. The use of sets here is also more natural than in Cantor’s abstractionist reduc- tion. If numbers are to be objects, then we need objects to found the numbers. And sets are very general sorts of objects. Basic propositional logic is often said to be ontologically neutral, but since the stuff of maths must come from somewhere, since Frege is demanding a logical ontology for arithmetic, sets are a good choice; sets are primitive; Frege closes the space between concepts and extensions—sets—to nearly flush. What is pleasing indeed about the Fregean analysis is that, in explicating num- bers, much more general and natural technology has been provided. We have, in effect, naive set theory. Frege invokes a basic connection in his fifth law, which he took to be a logical truth; in modern notation (with the material biconditional—a foible to be remedied in my chapter five),

{x : Φ(x)} = {x : Ψ(x)} ≡ (∀x)(Φ(x) ≡ Ψ(x)).

Read: Two extensions are identical iff exactly the same objects fall under the extensions’ respective concepts. (“It sounds obvious,” Boolos remarks, “doesn’t it?”[Boo98]) This assumption, basic law five, includes both the axioms of compre- hension and extensionality. Most importantly, Basic Law V provides Frege’s answer to how sets arise out of multitudes. Concepts, or properties, or predicates, depending on one’s predilec- tion, cast the requisite net over diverse objects that warrants their being regarded together. Frege’s intensional view of collections, like Dedekind’s, does differ from Cantor’s. But it is the broad points of agreement in this period of development that concern us. On the basis of Basic Law V, the membership relation ∈ is (second- order) definable: x ∈ y ≡ (∃Φ)(y = {z : Φ(z)} ∧ Φ(x)) As always with precise notation, all the assumptions, for better and worse, are on display for inspection. 50 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY

Frege avoids the mists of Cantor’s theology and abstractionism by bald platon- ism. He posits (in some unfortunate terminology) a dritte Reich of abstract objects and that is the end of it. Perhaps this is no more philosophical help than Cantor’s Ubermind,¨ but it is at least more honest in dispensing with any distractions, and, depending on how the ∃ quantifier is understood, unproblematic. Intensionality plays a vexed role here, since ‘the number 2 I am thinking about’ should be the same number no matter who is speaking. Like Cantor echoed Socrates, so Frege defends an extensional objectivity.

Are we to suppose that the does not hold for classes? ... In [that] case we should find ourselves obliged to deny that classes are objects in the full sense; for if classes were objects, then the law of excluded middle would have to hold for them. [Fre03](128)

With such scientific zeal, Frege’s call for an object to play the part of a number is in- tended exactly to remove any psychological or subjective doubt. “If [numbers] are to be anything at all,” Russell writes approvingly in 1903, “they must be intrinsically something; they must differ from other entities as points from instants, or colors from sounds...”[Dum91] Concepts and extensions, linked by Frege’s abstraction principle, are just this intrinsic something, a fixed and objective horizon. Frege’s provision was simple and sufficient: Sets are comprehended extensions. Frege can now proceed with the logicist reduction. Here is a nice example from the Fregean system. Making use of one-one corre- spondences, two sets M,N can be put in a 1-1 correspondence iff M is equinumerous to N, writing M ∼ N. Now let |M| denote the cardinal of the Ms, the set of all the sets which are the same size as M: x ∈ |M| ≡ x ∼ M Then we can derive the principle of equinumerosity, now called Hume’s principle (because Boolos refers to Frege’s reference to a passage from Hume). That is, we prove |M| = |N| ≡ M ∼ N. First let |M| = |N|. Then x ∈ |M| ≡ x ∈ |N|. Since M ∈ |M|, by hypothesis M ∈ |N|. And therefore M ∼ N. Now let M ∼ N. By definition, x ∈ |M| ≡ x ∼ M; so suppose K ∼ M for any K. Then again by hypothesis K ∼ N by transitivity, so again by definition, K ∈ |N|. Repeating mutatis mutandis for K ∼ N and universally generalizing, we have x ∈ |M| ≡ x ∈ |N|, as desired. 5. PARADOX AND PROSPECT 51

In part two of this work we will produce Hume’s principle from within naive set theory. (The proof just stated is too informal and reliant on extensional, ma- terial inferences to establish a paraconsistent theorem.) It is worth showing the naturalness of the original argument here, however, in case the idea that Frege’s original system is essentially absurd or uninhabitable (as is suggested by Dummett when he uses imprecations like ‘archaic’ at the outset of [Dum91]) needs further discrediting. That it is contingently absurd in contexts like classical logic is not at issue. From Hume’s principle Frege goes on to derive second-order Peano arithmetic. Following Wright [Wri83] neo-logicism observes that this fragment of Frege’s sys- tem is consistent, and calls this fact Frege’s theorem. The question qua old logicism is whether or not Hume’s principle is analytic [HW01]. Since we will be providing proof of Hume’s principle in naive set theory, though, entering into this debate would be a pointless diversion; it is analytic if comprehension is. Suffice to say that Frege’s original system boasts many of the features of a good foundation: His account of is conceptually clearer than the original notion, vindicating the reduction; the system is formally and philosophically rich. No such system, however, can be embedded in classical logic and survive.

5. Paradox and Prospect

In his taxonomy of reactions to paradox, Woods titles the most severe Frege’s Sorrow. For Frege, upon receipt of an otherwise laudatory note on “just one diffi- culty” from one Mr. , most memorably wrote: Hardly anything more unwelcome can befall a scientific writer than that one of the foundations of his edifice be shaken after the work is finished. ... And even now I do not see how arith- metic can be scientifically founded, how numbers can be con- ceived as logical objects and brought under study, unless we are allowed—at least conditionally—the transition from a concept to its extension. ... Solatium miseris, socios habuisse malorum. I too have this solace, if solace it is; for everyone who in his proofs hase made use of extensions of concepts, classes, sets, is in the same position. It is not just a matter of my particular method of laying the foundations, but of whether a logical foundation for arithmetic is possible at all. [Fre03](127) This echoes his repy to Russell: “With the loss of my Rule V, not only the founda- tions of my arithmetic, but also the sole possible foundations of arithmetic, seem to vanish.” Woods notes that “it may strike us as an extreme response, a trifle on the hysterical side.” [Woo03](11) However, since he took his Begriffschrift to be 52 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY the correct description of logic, and that description is classical, Frege is correct. His edifice is destroyed. Russell had hit upon his paradox, not by studying Frege’s system in depth, but rather while pondering Cantorian arithmetic. His correspondence to Frege was years after he noticed the paradox, an act prompted mainly by his own inability to solve it [Lav94](61). Russell was explicitly committed to a first-order comprehension axiom, and reasoned as follows. Since there are cardinal numbers, there is a set of all the cardinals, C. This set has some cardinal. Cantor’s eponymous theorem, though, shows that all the subsets of any set M together are cardinally greater than M. So there must, and yet cannot be, some cardinal beyond the cardinal of C. Moore fills in the historical details of Russell’s thinking in his essay in [Hin95]. This is Cantor’s paradox, known to Cantor himself probably a decade before Frege—a cardinal form of the paradox of absolute infinity, to be studied next chapter. The same contradiction applies to Frege’s system, for deep reasons. It is built into Frege’s assumptions. Concepts are the same when, and only when, they have the same extension. The tie between concepts and extensions leads to unexpected consequences, in the form of paradoxes. On the one hand, there is a concept for every extension, and an extension for every concept (even if for most concepts the extension is just empty). Basic Law V states that the relation between extensions and concepts is functional and one-one. On the other hand, there are more con- cepts than extensions. The general comprehension principle inside Frege’s axiom requires there be strictly more concepts than extensions, since there are concepts like ‘extension’. Formally, this is to consider the totality of either the Cantorian or Fregean universe, and to see simultaneously that the universe is one-one with all its subsets, and yet there be more subsets than members. Zalta explains the point at length in his [Zal07]. Any attempt to work with concepts in extension, then, will face this surprising, powerful and beautifully paradoxical fact: that there is an inconsistent fixed point where concept and extension inevitably, impossibly meet. Dummett [Dum91] asks how the serpent got into the garden: Why is Frege’s set theory inconsistent? A technical and anachronistic answer, near but not quite Dummett’s, is that Frege took concepts to be second-order, and ranged over them with corresponding second order variables, while taking the extensions of concepts to be first order. This restates Zalta’s diagnosis that there both needs to be and cannot be the same number of extensions as concepts. Naive comprehension is in fact stronger than positing equipollence of extensions and concepts: It is a statement of equivalence between extension and concept. For instance, an (apparently) second order principle such as mathematical induction

(∀F )[F (0) ∧ (∀n)(F (n) → F (n + 1)) → (∀n)F (n)] 5. PARADOX AND PROSPECT 53 in the presence of the set F = {x : F (x)} becomes just a first-order statement about sets:

(∀F )[0 ∈ F ∧ (∀n)(n ∈ F → n + 1 ∈ F ) → (∀n)n ∈ F )]

Much has been said about the superiority of second-order theories, e.g. [Sha91] and it is widely rumored that mathematical truths are second order; Potter re- peatedly emphasizes that first-order set theory is guided by and gains legitimacy from the second order variant, and c.f. Tait in [DO98] on reflection theorems. Endorsing naive comprehension is an act of agreement with these notions, but by importing higher-order, intensional entities into a first-order extensional theory. It takes properties to be what they are: objects. (This is the core thesis of Routley’s [Rou80] opus: Everything is an object.) Quantifiers ranging over, say, sets of sets, are still ranging over just sets. The problem returns us to the question asked at the start: How does a many become a one? We have seen that, covertly for Cantor and overtly for Frege, the answer is through law-like grasping. On the other hand, overt for both men is that sets are fully extensional, identified by no more than the determination of their members. Frege’s Basic Law V does us the service of perspicuously saying both at the same time. Cantor in effect does the same by giving an abstractionist, and so intensional, account of all his most important objects, and also asserting the solely extensional nature of sets, as in [Hal84](301). Dedekind’s ‘systems’ are somewhere between the two. As remarked from the outset, I cannot claim that the set concept I will be using, as encoded in the first order axioms of comprehension and extensionality, is unequivocally the same as the concept used by the first founders. There is too much confusion, both internal to their respective theories and externally between them, to expect a static notion to emerge. Dedekind and Peano may have been thinking about mereological fusions. Frege’s own signature metaphysics have been noted, and Cantor’s ideas are wild. Such is always the case in human endeavors, particularly in early days. I do, however, claim to have extracted the common threads of extensional members intensionally bound, as predicates in extension, and do not think it too contentious to say that this is, retrospectively, an explana- tion or rational reconstruction, with minimal mutilation, of the nineteenth century approach. The set concept, then, is inconsistent. I started by saying that our ability to explain the set concept directly affects the point or possibility of reducing mathe- matics to sets. We would be straying too far into the twentieth century to begin discussion of the Zermelodic axioms here, but it is worth pointing out now their 54 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY philosophical import: total abandonment of conceptual foundation, and so aban- donment of a true foundation. Bernadete (in [Woo03](334)) understands that

what we want to know is precisely what it is about sets as such, sets qua sets, that precludes there being [a Russell set]. Insight being above all what we seek, Zermelo would seem to be the least promising source to consult; for he was positively ostentatious in insisting that this was no more than a home remedy, suited to the needs of the working mathematician on the most pedestrian level.

This view is emphatically asserted by Hallett:

Axiomatization went hand in hand with the divorce from any attempt to understand what sets are or what conceptual role they play. We cannot say with any kind of conviction what sort of things sets are, so we attempt a type of ostensive definition of them through axiomatization or ‘listings’. [Hal84](303)

At least at the end of the nineteenth century, and as I will argue in the next chapters, over the twentieth century’s cumulative hierarchy, the set concept was abandoned by the mathematicians into the hands of metaphysicians. Hallett’s is no exaggeration; von Neumann says: “One understands by ‘set’ nothing but an object of which one knows no more and wants to know no more than what follows about it from the postulates.”[vN67] Potter characterizes this approach to axiomatics somewhat glibly, as intent on retaining “as many as possible of the naive set theoretic arguments which we remember with nostalgia from our days in Cantor’s paradise.”[Pot04](34) Zermelo’s ad hoc method alludes to the true dilemma of set theory: The naive version is both irreplaceable and classically untenable. In a footnote to the com- ments just made, von Neumann admits

There is, to be sure, a certain justification for the axioms in that they go into evident propositions of naive set theory if in them we take the word ‘set’, which has no meaning in the axiomati- zation, in the sense of Cantor. But what is omitted from naive set theory—and to circumvent the antinomies some omission is essential—is absolutely arbitrary. [vN67](396)

What we see is a general persistent sense that set theory, the naive version, is still in essence correct, its main ideas easily preserved, but rather mysteriously attached to contradictions. “It seemed unworthy of a grown man to spend his time on such trivialities, but what was I to do?” Russell (in [Lav94](61)) wonders. 5. PARADOX AND PROSPECT 55

“There was something wrong, since such contradictions were unavoidable on ordi- nary premises.” By the outset of his 1903 landmark advance of logicism, Russell is utterly distracted by the import of his contradiction: that the concept of set requires clarification.5 Because the reduction is not conceptually satisfying, the omission of an analytic concept of set was and remains an important contribut- ing reason for the collapse of the logicist program. What is initially reckoned to be a minor omission has blossomed into a major problem, a blatantly irresolvable dilemma: How can we both practice set theory as it is intuitively understood, and be perfectly consistent? Sets, as the meeting of properties and objects, are inconsistent. War is vio- lent, love is blind, and to be explicable the set concept involves contradiction: a multiplicity which is also a unity. The three chief founders of set theory, Cantor, Dedekind and Frege, as well as precursors like Bolzano, each essentially employed naive sets. Holding the naive view retains out the possibility of extending and com- pleting the maps of the conceptual universe in a way that any consistent theory cannot. “Philosophers have said so often that mathematics is extensional that many mathematicians have come to believe it,” Meyer and Routley remark, reemphasiz- ing a theme from Meyer’s [Mey71]. “With respect to any reasonable construal of extension, we cannot think of any doctrine more evidently false. Mathematics deals with such stuff as dreams are made on...”[MR77](368) A dialethic paraconsistent theory picks up reduction where Frege left off. For the objections to logicism are these [PRN89](524): The reduction is not concep- tually illuminating; the reduction is impossible; a complete reduction is necessarily inconsistent. This last is undeniable and the motive for going paraconsistent. The first is assuaged by returning to the naive idea of a set. As for the possibility of the reduction, this difficult task is carried through in part two. That the paradoxes of infinity intertwine with the tension between intensions and extensions is surprising and important. The underlying structural link is dis- played in the inclosure schema, and will be discussed at length in chapter 7. Well in advance of identifying the inclosure, in 1979, Priest writes: ...Our familiar world disappears and we find ourselves in strange new surroundings. The new terrain clearly needs to be explored. Where it will lead is not yet clear. Yet one consequence for the history of mathematics already stands out. The discovery by

5”In the case of classes, I must confess, I have failed to perceive any concept fulfilling the con- ditions requisite for the notion of class. And the contradiction discussed in Chapter X proves that something is amiss, but what this is I have hitherto failed to discover. For publishing a work containing so many unsolved difficulties, my apology is, that investigation revealed no near prospect of adequately resolving the contradiction or of acquiring a better insight into the nature of classes.” [Rus37](preface). 56 1. FIRST FOUNDATIONS—THE NINETEENTH CENTURY

Russell of a set which was both a member of itself and not a member of itself, is the greatest mathematical discovery since √ 2. [Pri79](240) In contrast to Russellian disappointment, Zermelodic formality, or especially Frege’s sorrow, “the view that there is no concept of set,” Woods identifies the dialethic reaction as Routley’s serenity [Woo03](159): The axioms prove contradictions, and these are worthy of study. Now that the foundational set concept and reactions to it are reasonably clear, the next chapter draws out the paradoxical aspects of the concept at length, arguing that these properties are not mere spandrels of an otherwise consistent notion, but formative and essential. CHAPTER 2

The Absolute

This chapter picks up the crucial tension remaining from early set theory, the inconsistent absolute, and shows in some detail its place in the transfinite. Rather than simply being a forerunner to the iterative conception of set as a limit on size, Cantor’s doctrine of the absolute captures the rich and paradoxical nature of the set concept. On the naive view, the absolute is an inconsistent set. After expositing inherent properties of the absolute as it bears on wellorder and choice, we examine the absolute from two perspectives—as viewed from below (as the set of all sets), and as viewed from above (as reflective).

1. Introduction

For over a century, Cantor’s principle of absolute infinity has remained an important guide to set theoretic investigations. Insofar as set theory charts the universe of sets at the limits of consistency, set theory just is the study of Cantor’s absolute. The absolute is a boundary condition and as such it shapes the universe; it is also correct to characterize the absolute as the universe itself. To say that a boundary, what it incloses, and the far side of the ultimate bound are all in some sense identical, is nearly to have articulated the paradox. The absolute is a fixed point where there can be none, a paradigmatic and powerful dialethia. Cantor was among the first to see how infinite sets can be dealt with coherently. He was also among the first to see precisely a problem with overlarge collections. This is in good keeping with his general philosophical attitude: The transfinite is that which is extendible, while the absolute is that which is not. Cantor (in [Ruc82](10)): The actual infinite occurs in three contexts: first when it is re- alized in the most complete form, in a fully independent other- worldly being, in Deo, where I call it the absolute infinite or sim- ply absolute; second when it occurs in the contingent, created world; third when the mind grasps it in abstracto as a math- ematical magnitude, number, or order-type. I wish to make a sharp contrast between the absolute and what I call the transfi- nite, that is, the actual infinities of the last two sorts, which are

57 58 2. THE ABSOLUTE

clearly limited, subject to further increase, and thus related to the finite. Pre-theoretically, these aspects are all the same—infinity is infinity is infinity, the child knows. Cantor’s great accomplishment was to isolate the properties of the infinite that may be studied meaningfully, to find “an intricate and beautiful form in an area that had hitherto been thought formless.”[Pri02a](113) Along with the core property of being incapable of further increase, Cantor characterizes the absolute by turns as inconsistent, incomprehensible, and ineffable. He also associates the absolute with God. Understanding these features is the aim of this chapter. Cantor thought that what can be counted and contained is transfinite, and what is mathematically ineffable is the absolute, both the last outpost and far side of intelligibility. Cantor’s legacy is a continuing tension between the transfinite and absolute aspects of the infinite—the problem of finding a precise demarcation of a limit and a far side amidst their perpetual intertwining. A metaphor suggesting what our philosophy and mathematics will find is that of a glacier, cutting across the landscape and coming to rest at the top of the world, leaving transfinite moraines and sediment in its linear wake: The transfinite is the residue of the absolute. Looking ahead, the naive view has that all collections are sets. The naive dialethic approach, then, is to regard all infinite sets as transfinite, and some trans- finite sets as absolute, too. This will take on more detail once it is seen that overlarge or absolute sets are extendible, too. All this is rather paradoxical in it- self, as we will see here and reappraise in chapter 7. What it means for now is that we will be more interested in the absolute in itself than what exactly Cantor said about it; if the absolute really is dialethic, but Cantor was attempting a consistent explanation of it, then his remarks will be inevitably mislead in places. In §2 below we face the most challenging aspects of the doctrine with God and the axiom of choice. In §3 and §4 below we then review the main two approaches to the abso- lute’s indirect study. A limit has a near and a far side—a view from below, and a view from above. As an approach to understanding set theory, the more common is from the bottom up, using generalizing generating principles under a limitation on size; more recently the top down method has gained prevalence, through reflection principles. In his erudite study [Jan95], Jan´eargues that there are two distinct periods in Cantor’s thinking about the absolute. (Lavine [Lav94] argues for a similar transi- tion on Cantor’s thinking overall.) From the 1883 Grundlagen until 1896, Cantor believed in the actual existence of the absolutely infinite. In his late work, on the other hand (with the 1895 - 97 Beitrage silent on the matter), Cantor took the absolute to be “irreducably potential”—that is, something close to the Aristotelian notion of infinity, the απιρoν. The two directions of study in this chapter are 2. EQUIVALENTS OF THE AXIOM OF CHOICE 59 linked to Jan´e’sdistinction: The view from above is tied to the early view of the actual absolute, where “the totality of all ordinals is assumed to be given...not re- ally created through the application of the generating principles—generation being only a suggestive way of describing the matter.”(383) The view from below is an expression that the absolute can only be approached through generating principles, “in such a way as the generation process is absolutely incompletable.” In sum, Actually conceived, the absolute bounds the entire ordinal se- quence, whereas potentially conceived, the ordinal sequence is absolutely unbounded.[Jan95](383) There is a good deal of tense energy in this formulation; it shows that there are two ways of understanding the absolute. Both ways are correct. Cantor held both at different times. Which perspective Cantor occupied at which time is, I think, a matter of Cantor biography, in the same way as which aspect of the necker cube one sees at any given time is just a matter of empirical reporting. Both aspects are real. The work here is ultimately gauged at understanding and accepting the meeting of oppositions, at occupying the point of collision that is the dialethic absolute, and carrying on study directly, with neither mysticism nor superstitious angst.

2. Equivalents of the Axiom of Choice

A set M is infinite iff M does not map to any finite ordinal, and finite otherwise. The axiom of choice makes infinity equivalent to dedekind infinity (defined at prop. 5.59). The axiom of choice makes precise the boundary between the finite and the infinite. “Someone once said to me,” Forster writes, “‘Doing set theory with a universal set is like believing in God.’ This arresting simile expresses a sound intuition....” [For95](11) Some have said that set theory is no more than precise theology, echo- ing comments of the editor of Mathematische Annalen to Hilbert, “Das ist keine Mathematik, das ist Theologie!”1 To make a very imprecise comment, in the works of some set theorists, past and present, are the tones of unabashed divine thinking. Now, the naive set concept is of an extensional intension, or perhaps an in- tensional extension, but it is not theological. There is no need of that hypothesis. Belief in a set of all sets does not induce questions of theodicy; the ordinals did not create the universe; and offering up a prayer to {x : x = x} is obviously absurd. The truth in Forster’s intuition is due to the fact that some of the more potent aspects of the naive set concept, e.g. infinite magnitude, have also been attributed to God.

1Recorded in Max Noether’s obituary of P. Gordan in Mathematische Annalen 75, 1914, page 18. 60 2. THE ABSOLUTE

Nevertheless, moderns tend to dampen the extent to which metaphysics influ- enced Cantor’s thought, wishing to see in the absolute no more than a forerunner of modern iterative set theory based on the limit on size. This is to ignore entire tectonic plates of what Cantor says. Before we go on, then, and in light of Can- tor’s theological proclivities—“...the transfinite have existed from eternity as ideas in the Divine intellect”[Hal84](21)—relations between the absolute and theology must be cleared up. We consider the absolute in itself, before describing its relation to the transfinite numbers. In particular we follow threads from the last chapter on the axiom of choice and the wellordering principle, and their relationship to de- ific heuristics. We will see that the axiom of choice has inherited the metaphysical ghosts of Cantorian theory; that the ghosts were noticed, and not entirely exorcised; but will allude to the fact that paraconsistent set theory simply shows the axiom of choice to be a consequence of naive comprehension (theorem 5.56 of chapter 5), and therefore a consequence of the (god-free) naive set concept. A basic mode in which Cantor’s doctrine is given is through inconsistency; in his well-known 1899 letter to Dedekind [vH67](114), Cantor spells out the principle as a fact about collections:

For a multiplicity (Vielheit) can be such that the assumption that all of its elements ‘are together’ leads to a contradiction, so that it is impossible to conceive of the multiplicity as a unity, as ‘one finished thing.’ Such multiplicities I call absolutely infinite or inconsistent multiplicities. As we can readily see, the ‘totality of everything thinkable,’ for example, is such a multiplicity. . . If on the other hand the totality of elements of a multiplicity can be thought of without contradiction as ‘being together,’ so that they can be gathered together into ‘one thing,’ I call it a consistent multiplicity or a ‘set’.

The 1899 letter indicates Cantor’s awareness of the paradoxes underlying his fledgling set theory; he had thought about absolute paradoxes at least since 1895 [Lav94](55). The absolute is, foremost, a name for insoluble paradox. In 1897, Peano’s student Caesar Burali-Forti published the first of the set the- oretic antinomies. While he was working with a mistaken definition of wellordered set, nevertheless Burali-Forti provided enough for the mathematical community, preeminently Russell, to see that naive sets point into the inconsistent. Burali- Forti considers the set of all ordinals, On, and finds that this set is wellordered. On must therefore have an ordertype, and this type is by definition within On—except that, by wellorder, the ordertype must be greater than any member of On. Burali- Forti believed himself to have disproved the trichotomy of ordinals [vH67]. I believe 2. EQUIVALENTS OF THE AXIOM OF CHOICE 61 he proved that understanding the ordinals as a whole is beyond the resources of classical logic. Very notably, in the 1899 letter Cantor uses the inconsistency of On to provide an argument that every cardinal is an aleph. The alephs

ℵ0, ℵ1, ..., ℵω, ℵω+1, ..., ℵ, ... are linearly ordered transfinite cardinals, indexed by the ordinals. Assuming the wellordering principle, Cantor argues that any set with cardinal not an aleph must be large enough to contain all the ordinals—that is, be inconsistent—that is, abso- lutely infinite—and so has no cardinal at all. Of this letter, which Zermelo wrote “operates with ‘inconsistent’ multitudes, indeed possibly self-contradictory con- cepts,” Moore [Moo82](53) remarks that

Cantor exhibited no alarm over the state of set theory in his let- ter [to Dedekind]—in sharp contrast to Gottlob Frege’s dismay ... Cantor did not treat these apparent difficulties as paradoxes or contradictions, but as tools with which to fashion new math- ematical discoveries.

One (anachronistic) reconstruction of Cantor’s proof is as paraconsistent (ch.5, thm. 5.72). The proof of the aleph theorem in the 1899 letter, aside from using overlarge sets, assumes that the ordinals can be “projected” into a set. How can the entirety of the ordinals be projected into a set? It is not a coincidence that the absolute as a name for inconsistency guides a proof utilizing a massive wellorder. We saw in the last chapter Cantor’s fixation on the notion of one-one mappings, and that his faith that all sets have comparable powers derived from the certainty that all sets can be wellordered. For Cantor, the wellordering principle was a law of thought. This is to say that even very large collections can be, more or less literally, counted. And we saw that the justification for this belief resides in God; or, what is largely the same, that its truth was an article of faith. An important component of this faith is that it puts no pressure on, and so calls no attention to, the distinction between the quasi-constructive image of putting a set into a wellorder, and the pure existential notion of a set being wellordered. Or again, for Cantor, whether a set can be dynamically counted, or is statically quantified, is immaterial; the effect, and the certainty that it has been effected, is the same.[Hal84](154) Whether this improves the of the constructive metaphor or reduces the plausibility of the existential statement, I shall return to next chapter. In a very explicit way, Cantor invests the absolute with the full retinue of metaphysical and theological mystery. 62 2. THE ABSOLUTE

What surpasses all that is finite and transfinite is not a “Genus [supremum]”; it is the single, completely individual unity in which everything is included, which includes the “Absolute,” in- comprehensible to the human understanding. This is the “Actus Purissimus” which by many is called “God.” [Dau79](290)

Zermelo saw that Cantor’s projection procedure without further argument failed the standards of proof. Zermelo made explicit an assumption that has now been revealed as ubiquitous in mathematics, making such a projection possible without, apparently, any overt theology. It is equally baroque and interesting, then, that to the Rubins’ compendium [RR63] can be added another entry: Robert Meyer and Richard Routley separately assert that the axiom of choice is equivalent to the existence of God. Routley presents the claim in his [Rou80](133); Meyer’s is found in “God Exists!” in [Mey87]. Here is the proof. Suppose the there to be causes and effects, these naturally arranged into partially ordered chains. And, Meyer argues, there is some reason, related to Aquinas’ cosmological argument, to think that every chain has a lower bound. Now suppose that the axiom of choice is true. Then (a downward version of) Zorn’s lemma is true, whereby the whole nexus of lower-bounded causes and effects must have a minimal element—a first cause. And this all men call God. Conversely, assume God exists; an argument to the axiom of global choice is left as an exercise. The same authors elsewhere say that “set theory is, at best, an exercise in philosophical and mathematical imagination anyway.”[MRD78]. In Cantor’s wake, Zermelo sought to place transfinite set theory on a rigorous basis and simultaneously to answer in a mathematically respectable way the out- standing open problem of Cantor’s career. To this end he published a proof in 1904 and then a proof with background axioms in 1908 (both in [vH67]), which reduces the problem of wellorder to our acceptance or rejection of the axiom of choice. “The method by which a wellordering necessarily results,” Hausdorff writes, “is basically very simple, although it places something of a burden on the abstract thinking of the reader.”[Hau57](66) In retrospect, Zermelo’s is a simple algorithm for ordering a set, M, by transfinite induction with a choice function:  a choice from M − {mα : α < β}, if there is one; mβ = stop otherwise. Tellingly enough are the initial reactions to Zermelo’s postulate, as detailed in chap- ter two of Moore’s fine history [Moo82]. For these, as I’ll briefly recall, call key properties of the absolute into question: existence, coherence, and comprehensibil- ity, among others. 2. EQUIVALENTS OF THE AXIOM OF CHOICE 63

The interested French—Baire, Borel, Lebesgue, Hadamard—generally voiced constructivist or empiricist concerns, registering the now usual complaint that the axiom offers no rule for making a univocal choice, and so does not provide enough information about the choice process to be mathematically respectable. This is to outright deny the existence, if not the coherence, of a general choice function. A recurrent theme in this reaction to set theory is a doubt about whether intuitions grown on finite collections, e.g. of thinking of sets like a “bag of marbles,” can be sensibly extended to the transfinite. Elsewhere, German Cantorians did not doubt the choice principle but were exercised that the Burali-Forti paradox might afflict the consistency of Zermelo’s wellorders. Bernstein, for example, saw no safeguard against a set of chosen el- ements being identical to the set of all ordinals On, and therefore carrying the known paradoxes of that overlarge collection. Burali-Forti’s original proof, after all, had been based on the wellorder of On; Zermelo proves the wellorder of any set at all. The Germans thus questioned the classical mathematical comprehensibility of unspecified choice sets—or more precisely, the consistency of such sets. That this was the salient concern for Bernstein, Schoenflies, Hausdorff et al who nevertheless made use of the axiom, shows an early recognition of the problematic relationship and indiscernible boundary between the transfinite and absolute. For their part, the Cambridge mathematicians Russell and P.E. Jourdain2, in conversation with G.H. Hardy, simply set about trying to prove the axiom from more primitive principles. The impression was that shared by many since: that the axiom is true, and should be a theorem of any natural set theory; yet is also elusive, beyond the ken of established mathematics. These reactions all show the incipient troubles with respect to choice, trou- bles closely allied with its metaphysical origins. Further anomalies, such as Vitali’s non-Lebesgue measurable set or the Hausdorff-Tarski-Banach decomposition of a sphere, only upset the scene further. A century of controversy has judged in favour of Zermelo’s method, much for pragmatic reasons. “No doubt the axiom of choice will always be desirable,” writes Dana Scott in 1974. “If only it could be deduced from some more primitive principle!” Moore concludes his study in echo: “The

2On a sad biographical note, Jourdain met with frustrated and persistent failure on this, until his eventual decline. Less than a week before his death in 1919, Jourdain’s wife wrote to Russell in despair: “He [Jourdain] is now quite unable to talk or see anyone, but just lies in a semi- unconscious state. Why didn’t you make an effort to come a little sooner? You have made him so unhappy by your inability to see his well-ordering.”[Moo82] Indeed, the last nine entries in Jourdain’s bibliography all bear the same impossible title: “A proof that every aggregate can be wellorded.” The last was published posthumously, with a disclaimer from a sympathetic Mittag- Leffler, that Acta Mathematica “will not to any further extent be at the disposal for papers of the same kind.” 64 2. THE ABSOLUTE axiom of choice is surely necessary, but if only there were some way to make it self- evident as well.”[Moo82](310) The philosophical questions about choice, owing to an impoverished set concept, have never been answered. The axiom of choice will be provable from the naive comprehension principle. This in some ways vindicates Zermelo’s axiomatic reconstruction. Our proof is of the wellordering theorem, from which choice follows, so it is Cantor’s suspicion of a Denkgesetz that is most em- phatic for the naive theory. The existence of God, hopefully, is not provable for us, recalling Euler’s rebuff of Diderot in the court of Tsarina Catherine: “Monsieur, (a + bn)/n = x, therefore God exists!” The absolute is no pariah in Cantorian thought, but a leitmotif. This case is strengthened now as we turn more concrete matters in the ordinal ascension.

3. The View From Below

Here we approach the absolute from beneath, the most common method of introducing and explaining the transfinite. We mean to establish the transcendence condition of the inclosure schema.

3.1. Generalization. Transfinite counting is effected by two generating prin- ciples, found in the 1883 Grundlagen. In modern terms, these are the two cases needed to define a transfinite recursion, at successor and limit ordinals: If α is a number, then α + 1 is a number. For any sequence of numbers, there is a least number greater than all of them. As stated, these are bald existence claims. To justify this continued climb, twentieth century ZF requires interplay of the infinity axiom and Fraenkel’s replacement scheme, which begins with a set and a function, and posits the existence of a range. The broad thought is of generalization, harkening back to Cantor’s method of abstraction. On this view, the natural numbers suggest the existence, and then are just an instance, of a form, a (small) initial segment of a self-similar chain. 0 is a limit ordinal over which ω stands qualitatively higher. And ω is a limit ordinal, too; so there should be something that stands in the same transcendent relation to it. That is, once the first ω-sequence is in place, we generalize; else ω would be the only one of its kind, the only number that cannot be reached by finite iteration. Generalization posits that ω is but one among many. The higher and higher cardinals are further explicable in terms of generaliza- tion, too. Consider the heuristic notion that most of the finite natural numbers are very, very large. Pick an n ∈ N; no matter how large n be, almost every other number is bigger. The vast tracts of higher set theory captures this same sense of space. In Devlin’s modern text we meet just this argument: 3. THE VIEW FROM BELOW 65

The cardinal ℵ0, being both regular and a limit cardinal, is very much larger than any of its predecessors. Neither the replace-

ment axiom nor the cardinal successor function gets up to ℵ0 from below. But our set theoretic universe should surely possess

a uniform character! The cardinal ℵ0 should not be so unusual. There ought to be a proper class of such cardinals. [Dev79](118) Whether or not one suspects a median ground between an entity being ‘unusual,’ and there being a proper class of like entities, the lure of upward generalization is strong. It is the clearest sense in which we extend, step-by-step, our intuitions about “bags of marbles” into the infinite. How long does the generation go on?

3.2. Paradox. Generalization gives us perhaps the most trodden entry to the transfinite ordinals, as recapitulations of the structure of the natural numbers. Generalization also brings us to an explicit contradiction, which was alluded to last chapter. Cantor is committed by the domain principle to the existence of a set of all transfinite numbers. This collection is the absolute itself. (Or: Cantor’s justification for how a set arises from a multitude is found in the mind of God: “All of these particular modes of the transfinite have existed from eternity as ideas in the Divine intellect.”[Hal84](21) The multitude of all transfinite numbers is surely known to the Divine intellect.) Stating the domain principle with respect to ordinal generation, The transfinite with its plenitude of formations and forms nec- essarily indicates an Absolute, a ‘true infinite’ whose magnitude is capable of no increase or diminution, and is therefore to be looked upon quantitatively as an absolute maximum. [Hal84](44) The beauty and danger lies in the fact that, like the natural numbers, there are always more ordinals; they are always increasable. The unstoppable diagonal drive is set in motion, and with it we can anticipate the waiting inclosure paradox. For the existence of upwardly generalized numbers is philosophically granted by the domain principle. We quantify over the transfinite, for example in stating the generating principles, which is predicated upon a stable domain of quantification. The absolute is just this stable domain. The structure of the generating principles that breaks through every barrier then shows its Janus-head. The domain principle prompts us to consider On as a totality: If α is an ordinal, then there is a set On of all ordinals. But the generating principle that triggers the domain principle powers on, insisting that there must be more ordinals, a least greater than all of them. In broad point 66 2. THE ABSOLUTE of principle, and specifically as witnessed by the Burali-Forti argument, this is a contradiction. This may be what is at issue when Boolos expresses discomfort with the limiting points that arise, cardinals like ℵλ = λ, a “teensy” cardinal by comparison to even the least , in his “Must we believe set theory?” reprinted in [Boo98]. Cantor’s doctrines involve an outright contradiction—all the ordinals may be apprehended because they inhabit a stable domain, and the ordinals all together may not be apprehended because their domain is overlarge [Hal84](47). The absolute manifests quantitatively, impossibly, as the final limit of the iterable. For Cantor, the transfinite numbers were to be treated as far as possible like the finite numbers, a tendency Hallett dubs ‘finitism’ [Hal84](7). Before Cantor, mathematicians ventured to the edge of the infinite, but no further; Cantor called this edge merely ω, and showed that there are many more gulfs yet to bridge. But Cantor does recognize a closure, and more, that the closure is unstable. It is in this vein that Cantor proscribes the absolute from being a set, insisting that it is beyond all mathematical or rational approach: The Absolute can only be recognized (anerkannt), but never apprehended (erkannt), even approximately. . . . The state of things is like that described by Albrecht von Haller: “ich zieh sie ab und Du liegst ganz vor mir [I count them off and they lie completely before me].” The absolutely infinite sequence of numbers seems to me to be, in a certain sense, a suitable symbol of the Absolute. [Can15](62ff) Yet the absolute is recognized. It follows from the most basic principles of the the- ory. It is recognizably inconsistent. The process of succession ends at the absolute, and does not. In the straightforward sense that ∞ = ∞ + 1, the absolute is a steadfast fixed point. Having posited grounds for a continued climb, Cantor insists that there is an end of the process—an end beyond the end, what Jaspers in [Kau75] calls the encompassing, not a horizon within which every determinate mode of Being and truth emerges for us, but rather that within which every particu- lar horizon is enclosed as in something absolutely comprehensive which is no longer visible as a horizon at all. Because all quantity falls into three categories—finite, transfinite, and absolute— Cantor requires there to be a demarcation at which the transfinites fades and only the absolute remains. That point of demarcation is necessary, but paradoxical. And this brings us to the main attempt to drive out the contradiction. 3. THE VIEW FROM BELOW 67

3.3. Limit On Size. In terminology that arose soon after the discovery of the antinomies, the doctrine of limit on size is an attempt to control the absolute. Sometimes called the central heuristic overseeing modern set theory, limit on size is an interpretation of Cantor’s absolute, though it took insight on Russell’s part, in his 1905 presentation to the London Mathematical Society, to draw it out, and von Neumann to baptise as a law. Axiomatically, von Neumann states that no set may contain as many objects as the universe. (In NGB set theory, that On is a ‘proper class’ is equivalent to the axiom of choice.) In a previously untranslated but otherwise transparent passage, Aber sie sind nur dann als Elemente bei der Bildung anderer Mengen verwendbar, wenn sie nicht “zu gross” sind. . . . Eine genaue Definition: Eine Menge ist dann und nur dann “nicht zu gross,” wenn sie von kleinerer M¨achtigkeit ist als die Menge alle Dinge ¨uberhaupt (d.h. wenn sie nicht so abgebildet werden kann, dass diese vollkommen Uberdeckt¨ wird). A nice definition: A set is “not too big” if and only if it is dis- tinctly smaller than the cardinal of the set of all things.[vN76](494) The intuition, now made eine genaue Definition, is that problematic sets are too big; ergo, we forbid any set which is too big, zu gross, and the problem ostensibly vanishes along with the problematic sets. By the time of the Tarski Symposium in 1971, limit on size was so entrenched that Church, in his paper “Set Theory with a Universal Set,” went as far as to claim that only two heuristic principles govern set theoretic progress: comprehension, and limit on size: “We must avoid axioms leading to sets which are too large, especially sets having a one-to-one or many-to-one relation with the universal class.”[Hen71](296) While the other set theorists of his day saw the comprehension principle as inviolable and therefore moved to make ad hoc restrictions on the set of all ordinals, (see Moore chapter three and my next chapter), Zermelo moved to weaken capacity to form sets, by giving his separation axiom (Aussonderung),

(∀x)(∃y)(∀z)(z ∈ y ↔ z ∈ x ∧ Φ(z)).

(See Zermelo 1908b in [vH67].) Aussonderung is sometimes considered to be a limit on size in itself: New sets can be extracted only from previously given sets, and if a is not overlarge, then any subset b of a will not be overlarge. This being much like an induction clause, the limit doctrine has an iterative character. If any overlarge set did exist, though, separation would not be an adequate block against contradiction [Hal84](251). Limit on size is called up by the otherwise overtly inconsistent drive of the generating principles. On a prevalent view, limit on size and generalization together 68 2. THE ABSOLUTE are a consistent extraction from Cantor’s theory. Challenging this view is the task of the next chapter, but we can say now that without any interpreting, fixating on the view from below seems just to identify the contradiction generating principles engender, especially in conjunction with the domain principle. Since the absolute is the set of all sets, a limit on the absolute is just the absolute again. Having stated the transcending iterative conditions, then, we view the absolute from the other direction.

4. The View from Above

While the absolute confirms that consistent counting, or even just conceptualiz- ing, does enjoin paradox at some point, the absolute does not show how to mark off a clear boundary—at least, not before the point of inconsistency is already reached. The absolute has so far only served as a name for a problem, and not a solution. We do not yet have enough information about the absolute for it to serve Cantor’s intended purpose, of demarcating the bounds of the consistent universe. Having only gazed up from below, we are missing much of the picture; so in this part we observe the panorama from above, using reflection principles: Any properties of the absolute are also properties non-absolute sets. That is, the absolute, which Can- tor wrote as Ω, reflects. In this mode the absolute represents not inconsistent but the qualitatively inaccessible, or even ineffable, aspects of infinity. This is already inconsistent with respect to expressing the inexpressible; see [Pri02a] chapter 8. Cantor urges that absolute infinity is incomprehensible. “In a certain sense it transcends the human power of comprehension, and in particular is beyond math- ematical determination.”[Hal84](13) The reflection principle states that the ab- solute is beyond reckoning, because any attempt to comprehend it will inevitably result in only capturing a smaller subset. Formally, with Φ a predicate,

Φ(Ω) → (∃X)[X ⊂ Ω ∧ Φ(X)], where ⊂ denotes proper parthood. Reflection theorems were developed formally and independently by Azriel Levy [Lev60] and Richard Montegue in 1960, and at least since then has provided a fertile source of active research in set theory; see W.W. Tait in [DO98]. Reflection offers a downward justification of the existence of transfinite numbers. Consider some claims about the absolutely infinite, which I will here follow Cantor in writing as Ω; let us see how they reflect, following a suggestion in [Ruc82]. The mechanism is: If some property holds of the absolute, it holds of some lesser set. We can first generate the lowest of the transfinite ordinals, since Ω is greater than every finite number. 4. THE VIEW FROM ABOVE 69

By reflection some lesser object satisfies this property; and then by wellorder, there is a least such object. Call ω the least such object greater than any finite; reflection thus implies the existence of ω. We can restate the upward generalization notion now in terms of reflection: The successor operation cannot reach Ω. This gives the reflected limit ordinals. So the familiar low hanging fruits of set theory obtain quite quickly. To see how the large cardinals are justified, consider For all κ < Ω, also 2κ < Ω. This yields (weakly) inaccessible cardinals, the first large cardinals. Investigating large cardinals was inaugurated in 1908 by Hausdorff, who considered cardinals of the form α = ℵα. So similarly, the existence of an inaccessible could be reflected by a theorem (thm. 5.72) to the effect that Ω = ℵΩ. The reason such cardinals are called inaccessible is also a reflection issue. We have seen that the absolute is in fact the universal domain of quantification. For an axiomatic system like ZF , a domain of some least rank θ can serve as a model of ZF . And θ, as a model of the classical universe of sets, reflects the absolute. (The details of this are spelled out in my chapter 6.) Large cardinal enthusiasts, in particular Akihiro Kanamori in his wonderful The Higher Infinite, bring reflection to bear heavily on helping us understand the structure of the universe. Reflection is evidently very powerful. Here then we have a characterization of the Cantorian program as picking out those aspects of infinity amenable to rigorous theorizing, reflections of the absolute. If this characterization is correct, then there is little wonder that some set theoretic paradoxes are irremediable. The edifice is extracted from an explicitly inconsistent object. As with the axiom of choice, naive theory can make sense of what is otherwise a dark doctrine. The view from below is just counting up from the finite, and has an uncontentious grounding, even if an overzealous extension. Descending from above, on the other hand, requires heavier metaphysics if it is to make any sense at all. A great advantage of the naive view will be to prove the reflection theorem (thm 6.2), from nothing but the basic notion of set. In turn, this will prove all the large cardinal axioms, and so divest higher set theory of some dubious conceptual abysses. As with the view from below, so is the view from above not without paradoxes. Indeed, notions like ‘incomprehensible’ or ‘ineffable’ have been recognized at least since the Hindu’s Brahman as sources of inconsistency [Pri02a]. These sorts of paradoxes are indigenous to inclosure. With respect to reflection, the proof is simple. Consider the second order property Ω(X), “Can only be predicated of 70 2. THE ABSOLUTE

Ω.” When predicated of Ω, the predicate is satisfied. But, analytically, no other object will do—violating the reflection requirement that some lesser set also satisfy. Reinhardt, in [Jec74], gives an honest and informal discussion of the absolute and reflection principles. After stating the “naive reflection principle,” he notes that the property “x = Ω” can only be satisfied by Ω itself; we might call such a property irreflective. Thus “something more subtle is required.” Rather than give a tightened parsing, though, Reinhardt goes on to say According to Cantor, Ω is unlimited. (Cantor shared with classi- cal Greece a certain distaste for the απιρoν (unbounded, inde- terminate), under which he classed the potential infinite.) To the extent that our thinking is limited, then, it should be compat- ible with what we think or understand (about Ω, for example) that the same could be thought or understood of some κ < Ω. Thus we do not understand (in the requisite sense) the property P (x) ↔ x = Ω. [Jec74](191) The argument here reasserts that reflection is meant to capture an aspect of Can- tor’s doctrine concerning incomprehensibility. Because we only understand the absolute in a limited way, what can be said precisely of it can be said of some lesser set; and what can only be said of the absolute, cannot be said at all. Plainly, though, we have just said it. And we can all understand exactly what it means for Ω to be Ω. If not, then the whole reflection enterprise is unintelligible. The statement of the reflection principle—‘anything that can be said of Ω can be said of something less’—itself would be nonsense. In part two we will see a proof of Kunen’s, from [Kun71], that Reinhardt’s investigations are classically inconsistent. Rather than stage a retreat, though, consider the potential for self-reference here: Ω reflects. Then there are lesser reflecting sets, too: If Ω reflects, then (∃X)(X ⊂ Ω ∧ X reflects);

if X reflects, then (∃X0)(X0 ⊂ X ∧ X0 reflects);

if X0 reflects, then (∃X1)(X1 ⊂ X0 ∧ X1 reflects) . . This provides, again, a fertile source of large cardinal theorems for the paraconsis- tentist. And it explains the non-self-identity of Ω we will see in chapter five; for though there are lower reflecting sets, still they may all still be just Ω. Like the Russell set in the opening incursion, since Ω properly reflects itself,

...Ω ⊂ Ω ⊂ ... ⊂ Ω... 5. CONCLUSION 71

This is the key mechanism to the wellordering theorem. The transfinite, when viewed from above, is a part of the absolute.

5. Conclusion

Formerly there were only two taxonomical categories, finite and infinite. Cantor opened a space between the two. The transfinites, as far as possible, are treated like finite numbers; though they are without end, they are still graspable objects of intelligible and precise manipulation. Such is the view from below, which inevitably invites a horizon of its gaze, and this leads to the absolute as iterative limit. Viewed from above, the absolute has all the incomprehensible aspects of infinity, which can be understood through reflection. Laying down criteria to judge whether an infinity is transfinite or absolute, and mutatis mutandis consistent or inconsistent, almost by necessity fails, because as a matter of paraconsistent logic, ¬(Φ∧¬Φ) is a theorem—so no sets are inconsistent. And similarly, the naive approach is exactly to treat the universal set or the set of all ordinals as sets like any other, so absolutely infinite sets are also transfinite; no sets are absolute. In a sense, this just is the paradox—that it must be somehow apart from the set theoretic inclosure, and it is not. Totalities necessarily involve limits; without a boundary there is no totality. When true totalities—wholes which are not contained in any greater collection— are considered, paradox results. The absolute is a fixed point where there can be none. In 1930, Zermelo presents a summary of the situation [Zer30]: The “ultrafinite antinomies of set theory” that scientific reac- tionaries and anti-mathematicians refer to so assiduously and lovingly in their campaign against set theory, these seeming “contradictions,” are due to a confusion of set theory itself, which is non-categorically determined by its axioms, with particular representing models: What appears in one model as an “ultra- finite non or meta set” is in the next higher one already a fully valid set with cardinal number and order type, and is itself the foundation stone for the construction of the new domain. The unlimited series of Cantor’s ordinal numbers is matched by just as infinite a double series of essentially different set-theoretic models, the whole classical theory being manifested in each of them. The two diametrically opposed tendencies of the thinking spirit, the idea of creative progress and comprehensive comple- tion, which also lie at the root of the Kantian “antinomies,” find their symbolic representation and symbolic reconciliation in 72 2. THE ABSOLUTE

the transfinite series of numbers based on the concept of well- ordering. This series in its boundless progression does not have a true conclusion, only relative stopping points, namely those limit numbers which separate the higher from the lower model types. And thus also, the set theoretic “antinomies” lead, if properly understood, not to a restriction or mutilation but rather to a presently unsurveyable unfolding and enrichment, of mathemat- ical science. What remains is to take this knowledge, which is everywhere expressed by set theorists, and to admit that it is an integral part of set theory itself. The finite is pressed directly against the limit, against the absolute. The trans- finite is fractal structure in the cracks. And structure demands study. At first, of the gap between finite and infinite enjoined contradictions. Then the contradictions became definitions: Infinite sets are equipollent with proper parts of themselves; finite sets are not. Now there is a need for explanation of the gap between transfinite and absolute, and there are contradictions, too. The confusion can be handled in much the same way. As our dialethic mathematics will reveal in part two, the transfinite is little more than what can be coherently discerned of the absolute. The task of this chapter has been to till the ground in preparation. CHAPTER 3

Foundations in the Twentieth Century

Nothing is easier than to devise expressions or notations.... But if we remove the veil and look underneath, if, laying aside the expressions, we set ourselves attentively to consider the things themselves which are supposed to be expressed or marked thereby, we shall discover much emptiness, darkness, and confusion; nay, if I mistake not, direct impossibilities and contradictions. - Berkeley, The Analyst, 1734 In the last two chapters we have positioned some conceptual elements of nine- teenth century set theory. That original theory was paradoxical in both the formal and informal sense—for it was clearly correct mathematics, and just as clearly, it was logically impossible mathematics. In this chapter we will consider in full the next generations’ reactions, the twentieth century doctrines encasing the mathe- matics of infinite, indicating how the early notions have been transferred into the modern apparatus, contradictions and all.

1. Introduction

A natural thought in the contemporary climate: Isn’t the iterative concept of set a consistent and adequate set concept? In their authoritative and influential 1958 book on the foundations of set the- ory, Fraenkel, Bar-Hillel and Levy state, as uncontroversial fact, that set theory and therefore all of mathematics is in an ongoing crisis [AF58](14). What has changed in the intervening years? The sections below are my partial answer to this question—that because the crisis has not been resolved, instead the statements of it have been discredited. The assumption that the iterative notion of set is successful has been widely challenged in many places for many years—by Routley and Priest [PRN89], but also questioned by Boolos [Boo98], rebuked at length by Lavine [Lav94] and by Weir [Wei98], as we will be seeing. Some of these charges are gathered together in [Woo03](155-163) as “the dialethic mini-history of set the- ory,” and the lot is synthesized in chapter three of Berto’s [Ber07]. Since I began drafting this chapter, even, a swath of wider criticisms have been collected in es- says in [RU06], there referring to doubts already aired in [ST00], on the needs of

73 74 3. FOUNDATIONS IN THE TWENTIETH CENTURY quantification and the concomitant shortcomings of the iterative concept. So the thoughts here are not unusual. Nevertheless, defaulting to the iterative concept is still basic for most philosophers. The purpose of the chapter, then, is to cast further doubt on this assumption. Cries of a foundational crisis arise when the most basic elements of a rigorous subject are not clearly defined, and the behavior and properties of these elements are not adequately understood. In mathematics, the first crisis arose when Pythagoras discovered irrational numbers but deemed them incommensurable; another crisis came when Newton and Leibniz discovered the calculus but deemed it infinitesi- mal. The crisis that occupies us here is in the foundations of mathematics itself, in set theory—the first self-proclaimed foundation. Again the crisis concerns the incommensurable, now returned as absolute infinity. In some circles, it is now considered unsophisticated to talk of paradox having ever caused a crisis in the foundations of mathematics. Kanamori [Kan94](481) recounts a parable of spiders building epic cobwebs in the newly excavated vaults beneath a cathedral, foolishly believing their intricate threaded structure to sup- port the stone spires and buttresses above. When one day a wind tears a gap in the webbing, the spiders are thrown into apocalyptic panic, while “of course, the craftsmen above hardly raised an eyebrow.” One response to the suggestion that there is a problem here at all, let alone a crisis, is dismissal. Jones registers this sentiment in his “A credo of sorts”:

The physicist’s reaction on being told that his proof was riddled with holes would be essentially to say, “Oh that’s just something for the mathematicians to worry about.” So it is for the math- ematician with respect to the logician. If ever the day comes when the logicians find some inconsistency in arithmetic, our re- action will surely be “Oh that’s just a trick of the logicians; let them worry about it.” And one can almost hear the inconsis- tency coming—perhaps there will be a proof of the existence of a contradiction, but that contradiction is too long to even con- template, so we may quite happily behave as if it did not exist. [DO98](204)

Shapiro [Sha97] tracks a related notion with the name “philosophy-last-if-at-all.” Has the iterative set concept solved the crisis? No. The iterative set conception is either inconsistent or inadequate. Routley paints the scene in dystopian hues. 1. INTRODUCTION 75

There are whole mathematical cities that have been closed off and partially abandoned because of the outbreak of isolated con- tradictions. They have become like modern restorations of an- cient cities, mostly just patched up ruins visited by tourists. [Rou80](927)

The problem inherent in the work of Frege, Dedekind, and Cantor—the problem of describing an intensional universe by an extensional theory—remains. The problem of absolute infinity, which expresses and engenders this problem, remains. The basic fact is contradiction, and has been clearly identified. I will show how and why the problem persists. Following Cohen’s independence proof in 1963 and Kunen’s inconsistency result in 1971 (see chapter 6), the most interesting of the subject’s questions, the cardi- nality of the continuum and the axiom of choice, have been left unanswered—and permanently unanswered in the framework of ZFC. This is at least mathematically unsatisfactory, and there is growing dissatisfaction among mathematicians about ZFC, “the generally accepted axioms for set theory,” writes Woodin—“but I would call these the twentieth-century choice.”[Woo01](567) The problem, I will argue, is rooted in the tension between the descriptive fact that the naive set concept is still in use, and the proscription that it cannot be, given classical logic. First I will enter into a reprisal of recent history, and identify a methodological issue which is quite general: a myopia—a near-sightedness about history. The methodological idea arises out of a sociological observation made most prominently by Lakatos, that science aspires to be a pinnace, always the last best hope, of a unified human project. As Shapiro puts it, “The historical assertions that conform to one’s views are to be praised as clear, while those that are incompatible are dubbed ‘confused’.” [Sha91](184) One can observe in the literature the founding and maintenance of a mythology, a revision of history that reifies current trends and drives out alternatives. Then I will show, largely through remarks of working set theorists, that the concepts underlying contemporary set theory are in need of clarification. The diag- nosis is simple. Because the situation is inherently inconsistent, but the theory is embedded in classical consistency, the result is incoherent. A main claim is just this tautology: If we insist a priori upon consistency, then it is a priori ruled out that set theory is inconsistent. By the same token, if genuine set theory is intractably paradoxical in places, then it is equally a priori impossible that a purely consistent theory will suffice. 76 3. FOUNDATIONS IN THE TWENTIETH CENTURY

2. On Method

The abstract sciences differ from others in that they “cannot, and have no need to, negotiate the empirical check.” [Woo03](xi) Pure mathematics and logic are paradigm examples of abstract science, and arbitrating conflict in these—or even characterizing the conflict by fixing on what the conflict is about (method? content? ? aesthetics?)—is an imposing challenge. All the more so for any philosophy in the vicinity of an abstract science. “How,” Frank Ramsey asked 1931, “can a philosophical enquiry be conducted, without a perpetual petitio principii?” While abstract sciences and the philosophy thereof may bypass the empirical check, there is still concrete data: the state and practice of mathematics. Shapiro quite rightly makes a programmatic identification:

Because mathematics is a dignified and vitally important en- deavor, one ought to try to take mathematical assertions literally, ‘at face value.’ This is just to hypothesize that mathematicians probably know what they are talking about, at least most of the time, and that they mean what they say. [Sha97](3)

I mean to extend this principle to include not just mathematical assertions, but also assertions made by mathematicians about their subject. This can subtend some straightforward observations about set theory: of outright conflict internal even to the consensus doctrine, of ideas, motives, and methods that depart from a purely iterative concept of set. Let us clarify method as the business of taking “the full range of contemporary mathematics seriously. It is a basic datum that the bulk of mathematics is legitimate ... something to be interpreted and explained, not explained away.” [Sha97](51) In agreement with this dictum, we will try to disentangle what talk from experts is only after a fa¸conde parler, and which parts really do not fit the orthodox view of set theory occurring in an iterative, non- constructive, and ultimately uncomprehended wellorder. Burgess [Bur05] invokes the difference between descriptive and prescriptive foundational mathematics. Frege saw his own work as descriptive, Burgess points out, capturing the inferences and assumptions used in informal proofs and thereby showing that the standing edifice of mathematics, “dignified and vital,” is secure. “I did not wish to represent an abstract logic by formulas,” writes Frege in 1883 [Sha91](175) “but express a content ... in more exact and clear fashion....” The avowedly descriptive character of Frege’s program is what makes logicism a legit- imate foundational, rather than interpretive, program. One can go so far as to make description an adequacy condition for any foundation of mathematics. The tool for disentangling the expert’s talk is to figure which expressions are descrip- tions of mathematical structure, and which are expression of prescription about 3. PARADISE LOST, PARADISE REGAINED 77 how that structure should be studied. It is with the goal isolating the descriptive fragment, albeit not a description that would win immediate universal endorse- ment, that we proceed. In following Shapiro and taking what mathematicians say seriously, the most difficult interpretive and explanatory task is how to respond when mathematicians speak in contradictions, all the while claiming consistency.

3. Paradise Lost, Paradise Regained

One received mythology tells us

3.1. A Story. On 7 December 1873, Cantor discovered that the reals are uncountable and thereby invented set theory; and that for a time, for the romantic period, all was halcyon in the garden of infinity, as we nibbled cherubic on the ordinal fruits of the transfinite, revelling in the place Hilbert called a paradise. In those days, we thought that any collection at all could be a set. After all, set theory arose out of concerns with arbitrary real-valued functions (a paradigm example: Dirichlet’s “shotgun function,”  1 if x is rational; D(x) = 0 if x is irrational, everywhere discontinuous), functions that do not arise naturally but can neverthe- less be defined and so needed to be understood, to keep analysis above alchemy. From this motivation, forming sets from arbitrary collecting predicates is appropri- ate, intuitive, even tautological. Against this background, set theory was intended to clarify the foundations. In those early days, a small band worked to anchor basic conceptual processes in precise formal methods. There was no marked boundary between first and second- order logic; there were no inhibitions about including hefty metaphysics in the machinery. Yet, alas, there was a serpent in the garden. Frege was caught unaware, but by 1895 Cantor certainly knew of the inconsistencies in his system. Russell too knew of Cantor’s paradox and from it abstracted his own paradox at the dawn of the 20th century (as Moore describes in [Hin95]). The ship was rocked amidst a squall of related paradoxes, and we realized that we cannot trust our most basic of intuitions. “It is not for nothing, after all, that set theorists resort to the axiomatic method,” writes Quine. “Intuition here is bankrupt...”[Qui69](x) In time, Zermelo and von Neumann salvaged what they could from the wreck of Cantor’s system. Frege’s work is now archaic. The gates of Cantor’s paradise, of which we may still speak, but which can never regain its glory, closed. 78 3. FOUNDATIONS IN THE TWENTIETH CENTURY

Or so the old story goes. The thought has emerged and grown up in the literature that the old story is inaccurate. This culminates in a recent and influential book by Lavine [Lav94], who goes so far as to say of the old story that “not one word of it is true,” and mocks the Russell/Quine diagnosis that “we can never rely on our intuitions again.” Instead there is

3.2. A New Story. “Cantor’s original theory was neither naive nor subject to paradoxes.”[Lav94](3) His paradise is today as it always was, an unshaken ladder from zero to beyond. Cantor’s theory had no problems, and still does not. “The paradoxes posed no problem for Cantor’s theory of sets—transfinite objects that can be counted.”[Lav94](76) Cantor knew all about the so-called contradictions, having written to Dedekind well before Russell wrote to Frege, along with another short note to Philip Jourdain asserting that the collection of all sets cannot be a set [Hal84](286). Cantor’s followers, notably Hausdorff and, on this account, Zermelo, were unruffled by the so-called crisis in the foundations of mathematics, and with good reason: There was no crisis. Cantorian set theory was, is, and will remain good, classical mathematics, and therefore consistent and true. Lavine’s [Lav94] story of the set concept in precis is as follows. Cantor’s theory, until the 1890s, was of transfinite objects that can be counted (i.e. wellordered), called sets. Cantor did not assume the domain principle, which is a naive compre- hension principle, but instead expected to be able to prove as an instance that the real numbers are a set (92). He expected to prove that the reals are a transfinite object that can be counted. Only late in his career did Cantor begin to see that the axiom might lead to the desired proof that the continuum is 2ℵ0 . Only here did the troubles begin. Because “Cantor needed to allow for the existence of a set that he did not know how to introduce explicitly via a counting,”(95) now his “theory was in trouble, but it was not trouble caused by the paradoxes. It was trouble caused by trying to fit the power set axiom into a theory that took well-orderings to be primary.”(97) Ergo the Cantorians, with their consistent notion of wellorder, charged ahead: Zermelo’s 1908 axioms, often assumed to be in part a response to paradoxes, are just the necessary ingredients for proving the wellordering theorem, and a fortiori a demonstration that the wellordering principle is free of soft notions or theolog- ical metaphysics. Similarly, Felix Hausdorff, with wholesale disdain for founda- tional arguments, “the only productive set theorist who is so little irritated by the paradoxes” (according to Hessenberg) in the immediate wake of what he deemed an attack with “medieval weapons” [Hau05](182) published work on ordertypes and showed the first glimmers of large cardinal research. Though under the name Paul Mongr´ehe wrote metaphysical poetry and politically critical novels and plays 3. PARADISE LOST, PARADISE REGAINED 79

[Czy94], (and in another sad bit of biography, discovered his high academic po- sition would not save him and committed suicide along with his family when the Nazis conquered Germany), when writing as Hausdorff he approached the credo of Jones: In 1908, An investigation such as this, which endeavors to add ... to the positive inventory of the still so new theory of sets in the spirit of its creator, cannot prae limine spend any time to enter into a discussion ... where presently a somewhat misplaced degree of ingenuity is squandered....1 And unchanged decades later: “In our opinion,” Hausdroff wrote again in 1937, “it does not detract from the merit of Cantor’s ideas that some antinomies that arise from allowing excessively limitless constructions of sets still await complete eluci- dation and removal.” [Hau57](11) A first tiller over the topology of Cantor’s earth, Hausdorff decried the intellectual efforts wasted on hand-wringing about paradox when these same energies could be put toward fruitful research—“the pursuit of structure for its own sake.”[Kan94](xxi) Wherefore, then, all the mythologizing about paradox? That, the new story proposes, was all part of a different narrative. Here we have not Cantor, but Frege, toiling away at an altogether different, baroque task of deducing arithmetic from logic. Frege’s logic was higher-order and his Basic Law V an unrestricted com- prehension principle, so the confusion with set theory is forgivable, if regrettable. Russell’s discovery of the set {x : x 6∈ x}, which after all he mailed not to Cantor but Frege, was devastating only to this latter logicist’s program. It was Russell who saw a contradiction in Burali-Forti’s paper; it was Russell who went before the London Mathematical Society to speak about the dire threat of the paradoxes [Rus05], and it was Russell again who’s reconstruction in the Principia ultimately implicated anything Cantorian. In short, the paradoxes belong to the logicians— we might as well be honest and call them philosophers. Basic Law V conflates properties with sets. Isn’t this like conflating astrology with astronomy? Contra- diction is inevitable. Thus, in a distinction running from G¨odelthrough Wang, to Maddy and Shapiro and now Lavine, good collections are mathematical sets, and bad collections are metaphysical, ‘logical’ sets. Or so goes the new story.

1Hausdorff’s comments continue: “To an observer, who in the face of skepticism is not wanting in skepticism, the ‘finitistic’ objections against set theory might roughly fall into three categories: those that reveal a serious need for, perhaps an axiomatic, sharpening of the set concept; into another fall those that would affect the whole of mathematics together with set theory; finally, there are those that fall into the simple absurdities of a scholasticism that clings onto words and letters. One shall be able to come to an understanding with the first group sooner or later; one may safely let the second rest; the third deserves the sharpest and clearest disapproval.” [Hau05](198) 80 3. FOUNDATIONS IN THE TWENTIETH CENTURY

3.3. Which story to believe? Like all stories, both contain elements of reality and fancy. Michael Hallett leaves the issue complicated and unresolved. Dauben largely supports the old story; Moore does, too. Potter is skeptical about importing anything too modern into the past, observing that while some try “to make the history of the subject read more like an inevitable convergence on one true religion,” really the iterative idea was not clearly articulated until G¨odelin 1944 or [G64¨ ], or even Boolos in 1971, and warns us to be wary of the Panglossian view that “the paradoxes are not really so paradoxical if we only think about them in the right way,” since this idea “is hard to find in print before G¨odel.”[Pot04](26, 37). (Potter has probably overlooked Zermelo’s 1930 paper [Zer30], but this would still not impinge on the old story.) In the next sections we can go into some detail about the origins of the reconstructed set theory and its history. But I think that the old story and the new story both have something to teach, not least because of their telling; to show this I’ll utilize the familiar, helpful description/prescription distinction. The old story, for all its shortcomings as apocrypha, is roughly descriptive, and ends with a prescription: Here are paradoxes, and here is how we should avoid them. It states a problematic is and offers a programmatic ought. The new story, on the other hand, is prescriptive. It boasts finer distinctions within the reactions of the mathematical community, and suggests important insights about the causes of paradox. But the new story also claims to be descriptive: This is what we do, or else there would be paradoxes—but, of course, there are no paradoxes. The new story makes an is of an ought. The old story is nostalgic. The new story is myopic. At least, it betrays a char- acteristic of mathematical practise: a preference for perfection, for clearing away rough edges, leaving only the beautifully syncopated call-and-response of theorem, proof, theorem, proof. Example: The luminary C.F. Gauss published little, tersely and without supporting motivation; of him Abel said approvingly, “He is like the fox, who effaces his tracks in the sand with his tail.” Gauss himself said that “no self-respecting architect leaves the scaffolding in place after completing the build- ing,” or again, “a cathedral is not a cathedral until all the scaffolding is removed.” Mathematics is the science of certainty; and I merely point out that what Lakatos has called the Euclidian tendency [Lak76](preface), and what Nietzsche in allusion to mummification called ‘Egyptianism’,2 is pervasive not only in mathematics, but mathematical history.

2“You ask me about the idiosyncrasies of philosophers? ... There is their lack of historical sense, their hatred of even the idea of becoming, their Egyptianism. They think they are doing a thing honor when they dehistoricize it, sub specie aeterni—when they make a mummy of it.” Twilight of the Idols, “‘Reason’ in Philosophy” I. 3. PARADISE LOST, PARADISE REGAINED 81

Who makes claim to history, it must be said, has little bearing on truth. The high tradition of medieval medicine ought attract no practitioners now. It is prob- ably distracting and specious to trade blows over precedent. Nevertheless, for what it is worth, the old story is the one I have selectively and interpretively told, be- cause I believe that some version of the old story is true—at least by the principle that the closer witness is more likely to be correct. The old story is found in Rus- sell’s pre-Principia writings, Hilbert’s epochal 1925 speech “On the Infinite,” and is then largely propagated by a first hand participant, , in own his popular texts [Fra53], [AF58], and historical introduction to Bernay’s [Ber68]. There was a romantic period, there were surprising paradoxes, and in time, there was some consensus reached on a solution. If this is right then paraconsistency can offer a natural continuation of the process. But, as the remainder of this chap- ter must show, the current paradox-avoiding strategy was built into the , canonized, dogmatised; the messy, organic formation process was swept away, hid- den like scaffolding to glorify the cathedral. What was born as ought is promoted to is, and the hermeneutic circle tightens with perfect and timeless mathematical precision. The iterative hierarchy makes claim to being descriptive, of illuminating the basic ontology of practice; it makes this claim because the claim is needed for legitimacy, or at least to stave off suspicions of ad hoc-ery. But this claim is false. The iterative hierarchy represents only a fragment of the mathematical universe. Current problems in our understanding of set theory are explained by seeing that the iterative reconstruction of set theory always was prescriptive. The point is made explicitly and overtly not least by Zermelo himself, who leaves no room for historical revision (although c.f. [Moo82] at the outset of his 1908 axiomatization:

At present, the very existence of the discipline [of set theory] seems to be threatened by the existence of certain contradic- tions or ‘antinomies’ that can be derived from its principles— principles necessarily governing our thinking, it seems—and to which no entirely satisfactory solution has yet been found. ... Cantor’s original definition of a set (1895) ... requires some re- striction; it has not however successfully been replaced by one that is just as simple and does not give rise to such reservations. [Zer67](200)

And von Neumann, at the close of his own sophisticated axiomatization in 1925, sighs that despite much work, still he must “entertain certain reservations about set theory” because “for the time being no way of rehabilitating this theory is known.”[vH67](413) 82 3. FOUNDATIONS IN THE TWENTIETH CENTURY

Now to support the claim that set theory is in ideological confusion, using our literalist principle to guide a closer look at the twentieth century technical responses to contradiction—the iterative concept of set, proper classes and limit on size.

4. Hierarchy

What follows are discussions of the three main turrets of classical set theoretic ideology, iteration, proper classes, and limit on size, and some statements mathe- maticians make about them. A literal reading of these statements together betrays confusion—a collision of intuition and iteration.

4.1. Iteration. Mathematics displays remarkable invariance. In the first half of the twentieth century, disparate research independently converged on the image of an infinite hierarchy. This is a means to escape the paradoxes of naive theory. One might even say that a certain attitude toward paradox, as a trigger for Dummetian ‘indefinite extensibility’, naturally leads to hierarchical structures. Priest identifies such structures as “latterday Kantian attempts to retain a certain control over conceptual production.”[Pri06b](48) Cast in terms of the inclosure schema, this view of paradox emphasizes the transcendence conditions, and denies closure. And this is the shortcoming of the iterative sets—not that they are not sets, or bad sets, but only that they are not all the sets. By denying closure, the iterative sets are neither enough sets, nor are they adequately explained. Russell and Whitehead in Principia, Zermelo in 1930 (though emphatically not in 1908), and then G¨odelin a popular lecture in 1947, all present what in retrospect may be called variants of iterative hierarchies. The idea came to the full popular attention of philosophers only in 1971, with George Boolos’ “The iterative concep- tion of set” article, reprinted in [Boo98]. This immediately follows a main moment for set theory in the second half of the twentieth century, the axiomatic set theory conference of 1967 [Sco68], [Jec74]. Just a few years after Cohen’s results put the continuum hypothesis out of immediate reach, at that conference the com- munity rallied and reached de facto consensus on the foundations of the discipline, fixing the tenets one now finds repeated in the literature. “By all accounts this was one of those rare, highly exhilarating conference that both summarized the progress and focused the energy of a new field opening up.”[Kan94](115) A leading voice in the area is Dana Scott, who in his seminal address there proved that a prior notion of ‘stages’ can be employed to derive the iterative notion of set. He declared: Our original intuition of a set is based on the idea of having al- ready fixed objects. The suggestion of considering all-inclusive collections only came in later....The suggestion proved to be 4. HIERARCHY 83

unfortunate, and so we must return to the primary intuitions. [Jec74](207)

These intuitions are summed in a standard recursive scheme,

V0 = ∅,

Vα+1 = {x : x ⊆ Vα}, [ Vγ = Vβ if γ is a limit. β<γ Often, the closure condition is also included, with On the set of all ordinals: [ V = Vα α∈On This is an anomaly we return to in discussion of overlarge sets. The iterative concept draws on finite intuitions and deeply implicates a con- structive metaphor: An iterative set y can only be formed out of pre-existing ob- jects. Therefore it cannot be that y ∈ y, and most of the antinomies stall. If the iterative hierarchy is born as an inductive structure, then the axiom of foundation acts as a maximality condition on its universe—asserting that the induction is ex- haustive, asserting ‘and that’s all.’ It is worth noting that Zermelo in 1930 was ambivalent about ruling out non-well-founded sets, ceding only that the founda- tion axiom “introduces for the moment no essential restriction of the theory.”(in [vA86], emphasis added) In his highly influential paper, Boolos, having first admitted that the naive idea of sets “might occur to us quite naturally,” and indeed that it has “great force,” nevertheless modulates his rhetoric a few paragraphs later:

It is important to realize how odd the idea of something’s con- taining itself is. Of course a set can and must include itself (as a subset). But contain itself? Whatever tenuous hold on the con- cepts of set and member were given one by Cantor’s definitions is altogether lost if one is to suppose some sets are members of themselves. The idea is paradoxical not in the sense that it is contradictory, but that if one understands ‘∈’ as meaning ‘is a member of’ it is very, very peculiar to suppose it true. [Boo98](18)

Boolos is aware that this is a bald assertion. “There does not seem to be any argument,” he admits, “to persuade someone who really does not see the peculiarity of a set’s belonging to itself.” Assuming nevertheless that friends of self-membership are sufficiently shamed, Boolos carries on to explain that a set is something that 84 3. FOUNDATIONS IN THE TWENTIETH CENTURY lives on the cumulative hierarchy, and he proves it with a reworking of Scott’s stage construction (discussed below). At that 1967 conference, a perspicuous model of ZF out for viewing signalled an end to the embarrassment of antinomies. Reinhardt:

It has been pointed out that the paradoxes of Russell, Burali- Forti etc, never really caused a crisis in mathematics (where one deals only with unproblematic examples of sets) but rather in logic (and ), where one tries to provide a general and universal frame for mathematics and in particular for arbitrary sets. We now consider such a frame to have been provided for set theory by the clarification of the intuitive idea of the cumulative hierarchy.... [Jec74](190)

Reinhardt distinguishes between mathematics and set theory, accepting the sacrifice of a foundation for mathematics and ‘general’ set theory if a foundation can be provided for set theory simpliciter. This is a more subtle analysis, with modest aspiration. Indeed, part of the proposed repair of set theory involves recasting it as one branch of mathematics among many, and not a foundation at all. This is a theme to which we shall return, but it is worth noticing immediately then that such a retreat makes any explanatory work on the part of set theory impossible. The expected payoff is that all mathematical paradoxes evaporate if we only consider the iterative sets:

It should not be forgotten that the paradoxes never applied to any type structure, and in this sense they are not .[Dra74](14)

G¨odelhad already intimated such forceful thoughts in 1947: that the iterative conception, unlike “something obtained by dividing the totality of existing things into two categories, has never led to any antinomy whatsoever.”[G64¨ ] The humble distinction is easily elided over for bolder claims. The hierarchical claim is not just that iterative sets are the only sets, but that iteration explains what sets are. They are offered as a foundation, because not only are all mathematical objects reckoned to be iterative sets; also this supplies a raw and primal set concept, and so the reduction to sets is illuminating, worthwhile and sound. This is Boolos’ explicit purpose, to show that

without prior knowledge or experience of sets, we can or do read- ily acquire the conception, easily understand it when it is ex- plained to us, and find it plausible or at least conceivably true. [Boo98](89) 4. HIERARCHY 85

Given these are the aims, many problems emerge for the hierarchy, most of them fatal. The first is the foremost: The reduction fails because it presupposes naive set theory, including the notion of ordinality. Therefore the iterative concept is not the foundational concept and is not explanatory. It is viciously circular, as is now explained. There is an obvious problem in explaining set theory in terms of iteration. As demonstrated by Scott, Boolos, van Aken [vA86], and most recently Potter in his informative text book [Pot04], one can deduce the axioms of ZF from some ‘stage axioms’, postulates about levels of the hierarchy, or some equivalent ordering notion like ‘presupposes’. The level of technical detail in presentation varies, but the idea is simple enough. We imagine a first level, or stage, comprising all the previous stages, which is meant to be the empty stage. (Note that the inference if there are no sets then all sets are in ∅, is just a form of explosion: p ` ¬p ⊃ q. So the ex nihilo construction in this form is very classical.) At the first and each successive level we collect up and take the accumulation to be the next stage. Some postulates are stated to formalize notions like ‘earlier than’ and ‘is a stage,’ and the process is indexed to keep order. At the very least, then, stage axioms require: - the existence of stages; - one or more order relations; - and an index set. Usually, these are exactly the sorts of things developed from within set theory. Especially the last, an index set, is the most egregious. As this is laid out in a contemporary text, Devlin notes that the description of the set theory depends on the construction

of the cumulative hierarchy of sets, the Vα hierarchy, and this, in turn, depends upon the ordinal number system. We are thus assuming a considerable amount of “set theory” in order to define our set theory.[Dev79](49) This is an instance where taking mathematicians at their word is challenging; what- ever the difference between “set theory” and set theory is, it is good indication that some foundational conceptual matters are in need of clarification. The unlikely insertion that “there is, of course, no real dilemma here” notwithstanding, a stage theory involving the first, second, nth stages, has overtly and persistently presup- posed the ordinals. Priest and Routley: The construction of sets presupposes a prior construction of or- dinals. However this raises all sorts of problems about ‘how far’ the construction can be continued, about sizes of infinities, 86 3. FOUNDATIONS IN THE TWENTIETH CENTURY

etc. Indeed it is just these kinds of problems that the theory of sets was supposed to solve. We do not deny that once one has a notion of set one can non-circularly produce ... the cumu- lative hierarchy. But to suppose one can use the notion of an ordinal to produce a non-question-begging definition of ‘set’ is moonshine.[PRN89](500) When in Devlin we meet the iterative hierarchy, we are told Before we can build sets of objects, we must have the set of objects out of which to build these sets. The crucial word here, of course, is ‘build’. Naturally we are not thinking of actually building sets in any sense, but our set theory should reflect this idea. [Dev79](43) From passages like this, there is some question of what the iterative concept is sup- posed to mean. If nothing else, this is not at the level of clarity or rigor one expects. The scare quotes scattered through Devlin’s discussion indicate that the construc- tive metaphor can be eliminated. Rather than forming or building sets, which would become unfeasible not even at the infinite stage, we can paraphrase with the statements: there is a first stage, there is a ω stage, ... (See Potter [Pot04](36 - 40) for discussion of construction metaphors, idealized constructors, etc.) This is now mathematically cogent, but it eliminates the naturalness or intuitiveness of the iterative view, in Boolos’ sense, and makes all the tortured building or presuppo- sition talk pointless. Whatever naturalness the iterative concept has is lost when we fall back on bare existence claims. If the proper way of speaking, the phrasing without scare-quotes, is not constructive at all, why not a set containing itself? As Weyl aptly observes, No one can describe an infinite set other than by indicating prop- erties which are characteristic of the elements of the set. ... The notion that an infinite set is a ‘gathering’ brought together by infinitely many individual arbitrary acts of selection...is nonsen- sical. [Wey19](23) Finally, the iterative story does not anyway justify all the ZF axioms. Boolos is dubious about power set; van Aken cannot justify replacement; and Lavine finds these and more wanting. Zermelo’s axioms are not, and were never meant to be, an axiomatic description of the iterative hierarchy. And so the iterative story is no replacement for, and is indeed parasitic on, the naive set concept. I take these objections against the iterative concept itself to be, in Routley’s idiom, “central, deep, and disabling.” Let us turn to some of the other features of the classical solution, some already familiar from the last chapter. 4. HIERARCHY 87

4.2. Proper Classes.

Sooner or later, in any course on set theory, the instructor finds it necessary to introduce the notion of a proper class. And this moment he usually dreads. For with virtual certainty, one of two things will occur. Either his students will suddenly gain the uneasy feeling that a sleight-of-hand method of introducing some very dubious notion is being employed. Or else the student will accept the new notion but fail totally to understand what exactly is being done. For us, this moment has come. [Dev79](57)

One way of introducing proper classes is as the dual of an urelement. For any y, if ¬(∃x)x ∈ y, then y is an individual; if ¬(∃x)y ∈ x then y is a proper class. Every set is a class, but proper classes are not sets. The key behavioral property of proper classes is not to be a member of anything. If proper classes can’t be members, then R 6∈ R but no more, and no Russell’s paradox. What to say about Cantor’s paradox is less obvious—V is not a member of P (V ), perhaps?—but the general issue is meant to be resolved because his theorem only applies to sets. One tradition in set theory, e.g. in Kelley’s appendix to [Kel55] or Takeuti and Zaring’s extremely precise [TZ71], is to have a predicate for ‘is a set,’ which does the work of tagging objects that do not seem to trigger contradictions. Proper classes are the repository for blame when an inconsistency arises. One could forgive a non-initiate for thinking that to call a class ‘proper’ is just to say that it is an inconsistent set. The core argument I will make is this. Either proper classes are not real mathematical objects, in which case talk of them is problematic; or else proper classes are real, in which case they are really problematic. Weir’s “Proper classes are proper Charlies,” in his [Wei98](773), makes the same point rather bluntly. Presenting proper classes as the dual of an atom at least has some symme- try and aesthetic sense. It is not the usual way. Proper classes are more often introduced via limit on size, that y is a proper class iff y is as big as V . The most common presentation is as in Jech’s standard [Jec78], which blankly inserts proper classes as devices for maintaining the comprehension principle: “We do this for practical reasons: It is easier to manipulate classes than formulas.” The same holds of Levy’s, Drake’s, Devlin’s, and Kunen’s books, where classes are taken as a notational substitute for predicates; that is, where for every open sentence there is a class such that the comprehension principle applies. On this line, then, it is not exactly the comprehension principle that is to blame for paradoxes; rather it is us mistaking some comprehended classes for sets. 88 3. FOUNDATIONS IN THE TWENTIETH CENTURY

Proper classes are, in a word, intensional. “. . . Sets are sharply delimited and determined by their extensions, whereas classes in the sense of properties have an essentially intensional quality.” [Kan94](313) They retain the intensional character of the naive set concept. And since they are taken as a paradox solvent, freeing sets to be perfectly extensional and consistent, that intensionality is considered something of a stain from the old, romantic : If we call such things properties then it is clear that the paradoxes are a real problem to be dealt with before a thoroughgoing theory of properties can be developed. (The consensus of opinion seems to be that in fact it just cannot be developed, but some system of levels must be resorted to—that we cannot regard properties as being members of the universe to which they apply.)[Dra74](14) The dangerous operation of “dividing the universe in half”, alluded to by G¨odel, is the province of property theory, attribute theory—proper classes. Martin (in [Mad83]) gives the standard gloss: ...sets are generated by an iterative construction process. Classes are given all at once, by the properties that determine which objects are members of them. ...‘set’ is a mathematical concept and ‘class’ is a logical concept. In this attitude, proper classes are volatile, and metaphysically heavy. Given the insistence on iteration as the only legitimate picture of sets, it is important to recognize why proper classes are still a part of the classical appara- tus. Despite persistent insistence that “the universe does not exist” (as Halmos’ introductory text [Hal74](7) announces for maximum shock value), to work in the discipline of set theory does require the cumulative hierarchy to be a finished do- main. Proper classes offer closure. Proper classes are appealed to often, for both combinatorial and conceptual purposes. This goes to the main problem: Sets were supposed to be collections. Some collections are inconsistent. Invoking proper classes is an attempt to save the sets, but only because sets are no longer the last word on collections. Nevertheless, we must have some workable notion of collection. To cite a telling assortment of examples, key results like Scott’s proof that the existence of measurable large cardinals contradicts G¨odel’saxiom that the universe is constructable, V = L, make apparently indispensable use of V itself [Kan94]. More generally, the big class On enjoys constant publicity, since any definition by recursion on the ordinals involves a function that explicitly takes On as its domain; and even if it is recursion only up to some rank α, where is α? Most tellingly, the set V is the gauge of truth; Shelah [She00] asserts that ‘true’ means ‘true in V ’. To say that the structure hV, ∈i 4. HIERARCHY 89 satisfies a set theoretic formula ϕ is to say that ϕ holds absolutely. Absoluteness is an important property for studying models of set theory, as in [Kun80]; but without V , there is no sensible way to articulate it. Now, if proper classes do properly exist, as they do in von Neumann-G¨odel- Bernays set theory NGB, then at least some of the psychological discomfort and disingenuousness is alleviated. But still a proper class y is a member of no z.

Thus although we can talk in NGB of the whole iterative hierarchy VOn, this class cannot itself have a model or belong to any interpretation function [Wei98]. So these classes are about as ineffable as any fa¸conde parler. Maddy identifies and investigates this tension in her 1988 paper, which develops a putatively consistent theory of proper classes based on Kripke’s construction in An Outline of a Theory of Truth. Her core observation is that proper classes are integral to mathematical practise, yet seem neither to be mathematically nor metaphysically understood.

Much of the talk [of proper classes] is casual, in the sense that it can be translated away, but in some cases it seems that the translated version would never have been reached without the heuristic detour through proper classes... On the other hand, some of the talk is serious and much in need of foundational clarification. [Mad83](117)

She goes on to suggest that the notion of set needs explaining—and that by improv- ing our understanding here, even problems like Benacerraf’s (which characterization of ordinals, Zermelo’s or von Neumann’s, is correct?) may be solved. This is true; if stronger theory could pin down an intended model, we would have a firmer idea of the invariant properties of mathematical objects. First Maddy suggests that the set/class distinction is like Aristotle’s separation of potential and actual infinity: The boundaries of the realm of sets are never completed, while proper classes make claim to being a finished, yet indefinite, space. Reinhardt takes up the same thought, suggesting that “Cantor shared with the ancient Greeks a disdain for the apeiron,” and that this is modelled in the set/class distinction. Maddy makes the distinction we noted in the comparison of histories, between sets that are essentially combinatorial, and classes as given by properties. It is this distinction we have been considering all along. There is room for terminological confusion here; for “it is not easy to say what makes a result ‘combinatorial’... It seems to be used for results which are more intrinsic to the sets themselves rather than relying on additional structure, and which in some way involve cardi- nality.” [Dra74](70) To these remarks, Kunen adds, “By infinitary combinatorics 90 3. FOUNDATIONS IN THE TWENTIETH CENTURY we mean the field that used to be called set theory before there were independence proofs.” [Kun80](47). Rather than fiddle with these terms, which are not terribly rigorous, we settle on the basic of extensional and intensional collec- tions. Maddy’s suggestion keeps with her Quinean naturalism, echoing the policy of keeping the universe-dividing properties cloistered into proper classes, while the innocent extensions of sets are undisturbed. The picture piques an immediate and irremediable problem. Suppose the it- erative hierarchy is the last word on sets. Where are the proper classes? While qualitatively distinct from sets, the proper classes still seem to be members of the mathematical universe—“much of the talk is serious”—so the question is not mis- placed. As Reinhardt notes, “Our idea of sets comes from the cumulative hierarchy, so if you are going to add a layer at the top, it looks like you just forgot to finish the hierarchy....” In trying to answer his own worry Reinhardt offers:

A proper class P may however be distinguished from a set x in the following way (if the reader will indulge another counter- factual conditional): If there were more ordinals, x would have exactly the same members, whereas P has more to it than that. (We could say that P contains the famous three dots or ‘etc.’ of mathematics in an essential way.) [Rei74](196)

Whatever sense one can make of this, it goes no distance in answering our query of location, save perhaps an oblique suggestion that proper classes are ‘at the top’— “in class theory the ‘universe’ of sets is a completed whole (so one looks downward at the universe of sets from a high vantage point).”[Dev79](59) In a followup article developing a Kripke-construction of sets and classes further, Maddy sums the issue up completely. “‘Proper classes’ are either indistinguishable from ex- tra layers of sets or mysterious entities in some perpetual, atemporal process of becoming.”[ST00](299) Approaching from above, it becomes clear that there is no way to locate proper classes in the official mathematical domain—at least not in a way consonant with classicality. This is the reason that, despite the irresistibility of talking about collections as sets, most wish to continue regarding proper classes as meaningless abbreviations. But the purpose of proper classes is to provide closure, an inelim- inable conceptual resource; and this brings us to a very serious problem for the prevailing theory. The main theme of [Pri02a] is the sheer hopelessness of both outrunning para- dox and reaching any kind of broad intelligibility; one can see these two demands 4. HIERARCHY 91 as transcendence and closure respectively, of unstoppable forces colliding with im- movable objects. Putnam, at one phase (in 1990, quoted in [Wei98](768)), makes the point succinctly:

The paradoxical aspect of Tarski’s theory, indeed of any hierar- chical theory, is that one has to stand outside the whole hierar- chy even to formulate the statement that the hierarchy exists. ... The paradoxes themselves are hardly more paradoxical than the solutions to which the logical community has been driven.

To support a hierarchy either heuristically or formally, one must violate the stric- tures of the hierarchy. The phenomenon is witnessed by Russell’s self-refuting dictum: ‘No propo- sition may quantify over all propositons.’ In effect, the iterative conception of a mathematical foundation recapitulates Russell’s vicious circle principle, taking self- inclusion to be the culprit in contradiction. This is demonstrably a misdiagnosis, as is aptly shown by Aczel’s non-well-founded set theory, which is consistent but allows self-membered solutions to equations. And this is anyway a diversion. Asserting that closure is ultimately impossible is evidently self-defeating. Priest’s [Pri02a] is a book-length argument on the persistence of paradox over hierarchies. Iteration fixates on transcendence and denies closure; the iterative view is con- genitally unfit to take ultimate stock of itself, and so is incapable of providing a satisfactory basis for set theory. This is what proper classes are for: Iteratively, there is no way for set theory to express the existence of its own domain. Weir writes, “An infinite regress at which there is no at all at level α unless the explanation at α + 1 works is indeed vicious, for the usual reason: ex- planation must come to an end.”[Wei98](780) To bring about conceptual closure would entreat paradox all over again. Any completing stage will feature the incon- sistency found at the start. Ergo we have an incompleteness, and not a foundation. Since inclosure is dilemmatic, a remaining possibility is to opt for outrunning contradiction and to abandon hopes of intelligibility—of our practice, of mathe- matics, of truth. This is to give up foundations as ineffable, as seems to be the outcome of the appendix to Kanamori [Kan94]. Claims about ineffability may be answered in kind. Writing on the current state of foundational set theory, and reflecting on their own past work, Shapiro and Wright sum up neatly: “Invoking proper classes is an attempt to do the very thing we are intuitively barred from doing. ... Set is supposed to encompass the maximally general category of entities of the relevant kind.”[SW06](272) As Boolos says, “you can’t get out of this paradox merely by substituting one word for another.” Proper classes and the essential conceptual 92 3. FOUNDATIONS IN THE TWENTIETH CENTURY resources they provide show that the cumulative hierarchy is not a descriptive foundation for mathematics. And if, as Maddy thinks, the proper classes really are required, and if calling a set ‘proper’ is to label it with a warning of nascent contradiction, I again ask whether or not the proper classes do anything more that signal inconsistency.

4.3. Overlarge Sets. In the last chapter we met the doctrine of limit on size as a natural enough formalization of Cantor’s absolute: A collection is overlarge if it is the same size as the universe, V . As early as 1904, Russell had proposed a limitation on size as a solution to the antinomies. This idea, always presented with scare-quotes, states that some collections are ‘too big’ to be sets. Von Neumann agreed and offered the more precise criteria: Any collection containing as many objects as the universe is not a set. The original and still forceful objection to a limit on size is: Any limit, without further explanation, is capricious and arbitrary. To paraphrase an insight of David Lewis, if there are constraints on a mathematical object, the mathematics must tell us what those constraints are. Upon proposing it, in fact, Russell admits: “A great difficulty of the theory is that it does not tell us how far up the series of ordinals it is legitimate to go.”[Rus05] The large cardinal work studied in my chapter 6 shows indeed that the extent of the ordinals is still an open classical question, that there is still a lot to learn about when and where inconsistencies in the ordinal line arise. The limit doctrine isn’t doing any work. Even supposing the limit on size is clear, we encounter the same demand for resources that plagued the doctrine of proper classes. Regarding the sheer size of Cantor’s universe, we find direct ascriptions of plenitude that simply outstrip the iterative construction and show that comprehension, and with it overlarge classes, remains the unspoken assumption. The general Cantorian initiative is: Whatever further ordinals there could be, there are [Pot04](210). The mathematical universe is utterly maximal; the principle of plenitude is a closure condition. Reinhardt: “Cantor through his doctrine of the Absolute had intended the universe of sets to comprehend all possibilities.” And this is correct: The domain principle is a clear restatement: Potentiality presupposes actuality. Such plenitude is appealed to in mathematical practise, as a justification for assumptions. For example, Gaifman explains that “All embeddings of the form j : V −→ M are assumed here to be definable in V . This is a natural assumption; since V is Cantor’s universe, everything should be there already.” [Jec74](39) Here is a stark commitment to the largeness of the universe, in particular the maxim that all the ordinals are already there. Tarski would observe that “those who share this attitude are always ready to accept new ‘construction principles’, 4. HIERARCHY 93 new axioms securing the existence of new classes of ‘large’ cardinals ... but are not prepared to accept any axioms precluding the existence of such cardinals ....” Perhaps the maxim is meant only heuristically, or as some kind of figure of speech. How might such overlarge thoughts be defined away, especially if the paraphrasing must be in terms of iteration? When offered statements of this kind by experts like Gaifman, the clearest course is to take him at his word, rather than attempt some radical translation. A literal reading of set theoretic intuitions requires a mathematical universe that outstrips the conceptual resources available from the cumulative hierarchy. The discussion of proper classes has already touched on this issue. Two further comments are in order. First, from the naive viewpoint, limit on size makes V look, ironically, like a set. If there is no universe, then what, exactly, is it’s size? On the other hand, if V is a real object in the range of a one-one correspondence with some M, we need an explanation as to why V and M are not sets. Otherwise these would be collections deemed largely unintelligible or somehow conceptually defective, because they are wellordered and in one-one correspondence. If Lavine is right that sets are transfinite objects that can be counted, then they are sets. As I’ve alluded to, I think the cause of this strange doctrine is an attempt to articulate a limit of iteration or expression qua Priest. But without resituating into a dialethic context, this attempts to say something where, by consistent lights, nothing can be said. It is indicative of the is-ought, description/prescription confusion. This makes way for the second comment germane to the doctrine. Consistency cannot be guaranteed by assertion of consistency; “one thing that is clear is that adding premises cannot possibly reduce threat.” [ABD92](503) To take an analogy, murder is a crime. It is illegal to commit murder. This does not, however, ensure that murder never takes place. The very fact of murder is what makes the laws urgent. Now, inconsistency and size are closely linked, as we will prove in part two. If limit on size is recommended as mere legislation, then so be it; one may try to regard it as a protective barrier, as something imposed by fiat. This is not the current attitude towards limit on size, which reverses the roles of prescription and description. The standing argument, rather, accepts the premises but draws wayward conclusions. With the murder analogy, consider: If the President murders people, then he has committed a crime; but the President is the President, and cannot commit crimes; therefore, by a limit on size style reductio, he cannot have murdered anyone—or even more baroquely, if he did, then the President does not exist. With respect to set theory, the ordering of premises is similarly tortured. We prove some set M to be V -sized. Then we are to reject that M has an intelligible size, or even that there is such an M. 94 3. FOUNDATIONS IN THE TWENTIETH CENTURY

Now, if this were a cogent method, then there is of course no worry that set theory is, was, or ever could be inconsistent. Any inconsistent element of the theory is rejected from the theory, ruled out a priori. But plainly, as with murder, one cannot simply eliminate a problem by defining it away. Having found a collection to be overlarge, there is no time to invoke limit on size; it is already too late. Once a class is recognized to be a proper class, this means that we have proved a set is inconsistent. Faith, and faith alone, intimates that it was not a set to begin with. Faith alone is propping up limit on size, which on a literal reading is the simple statement that set theory (albeit not necessarily ZF ) is inconsistent. The same problem was incipient in Cantor’s theory. A collection is a set when intensionally bound, and in this breath, as Russell recognized, all collections should be sets. In particular the collection of all sets is intensionally bound, by this very sentence, and so should itself be a set. Cantor flatly denies this with the name ‘ab- solute’. As we’ve seen, the absolute simultaneously is just the universe of sets and a prohibition against saying so. Cantor did not provide sufficient criteria for de- marcating the difference between transfinite and absolute, and a fortiori provided motivating arguments that immolate the difference altogether. Exactly as much work in the service of consistency is achieved by the words ‘proper’ and ‘absolute’; when viewed without a classical they are names of dialethias. Cantor’s dis- tinction between consistent and absolutely infinite multiplicities is not a solution to the problem of the quantitative boundary of consistency. The stigmatizing of overlarge sets is similarly but an illustration of the basic paradox of set theory.

5. Of Comprehension

The naive comprehension principle is an axiom in the old sense: simple and self-evident. My case comes down to the inalienable truth of comprehension, and the fact that an honest assessment of mathematical statements shows that it has not been alienated. In this last phase of the argument, we observe that in fact classical mathematics has not, because, as the discussion so far has already suggested, it cannot abandon the naive view. Zermelo’s 1908 axioms met with “intense criticism,” not only over the axiom of choice but because of his decision to molest the comprehension principle. Schoen- flies, Bernstein, and Poincar´eall rejected this possibility out of hand [Moo82](111, 117). In his famous 1914 textbook, Hausdorff too expressed doubt about Zermelo’s system—“At present, these extremely ingenious investigations cannot be regarded as completed, and introducing a beginner by this [axiomatic] approach would cause great difficulties. Thus we wish to permit the use of naive set theory here...” [Hau57] 5. OF COMPREHENSION 95

George Boolos, as we’ve seen, was an important spruiker for the iterative set concept as a philosophically satisfying replacement for the naive view. The work, if such work could be done, is necessary, because the naive view is otherwise the default position. Boolos’ discussion is very honest. On Frege’s fifth law (“it sounds utterly obvious, doesn’t it?”) and the Russell paradox, he writes It should be noted that, although logic forces us to accept that there isn’t any such set, it’s highly paradoxical that there isn’t. ... Let me invite you to fix your attention on the sets that don’t contain themselves, the set of evens, the set of human beings, etc. Doesn’t there HAVE to be such a thing as the collection or totality of things you are thinking about, the sets that don’t con- tain themselves? And isn’t a collection or totality just the same thing as a set? How COULD there NOT be a set containing all and only the sets that don’t contain themselves? [Boo98](148) And he goes on to chide the renaming certain collections as sets and others as classes as a hopeless diversion. The aporic situation is apparent in the opening pages of most set theory books. Devlin opens as many do with exposition of basic Cantorian theory. Endorsing (and crediting to Paul Halmos3) the epithet of ‘naive’ set theory, Devlin explains the discipline to be a rigorous theory, based on a precise set of axioms. ... [But] the axioms can only be fully understood after the theory has been investigated sufficiently. This state of affairs is to be expected. The concept of ‘set of objects’ is a very intuitive one, and, with care, considerable, sound progress may be made on the basis of this intuition alone. Then, by analysing the nature of the ‘set’ concept, the axioms may be ‘discovered’ in a perfectly natural manner. [Dev79](1) The set concept Devlin intends to use initially is naive comprehension. “In set theory, there is really only one fundamental notion: The ability to regard any col- lection of objects as a single entity (i.e. a set).” On the basis of our method of taking people at their word, comprehension is still the guiding notion of set; the ZF axioms are supposed not even intelligible without naive acquaintance with sets. This is not a controversial claim. “Since the concept of set is so intuitive (is it?), one might expect the axioms to form a neat collection of about three statements,”

3Who at the close of his charming autobiography I Want to be a Mathematician names credits to himself the abbreviation ‘iff’ and the little square ‘tombstone’ marker at the end of proofs. 96 3. FOUNDATIONS IN THE TWENTIETH CENTURY

Devlin writes. “In fact we shall obtain what can only be called a motley collec- tion of some nine statements.” [Dev79](49) C.f. van Aken: “A survey of the [ZF] axioms does not suffice to reveal the source of their attraction.”[vA86](992) The explanation from Takeuti and Zaring, in their dense and elegant text [TZ71], is paradigmatic of the way heightened precision also exposes the gape of a conceptual gap. They make respect for comprehension an adequacy condition. Russell’s paradox notwithstanding, “classes, or arbitrary collections, are however so useful and our intuitive feelings about classes are so strong that we dare not abandon them. A satisfactory theory of sets must provide a way of speaking safely about classes.” And again in the midst of technical development, “The idea of the collection of all objects having a specified property is so basic that we could hardly abandon it.” [TZ71](3, 9) Scott, the principle architect of stage axioms, at some points in his influential 1967 symposium talk belittles the naive view, insisting on the myopic idea that intensional set collection was never a part of Cantorian practise. Such rhetoric is highly local, however. Speaking of the Zermelodic separation scheme, Scott admits that “it is a great temptation to erase the condition x ∈ z [from (∀z)(∃y)(∀x)(x ∈ y ↔ x ∈ z ∧ Φ(x))], thus simplifying the ; but we all know what happens.” The localization of the rhetoric against the ‘logical’ notion of set suggests it is not entirely sincere. In general, and in parallel to the historical revisions, a ritual has evolved of bringing out the formal comprehension axiom and discrediting it. This officially done in public, set theory then reverts to what it always was, a study of compre- hended collections. As Church bluntly states in [Jec74], “We assume all of naive set theory except the comprehension axiom, and then assume as many special cases of the comprehension axiom as we dare.” De jure, comprehension is in disrepute. De facto, the full comprehension axiom is still in force. After all, “simply saying that we ought never to have expected any property whatever to be collectivizing, even if true, leaves us well short of an account which will settle which properties are...”[Pot04](27) In light of what has been argued, the revisionism in the new history of set theory §3.2, with Cantor a forerunner of modern researchers, is telling. Lavine’s story is nuanced, detailed, thoughtful, and ends in exactly the contradiction it meant to avoid: the existence of collections that are obviously sets, but that outstrip the resources of any set concept save the full naive one. Lavine himself rebukes the iterative concept for his own reconstruction of Cantor’s theory; but the difference for us is immaterial. A theory taking e.g. wellorders as primary will be even less independent, presupposing even more naive set theory, than the iterative account. Whatever point an inclosure paradox is admitted, it is the same, inevitable paradox. 6. CONCLUSION 97

6. Conclusion

The paradoxes show that the naive view is inconsistent. The naive view is analytic. Persisting with classical logic, then, is an act of outright incoherence, in the formal sense of being trivialized by explosion. The twentieth century crisis was a conceptual battle of the need for and admission of explanatory closure, against the assumption of consistency. The broad situation is captured by Dummett.

The beginner can be persuaded to that it makes sense, after all, to talk of the number of natural numbers. Once his initial prejudice is overcome, the next stage is to convince the beginner that there are distinct transfinite cardinal numbers.... When he has become accustomed to this idea, he is extremely likely to ask: ‘How many transfinite cardinals are there?’ How should he be answered? He is very likely to be answered by being told, ‘You must not ask that question.’ But why should he not? If it was, after all, all right to ask ... in the sense in which ‘number’ meant ‘finite cardinal’, how can it be wrong to ask the same question when ‘number’ means ‘finite or transfinite cardinal’? A mere prohibition leaves the matter a mystery. ... And merely to say, ‘If you persist in talking about the number of all cardinal numbers, you will run into a contradiction’, is to wield the big stick, but not to offer an explanation.[Dum91](315)

The bottom line of the explanation we get is this. There is of course no worry about paradox. Paradoxical objects don’t exist, or else they would be paradoxical. Now, a friend of the mainstream theory could still think that all the apparently ‘bad’ talk, the appeals to various notions in scare quotes, can be paraphrased or defined or explained away, leaving a consistent concept in place. Enough care and thought have gone into isolating the consistent fragment of set theory, the thought goes, that it is now safe to show a bit of na¨ıvet´e;the full consistent machinery can be invoked if any trouble arises. Kunen and Drake, for example, devote consider- able energies to showing how class terms can always be reduced to basic, legitimate sets; Levy has a full appendix to prove what he calls the “conservation and elim- inability theorems.” The starkest and most pervasive example of this is to have the wellordered class V on hand, but always to work at a suitably high rank Vα and in doing so avoid risk of antinomies. ‘Scott’s trick’ in his ultrapower construction is a good example of this idea being stretched to the limit; see [Kan94](47). Thus, my literal reading of mathematical talk could be accused of being simple minded. When we use heuristics, metaphors, aren’t these just means of gaining traction on difficult and abstract ideas? In principle, this can all be replaced with 98 3. FOUNDATIONS IN THE TWENTIETH CENTURY utterly precise and clear propositions, in the same way that in principle a proof in mathematical English can be displayed recast as a fragment of Principia Math- ematica. Russell himself said in 1901 that “A book should have intelligibility or correctness; to combine the two is impossible.” In selecting statements from set theory texts, perhaps I am being unfair. This is a serious concern. The reply is that mathematics is serious, and we should not mistake appeasement for actual ideas. To take an analogy, Kant’s Cri- tique of Pure Reason states explicitly that noumena are that about which nothing can be said. He then writes in excess of 500 pages about the noumena. If one wished to defend Kant against charges of inconsistency, something like a use-mention dis- tinction could be introduced and an algorithm offered to reduce any apparent talk about noumena to simple mention of ‘noumena’. Perhaps this could be done. Sim- ilarly for set theoretic foundations and the paradoxes of infinity, perhaps mathe- maticians, being extraordinarily clever people, can and have found technical devices to silence complaints from logicans and philosophers. The question is whether this in-principle solution makes contact with the con- tent of mathematical propositions. Is it really descriptive? The thought that large portions of set theory books are written in a kind of eliminable slang runs against the idea of taking mathematics seriously; no vital and dignified discipline is con- ducted in scare-quotes. Before Cantor, but after Newton and Leibniz, it was clear that talk about infinity was required, yet terribly murky. Gauss famously excused this cloudy exposition as a mere fa¸conde parler. The transfinite shows Gauss’ evasion is unnecessary, and frees up analysis for honest practice. There may be workable translation algorithms, but they are in the service of prescribing safe behavior rather than describing mathematical structure. We now have an oppor- tunity to trade in prescriptive dogma that has overstayed its need, for explanations pointing to a way ahead. To conclude, let’s brings together the main themes of part one, the paradox of absolute infinity at the meeting of intension and extension. Cantor overtly and repeatedly explained that his extensional set theory is ulti- mately intensionally bound. Similarly for Frege’s system, it is an extensional theory mortared by intensionality—a notion that, if not for contradiction, was never in question and even now remains. Zermelo and von Neumann found a neutral way to phrase Cantor’s doctrine, but, as we’ve seen, did so by neutralizing the mean- ing of their axioms, or more charitably, by allowing that there is no explanation for transfinite set theory. Plainly unhappy with making Russell’s quip about the vacuity of pure mathematics (as the “the subject in which we never know what we are talking about, nor whether what we are saying is true”) a reality, a program was launched to replace the intensional universe with an extensional story about 6. CONCLUSION 99 iterative set formation. We have seen that the outcome of this program is inade- quate. All that I’ve said can and has been repeated mutates mutandis with respect to naive and Tarksi’s hierarchy, and there the data is not sequestered to expert mathematics but to the working praxis of language. The approach we will now take up, that of dialethic paraconsistent set theory, is designed to overcome as many of these difficulties as possible.

Part Two—Foundation

“One of the most compelling motivations for the construction of paraconsistent logic,” said Newton da Costa in 1998, is “the possibilities it opens up in the founda- tions of set theory.” The ratio of programmatic statements about paraconsistent set theory, though, to the proven theorems of paraconsistent set theory, is imbalanced. A developed set theory based on full comprehension is long overdue, and is begun in this part. First I fix the logic for the theory, DLQ, and discuss some of its properties. Then, in the most important chapter, I work up from the axioms through the transfinite numbers, recapturing classical results such as Cantor’s theorem, and also deriving novel theorems, such as the ‘axioms’ of infinity and choice, Burali- Forti’s contradiction, and some indications of distinctive transcendence phenome- non. Then, in chapter 6, the transcendence results prove the existence of all the major large cardinals, showing serious advances in the proving power over com- petitors. All this shows that, to a large extent, infinitary mathematics is just a reflection of the paradox at the limit of ordinality, discussed in chapter 7. At the least, this part proves that a robust, independent paraconsistent set theory exists, and a fortiori that a lot of set theoretic mathematics can be had without need or assumption of consistency.

∗ ∗ ∗ Logistic is no longer barren. It engenders antinomies. — Poincar´e

101

CHAPTER 4

On A Logic for Naive Set Theory

This chapter studies a logic appropriate for naive set theory, focusing mainly on the technical problems that arise for weak relevant logics in inconsistent settings.

1. Introduction

The goal of this work is to embed a viable naive set theory. Paraconsistency is therefore a requirement; but there are many systems of logic in which explosion fails. Which logic to use? Once committed to a robust abstraction principle like full comprehension or Tarski’s schema, the field narrows quickly. In fact, for reasons shortly to be given, my current suspicion is that there is (basically) only one going option, the DLQ of Routley and Meyer [RM76]. I will be calling it dialethic logic, in keeping with the change in nomenclature since 1976. This chapter is not an ex- haustive survey of the many non-classical, non-explosive systems available (for this see [PRN89], [Pri01], [DR02], [Pri02b]), but rather scene-setting considerations and solutions to preliminary formal woes. This chapter is heavily concerned with identity as it occurs in mathematical practice. I use a technical result on relevant restricted quantification in §5 to mo- tivate a new approach to thinking about identity and naive set theory in general. The technical result shows that a relevant conditional cannot, despite a recent sug- gestion [BBH+06], be made to mimic the material conditional without lapsing into absurdity. The main subsequent claim is: Some sets, contrary to first impres- sions, are not coextensive. While it is standing wisdom that set theory can be either extensional, or have unrestricted set-formation principles, but not both, it is here redressed what is meant by extensional; and this done, the standing wis- dom is undercut. The meaning of extensionality in a relevant naive set theory is different and novel. My point is a straightforward one: Identities are only ‘lost’ if we start from the assumption that they are had. We need not assume so; in the set theory to be considered below, they are not. And a fortiori, to develop the internal mathematics of paraconsistent set theory, as a robustly independent and dignified discipline, we ought to take the paraconsistent view as independent and dignified. Naive set theory in relevant logic is not extensional when evaluated from the standpoint of classical set theory. It is perfectly extensional when evaluated on

103 104 4. ON A LOGIC FOR NAIVE SET THEORY its own terms. The methodological import of this is that metatheory and study of models of set theory with respect to naive set theory is best conducted in an appropriate non-classical logic. Some preliminary notes on that project are found at the end of this thesis. The criteria we bring to bear on a theory has radical implications for our attitudes toward and the status of that theory; crucially, if classical logic continues to be the logic in which we prove meta-theorems about non-classical systems, classicality will be the standard by which competing theories are judged. Trivially, if any other set theory diverges from classical set theory but is judged by the latter, then the former will always be a deviant. It is no longer au courant to regard non-classical logics as ‘deviant’, and it should not be so for set theory, either.

2. Logics of Formal Inconsistency

2.1. A Map of the Paraconsistent. Lukasiewicz opened the way to non- classical logic in its modern form in 1917 in Poland, shortly after the introduction of classical logic in its modern form. Paraconsistent formal logic appeared with the work of Ja´skowski, himself a student ofLukasiewicz, on non-adjunctive logics in 1947. Not long thereafter, Newton C. A. da Costa in Brazil began a direct inves- tigation of paraconsistent logics [dC74], [dCKB04] and in 1976 F. Mir´oQuesada of Peru coined the neologism, paraconsistent. South America is prominent on the paraconsistent landscape, even if da Costa was not its sole creator as is sometimes claimed there; recently we have [Mar05], from whom the title of this section is borrowed. The programmatic leader in Brazil, early on da Costa stated four methodolog- ical principles (see [Urb89]): - From Φ and ¬Φ should not follow Ψ for arbitrary Ψ; - The fragment of classical logic that does not cause explosion should be maintained; - The schema ¬(Φ ∧ ¬Φ) should not be derivable; - The full system must be at least first order (with equality). The second principle may sound simple enough, but working out just what fragment it alludes to is a substantial and controversial topic. The third principle is a good indication of da Costa’s thinking. Other paracon- sistentists [PR89b] see a difference between holding the law of non-contradiction as a normative principle, and having the formula ¬(Φ ∧ ¬Φ) as a theorem. The former needs to be overcome; the latter is, arguably, an analytic feature of nega- tion any should capture. In da Costa’s systems the matter remains undifferentiated. Ultimately da Costa founds a hierarchy of logics, the Cn systems, 0 ≤ n ≤ ω, which includes an operator Φn read that Φ behaves consistently in 2. LOGICS OF FORMAL INCONSISTENCY 105

◦ Cn−1. Most often this is presented in C1, where the classicality operator is Φ , = since C0 is just classical logic. The opening set theory exercise showed C1 allows for some natural arguments, but we cannot develop naive sets in da Costa’s sys- tems, because its conditional is wrong: Arruda found early in her investigations that naive comprehension is too strong for the C systems, deriving the “unpleas- ant” result that a = b for every a and b; see [AB82], [Arr89](112). In particular the conditional in any Cn supports contraction and therefore falls afoul of Curry’s paradoxes—all of which is discussed in detail below. A paraconsistent school in is thriving. In Ghent, Batens and his school study dynamic dialectical logics [PRN89](ch6). From Brussels come innovative models in the works of Hinnion [Hin94] and Libert [Lib04]. Hinnion’s work grew out of another alternative set theory, Quine’s NF set theory, to which he has contributed plentifully [For95]. Focusing heavily on topological models for the comprehension axiom, their approach pays equal attention to paracomplete (truth- gappy) theories, beginning from Gilmore’s work on partial set theory [Gil74]. The Australian school of paraconsistency is closely allied with the program there [RPMB82], and is most attributable to Routley (later Syl- van), Meyer, Brady, and Priest. In 1976 Routley and Meyer questioned the con- sistency of the world [RM76], in a Marxist context. In 1979, Priest published his landmark [Pri79], the “Logic of Paradox.” This paper brought the burgeoning idea of dialethism to a wider audience than it had found previously, and paraconsistency in Australia is now often conflated with dialethism, if not with Priest himself.

2.2. A Point of Principle. As we go, there is a methodological question, which we will return to in the conclusion of this chapter. On the one hand some logic must be in place to guide steps in proofs. To know what is logically valid on the other hand is by most accounts to know what is valid in every model; models being set theoretic constructs, logic relies on knowing what sets there are. Roughly speaking, the number of logical truths is inversely proportional to the number of sets, since a smaller universe makes it more likely that a proposition is ‘universally’ valid. (See Jes´usMosterin’s article in [Wei94].) Determining either logic or set theory in complete isolation is an impossible task. A proposed approach so far has been to bootstrap. Priest uses the collapsing lemma, that a consistent model can be extended by adding truth values to obtain an inconsistent model [Pri97]. The proof is by induction, a non-trivial mathematical tool. The collapsing lemma can then be applied to show that LP is just an extension of classical logic—but all this under the umbrella of classicism. And if we are concerned that the classical umbrella will not hold up against what we suspect a real deluge of contradiction, then this whole line is in serious danger. 106 4. ON A LOGIC FOR NAIVE SET THEORY

My approach is to treat the logic, at least initially, as a formalist—to give a Hilbert-style axiom list, comprised of uninterpreted symbols which may be shuf- fled by a few given rules. Talk of truth, meaning, or interpretation will have to wait for the set theory. At the end of the chapter we can return to the ques- tion of classicality in the metatheory. Newton da Costa explains that “at least on philosophical grounds, it is needed to have a paraconsistent set theory already articulated if one intends to have a reasonable semantics for paraconsistent logic (given that the semantics shall be constructed within set theory).”[Mar05], xlii) The methodological maxim underlying this chapter is that, because semantics are presented in terms of set theory, there can be “no semantics for a paraconsistent logic without a paraconsistent set theory.”

2.3. Which Logic to Use? The two concerns in setting up a paraconsistent set theory are: the formulation of the non-logical assumptions and basic definitions; and the choice of the background logic itself. Systems have shown themselves to be remarkably sensitive to small changes in either. The chapter is not a survey of other attempts. Rather we focus in on a version that seems to work, by fixing the (sometimes competing) goals of finding a system - strong enough to deliver real mathematical theorems, both classical and non- classical; - weak enough to be non-trivial, else the enterprise is absurd; - and as far as possible still respecting our intuitions about the meaning of core concepts, such as predication, set membership, identity, and order. Achieving the first two goals is more or less a matter of technical balance, approachable by trial and error. The third is far more difficult, since intuitions seem to diverge—or, more contentiously, are overdetermined. A rule of thumb is to avoid excessively prolix or abstruse requirements that would lead to the systems being inflexible and unattractive to users. Priest’s work has enjoyed continuing importance in part because Priest ties his philosophical argument to a natural enough looking logic, LP . This “logic of paradox” is Kleene’s strong three valued logic K3, with the middle value designated, and the intended interpretation of that value to be overdeterminate rather than indeterminate. (A similar change of designation to K4 yields Dunn’s first degree entailment.) Because LP is truth-functional, the following initially alluring result follows: Every classical tautology is an LP tautology. The picture presented is one of a straightforward conservative extension of our rea- soning apparatus, likened to the addition of irrational numbers to the rationals for the completion of a structure. A naive set theory in LP was briefly investigated 2. LOGICS OF FORMAL INCONSISTENCY 107 by Restall in [Res92], with solid model-theoretic results such as non-triviality, as evinced by the failure of the foundation axiom. Non-trivial models for inconsistent arithmetic in LP have been completely characterized in [Pri97] and [Pri00a]. Of- ten dialethic paraconsitency and LP are treated as interchangeable: LP is assumed to be the correct logic for a dialethist. As a logic for set theory, though, LP has some insuperable difficulties. Pre- eminently, LP has no conditional; Φ ⊃ Ψ is defined as usual as ¬Φ ∨ Ψ, and so, because must fail, is non-detachable:

Φ, Φ ⊃ Ψ 6`LP Ψ.

Indeed, LP is just a de Morgan lattice with excluded middle, so it does recapture classical tautologies, e.g.

`LP Φ ∧ (Φ ⊃ Ψ) ⊃ Ψ, but exactly because it is too much like classical logic, these tautologies sit inert. No logic, let alone any serious mathematics, can proceed without a conditional that respects modus ponens. There are several more restrictions on the logic, discussed in §3 below. These considered, then amongst the well-understood paraconsistent logics, this leaves only the intensional relevant systems of the Australian school. Again, much of this work has been motivated by Anderson and Belnap’s relevance program initiated in [AB75], and carried on by Dunn, Slaney, and more recently, Mares [Mar04]. The best investigated relevant logic is R, but this is too strong for naive set theory by several orders, and as a home for arithemtic has been investigated into a cul-de-sac by Friedman and Meyer [FM92]. Mathematics couched in both a relevant and dialethic light is Mortensen’s [Mor95]. When I began to explore naive set theory, my principle was pragmatic: Use the strongest logic possible without triviality. I did not fix on any one logic, but rather guessed and checked, employing many natural inference principles, seeing where they led, much in the spirit of the initial excursion of this thesis, chapter 0. They led, as it turned out, to several explosions, and one proof that a 6= b for any a, b. Some real time was required to break from intuitions honed on classical logic. Aside from an absolute insistence on modus ponens, it came clear that some principles, especially negation principles, are indispensable for a study of collections. Particularly, it is important to keep contraposition,

(Φ → Ψ) → (¬Ψ → ¬Φ), and a counterexample principle, at least in rule form:

Φ ∧ ¬Ψ ` ¬(Φ → Ψ). 108 4. ON A LOGIC FOR NAIVE SET THEORY

Contraposition is essential because the naive intuition is that set membership and predication are basically the same. Without contraposition, much of this conceptual link is severed; if n is not a number, it is not in the set of numbers. Socrates’ schema asks not only when an action is Φ, but also when it is not. Contraposition, though, demands adherence to a relevant logic in dialethic context; in particular, we cannot have weakening, Φ ` Ψ → Φ. With weakening and contraposition, we could argue as such, with Λ a true contradiction: 1 Λ premise 2 ¬Ψ → Λ 1, weakening 3 ¬Λ → Ψ 2, contraposition 4 ¬Λ premise 5 Ψ 3, 4, modus ponens But Ψ is arbitrary. This step to relevant logic does not really increase our philosophical commitments, though, since in the presence of contrapositon, the inference in question here is just Φ ` ¬Φ → Ψ, a form of explosion. Similarly, the counterexample principle is needed to respect the nature of col- lections, specifically the subset relation a ⊆ b. If x ∈ a but x 6∈ b, it must be that a is not a part of b; else there would be no way of distinguishing sets. Both these principles are about negation. Paraconsistent negation is a nuanced issue; Dunn, Priest, Restall, and many others, have written extensively on it. I am only stating my intuitions on the matter, combined with some experience trying to prove theorems. Negation, while it cannot be entirely boolean, nor intuitionistic (see section 5 below), still has most of its other properties: contraposition, introduction and elimination. With contraposition, the law of excluded middle actually just follows from axiomatic counterexample,

(Φ → Ψ) → ¬Φ ∨ Ψ, with Φ for Ψ. The claims about extensionality in this chapter hinge on a ro- bust negation. The axiom of extensionality should tell us not only when sets are identical, but when sets are not identical. This is as much a part of the set con- cept as comprehension—to tell us, in Socrates’ idiom, which actions are just and also which actions are unjust. Some would jettison excluded middle—and eo ipso counterexample—to preserve consistency, e.g. Brady [Bra06]. For without ex- cluding the middle, instances of comprehension like x ∈ R ↔ x 6∈ x, the Russell condition, go no further. But a great many (most?) proofs in e.g. the theory of ordinals make appeal to an exhaustive set theoretic universe. I can see no way at all to argue for theorems without it, and this leads me to suspect that exhaustion is a deep property of 3. A PARACONSISTENT LOGIC 109 the set concept; my suspicions are borne out since Brady in [Bra06] postulates, rather than proves, all the desired facts about order, ordinals, and cardinals. From a perspective that does not assume that classical set theory is the antecedently correct set theory, the axiomatic postulation method makes little sense. Anyway, blocking the derivation of the set theoretic paradoxes, given a setting in which they will not lead to absurdity, deprives us of knowing what sort of novel information the paradoxes contain. It constitutes a decision, as it were, not to study the full naive theory; and this is not in the paraconsistent spirit. There is a widely noted duality between paraconsistent and paracomplete systems, and there has been a good deal of investigation into mathematics that is consistent but with gaps; at the very least, it is of some scientific interest to consider dual systems without gaps that are inconsistent. The logic DLQ was first proposed by Routley and Meyer [RM76]. There it was explicitly a dialethic (at the time called dialectical) logic, with p∧¬p included as an axiom, to ensure inconsistency (!). In his highly influential article [Rou77], where he argued for and sketched the beginnings of a naive set theoretic recapture, Routley used the logic DKQ, which is slightly weaker than DL but still has excluded middle, Φ ∨ ¬Φ. Routley called the theory obtained from joining DKQ with the axioms of comprehension and extensionality DST , dialectical set theory. Brady [Bra89], [Bra06] proves that DST has a model and is non-trivial. In the last decades Brady weakened the logic further, dropping excluded middle to obtain DJQ, which leads to a consistent set theory. A weak member of the relevance family, DLQ nevertheless proves more ap- propriate than Cω or LP . On the one hand, every classical tautology (since they involve only ∧, ∨, and ¬) is a theorem of DLQ [Rea88](71), just like LP . But the logic is not cast in a classical philosophical paradigm. As it turns out, drop- ping ex falso quodlibet is a tall order; the majority has not been wrong to think that classical logic is deeply inured in, if not based upon, consistency. Abandoning consistency requires a qualitatively distinctive approach to formal reasoning, and this is the ra¨ısond’ˆetreof DL. A drawback for DLQ is that there is as yet no non-triviality proof for it. Brady’s construction is not easily adapted to encompass DLQ [Bra06](244).

3. A Paraconsistent Logic

Let us have the language of first order set theory: with primitives ∧, ¬, →, ∀, =, ∈, and {· : −}; variables x, y, z, ...; names a, b, c, ...; and formulae Φ, Ψ, Υ, ..., built up by standard formation rules. The usual shorthand is used: Φ∨Ψ for ¬(¬Φ∧¬Ψ); Φ ↔ Ψ for (Φ → Ψ) ∧ (Ψ → Φ); ∃ is ¬∀¬. All instances of the following schemata are theorems: 110 4. ON A LOGIC FOR NAIVE SET THEORY

Φ → Φ, Φ ∧ Ψ → Φ, Φ ∧ Ψ → Ψ, Φ ∧ (Ψ ∨ Υ) → (Φ ∧ Ψ) ∨ (Φ ∧ Υ), (Φ → Ψ) ∧ (Ψ → Υ) → (Φ → Υ) (conjunctive syllogism), (Φ → Ψ) ∧ (Φ → Υ) → (Φ → Ψ ∧ Υ), (Φ → ¬Ψ) → (Ψ → ¬Φ), ¬¬Ψ → Ψ, (Φ → Ψ) → ¬(Φ ∧ ¬Ψ) (counterexample), (∀x)Φ → Φ(a/x), (∀x)(Φ → Ψ) → (Φ → (∀x)Ψ), (∀x)(Φ ∨ Ψ) → Φ ∨ (∀x)Ψ.

The last two axioms have the caveat that x does not appear free in Φ. For good measure, there is no danger [Bra06](200) in adding the pair,

(Φ → Ψ) → [(Ψ → Υ) → (Φ → Υ)], (Φ → Ψ) → [(Υ → Φ) → (Υ → Ψ)] suffixing and prefixing, respectively, though this takes us beyond DLQ, nearby to what is called T LQ in [Bra06](242). The following rules are valid:

Φ, Ψ ` Φ ∧ Ψ (adjunction) Φ, Φ → Ψ ` Ψ (modus ponens) Φ → Ψ, Υ → ∆ ` (Ψ → Υ) → (Φ → ∆) Φ ` (∀x)Φ

Including adjunction is in contrast to the earliest paraconsistent systems of Ja´skowski. Contraposed, counterexample show the reason for its name: Φ ∧ ¬Ψ → ¬(Φ → Ψ). This preserves essential features about ∈ and =, namely, that sets satisfying different predicates are members of different sets, and that different with respect to membership are different. This is captured in a simple theorem (ch5, thm5.7), that sets differing with respect to membership are not identical. In particular, the Russell set R = {x : x 6∈ x} exists and R 6= R. The well-known inconsistency of the Russell set thus has a fairly natural characterization in naive set theory: R is not identical to itself. As is suggested by the brevity of the above list, very many common inferences are not appropriate for the study of inconsistent structures. We review notable absences and their attendant proofs. 3. A PARACONSISTENT LOGIC 111

Central in the relevance debate of the last decades is disjunctive syllogism,

Φ, ¬Φ ∨ Ψ ` Ψ.

This is due to C.I. Lewis’ famous argument in [LL59](250), showing that disjunctive syllogism, or material detachment, is explosive. Suppose both Φ and ¬Φ. From Φ follows Φ ∨ Ψ. But given also ¬Φ, follows Ψ by material detachment, where Ψ is arbitrary. The other major casualty is contraction,

Φ → (Φ → Ψ) ` Φ → Ψ due to Curry [Cur42]. By comprehension, consider the set

x ∈ C ↔ x ∈ x → Ψ which we will call Curry’s set. Then 1 C ∈ C ↔ C ∈ C → Ψ 2 C ∈ C → Ψ contraction 3 C ∈ C 1, 2, modus ponens 4 Ψ 2, 3, modus ponens where Ψ is arbitrary. Closely related to Curry’s problem is axiom (or pseudo) modus ponens

Φ ∧ (Φ → Ψ) → Ψ as found in ([MRD78]. Using Curry’s set, if pseudo-modus ponens were a valid scheme then we could instantiate thus:

C ∈ C ∧ (C ∈ C → Ψ) → Ψ

But C ∈ C is equivalent to C ∈ C → Ψ, so substituting we have

C ∈ C ∧ C ∈ C → Ψ, which reduces to C ∈ C → Ψ. Again, though, this is equivalent to C ∈ C; so by modus ponens, Ψ, where Ψ is arbitrary. Upon this discovery, a venerable vanguard of the relevance/naive set tradition expressed despair; the loss of modus ponens in axiom form was shocking enough to drive the tough-skinned authors of [MRD78] to write: Our examination of Curry’s paradox is discouraging in the ex- treme for the hopeful naive set theorist.... We were willing to give up the usual aversion to contradiction. We faced with equa- nimity the sacrifice of the deduction theorem. To continue with the project, a minimal, decent respect even for modus ponens must be given up as well. ... A naive set theory is untenable. 112 4. ON A LOGIC FOR NAIVE SET THEORY

However, history showed this to be overhasty. Routley conjectured that the rule form of modus ponens could be salvaged; and indeed Brady proved that this rule is not explosive even under unrestricted comprehension. The present approach is of a kind with Routley’s: Grim pronouncements about extensionality are unnecessarily pessimistic, because they are unnecessarily steeped in classical thinking. As Priest counters,

this way of putting it prejudices the issue by claiming that the modus ponens axiom is something we have ‘got’ which we will have to lose if we want to adopt naive set theory. Possession, as everyone knows, is nine tenths of the logical law. ... We cannot hold both the abstraction scheme and the modus ponens axiom. However, both are candidates for rejection. It is not true to say that we already ‘have’ one which we have to give up. We have both in exactly the same sense, i.e. we appear to start out with a belief in both. [Pri80](430)

Classically, axiom and rule modus ponens are indistinguishable; the finer lens of paraconsistency detects important variations. What is persistently needed is to see underlying connections—to see that there are only a few major surprises, manifest- ing in different guises. For example, the loss of material detachment when blocking explosion is very surprising. But it can be explained, by putting the premise (Φ ∨ Ψ) ∧ ¬Φ into an equivalent form, (Φ ∧ ¬Φ) ∨ (¬Φ ∧ Ψ). From this form, one ought to be able to argue by cases to Ψ. Now, from ¬Φ ∧ Ψ follows Ψ immediately; but Φ ∧ ¬Φ only yields Ψ by explosion. [Sha05] (742) This shows that disjunctive syllogism is not a proof for but a restatement of explosion. We have rejected explosion; not as a further consequence but eo ipso we reject disjunctive syllogism. Similarly, we have rejected contraction, and therefore pseudo-modus ponens too. The extensional picture for naive sets is similar. A small class of novel phe- nomenon trail paraconsistent identity, inconsistency most saliently, but no more than these. What appears to be a problem with extensionality is in fact a base fact about inconsistency, and if the latter is dealt with carefully, there will be no more cause for concern about the former. There is also a trouble with permutation,

Φ → (Ψ → Υ) ` Ψ → (Φ → Υ) due to Slaney’s argument in [Sla89], concerning the related logic RWX. In fact the argument shows that excluding the middle and permutation are not jointly tenable. The cause is, again, a close relative of curry’s paradox. Let us have a constant f 3. A PARACONSISTENT LOGIC 113 that obeys the two-way rule ¬Φ a` Φ → f. Slaney shows that (Φ → ¬Φ) → ¬Φ. Expanding this out, we have an instance of contraction: [Φ → (Φ → f)] → Φ → f Then, using permutation, Slaney argues to triviality [Sla89](477). No one has anyway ever produced a non-triviality result on naive set theories with permutation. Slaney’s argument, aside from seriously weakening an already emaciated apparatus, calls attention again to the difference between negation, or denial, and rejection. Looking ahead to possibilities in paraconsistent metamathematics, the elimi- nation of contraction also blocks L¨ob’stheorem, which uses a provability predicate Prov and ` as a sign of theoremhood:

If ` Prov(pΨq) → Ψ, then ` Ψ.

L¨ob’sis itself a sinister version of G¨odel’stheorem. Since the diagonal lemma and other topics from recursion theory are involved, this point is not as elementary as the others being rehearsed in this section, and is for now omitted, but it is important to see some of the wider avenues opened up by moving to weaker logics— namely a regaining of expressive power and thwarting of limitive theorems. Let Prov be a provability predicate and p·q be a name-forming operator. We assume that theorems are provable and that provability distributes over implication. L¨ob’s theorem states that if Prov(pΨq) → Ψ is a theorem, then so is Ψ. Depending on formulation, the argument uses either contraction or permutation, or both. See [BJ89], chapter 15). In a paraconsistent logic, ¬-contradictions are tolerated. A recurring villain of the piece, however, is absurdity, which serves much the same purposes that in- consistency serves for classical mathematics—fixing boundaries of what is possible. Take ⊥ to be the formula (∀x)(∀y)x ∈ y. This is an absurdity, as Slaney [Sla89](476) puts it, “that than which nothing sillier can be conceived.” To show ⊥ is so aptly described it suffices to observe that

⊥ → (∀y)a ∈ y → a ∈ {z :Φ} → Φ.

So ⊥ → Φ for every formula Φ. A ⊥ formula allows a kind of negation, Φ → ⊥, what Curry called absurdity negation. As Slaney shows, this negation is not paraconsistent even in relevant logics close to the one we are using, because one 114 4. ON A LOGIC FOR NAIVE SET THEORY can obtain an instance of contraction. More generally, any contradiction with this negation explodes by modus ponens:

Φ, Φ → ⊥ ` ⊥

To repeat the analogy with contradiction in classical logic, the appearance of ⊥ is a clear signal that something is awry; the key, and the key to preserving extensionality, is isolating just what it is that is to be rejected.

4. Relevant Identity

The language of set theory is distinctive for its connectives ∈ and =. Next chapter we will study membership. More general considerations go into identity, which is the focus now.

4.1. Extensionality. A good deal of concern has been expressed about if or how a set theory with full comprehension can be extensional. A fully comprehensive extensionality means: - every predicate to determine a set; - identity is completely determined by membership; and - identical objects are intersubstitutable. Put directly, though, there is sometimes joint incompatibility between a = b, Φ(a), and Φ(b), for some sentences Φ. The motivating cases are, prima facie, problems with the intersubstitution of sets a, b when they seem to have the same members. This piques two questions: When is it appropriate to say that a has the same members as b, and when can a be intersubstituted with b? To take an example, here are two sets:

U = {x : x = x}, V = {x :(∃y)x ∈ y}.

(The appeal of a naive theory is that these really are sets, not proper classes or some other contrivance.) Classically, both are universal, in that they have all objects as members, (∀x)x ∈ U and (∀x)x ∈ V . But absent a material conditional, there is no way to prove that V is a subset of U. In the logics we will be considering, in fact, the sets are distinct, U 6= V . And further, were it the case that U and V satisfy all the same sentences, we could prove everything—triviality, as below (§ 5). This is not paraconsistent. Thus, as has been widely observed at least since White [Whi79], but seemingly also earlier in Chang [Cha63], some extensional properties of sets are ‘lost’ when one moves to a . The point is also independently made by Brady [Bra71] and Gilmore [Gil74]. Working on set theory in a logic without contraction, 4. RELEVANT IDENTITY 115

Uwe Petersen writes that “...the failure of weak extensionality can be sharpened to the point of destroying all illusions about possibilities of finding a realm of objects which is ‘well behaved’...” [Pet00](384). Libert in recent work declares that “indeed, it is hopeless to combine extensionality and abstraction [terms] with the equality relation in formulae defining sets.” [Lib04](31). As stated in the introduction, I think there is a way to understand naive sets in such a way as they are still comprehensively extensional. There are certainly issues to face, such as the co-extensionality of sets V and U, but it is worth remembering that classically, neither V nor U are even sets at all. How should we approach iden- tity, both in terms of its definition and the properties we expect of it? Inalienable properties are clustered around the needs of substitution and equivalence relations; we will consider these in turn. Then we will examine a few options open in the literature, which advocate sacrificing some expressive resources in order to be as liberal with identity as possible. What forms of identity get enough set theory right, while not proving everything?

4.2. Equivalence. Mares [Mar92] discusses various formulations of identity in a neutral setting, and then Kremer [Kre99] goes farther, asking not just which formulations of identity are possible, but “which axioms to accept, or at least ... the philosophical consequences of accepting one set of axioms rather than an- other.”(200) Any candidate for identity at least ought to be reflexive, symmetric and transitive. The principle of extensionality, given any relevant logic as strong as B (which DL contains), immediately delivers the properties:

x = x, x = y → y = x.

Transitivity has two non-equivalent formulations:

x = y ∧ y = z → x = z, x = y → (y = z → x = z), which we will follow Kremer in calling conjoined transitivity and nested transitivity, respectively. Logics that include conjunctive syllogism validate the conjunctive form. Logics that include the hypothetical syllogism pair of prefixing and suffixing validate nested transitivity. Either form of transitivity, then, will be validated by the extensional definition of =. On the other hand, conjunctive and nested forms cannot be equivalent; or at least, both directions of

Φ ∧ Ψ → Υ a` Φ → (Ψ → Υ) 116 4. ON A LOGIC FOR NAIVE SET THEORY need to fail. Otherwise, from left to right, we can have Φ ∧ Ψ → Φ lead to Φ → (Ψ → Φ), which is weakening; or from right to left, (Φ → Ψ) → (Φ → Ψ), which is an instance of an axiom, yields Φ ∧ (Φ → Ψ) → Ψ, which is pseudo-modus ponens.

4.3. Substitution. Identical sets should be intersubstitutable in a fairly nat- ural way. If x = y, then x and y satisfy all the same formulae. In the context of naive set theory’s comprehension principle, this is equivalent to saying that x and y are members of all the same sets. In [BR89], the sign ‘=’ is simply defined in terms of membership coincidence: (∀z)(z ∈ x ↔ z ∈ y); Routley and Brady use the name ‘extensionality’ for what is more commonly called Leibniz’s law or the substitution of identicals. As discussed at least since [Rou77], though, there are different and definitely non-equivalent forms of substitution to consider. Modeling predication in relevant logic in general is studied by Dunn [Dun87], to distinguish those properties an object has essentially rather than accidentally. In that study identity is integral: Any a is said to be relevantly Φ iff (∀x)(x = a → Φ(x)). Dunn’s analysis makes substitution, therefore, a core part of the meaning of genuine predication; and when substitution is governed by →, it is also a non-trivial operation. The most obvious formulation of a substitution principle,

(1) x = y → (Φ(x) → Φ(y)), is shown [Dun87] to be unpalatable on relevance grounds. For one can quickly derive the statement x = y → (p → p) for some (irrelevant) proposition p. Similarly,

(2) Φ(x) → (x = y → Φ(y)) has as an instance p → (x = y → p) for arbitrary p. Even for those not primarily concerned with relevance, both of these schemes are basically untenable in a naive set theory. Since we expect the theory to be inconsistent, we reason as follows. Let p, ¬p be theorems. Choose 0 for both x, y respectively. Contraposing principle 1, we get ¬(p → p) → 0 6= 0. But by counterexample, p ∧ ¬p → ¬(p → p), so two applications of modus ponens yields 0 6= 0. Similarly, version 2 implies p → (¬p → 0 6= 0). Two applications of modus ponens again reduce to 0 6= 0. This is an instance of a more general fact, that given any contradiction, either of principles 1 or 2 lead to ∀x(x 6= x). Now, because we are in a paraconsistent setting, it is not obvious whether n 6= n for all numbers n being provable is grounds for rejecting the substitution principles involved. Models of relevant arithmetic due to Meyer and Mortensen in fact allow just this—that every number be non-self-identical—without triviality 4. RELEVANT IDENTITY 117

[Mor95]. Trivial or no, though, since n = n is a theorem for every n, global non- self-identity does introduce a lot more inconsistency than perhaps even a steely- nerved paraconsistent set theorist was expecting; and there does not seem to be any philosophical reason to expect or explain inconsistency in the vicinity of 0 (unlike, say, in inconsistency in some transfinite numbers [Can67](in [vH67]), or even large finite numbers [Pri94]). What counts as unacceptable short of full triviality is a delicate issue [Pri89], [Pri06a], [Ber08]. At this point, if we can isolate substitution principles that do not render all small finite numbers inconsistent, that would be worth knowing. For the duration, principles 1 and 2 are no longer under consideration. This leaves a few options. A conjoined formulation of substitution is

(3) x = y ∧ Φ(x) → Φ(y), but is very close to an unacceptable form of Curry’s paradox, Φ ∧ (Φ ↔ Ψ) → Ψ. This leaves the principle

(4) x = y ` Φ(x) → Φ(y), which has the deductively equivalent formulation

x = y ` (∀w)(x ∈ w → y ∈ w).

Both (4) and its deductive equivalent follow from the related formulation

x = y ∧ w = w → x ∈ w → y ∈ w.

All of these are effectively non-contraposable, since given Φ(x), ¬Φ(y) there is no inference to x 6= y. Principle 4 is the one to endorse, as with [Rou77], [Bra89]. A consequence of the principle is

x = y, Φ(x) ` Φ(y), which is like the (acceptable) rule form of modus ponens.

4.4. Expressive Restrictions. A widespread response to issues around ex- tensional identity is to restrict formulae defining sets in various ways. In [BR89] the non-triviality of what is there called extensional dialectical set theory is proved, where ‘extensional’ highlights the restriction on comprehension that no occurrences of → are allowed in formulas defining sets. We saw Mares make this same sugges- tion §4.3 above. Other suggested restrictions have been, variously: not to involve abstraction terms [Lib04], or negation (‘positive’ set theory) [Gil74], [FH89], [HL08], or any combination of these. An important form of Curry’s paradox runs through these considerations, beginning with Gilmore. 118 4. ON A LOGIC FOR NAIVE SET THEORY

When recommendations on expressive restriction are followed out, one obtains different systems with different properties. Whether or not = is defined or primitive, and what connectives and operators can appear in formulae defining sets, can allow for example embedding into the logic RM3Q, as in [Lib04], which is otherwise too strong because it includes contraction. An interesting mathematical item is to know in which resulting models the axiom of choice holds, which Esser has investigated. Understanding these differences goes back to the core decisions in mounting a set theory: the definitions, formulations, and assumptions; and the chosen underlying logic. The original motive for maintaining naive comprehension is that every open sentence of the language determines a set. If possible, an arrangement that does not necessitate restrictions on the formulae themselves is desirable. We are already paying a hefty sum of inference rules for the right to practice naive set theory; if we must also give up some predicates as set defining then this is becoming unaffordable. Set theory without restriction on the comprehension formulae is proven non-trivial in [Bra89], indeed, without even the restriction that the set itself not appear in its defining condition. The goal is to maintain expressive power, and set theories in which this is done are the main object of study here.

5. Restricted Quantification

This section looks forward; some of the theorems claimed here are officially proved next chapter. Perhaps the most difficult challenge to set theory in relevant logic is formalizing restricted quantification; for there do seem to be many cases in which we want to say that all Φs are Ψs, without meaning or being able to prove that ∀x(Φ(x) → Ψ(x)). An entailment of this form may well be stronger than what is meant by, e.g. “all the beer is warm.” See [Mar04](198 - 200). The most recent proposal for restricted quantification is by Beall, Brady, Hazen, Priest and Restall [BBH+06]. Using a semantic characterization, they produce a conditional, the “restricted arrow,” with a number of desirable properties. Let R be a three-place relation on setups x, y, z. Φ → Ψ is true at x iff, for all y, z such that R(xyz), Ψ is true at z if Φ is true at y. Various constraints on the relation R produce different conditionals, and so differ- ent logics [Pri08]. The innovation of [BBH+06] is to control implication by the addition of a heredity constraint. Take x ≺ z to mean that everything true at x is also true at z. Define a new three place relation, R0(xyz) := R(xyz) ∧ x ≺ z. 5. RESTRICTED QUANTIFICATION 119

This gives the clause for restricted arrow, Φ 7→ Ψ is true at x iff for all y, z such that R0(xyz), Ψ is true at z if Φ is true at y. This seems like a promising way to engineer a subset relation.1 For us the important fact about this new conditional is

∀x(x ∈ X) ` ∀x(x ∈ Y 7→ x ∈ X).

This is a weakening that, roughly, suggests an indifference to context. The indif- ference is what will allow recapture of some material implications and extensions, presuming that extensions are meant to be indifferent to context. At least since Routley’s [Rou77], the basic idea has been to outfit → with a similar merging device; in a more sophisticated version, this is still the idea of [BBH+06]. To show the generality of our main result, further detours through semantics may be avoided using a syntactic constant. The constant t is added conservatively [DR02](11) through a two way rule: Φ a` t → Φ Then the enthymematic conditional is defined: Φ 7→ Ψ := Φ ∧ t → Ψ. Whether or not using t agrees in other points of detail with the semantic definition of the arrow, for what follows, makes no difference (as the authors of [BBH+06] (footnote 16) point out; c.f. [Pri06b](ch18)). The main facts about 7→ is that it obtains if an →-entailment does, but, im- portantly, 7→ does not contrapose. Since Φ ` Ψ 7→ Φ, contraposition would further give us Φ ` ¬Φ 7→ ¬Ψ. If, though, both Φ and ¬Φ hold, then arbitrary ¬Ψ follows, which is absurd. The authors of [BBH+06](589) note that contraposition must fail for restricted arrow given their desiderata.2 Like DLQ’s counterexample axiom, 7→ takes extensional information from in- tensional information. The arrow is intended for use in set theory. For example, if U is the universe of our example in §4, then it seems that every set is also a subset of U, just because everything whatsoever is self-identical. And so it is tempting to define subsets as a v b := ∀x(x ∈ a 7→ x ∈ b). We can go back to the formulation of the extensionality axiom and write:

[Ext7→] x v y ∧ y v x ↔ x ≡ y.

1Asmus has already worked out proofs systems and a method for obtaining completeness results for this conditional [Asm09]. 2Brady has also recently pointed out (in communication) that a Curry paradox is in the vicinity: x ∈ C ↔ t ∧ (x ∈ x → Ψ) leads to Ψ. 120 4. ON A LOGIC FOR NAIVE SET THEORY

Ext7→ seems useful. We have already met the doppleg¨angerproblem, which in generality involves at least two sets,

{x : Φ(x)}, {x : Φ(x) ∧ p}, where p is a tautology. Now the doppleg¨angerscollapse: Φ(x) 7→ Φ(x) ∧ p holds when p does. By Ext7→, {x : Φ(x)} ≡ {x : Φ(x) ∧ p}, after all. Recalling U = {x : x = x} and V = {x : ∃y(x ∈ y)}, we put the idea to work. From the logic of = and ∃, and the stronger fact that ∀x∀y(y ∈ x → y ∈ V ), we have ∀x(x ∈ U), ∀x(x ∈ V ), and ∀x(x v V ).

Theorem 4.1. ∀y(y v U).

Proof. Consider a ∈ U.

a = a ` t → a = a.

Then a = a ` a ∈ y 7→ a = a. Then ∀x(x = x) ` ∀x(x ∈ y 7→ x = x). By modus ponens, ∀x(x ∈ y 7→ x = x). Ergo ∀y(y v U) by abstraction, as desired. 

Theorem 4.2. V ≡ U.

Proof. By Ext7→ and the last two theorems.  Now define P (x) := {y : y v x}.

Theorem 4.3. V ≡ P (V ).

Proof. Because ∀x(x v V ), also ∀x(x ∈ P (V )). Since

∀x(x ∈ P (V )) ` ∀x(x ∈ a 7→ x ∈ P (V )) it follows just as before that ∀y(y v P (V )), so in particular, V v P (V ). Therefore

V v P (V ) ∧ P (V ) v V, showing, by Ext7→, that V ≡ P (V ).  While this is the anomaly that undid Frege’s system, it should be expected, possibly even welcome, that such prestigious paradoxes are theorems here. Obviously x = y ` x ≡ y. The question, roughly, is the converse. Less roughly, we know that restricted arrow does not contrapose, as the arrow of DLQ does. Perhaps, though, there is a way to integrate the two identity strengths, so that extensional behaviors of identity and subset are preserved, but where the important negation facts about sets—facts needed to know what sets are, and for proving 5. RESTRICTED QUANTIFICATION 121 theorems about sets—are preserved too. Clearly → cannot be interchangeable with 7→, but we would like to know how far they can go together. Maybe = and ≡ are a special case of implication where we can have it all. This is the hypothesis under investigation. 3 Our main result for the chapter concerns a substitution principle for Ext7→,

(5) x ≡ y ` Φ(x) 7→ Φ(y).

If ≡ is serving as identity, and is extensional, it must satisfy this principle. Here are two empty sets,

{x : x 6= x}, {x : ⊥}.

Given contraposition and double negation properties, these are the compliments of our universes (c.f. chapter 5, prop. 5.15),

z ∈ V ↔ z 6∈ {x : ⊥}, z ∈ U ↔ z 6∈ {x : x 6= x}.

Even with Ext7→, we do not get a unique empty set, since 7→ does not contrapose [Pri06b](255). That is good news, as we are about to see. Notice in particular that x 6∈ V ↔ ⊥, that V is exhaustive, and {x : ⊥} is void, on pain of absurdity.

Theorem 4.4. Principle 5 is trivializing.

Proof. Ext7→ implies that V ≡ U. However, R 6= R by theorem 5.7. Therefore R 6∈ U. Then from principle 5,

U ≡ V ` R 6∈ U 7→ R 6∈ V.

By two applications of modus ponens, R 6∈ V . Therefore ⊥. 

To see what is going on from a slightly different angle, consider Russell’s wit- nesses, ν(R) = {x : R ∈ R}. Since R ∈ R from theorem 5.7, it follows that ∀x(x v ν(R)), as in theorem 5.2. Now we’ve established that ν(R) v V since everything is, and V v ν(R) for the same reason; so ν(R) ≡ V.

3This initially developed while studying Libert’s limitive extensionality results in [Lib04]. 122 4. ON A LOGIC FOR NAIVE SET THEORY

But since R 6∈ R it follows that ∀x(x 6∈ ν(R)). In particular, R 6∈ ν(R). Now by substitution principle 5 and modus ponens twice, ⊥. Similarly and a fortiori, even though ¬∃x(x ∈ ν(R)), any proposals for exten- sionality that yielded ν(R) ≡ {x : ⊥} would be disastrous. The result seems most directly related to Dunn’s finding [Dun88] in [Aus88] that, if a logic meets some basic conditions and extensionality7→ is enforced then collapse back to classical logic (triviality, for the naive set theorist) is inevitable. Our result also relates, then, to the ‘slingshot’ argument, as Dunn points out. I think this shows that v and ≡ are regressive notions of subset and identity, as I now urge. Unrestricted use of restricted quantification is therefore unacceptable. The extensionality axiom in terms of v cannot be maintained. Any heavy use of t will reintroduce exactly the problems relevant machinery is meant to solve with respect to set theory. This result can be viewed in two ways. On the one hand, we might say that certain classical inferences are explosive, regardless of whether we welcome them explicitly or covertly through devices like t; and this suggests that there is nothing to be gained in being half-hearted about paraconsistency. On the other hand, we may say that there are simply certain facts about the full set theoretic universe, understood as the domain of a non-trivial but inconsistent theory; and a non-trivial but paraconsistent logic will bring out those facts, no matter how we try and subvert them with classicality-preserving arrangements. Defining subsets with t reproduces nice classical behavior that we traditionally expect from subsets. But the argument to explosion shows limits on what can be inferred—namely, that ≡ is not identity. While there are multiple subset-like relations to work with, some are stronger than others. At the precipice of absurdity, serendipitously, it is difficult to prove that two sets are truly identical. As I will argue in the concluding section, these are not, even extensionally, the same sets. Relevant logics, and paraconsistent logic in general, are designed to handle paradoxical objects by abstaining from making formerly innocuous logical infer- ences. The introduction of restricted quantification via t is intended to relieve separation anxiety. But the relations 7→ and ≡ bring back moves that were prob- lematic to begin with. It is to be expected that the same problems resurface when these relations are reintroduced.

6. An Iatrogenic Disorder

A medical condition is iatrogenic when it is induced by misdiagnosis or mis- treatment. Some diagnose doppleg¨angersand all the other attendant problems as a failure of extensionality. In light of the triviality argument in the last section, this perceived failure is chronic. Weakening the constraints of identity using t are 6. AN IATROGENIC DISORDER 123 to no avail. Nor is it possible to alter these facts by honing the criteria for identity further, saying that x = y iff they share an extension and an anti-extension, i.e.

x = y ↔ (∀z)[(z ∈ x ↔ z ∈ y) ∧ (z 6∈ x ↔ z 6∈ y)]

The suggestion is well intentioned, but inert. In effect, since the logic DLQ includes an axiom for contraposition, we already have this. The time has come to question whether the objects in question really are dopp- leg¨angersafter all. Yes, both V and U are universal—but they are not really the same set, even extensionally. To give the proof requires appeal to the counterexam- ple rule. Some set, e.g. Russell’s set R, is in V but not in U; that is, R ∈ V ∧R 6∈ U, which shows that ¬(∀x)(x ∈ V → x ∈ U), and ergo that

V 6= U.

Sets like V and U both have all objects as their members, and yet are not identical. They have different members. Some objects are demonstrably not in U, whereas anything not in V implies triviality. The situation is even more dramatic with ν(R), of which both (∀x)x ∈ ν(R) and (∀x)x 6∈ ν(R) hold. Both {x : x 6= x} and {x : ⊥} have no members, whereas e.g. ν(R) is a member of the first but not the second. This is a signature of inconsistency: Surprises are hidden even where we have already looked for them. In the general case, Φ(x) and Φ(x) ∧ p would seem to pick out all the same objects if p is just true—but there remains the possibility that ¬p is also true, and hence that the latter set excludes as many objects as it embraces. Since Frege we have known that naive set theory is inconsistent. The phenom- enon we are witnessing is explicable through inconsistency. But nothing indicates that inconsistent set theory is not extensional; rather, these show what extensional- ity means in an inconsistent setting. The logic does not demand that V and U are the same set because they are not the same set. They are not the same extension. Any reported failure of extenstionality, then, amounts to this. What one can prove using an insensitive operator like the material conditional, working under truncated axioms like those of ZFC, diverges from what can be proven elsewhere. More specifically, sets that appear to be identical when viewed through a classical lens already incapable of perceiving many (most?) of the objects in the wider set theoretic universe are no longer identical when viewed using a relevant paraconsis- tent lens. The reported failure, in short, amounts to observing that paraconsistent logic is not classical; the report amounts to not very much. Future discussions of naive set theory can be recast by taking a thoroughgo- ing paraconsistent approach. Much brilliant work notwithstanding, the unstable mismatch of classical metatheory to inconsistent object theory causes misdiagnosis and, subsequently, mismanagement. Even in situations reckoned to be consistent 124 4. ON A LOGIC FOR NAIVE SET THEORY

(notably, the full theory of ordinals and cardinals not being one of these), the ap- paratus of classical set theory and logic is inappropriate, for two reasons. The first is just as the authors of Entailment warn: One might think as follows. The point of [paraconsistency] is to take seriously the threat of contradiction. But there is in this vicinity no such threat. ... That sounds OK; but is it? After all, we supposed that ‘here there is no threat of contradiction’ is to be construed as an added premise ... bound up with the threat of contradiction, and one thing that is clear is that adding premises cannot possibly reduce threat. [ABD92](503) The second reason goes beyond concern about inconsistency. It is concern with the basic integrity of a robustly non-classical project. The point is emphasized in an honest complaint from John P. Burgess: How far can a logician who professes to hold that [paraconsis- tency] is the correct criterion of a valid argument, but who freely accepts and offers standard mathematical proofs, in particular for theorems about [paraconsistent] logic itself, be regarded as sincere or serious in objecting to classical logic? [Sha05](740) Once we take a coherent naive theory seriously, no longer judging it by intuitions fed on classicality, the force of results, in particular the status of paraconsistent extensionality, changes. Brady’s non-triviality result constructs a transfinite model for set theory— using already a considerable amount of classical set theory and classical semantic modeling assumptions. The fine works of Hinnion and Libert [HL08] rest on highly developed topology. This is not to suggest that their work is any less valuable, only that certain perspectives inevitably taint results, and especially the interpretation of results. If one is only interested in conventional metatheorems about unconventional structures, then paraconsistent information about the structures will be distorted, lost or ignored. Many of the difficulties that an embedding of naive set theory has faced are iatrogenic, because logicians of classical schools are simply not trained to treat the altogether different pathology of dialethias. If one takes the features of the classical universe to be the gold standard against which all else is judged, then something in naive theory (or at least a naive theory in which every formula is taken to be set-defining) is terribly non-extensional. One has also essentially ruled out the possibility of alternative criteria: Differences will always be deviants, abominations. Here we still know that - sets which can be proven to have the same members are identical; and - identical sets can be intersubstituted. 6. AN IATROGENIC DISORDER 125

Sets that are not identified by =, on the other hand, are simply not co-extensive. There is no ‘failure’ of extensionality over a relation like ≡ any more than there is a failure of explosion over a paraconsistent consequence relation. Far from being a disorder, it seems very much a part of the intensional set concept that the properties Φ and Φ ∧ p should not be even apparently coextensive. Non-classical behavior is not the same as non-extensional behavior. Under the analytic axioms of naive set theory, this is what it means for sets to behave extensionally. We now have some sense of the logic DLQ and the initial definitions that will suit naive pursuits, especially with respect to identity and the universal set. We are now ready to begin.

CHAPTER 5

Elements

This chapter develops the basic elements of naive set theory—the consequences of the full comprehension principle—in a paraconsistent logic. Results divide into two sorts. There is classical recapture, where the main theorems about ordinals and cardinals are proved. Then there are major extensions, including proofs of the axiom of choice and a counterexample to the continuum hypothesis.

1. Introduction

A set is any collection of objects, and is itself an object. The members of a set determine the identity of that set. The first clause is the principle of comprehension, and the second is the principle of extensionality. These statements define the set concept, and on the basis of these principles, the mathematics of set theory can be developed, as this chapter demonstrates. The resulting theory has several important features. It contains as theorems all the basic theorems of standard ZFC set theory, initially proving the ZF axioms and building to the theory of ordinal and cardinal numbers. This has been called the classical recapture, and has proved startlingly elusive. The demands placed on the underlying logic are simply very stringent; speaking to this point, the usually optimistic authors of [MRD78] conclude in dramatic fashion that “naive set theory is untenable.”(128) Priest summarizes the situation: Since the early days of paraconsistent logic, it has been clear that the rejection of ex contradictione [quodlibet] is not possible with- out the rejection of other things which appear to be much more integral to classical reasoning.... Several logicians (including Brady, Meyer, Mortensen, Priest and Routley) have attempted to reconstruct various fragments of classical reasoning.... While the results are not definitive, they are not terribly encouraging. Though ... a highly important—indeed, essential—enterprise, it would now appear that the aim of reconstructing sensible clas- sical reasoning in this way may not be realizable.[Pri06b](222) This chapter begins to show that the goal is realizable.

127 128 5. ELEMENTS

Recapture of higher theories, most strikingly arithmetic, is not included here beyond showing that the Peano axioms hold. Not all classical consequences of these theorems are paraconsistent consequences—at least not automatically—so the issue of what can be proved from here is open. Generally there is much more to be done; the best days of such a set theory lie ahead. Paraconsistent set theory can prove more than its classical counterpart, because it is using the full set concept. We will be able to prove the axiom of choice and provide a counterexample to the generalized continuum hypothesis. And the theory begins to address new mathematical questions that arise in coherent inconsistent settings. Inconsistent objects are characterized and an approach to counting them is developed, thereby proposing an answer as to how big is e.g. the singleton {a} when a differs from itself with respect to membership. The work here, then, in addition to providing a paraconsistent foundation for mathematics, also provides a foundation for and new directions in paraconsistent mathematics. The intent of the chapter is not to interpret or philosophically defend at any length the results; this is for chapter 7. The issues involved, for example the identity of theorems when moving from one language to another, are subtle and philosoph- ically controversial. Whether or not the entities being studied are really sets, what it means for identity = or mebership ∈ to behave inconsistently, and therefore in what sense this work is a classical recapture, are all interesting questions. But mathematics, controversial or otherwise, must first stand on its own; Zermelo had to give his wellordering proof before we could argue about the axiom of choice. Before any useful discussion of the theory can take place, we must know what the theory is. The ultimate goal here is to provide a new, paraconsistent proof of Cantor’s theorem that there are orders of cardinal powers, by pinning down a rigorous theory of ordinals. Whether in a paraconsistent background this has the same force as it did over a century ago when Cantor discovered it, is a discussion I hope will occur elsewhere.

2. Logic

Last chapter we presented DLQ. Here we will officially add the hypothetical syllogism pair.1 I briefly recall support for this logic. Comprehension is an extremely powerful principle and can only be channeled by weak logics. Still, we want a logic that respects some natural behaviors of sets. For example, given that two sets with all and only the same members are identical, it should also be the case that sets with different members are not identical. Or again, when y = {z : Φ(z)} and ¬Φ(x), then it should be that x 6∈ y. These are fairly

1The addition officially makes the logic T LQ. 2. LOGIC 129 strong requirements on negation. When formalized (e.g. with a counterexample axiom VIII below), these demands mean that set negation excludes the middle, tertium non datur: Φ ∨ ¬Φ for any sentence Φ. This is to be expected. An exhaustive universe is explicitly postulated by Cantor and Frege; it is the trigger of inconsistency in comprehension to begin with. The language of first order set theory has primitives ∧, ¬, →, ∀, =, ∈, and {:}; variables x, y, z, ...; names a, b, c, ...; and formulae Φ, Ψ, Υ, ..., built up by standard formation rules. Shorthand is used: Φ ∨ Ψ for ¬(¬Φ ∧ ¬Ψ); Φ ↔ Ψ for (Φ → Ψ) ∧ (Ψ → Φ); ∃ is ¬∀¬. (Taking these as definitions means that e.g, Φ∨Ψ → ¬(¬Φ∧¬Ψ) is no more than an instance of axiom I below.)

2.1. Axioms. All instances of the following schemata are theorems: I Φ → Φ IIa Φ ∧ Ψ → Φ IIb Φ ∧ Ψ → Ψ III Φ ∧ (Ψ ∨ Υ) → (Φ ∧ Ψ) ∨ (Φ ∧ Υ) (distribution) IV (Φ → Ψ) ∧ (Ψ → Υ) → (Φ → Υ) (conjunctive syllogism) V (Φ → Ψ) ∧ (Φ → Υ) → (Φ → Ψ ∧ Υ) VI (Φ → ¬Ψ) → (Ψ → ¬Φ) (contraposition) VII ¬¬Ψ → Ψ (double negation elimination) VIII (Φ → Ψ) → ¬(Φ ∧ ¬Ψ) (counterexample) IXa (Φ → Ψ) → [(Ψ → Υ) → (Φ → Υ)] IXb (Φ → Ψ) → [(Υ → Φ) → (Υ → Ψ)] (hypothetical syllogisms) X (∀x)Φ → Φ(y/x) XI (∀x)(Φ → Ψ) → (Φ → (∀x)Ψ) XII (∀x)(Φ ∨ Ψ) → Φ ∨ (∀x)Ψ In axiom X, y is free for x in Φ. In axioms XI and XII, x is not free in Φ.

2.2. Rules. The following rules are valid: I Φ, Ψ ` Φ ∧ Ψ(adjunction) II Φ, Φ → Ψ ` Ψ (modus ponens) III Φ → Ψ, Υ → ∆ ` (Ψ → Υ) → (Φ → ∆) IV Φ ` (∀x)Φ V x = y ` Φ(x) → Φ(y)(substitution) Modus ponens is also valid in the single premise form: Φ ∧ (Φ → Ψ) ` Ψ, not to be mistaken for the illegitimate axiom form. Substitution, similarly, is only valid in rule form. The following derived facts will be most helpful. Double negation introduction follows from axioms I and VI, and modus ponens: (¬Φ → ¬Φ) → (Φ → ¬¬Φ). Then the contraposition axiom VI can be rearranged to 130 5. ELEMENTS

(Φ → Ψ) → (¬Ψ → ¬Φ). Counterexample gets its name from the contraposed form Φ∧¬Ψ → ¬(Φ → Ψ). From the instance of counterexample (Φ → Φ) → ¬(Φ∧¬Φ), by axiom I and modus ponens we have the law of non-contradiction: ¬(Φ ∧ ¬Φ). Then by the definition of disjunction follows the law of excluded middle, Φ ∨ ¬Φ. Contraposition on axiom V gives a schema for ∨-elimination: (Φ → Ψ) ∧ (Υ → Ψ) → (Φ ∨ Υ → Ψ). Then as theorems are the pair (Φ → ¬Φ) → ¬Φ and (¬Φ → Φ) → Φ, reductio and consequentia mirabilis. From axiom XI we have the scheme for existential instantiation, (∀x)(Φ → Ψ) → (∃x)Φ → Ψ. A useful derived rule is

Φ → (Ψ → Υ), Υ → ∆ ` Φ → (Ψ → ∆).

The rule holds because Υ → ∆ implies [Φ → (Ψ → Υ)] → [Φ → (Ψ → ∆)] by two applications of prefixing; and then the conclusion follows by modus ponens. Similarly, an extra application of prefixing gives the extended

Φ → (Ψ → (Υ → Γ)), Γ → ∆ ` Φ → (Ψ → (Υ → ∆)), and so forth. These will be called derived prefixing rules. Like other relevant logics in its area, the deduction theorem fails for DLQ, exactly because a demonstration that Ψ is provable from Φ is not sufficient to show that Φ relevantly implies Ψ; often a good deal more information than Φ alone is needed to get Ψ. So the difference between deductions marked by ` and implications marked by → are stark. While the move from Φ → Ψ to Φ ` Ψ is modus ponens, the other direction does not follow. In general, → carries the burden of relevance and theorems containing it are harder to obtain. To this end we have two more rules:

2.3. Meta-rule. If Φ ` Ψ, then Φ ∨ Υ ` Ψ ∨ Υ.

2.4. Quantified Meta-Rule. If Φ ` Ψ, then (∃x)Φ ` (∃x)Ψ. The idea of meta-rules is discussed in [Bra06](6). Importantly, meta-rules soften the loss of the deduction theorem; in particular, the following argument, a derived meta-rule, is validated: if Φ ` Υ and Ψ ` Υ, then Φ ∨ Ψ ` Υ, This is useful for disjunction eliminations when reasoning under hypothesis, as in theorem 5.41. We also obtain the useful substitution form, a = b, Φ(a) ` Φ(b). Another simple derived rule is: Φ ∧ (Φ → Ψ) ` Ψ, and its quantificational coun- terpart, Φ(a) ∧ (∀x)(Φ(x) → Ψ(x)) ` Ψ(a). So although pseudo-modus ponens is not included, using the meta-rules we can still reason fairly naturally even under hypothesis. 2. LOGIC 131

A fusion operator ◦ cannot be added to this logic for naive set theory on pain of triviality. Such an operator obeys the two way residuation rule

Φ ◦ Ψ → Υ a` Φ → (Ψ → Υ),

(see [DR02](12)) but allows derivation of curry-like paradoxes [Bra06](35). Again, we have the connective t, the two-way rule Φ a` t → Φ, and the enthymematic conditional Φ 7→ Ψ defined Φ ∧ t → Ψ. In general for DLQ there is no way to convert a proof into an entailment. Now, it would not be unreasonable to expect that 7→ to do this, e.g. if Φ ` Ψ then ` Φ 7→ Ψ. But this would in fact be a disaster. For instance, we know that Φ, Φ → Ψ ` Ψ. Yet ‘t-pseudo modus ponens’, ` Φ ∧ (Φ → Ψ) 7→ Ψ admits of curry paradoxes just as its t-free mate does.2 Thus 7→ is not to be used in this way. In [Bra03](327), there is a promising approach For proofs to yield formulae by restricted quantification, we could add more axioms to govern quantifiers of the form (∀xΦ)Ψ. The desired move from proofs to formulae is brought about by an extra primitive rule: ∃xΦ(x) ` (∀xΦ)Φ(x). This gives us some nice flexibility, due to the derived equivalence

∃xΦ(x) ` (∀xΦ)Ψ(x) iff Φ(x) ` Ψ(x).

That is, a proof of Ψ from Φ, along with the supposition that there are Φs, licenses a restricted quantification. Brady shows that, with the minimal assumption that the restricting predicate is non-empty, a rule can safely be converted to a formula (and vice versa—assuming some further postulates). Here, the set theory being developed is not couched in terms of quantifiers be- yond ∀, ∃, because the general theme of this thesis is to investigate the relationship between intension and extension via sets. Running proofs with restricted quantifiers to derive formulae is a promising approach, but in effect writes → out of the story. For example, using restricted quantifiers of this sort, the natural way to formulate a subset relation between X and Y would be (∀xx ∈ X)x ∈ Y , and then the exten- sionality axiom should say [(∀xx ∈ X)x ∈ Y ∧ (∀xx ∈ Y )x ∈ X)] → X = Y . Given the philosophical study of intensional sets, this is not appropriate; more exactly, given our choice of logic, this is not an option (see last chapter). We will instead see how far →, with some help from the t constant, can take us.

2This is an unpublished result due to Brady. 132 5. ELEMENTS

3. Axioms

The formal language of DLQ is augmented with a variable binding term forming operator {· : −}. It remains an open question how to add term-forming symbols conservatively in relevance contexts [Bra06](177), and is not a problem we wish to address here. The set concept is now characterized by two axioms.

Axiom 1 (Abstraction). x ∈ {z : Φ(z, u)} ↔ Φ(x, u),

Axiom 2 (Extensionality). (∀z)(z ∈ x ↔ z ∈ y) ↔ x = y.

By existential generalization on abstraction follows immediately the principle:

Theorem 5.1 (Comprehension). (∃y)(∀x)(x ∈ y ↔ Φ(x, u)).

The terminological distinction between abstraction and comprehension I take from [Lib04]. Abstraction and extensionality can be reconnected, as in Frege’s axiom:

Theorem 5.2 (Basic Law V). {x :Φ} = {x :Ψ} ↔ (∀x)(Φ ↔ Ψ).

Under abstraction, the substitution rule is recast;

Theorem 5.3. x = y ` (∀z)(x ∈ z → y ∈ z).

Peano chose the ∈ sign to denote predication, from the greek verb στιν, ‘to be’. Since arbitrary predicates determine sets, then, in the comprehension principle the occurrence of y in the predicate Φ is not ruled out. Following Routley, this is completely unrestricted comprehension; without this, some sets would not be obtainable, e.g. the limiting case of diagonal sets, Z = {x : x 6∈ Z}, a subset of which is used to prove Cantor’s theorem. The abstraction axiom keeps track of these cases viz x ∈ {z : Φ(z, u)} ↔ Φ(x, {z : Φ(z, u)}). Originally, tracking occurrences of sets terms appearing in their own collectivizing predicate was accomplished with an extra ‘reflection’ axiom [BR89]. Here we follow instead the simpler approach in [Bra06](177) and take Φ(x, {z : Φ(z, u)}) as a simultaneous substitution on Φ(z, u). The full comprehension scheme is studied for three reasons. The first has al- ready been stated, namely, the philosophical conviction that all predicates, even baroque predicates, determine sets.3 Second is a pragmatic motive: Since DLQ is

3Priest and Routley: “The naive notion of set is that of the extension of an arbitrary predicate.... This is as tight an account as can be expected from any fundamental notion. It was thought to be problematical only because it was assumed (under the ideology of consistency) that ‘arbitrary’ could not mean arbitrary. However, it does.”[PRN89](499) 4. BASICS 133 so terribly weak, the proving power for theorems must come from somewhere. Un- restricted comprehension will allow easier construction of functions, diagonal sets, and other useful objects. The third motive, then, is to model easily some venera- ble fixed-point phenomena in set theory; see e.g. our recursive characterization of the natural numbers below. Non-well-founded sets like x = {x} have been studied by Aczel and are easily reproduced (albeit without the condition that such single- tons are unique). Full comprehension axiomatizes important limiting, circular, and contradictory cases. It is frequently convenient to use names for sets, e.g. for a set a there is a set of all the subsets of a, called the powerset of a. The symbol P (a) is used to denote this set in the same way that ∅ is used to denote a particular set below; it is a name. Similar comments apply to complementation a, intersection, union, etc. This notation can be thought of, temporarily, as governed by instances of comprehension:

hx0, ..., xni ∈ y ↔ Φ(x0, ..., xn), with ordered pairs hx, yi primitive. In later sections we will officially develop ordered pairs (prop. 5.24). Already without this we have

Proposition 5.4. y = {z : Φ(z)} ↔ (∀x)(x ∈ y ↔ Φ(x)).

Proof. By extensionality, y = {z : Φ(z)} ↔ (∀x)(x ∈ y ↔ x ∈ {z : Φ(z)}. By abstraction, (∀x)(x ∈ {z : Φ(z)} ↔ Φ(x)).Then by transitivity, (∀x)(x ∈ y ↔ Φ(x)). For the converse, we again invoke the abstraction scheme, where (∀x)(Φ(x) ↔ x ∈ {z : Φ(z)}, so by transitivity (∀x)(x ∈ y ↔ x ∈ {z : Φ(z)}. And this with the extensionality axiom completes the proof. 

4. Basics

Proposition 5.5. Identity is an equivalence relation:

x = x, x = y → y = x, x = y ∧ y = z → x = z.

Proof. These follow directly from the properties of → and ∧ (axioms I, II, IV and V), and extensionality.  As with proposition 5.11 below, identity also obeys an alternate form of tran- sitivity due to the hypothetical syllogism, namely

x = y → (y = z → x = z).

Proposition 5.6. (∀x)(∀y)(x ∈ y ∨ x 6∈ y). 134 5. ELEMENTS

Proof. With axiom I, a ∈ y → a ∈ y. Then by axiom VIII, a 6∈ y ∨ a ∈ y. Rule IV generalizes to the result. 

Proposition 5.7. Sets that differ with respect to membership are not identical. In particular, (∃x)(x ∈ a ∧ x 6∈ a) → a 6= a.

Proof. We prove the contrapositive. 1 a = a → (∀z)(z ∈ a ↔ z ∈ a) Ext. 2 (∀z)(z ∈ a ↔ z ∈ a) → (b ∈ a ↔ b ∈ a) 1, Ax.X 3 (b ∈ a ↔ b ∈ a) → b 6∈ a ∨ b ∈ a Ax.II, V III 4 b 6∈ a ∨ b ∈ a → ¬(b ∈ a ∧ b 6∈ a) 3, Ax.I 5 a = a → ¬(b ∈ a ∧ b 6∈ a) 1 − 4, Ax.IV 6 (b ∈ a ∧ b 6∈ a) → a 6= a 5, Ax.V I Existential generalization completes the result.  . When a set a is such that its membership is inconsistent, some b ∈ a and b 6∈ a, then a is inconsistent. (The same is not said of b, since e.g. every set is both in and not in Routley’s set Z = {x : x 6∈ Z}, but we don’t think that all sets are inconsistent.) There are inconsistent sets in naive set theory and they have a neat characterization:

Proposition 5.8. (∃x)x 6= x.

Proof. By comprehension we have Russell’s set, R = {x : x 6∈ x}. 1 (∀x)(x ∈ R ↔ x 6∈ x) Comp. 2 R ∈ R ↔ R 6∈ R 1, Ax.X 3 R ∈ R → R 6∈ R 2, Ax.II 4 R 6∈ R ∨ R 6∈ R 3, Ax.V III 5 R 6∈ R 4, Ax.V 6 R ∈ R 2, 5, RuleII 7 R ∈ R ∧ R 6∈ R 5, 6, RuleI Since R differs from itself with respect to membership, by proposition 5.7, R 6= R.  From this novelty, Restall [Res92](427) infers by generalization the

Corollary 5.9 (Restall). There are at least two objects, (∃x)(∃y)(x 6= y).

These few facts already show us that the present set theory will have as theo- rems some propositions not contained by classical theory, and that classical theo- rems may be recaptured by distinctively non-classical means.

Definition 5.10 (Subsets). x ⊆ y is (∀z)(z ∈ x → z ∈ y). 4. BASICS 135

This leads to a natural partial order; the converse of anti-symmetry actually holds, too, since this is just the axiom of extensionality rewritten.

Proposition 5.11. Subsets are reflexive and anti-symmetric,

x ⊆ x, x ⊆ y ∧ y ⊆ x → x = y.

They are also transitive:

x ⊆ y ∧ y ⊆ z → x ⊆ z, x ⊆ y → (y ⊆ z → x ⊆ z), and x ⊆ y → (z ⊆ x → z ⊆ y).

Proof. Reflexivity and antisymmetry come from extensionality, axiom I and the commutativity of conjunction. The forms of transitivity are direct results of conjunctive syllogism and the hypothetical syllogism pair, respectively.  Proposition 5.12. Every set has an unrestricted , x = {y : y 6∈ x}, and x = x.

Proof. Existence is by comprehension; its behavior is checked with double negation introduction and elimination: Since y ∈ x ↔ y 6∈ x, it follows that y ∈ x ↔ y ∈ x.  Moving from these simple but useful facts, we fix a top and bottom of the universe of sets, an emptyset and a universal set, using von Neumann’s definition of set as an object that is a member of something. These will exhibit most of the expected structure of a boolean lattice, except that x ∩ x = ∅ and x ∪ x = V do not generally hold. Note that the characterizations we are about to use are not those given by Russell and Whitehead, which have since become standard: {x : x = x} for the universe and its complement, {x : x 6= x} for the emptyset. We have already seen in R 6= R that there are object not in, and in, these sets respectively. Nor, due to the relevance of the logic DL, would it be the case that y ⊆ {x : x = x} or {x : x 6= x} ⊂ y for every y. Our universe and emptyset, by contrast, are well behaved and have a very natural reading if sets are taken to be models of properties.

Proposition 5.13. The universe,

V = {x :(∃y)x ∈ y}, exists, and (∀x)x ∈ V . The emptyset,

∅ = {x :(∀y)x ∈ y}, exists, too, and is empty: (∀x)x 6∈ ∅. 136 5. ELEMENTS

Proof. Existence of both sets is by comprehension. To prove that (∀x)x ∈ V , 1 (∀x)(x ∈ V ∨ x 6∈ V ) prop.5.6 2 (∀x)(x 6∈ V → x ∈ V ) Abstraction, prop.5.12. 3 (∀x)(x ∈ V → (∃y)x ∈ y) Ax.X, Ax.V I 4 (∀x)((∃y)x ∈ y → x ∈ V ) Abs. 5 (∀x)(x 6∈ V → x ∈ V ) 2, 3, 4 The conclusion follows by ∨-elimination on line 1. To show that the emptyset is empty, similarly, if x ∈ ∅ then (∀y)x ∈ y; then x ∈ ∅ and therefore x 6∈ ∅.  Proposition 5.14. (∀x)(x ⊆ V ) and (∀x)(∅ ⊆ x).

Proof. By existential generalization, z ∈ x → (∃y)z ∈ y. Then z ∈ x → z ∈ V by abstraction, and therefore x ⊆ V . Similarly, z ∈ ∅ → (∀y)z ∈ y, and then z ∈ x by ∀-elimination.  What these theorems do not show—and indeed what probably cannot be shown—is the uniqueness of a universal or empty set. Indeed, one can show that (∀x)x ∈ {z : z = z}, since any x is self-identical by proposition 5.5 above. And we also have {z : z = z} ⊆ V , since everything is. But it is formally irrelevant that (∃y)x ∈ y → x = x, and unlikely that the present system will produce this as a theorem. (Proving that this is not a theorem is a metatheoretic task.) Ergo it will not follow that {z : z = z} = V . Similar arguments apply to ∅. The extensionality of naive set theory in light of this phenomenon were discussed last chapter.

Proposition 5.15. V = ∅ and ∅ = V .

Proof. One direction of each is already proved. For V ⊆ ∅ we argue the contrapositive (using only arrows): x ∈ ∅ → (∀y)x ∈ y, and (∀y)x ∈ y → x ∈ V . Then x 6∈ V . For V ⊆ ∅, x 6∈ V → ¬(∃y)x ∈ y, so (∀y)x 6∈ y; in particular, x 6∈ ∅. Double negation elimination completes the proof.  Note that ∅ is explosive: If anything is a member of it, triviality follows.

5. ZF

Here we retrieve all the axioms of Zermelo-Fraenkel set theory, except Fundierung. Since general comprehension induces sets that are not well founded, e.g. V ∈ V , the foundation axiom is not a part of the present theory, as is expected from Re- stall’s results in [Res92]. That the other axioms are forthcoming is not especially surprising, since Zermelo in 1908 [Zer67] saw them as a consistent fragment of the naive theory. The axiom of infinity will be provable, too, showing that the naive theory does not fail a reductive program in the same way as Russell and Whitehead’s system did. In each proposition, the step of universally generalizing on free a, b is omitted. 5. ZF 137

Proposition 5.16 (Aussonderung). (∃y)(∀x)(x ∈ y ↔ x ∈ a ∧ Φ(x)).

Proposition 5.17 (Powerset). (∃y)(∀x)(x ∈ y ↔ x ⊆ a).

For any a, we use the name P (a) = {x : x ⊆ a}.

Proposition 5.18 (Pairing). (∃y)(∀x)(x ∈ y ↔ x = a ∨ x = b).

For any a, b we use the name {a, b} = {x : x = a ∨ x = b}. A special case is the singleton {a} = {x : x = a}. For relevance purposes, sometimes singletons are relativized to a particular set, e.g. {x : x = a ∧ x ∈ b}, so that {a}b ⊆ b when a ∈ b.

Proposition 5.19 (Union). (∃y)(∀x)(x ∈ y ↔ (∃z)(z ∈ a ∧ x ∈ z)).

Proposition 5.20 (Intersection). (∃y)(∀x)(x ∈ y ↔ (∀z)(z ∈ a → x ∈ z)).

The standard names are adopted: S a = {x :(∃z)(z ∈ a ∧ x ∈ z)} and T a = {x :(∀z)(z ∈ a → x ∈ z)} for the union and intersection of a, respectively. Note that, because the conditional in DLQ is not material, these are not necessarily duals. On the other hand, a ∪ b = {x : x ∈ a ∨ x ∈ b} and a ∩ b = {x : x ∈ a ∧ x ∈ b} obey their usual algebra, save for the explosive a ∩ a ⊆ b. The complement of b in a is now easily construed as a − b = a ∩ b. The next axiom is the axiom of infinity. Note that V is already a set containing ∅ and its successors (since every set whatever is in V ), and that V will count as dedekind-infinite once that notion is defined. We will also prove that the natural numbers exist later. So the following is only for curiosity.

Proposition 5.21 (Infinity). There is a non-empty set i such that when x ∈ i, also {x}i ∈ i.

Proof. Consider i = {x : x ⊆ i}, a set that is its own powerset. Both i ∈ i and ∅ ∈ i, so i is not empty. For x ∈ i, also {x}i ⊆ i, and so {x}i ∈ i. 

In the next definition and beyond, we take to writing some sets in a common- place way, as e.g. {hx, yi : x ∈ a ∧ y ∈ b}. As noted above, this is to be controlled with instances of abstraction, in this case hx, yi ∈ z ↔ x ∈ a ∧ y ∈ b for some y, which we will name a × b.

Definition 5.22. An ordered pair is ha, bi = {{a}, {a, b}}.A cartesian prod- uct is a × b = {hx, yi : x ∈ a ∧ y ∈ b}.A relation is r ⊆ a × b, and its inverse is r−1 = {hx, yi : hy, xi ∈ r}.A function is a relation f : a −→ b with domain dom(f) = a and range rng(f) = b, such that

hx, ui ∈ f ∧ hx, vi ∈ f 7→ u = v. 138 5. ELEMENTS

The composition of two functions f, g is f◦g = {hx, zi :(∃y)(hx, yi ∈ g∧hy, zi ∈ f}. The image of a under f is f 00a = {f(x): x ∈ a}. The restriction of f to x ⊆ a is f|x = f ∩ (x × a).

Proposition 5.23 (Replacement). Let f be a function with domain a. Then f 00a exists.

When hx, yi ∈ f and f is a function, then we want to write f(x) = y. For this, we can think of functional claims defined by two-place predicates Φ in the usual way: (∀x)(∃y)[Φ(x, y) ∧ (∀z)(Φ(x, z) → y = z)], though the details need checking. The next theorem shows that with some care similar details can be (tediously) worked through.

Proposition 5.24 (Ordered Pairs). ha, bi = hc, di a` a = c ∧ b = d.

Proof. Right to left is substitution, a = c ` a ∈ ha, bi → c ∈ ha, bi, and similarly for b = d. We prove left to right. For a = c, 1 ha, bi = hc, di 2 (∀x)(x ∈ {{a}, {a, b}} ↔ x ∈ {{c}, {c, d}}) def.5.22 3 ∀x)(x = {a} ∨ x = {a, b} ↔ x = {c} ∨ x = {c, d}) 2, Abs. 4 {a} = {a} → {a} = {c} ∨ {a} = {c, d} 3, Ax.X 5 {a} = {c} ∨ {a} = {c, d} → (∀x)(x ∈ {a} ↔ x ∈ {c}) ∨ (∀x)(x ∈ {a} ↔ x ∈ {c, d}) 4, Ext. 6 (∀x)(x = a ↔ x = c) ∨ (∀x)(x = a ↔ x = c ∨ x = d) 5, Abs. 7 (∀x)(x = a ↔ x = c) → (a = a → a = c) ∧(∀x)(x = a ↔ x = c ∨ x = d) → (c = c ∨ c = d → c = a) 6, Ax.V, Ax.IX 8 a = a ∨ c = c ∨ c = d → a = c 7, Ax.V 9 a = c 8, prop.5.5, RuleII To prove that b = d, first we prove that b = d ∨ b = a. Then we prove that b = d ∨ d = a. Once proved, these can be used in substitutions, and together will give the result. Again from the definition of ordered pairs, 6. ORDINALS 139

1 {a, b} = {a, b} → {a, b} = {c} ∨ {a, b} = {c, d} 2 (∀x)(x = a ∨ x = b ↔ x = c) ∨(∀x)(x = a ∨ x = b ↔ x = c ∨ x = d) 1, Abs. 3 (b = a ∨ b = b ↔ b = c) ∨(b = a ∨ b = b ↔ b = c ∨ b = d) 2, Ax.X 4 b = c → b = c ∨ b = d Ax.II 5 b = a ∨ b = b → b = c ∨ b = d 3, 4, Ax.IV, V 6 b = b prop5.5 7 b = c ∨ b = d 5, RuleII 8 a = c prop.5.24 9 a = c ∧ (b = c ∨ b = d) 7, 8, RuleI 10 (a = c ∧ b = c) ∨ (a = c ∧ b = d) 9, Ax.III 11 a = c ∧ b = c → a = b prop.5.5 12 a = c ∧ b = d → b = d Ax.II 13 b = a ∨ b = d 10, 11, 12, Ax.V completing the first step of the argument. Now 1 a = c prop.5.24 2 ha, bi = hc, di 3 ha, bi = ha, di 1, 2, RuleV 4 (∀x)(x = {a} ∨ x = {a, b} ↔ x = {a} ∨ x = {a, d}) 3, def.5.22 5 {a} = {a} → {a} = {a, d} ∨ {a, d} = {a, b} 4, Ax.X 6 {a} = {a, d} ∨ {a, d} = {a, b} → (∀x)(x ∈ {a} ↔ x ∈ {a, d}) ∨ (∀x)(x ∈ {a, d} ↔ x ∈ {a, b}) 5, Ext. 7 d = d → d = a ∨ d = b 6, Ax.X, V 8 d = a ∨ d = b prop.5.5, 7, RuleII completing the second step of the argument. To conclude, we adjoin the two facts (b = a ∨ b = d) ∧ (d = a ∨ d = b); distributing possibilities,

(b = a ∧ a = d) ∨ (b = a ∧ d = b) ∨ (b = d ∧ d = a) ∨ (b = d ∧ d = b), each of which implies that b = d, as required. 

Most set theories include the axiom of choice; proving choice would recapture the axioms of ZFC. We will see that choice is a consequence of a deeper Cantorian principle: that every set can be put into a wellorder. To show this, though, we must first do a lot of work on the notion of order itself.

6. Ordinals

Although the theory of cardinals is our final destination, this section, in which the ordinals are defined and their properties developed, is logically speaking both the most import and the most difficult. Cantor abstracted the notion of an ordinal 140 5. ELEMENTS from wellordered sets, as the prototypical inductive structure; the ordinals form the central pillar of set theory. Deriving a fixed metric is the key to a set theoretic foundation for mathematics, and the heretofore unfulfilled task of the paraconsistent project. While the definitions and theorems are, for the most part, standard, some proofs are not. Again, much of the novelty here is to show that the classical theorems can be captured, where until now it had appeared that this could not be done. This section also goes beyond the classical, and proves that the set of all ordinals On is itself an ordinal, confirming both that On is wellordered and that the Burali- Forti paradox holds. The most important property of ordinals is self-similarity, displayed in the recursive proposition that an ordinal is the set of all preceding ordinals, or more subtly when ordinals are defined to be transitive both inside and out. The present approach confirms this fractal property even at the level of absolute comprehensiveness: It is only right that On itself be an ordinal. This shown, the Peano axioms are justified, and proofs of transfinite induction and recursion given. The theorems could be arranged to first study wellfoundedness, up to a proof of transfinite induction on any wellfounded set. That is, we could avoid discussing linearity or transitivity and still have a good chunk of structure, as many of the standard texts do. Useful classical references are [Dra74], [Lev79], and [Kun80].

Definition 5.25. A set a with respect to ∈ is: strictly ordered iff

x, y, z ∈ a → x 6∈ x ∧ (x ∈ y ∧ x 6∈ x → y 6∈ x) ∧ (y ∈ z → (x ∈ y → x ∈ z));

totally or linearly ordered by ⊆ iff a is strictly ordered and

x ∈ a 7→ (y ∈ a 7→ x ⊆ y ∨ y ⊆ x), that is, ⊆-trichotomy holds; wellfounded, W f(a), iff

y ⊆ a ∧ (∃z)z ∈ y 7→ (∃z)(z ∈ y ∧ ¬(∃x)(x ∈ z ∧ x ∈ y))];

wellordered, W o(a), iff totally ordered and wellfounded; transitive, T r(a), iff x ∈ a → x ⊆ a; an ordinal iff a is a wellordered and of ordinals, ⊆-connected to all other ordinals.

In summary, by (full) comprehension 6. ORDINALS 141

Proposition 5.26. There is a set of all ordinals, On = {x : x is an ordinal }, such that

x ∈ On ↔ W o(x) ∧ y ∈ x → y ⊆ x ∧ x ⊆ On ∧ y ∈ On 7→ (x ⊆ y ∨ y ⊆ x).

The definition of ordinal is adapted from von Neumann. We have added im- predicative clauses, to capture the recursive idea that the ordinals are the same inside and out. The ordinals are an analysis of the concept of induction, gener- alized to the transfinite and refified. The hard work in the theory of ordinals is in locating the right definition. Then the properties of the ordinals, culminating in the Burali-Forti theorem, should all follow, as it were, by logic alone. On this point, in his Calculus on Manifolds, Spivak memorably isolates “three important attributes [of] many fully evolved major theorems: 1. It is trivial. 2. It is trivial because the terms appearing in it have been properly defined. 3. It has significant consequences.” I hope that, like the classical definition of ordinals, the definition here comes close to delivering in this way. A few more detailed comments. Our extra clause in the anti-symmetry con- dition for strict order, x 6∈ x, is due to the relevance constraint on implication, as we will see in the proofs below. In the wellfounding clause, the rendering of a set having a least member is material; the intensional alternative would have been that for any non-empty y ⊆ a, there is a z ∈ y such that (∀x)(x ∈ z → z 6∈ y). Using this definition, however, makes it too difficult to prove that anything is wellfounded. Similarly, to gloss z as a least member if z ∩ y = ∅ would be almost impossible to confirm, since one would have to show that x ∈ z ∩ y not only fails, but is absurd. It may be contradictory, but it is not absurd. For linear order, we are using the ⊆ relation instead of the ∈ relation, and have built in an added clause into the definition of On based on this choice. There is, as far as I can try, no way to prove that On is linearly ordered by ∈. On to the mathematics. Ordinals are written with lowercase greek letters.

Proposition 5.27. ∅ ∈ On.

Proof. This is because ∅ is explosive. First, ∅ ⊆ On, by prop. 5.14. Similarly, x ∈ ∅ → x ⊆ ∅, transitivity, and x ∈ ∅ → (y ∈ ∅ → x ⊆ y ∨ y ⊆ x), a linear order. To show that ∅ is ⊆-connected to all ordinals it again suffices that ∅ ⊆ x for any x at all. Finally, to show wellfoundedness, we have a ∈ y ∧ (x ∈ y → x ∈ ∅) → a ∈ 142 5. ELEMENTS y ∧ (x ∈ ∅ ∨ x 6∈ y), by counterexample. Again since ∅ is explosive we get

a ∈ y ∧ (x ∈ y → x ∈ ∅) → a ∈ y ∧ (x 6∈ a ∨ x 6∈ y).

Generalization completes the proof. 

Proposition 5.28. Subsets of a wellordered set are wellordered:

W o(α), β ⊆ α ` W o(β).

Proof. Let α be wellordered and β ⊆ α. By proposition 5.11, β ⊆ α → (y ⊆ β → y ⊆ α). And (y ⊆ β → y ⊆ α) means that members of β behave just as members of α, giving wellorder. For example, linearity is proved using hypothetical syllogism and (the rarely seen) rule III: 1 x ∈ β → x ∈ α 2 y ∈ β → y ∈ α 3 (y ∈ α → x ⊆ y ∨ y ⊆ x) → (y ∈ β → x ⊆ y ∨ y ⊆ x) 2, hyp. syl. 4 (x ∈ α → (y ∈ α → x ⊆ y ∨ y ⊆ x)) → (x ∈ β → (y ∈ β → x ⊆ y ∨ y ⊆ x)) 1, 3Rule III 5 x ∈ β 7→ (y ∈ β 7→ x ⊆ y ∨ y ⊆ x) 4, m.p. For wellfoundedness, notice that (*) Φ → Ψ ` Φ ∧ Υ → Ψ ∧ Υ; this allows the following derivation. 1 β ⊆ α. 2 β ⊆ α → (y ⊆ β → y ⊆ α). prop.5.11 3 (y ⊆ β → y ⊆ α). 1, 2, m.p. 4 (y ⊆ β ∧ ∃zz ∈ y → y ⊆ α ∧ ∃zz ∈ y). 3, (∗) 5 y ⊆ α ∧ ∃zz ∈ y 7→ ∃z(z ∈ y ∧ ∀x(x 6∈ z ∨ x 6∈ y)) def.5.25 6 y ⊆ β ∧ ∃zz ∈ y 7→ ∃z(z ∈ y ∧ ∀x(x 6∈ z ∨ x 6∈ y)) 4, 5Ax.IV as required.  Proposition 5.28 extends immediately to ordinals, as these are wellordered sets.

Proposition 5.29. On is transitive, α ∈ On → α ⊆ On.

Proof. This is a clause in the definition of being an ordinal. 

Proposition 5.30. An ordinal α is the set of all preceding ordinals,

α = {x : x ∈ On ∧ x ∈ α}.

Proof. Let α ∈ On. The previous theorem and the axiomatic instance β ∈ α → β ∈ α shows that β ∈ α → β ∈ On ∧ β ∈ α. The other direction is immediate. Therefore α = α ∩ On. 

Proposition 5.31. α ∈ On → α 6∈ α. 6. ORDINALS 143

Proof. The idea is that, were α ∈ α, then still α 6∈ α. 1 α ∈ On → (∀x)(x ∈ α → x 6∈ x). def.5.25 2 (∀x)(x ∈ α → x 6∈ x) → (α ∈ α → α 6∈ α). Ax.X. 3 (α ∈ α → α 6∈ α) → α 6∈ α ∨ α 6∈ α. Ax.V III 4 α 6∈ α ∨ α 6∈ α → α 6∈ α. Ax.V Conjunctive syllogism completes the proof. 

Proposition 5.32. α ∈ β ∧ α 6∈ α → β 6∈ α.

Proof. α ∈ β ∧ α 6∈ α → β 6⊆ α by counterexample, and β 6⊆ α → β 6∈ α. 

Proposition 5.33. β ∈ γ → (α ∈ β → α ∈ γ).

Proof. By the definition of transitivity on γ.  The last few propositions have showed, by virtue of self-similarity, that the ordinals are strictly ordered. Now we need total and wellordering.

Theorem 5.34 (Trichotomy of Ordinals). Any two ordinals are ⊆-connected,

α ∈ On → (β ∈ On 7→ α ⊆ β ∨ β ⊆ α).

Proof. This is a clause in the definition of ordinal.  This delivers, for a start, some miscellany like the following.

Proposition 5.35. α ∩ β ∈ On, and α ∪ β ∈ On.

Proof. We can show α ∩ β is wellordered and transitive independently of trichotomy. Let x ∈ α ∩ β. Then x ∈ α ∧ x ∈ β; then x ⊆ α ∧ x ⊆ β; then x ⊆ α ∩ β, showing transitivity. Meanwhile, α ∩ β ⊆ α; since α is wellordered, α ∩ β is wellordered, too. Easier still is to notice that if α ⊆ β, then α ∩ β = α; and if β ⊆ α then α ∩ β = β, ordinals both. The case for α ∪ β is just like this.  More weakly, we see that certain intervals of On must be empty. Since α ∩ β ∈ On, we have α ∩ β 6∈ α ∩ β. Therefore either α ∩ β 6∈ α or α ∩ β 6∈ β. Therefore there cannot be ordinals intervening between both α and α ∩ β, and β and α ∩ β. Observations like these are the spur for the classical proof that On is linearly ordered.

Proposition 5.36. The ordinals are wellfounded.

Proof. We have to show that a non-empty θ ⊆ On has a least member. Let β ∈ θ. The idea is that either β is ∈-least in θ, or else the least member of β is the least member of θ. Note that β ∩ θ ⊆ β; since β is wellordered, β ∩ θ is wellordered (by prop.5.28). Either β ∩ θ is empty, or not. 144 5. ELEMENTS

Suppose (∀y)(y 6∈ β ∩ θ). We infer by t-introduction that

θ ⊆ On ∧ β ∈ θ 7→ β ∈ θ ∧ (∀y)(y 6∈ β ∨ y 6∈ θ).

This says that β is a least member of θ, and generalizes:

θ ⊆ On ∧ ∃zz ∈ θ 7→ ∃z(z ∈ θ ∧ (∀y)(y 6∈ z ∨ y 6∈ θ)).

So far this shows (∀y)(y 6∈ β ∩ θ) ` W f(On). On the other hand, suppose (∃y)(y ∈ β ∩ θ). Since β ∩ θ is wellfounded, trading ∃ for γ we have

γ ∈ β ∧ γ ∈ θ ∧ (∀y)(y 6∈ γ ∨ y 6∈ β ∨ y 6∈ θ).

As γ ⊆ β (since γ ∈ β), by contraposition y 6∈ β → y 6∈ γ. Therefore the above formula reduces to γ ∈ θ ∧ (∀y)(y 6∈ γ ∨ y 6∈ θ), which says that γ is ∈-minimal in θ. Again with t-introduction this generalizes to

θ ⊆ On ∧ ∃zz ∈ θ 7→ ∃z(z ∈ θ ∧ (∀y)(y 6∈ z ∨ y 6∈ θ)).

This shows ¬(∀y)(y 6∈ β ∩ θ) ` W f(On). Both directions of an excluded middle lead to W f(On). An instance of a derived meta-rule, if Φ ` Υ and ¬Φ ` Υ, then Φ ∨ ¬Φ ` Υ, is now used, and we have a theorem. Any arbitrary non-empty subset of ordinals has a least member. 

Proposition 5.37. Any transitive set of ordinals, connected to all other ordi- nals by ⊆, is an ordinal.

Proof. Any set of ordinals is wellordered, by the previous theorems. Then the definition of being an ordinal is satisfied. 

Theorem 5.38. [Burali-Forti 1897] On ∈ On.

Proof. By the transitivity of On, we have y ∈ On → y ⊆ On. By ∨- introduction, y ⊆ On → y ⊆ On ∨ On ⊆ y. With conjunctive syllogism, y ∈ On 7→ y ⊆ On ∨ On ⊆ y as required. On is a wellordered, transitive set of ordinals connected to all other ordinals. 

Corollary 5.39. On 6∈ On, and then On 6= On.

Proof. From prop. 5.31, and then prop. 5.7.  At this point the limitations of our logic leave us with an unsolved problem of basic relevant set theory. To see that On is closed under some expected operations, and that certain objects are ordinals, we need the following lemma: 6. ORDINALS 145

Conjecture. Let α, β ∈ On, let θ be a wellordered, tran- sitive set of ordinals, and let α ⊆ θ ⊆ β. Then θ ∈ On. To prove this, we need to show under the assumptions of the conjecture that (∀y)(y ∈ On 7→ y ⊆ θ ∨ θ ⊆ y); then all the conditions for θ being an ordinal are satisfied. I do not have a proof that is adequate from a paraconsistent perspective. (Solatium miseris, socios habuisse malorum.) Any set of ordinals θ is such that ∅ ⊆ θ ⊆ On; so if this conjecture were proved, then for any θ that is transitive and wellordered, θ is an ordinal. We will assume the conjecture for the duration.

Definition 5.40. α+ = α ∪ {α} is the successor of α. Ordinals α with no predecessor, ¬(∃β)(β ∈ On ∧ β+ = α), are called limits.

Proposition 5.41. α ∈ On ` α+ ∈ On.

Proof. All the members of α+ are ordinals, therefore α+ is wellordered. For transitivity, 1 x ∈ α+ → x ∈ α ∨ x ⊆ α. 2 x ∈ α → x ⊆ α. 3 x ⊆ α → x ⊆ α ∪ {α}. 4 x ∈ α+ → x ⊆ α+. To show that α+ is connected to all other ordinals, it suffices to notice that + α ⊆ α ⊆ On. 

Proposition 5.42. On is a successor, and On is a limit.

Proof. To show successor: Since On is an ordinal, On+ is an ordinal. So On+ ∈ On, and by transitivity, On+ ⊆ On. And On ∈ On+ like all other ordinals, so On ⊆ On+ by transitivity again. Therefore On = On+. To show limit, we need β+ = On ` β 6∈ On for all β. We can then apply this rule disjunctively to excluded middle, to obtain (∀β)(β 6∈ On ∨ β+ 6= On), which is equivalent to On being a limit. We prove β+ = On ` β = On from which β 6∈ On follows, since On 6∈ On by cor. 5.39. So let β+ = On; then β ⊆ On. Conversely, since On ∈ On by thm 5.38, also On ∈ β+ by def. 5.40. Then On ∈ β ∨ On = β and in either case, On ⊆ β by the transitivity of β. So β = On as required.  What Burali-Forti’s contradiction shows, in effect, is that the entire structure of the ordinals is recapitulated at the top. This fact provides the core of the other main arguments to follow: the wellordering principle and thus the solution to the cardinal assignment problem, as well as the existence of distinct and inaccessible transfinite cardinals, and a counterexample to the GCH. We get, then, quite a lot of work out of On; this is appropriate for two reasons. The first is just that it is not at all an easy task to prove On ∈ On in a non-trivial setting, so the fact that 146 5. ELEMENTS it is powerful shows that the logical machinery is in some sense ‘fuel efficient’. The second, perhaps deeper reason is that since set theory was clarified as a legitimate mathematics in Cantor’s Beitrage, there has been intense concern over the status of On, because it was recognized that there is a very serious paradox here; for a century it has been thought so serious that were On ∈ On then not only does a contradiction follow, but triviality, too. What we draw out of the paradox is more informative and more nuanced. Now we sketch the development of Peano arithmetic, followed by more general ordinal arithmetic. Full comprehension induces

Proposition 5.43. There is an ω = {x ∈ On :[x = ∅∨(∃y)(y+ = x)]∧x ⊆ ω)}.

Members n, m, ... of ω are called natural numbers, or just numbers. By defini- tion, ω ⊆ On; so it is a wellordered set.

Proposition 5.44. ω ∈ On.

Proof. Let n ∈ ω. Then by definition, n ⊆ ω, showing transitivity. As a subset of On, ω is wellordered. And n ⊆ ω ⊆ On, showing ⊆-connection to all other ordinals. So ω is an ordinal. 

Theorem 5.45 (Peano’s Postulates). The following hold: 0 ∈ ω; n+ 6= 0; n ∈ ω ` n+ ∈ ω; n+ = m+ ` n = m.

Proof. Since 0 ∈ On ∧ 0 = 0 ∧ (∀y)(y ∈ 0 → y ∈ ω), zero is a number. Since n ∈ n+ but no n ∈ 0, zero is not the successor of any number. If n ∈ ω, then n+ meets all the requirements to be a number: n+ ∈ On, and has a predecessor, and all its members are either in n, in which case they are numbers, or else n, a number again. And if n+ = m+, then ∀z(z ∈ n ∨ z = n ↔ z ∈ m ∨ z = m); then picking n and m for z respectively, (n ∈ m ∨ n = m) ∧ (m ∈ n ∨ m = n), which distributes to

(n ∈ m ∧ m ∈ n) ∨ (n ∈ m ∧ m = n) ∨ (n = m ∧ m ∈ n) ∨ (n = m ∧ m = n), each of which implies that n = m. So the successor of every number is unique.  6. ORDINALS 147

Names of the first few natural numbers are

∅ = 0 ∅+ = 1 ∅++ = 2 . .

The fifth postulate is induction, proved in the general, transfinite case over On.

Theorem 5.46 (Transfinite Induction). Let θ ⊆ On. Suppose

(∀β)(β 6∈ α ∨ β ∈ θ) → α ∈ θ.

Then ¬(∃α)(α ∈ On ∧ α 6∈ θ).

Proof. Suppose (∃α)(α ∈ On ∧ α 6∈ θ). Then there is a least such,

(∃α)[α 6∈ θ ∧ (∀β)(β 6∈ α ∨ β ∈ θ)].

But the hypothesis implies (∃β)(β ∈ α ∧ β 6∈ θ) ∨ α ∈ θ, for every α, which negates this claim. Therefore there is no least, and so no ordinal not in θ, as required. 

Transfinite induction will hold for any wellordered set, including ω. The base case, 0, is covered automatically by the induction hypothesis: If ¬Φ(0) → (∃x)(x ∈ 0 ∧ ¬Φ(x)), then Φ(0) by the explosiveness of ∅. A close mate of induction is definition by recursion. There is something appro- priate in proving the recursion theorem, as we are about to, with a set containing itself in its defining condition.

Theorem 5.47 (Transfinite Recursion). Let h be a function from V to V . There is a function f from On to V such that

f(α) = h(f|α).

Proof. Take the set

hx, yi ∈ f ↔ y = h(f|x).

Existence is immediate from full comprehension. This is a function because h is; if hx, yi, hx, zi ∈ f then y = h(f|x) = z. 

The recursion scheme is used to define ordinal arithmetic. To deal with limit ordinals, we look to least upper bounds.

Definition 5.48. Let X be a set of ordinals. The supremum of X, sup(X), is the least ordinal δ such that every x ∈ X is either in δ or identical to δ. 148 5. ELEMENTS

The usual identification of the sup(α) with S α does not seem to follow in the logic DLQ, or at least not with ordinals as they are here defined, because the existential quantifier resists a proof that S α is an ordinal. Nevertheless, or- dinals including every member of X certainly exist—On, for example—and so by welfoundedness a least exists, too. If there is any question about the uniqueness of a supremum, a choice function—developed in the next section independently of ordinal arithmetic—can be applied to make a functional selection. Addition, taking h to be h(α) = α+, is

α + 0 = α α + (β + 1) = (α + β) + 1 α + β = sup{α + γ : γ ∈ β} for limit β;

multiplication, taking h to be hβ(α) = α + β, is

α · 0 = 0 α · (β + 1) = α · β + α α · β = sup{α · γ : γ ∈ β} for limit β;

exponentation, with hβ(α) = α · β, is

α0 = 1 αβ+1 = αβ · α αβ = sup{αγ : γ ∈ β} for limit β.

From here we will work a bit faster, and assume that the ordinals are linearly ordered by membership as well as subsets, although α ⊆ β ∨ β ⊆ α ` α ∈ β ∨ β ∈ α ∨ α = β is unproved.

7. Global Choice

In 1977, Routley produced an argument for the axiom of global choice from full comprehension [Rou77]. Since then the claim that choice is a theorem of naive theory has become part of paraconsistent folklore. However, on close examination, Routley’s proof relies on a faulty definition of function. On the one hand, he does show the existence of a global choice, but on the other hand, the choice is not func- tional. A second formulation, in [PRN89](374), accounts better for functionality, but nevertheless falls afoul, at least, of Curry’s paradox. A significant claim of naive set theory in paraconsistent logic, then, has been erroneous. Routley’s details were awry, but his forecast was not. Answering a fundamental question about the set concept, we here surpass classical theory and derive a global choice theorem without need of any further assumptions. In fact what is proved is 7. GLOBAL CHOICE 149 a weak version of Cantor’s wellordering principle, which he took to be a Denkgesetz (law of thought), “basic and consequential and especially remarkable because of its general validity.” It was in service of proving this principle, essential to the theory of transfinite cardinals, that Zermelo formulated his choice principle in 1904. The wellordering here is produced by injecting V into a particular subset of On, thereby giving a wellorder for every subset of V , that is, every set. The wellordering then provides the minimal resources for proving Hausdorff’s maximal principle; this latter a structure for Zorn’s lemma; and only then does a global choice theorem obtain. In the next section on cardinals, the choice principle is used to strengthen wellorderings.

Definition 5.49. A function f : a −→ b is: injective, or one-one, iff (∀x)(∀y)(x 6= y 7→ f(x) 6= f(y)); surjective, or onto, iff (∀y)[y ∈ b 7→ (∃x)(x ∈ a ∧ f(y) = x)]; bijective, or a one-to-one correspondence, iff f is injective and surjective. a ≤ b iff there is an injection from a to b. a < b iff a ≤ b and (∀f)[(∃y)(y ∈ b ∧ ¬(∃x)(x ∈ a ∧ f(x) = y)].

Theorem 5.50. The universe can be wellordered.

Proof. An injection f : V −→ On is required. Consider

hx, yi ∈ f ↔ x ∈ V ∧ y ∈ On ∧ y = On.

That is, for each x ∈ V , f(x) = On. This is clearly a function. Now,

{α ∈ On : α = On} ⊆ On, showing that the range of f is a segment of the ordinals and therefore wellordered. Intuitively, the Burali-Forti paradox indicates that the members of the range of f are discrete (corollary 5.39), of the form

... ∈ On ∈ On ∈ ..., so {On} may be injected into by arbitrarily large sets, inducing a wellorder on them. Formally, because On 6= On,(∀x)(∀y)(x 6= y 7→ On 6= On), so (∀x)(∀y)(x 6= y 7→ f(x) 6= f(y)). Therefore f is an injection. Thus

{xf(x) : f(x) ∈ On} is a wellorder on V . 

The proof is clearly not constructive; given the ordering {aOn, bOn, cOn, ...} on V with each On distinct, it is not said how to determine a first member. This is exactly the case with Zermelo’s choice principle, which is a pure existence claim. 150 5. ELEMENTS

A proof of a Cantorian “law of thought” will inevitably be by demonstration of a bare existence of an ordering; and so here, it has been established that for any wellordered set there is a least member, and {On} is wellordered. The difference between Zermelo’s proof and our own is that no extra assumptions are needed to produce the existence claim. It comes directly from the set concept. Since subsets of a wellordered set are wellordered,

Corollary 5.51 (Zermelo 1904). Every set can be wellordered.

Definition 5.52. A ⊆-chain in a is a c ⊆ a such that, for x, y ∈ c, either x ⊆ y or y ⊆ x. An upper bound x ∈ a is such that for all y ∈ a, y ⊆ x.A maximal element x ∈ a is such that for any y ∈ a, if x ⊆ y then x = y.A maximal chain b ⊆ a is such that if any c is both a chain in a and b ⊆ c, then b = c.

Below, a ⊆-chain is called just a chain.

Proposition 5.53 (Hausdorff’s Maximal Principle). Every set has a maximal chain.

Proof. By the wellordering theorem, let {xξ : ξ ∈ α} be a wellordering of a. From this ordering is built a maximal chain c ⊆ a. Define f by transfinite recursion:  S {xζ } if {xζ } ∪ ξ∈ζ f(xξ) is a chain;  f(xζ ) =  ∅ otherwise.

For relevance purposes, {x} here is {x}a = {y ∈ a : y = x}, so that {x} ⊆ a when S x ∈ a. Then it is to be shown that x∈a f(x) = c is a maximal chain in a. Note that, by the law of excluded middle, c is not empty. To show c is a chain, consider any u, v ∈ c. Either u precedes v or v precedes u in the wellorder on a, or else u = v. So either u = ∅ or {uζ } ∪ c is a chain, and either v = ∅ or {vζ } ∪ c is a chain. Since (∀x)(∅ ⊆ x), either u = v, or u ⊆ v, or v ⊆ u. Therefore c is a chain. To show c is maximal, let b be a chain in a such that c ⊆ b. Since b is a chain, for u ∈ c and S v ∈ b we have either u ⊆ v ∨ v ⊆ u. Then {v} ∪ c is a chain, so v ∈ x∈a f(x) by definition of f. Then v ∈ c and b ⊆ c, giving b = c as sought.  Lemma 5.54 (Zorn). If every chain of some non-empty a has an upper bound, then a has a maximal element.

Proof. From Hausdorff’s principle, let c ⊆ a be a maximal chain in a, and ex hypothesi let x be the upper bound of c. Suppose y ∈ a ∧ x ⊆ y. Then c ∪ {y} is a chain in a with upper bound y. But c ⊆ c ∪ {y}; so c = c ∪ {c} by the maximality of c. So, both being upper bounds of c on assumption, x = y. This proves that x is maximal in a.  8. CARDINALS 151

Deviating in notation, in the next proof σ and η denote not ordinals but func- tions, because they are rather special functions. A function σ is called a choice function on a iff σ(x) ∈ x when x ∈ a and (∃z)z ∈ x.

Theorem 5.55 (Choice). There is a choice function on every non-empty set.

Proof. Let c be the set of all choice functions on subsets of a. Let u ⊆ c be all the choice functions totally ordered by inclusion: For any f, g ∈ u, either f ⊆ g ∨ g ⊆ f. Let S u = η. For each hx, yi ∈ η, η(x) = f(x) for some f ∈ u, so η is a choice function on subsets of a, showing η ∈ c. For f ∈ u, by definition f ⊆ η, showing η is an upper bound of u. Then by Zorn’s lemma c must have a maximal element, σ, whence from f ∈ c and σ ⊆ f follows f = σ. We only need now that the domain of σ is a. Let x ∈ a be non-empty, and consider σ0 = σ ∪ {hx, yi} with y ∈ x. Then σ0 ∈ c and σ ⊆ σ0, so by maximality σ0 = σ. Thus σ is a choice on a as desired.  Corollary 5.56 (Global Choice). There is a choice function on V . A fortiori, for every non-empty a there is a choice function σ such that σ(a) ∈ a.

We can complete the familiar circle and re-prove Zermelo’s theorem:

Proposition 5.57. (∀x)(∃α)(x ∼ α).

Proof. By recursion:  σ(x − {yξ : ξ ∈ ζ}) if there is one;  f(yζ ) =  ∅ otherwise, again leads to a wellorder on x, a least ordinal θ such that f(yθ) = ∅.  Aside from settling long-standing questions, a global choice theorem provides a solution to a pressing problem of uniqueness for mathematics in intensional logics like DLQ. The wellordering on V , and indeed on On, so far developed in theorem 5.50 does not guarantee that there is only one least member of a given wellordered set—the least member. Instead, we have an equivalence class of the minimal mem- bers of a. A functional choice σ(x) from the equivalence set x, though, can act as the desired unique member, as the next section explores.

8. Cardinals

Here we define cardinal numbers and study their most basic properties, working from the fundamental

Definition 5.58. a is equipollent to b, a ∼ b, iff there is a between a and b. 152 5. ELEMENTS

Chiefly we must solve the cardinal assignment problem posed, in essence, by Cantor [Can95](482), of defining an operation | · | which satisfies (at least):

x ∼ |x|, |x| = |y| when x ∼ y.

As is standard, we will use von Neumann’s solution, taking specific ordinals for cardinals, and requiring the axiom of choice. The end goal is to prove Cantor’s theorem, that there are orders of transfinite powers. This will show that the transfinite still has rich structure in a paraconsistent setting, a point that has been in some doubt [Pri06b](253). To start, here is a classic theorem of naive set theory, an adaptation of theorem 66 from [Ded88] proving the existence of an infinite set. Any set a is dedekind infinite iff (∃b)(b ⊂ a ∧ b ∼ a). Then

Proposition 5.59 (Dedekind 1888). Some sets are dedekind infinite.

Proof. Consider f = {hx, {x}i : x ∈ V }. This is an injection of V into a proper part of itself. Therefore V is dedekind infinite. 

Proposition 5.60. Equipollence is an equivalence relation.

a ∼ a, a ∼ b → b ∼ a, a ∼ b ∧ b ∼ c → a ∼ c.

The following forms of transitivity hold, too.

a ∼ b → (b ∼ c → a ∼ c), a ∼ b → (c ∼ a → c ∼ b).

Proof. For reflexivity, the function is identity. For symmetry, if f is a bijection from a to b, then f −1 is an bijection from b to a. For (conjunctive) transitivity, if f is a bijection from a to b and g is a bijection from b to c then f ◦ g is a bijection from a to c. As in proposition 5.5, too, the implicational phrasings of transitivity 4 are from hypothetical syllogism.  For any a, therefore, we have the ∼-equivalence class {x : x ∼ a}. This is the Frege cardinal of a. Frege cardinals are a plausible of the concept of size, especially given the naive tie-up between set membership and predication: The concept of a-sized is just the extension of a-sized objects. The

4If there is any doubt about this form of transitivity, note that it and prop.5.61 below which it supports are disposable, since the proof of the more important prop.5.62 requires only conjoined transitivity. 8. CARDINALS 153 notion can be tightened by intersecting Frege cardinals with the ordinals. Let C be the set of all ∼-least ordinals:

x ∈ C ↔ x ∈ On ∧ ¬(∃z)(z ∈ x ∧ z ∼ x).

This immediately delivers a weak form of an important counting principle:

Proposition 5.61. For every α ∈ C there is a set Cα = {x : x ∈ C ∧ x ∼ α}, such that, for β ∈ C,

Cα = Cβ a` α ∼ β. Proof. Left to right, the premise is x ∈ C ∧ x ∼ α ↔ x ∈ C ∧ x ∼ β. Then α ∼ α implies that α ∼ β. Right to left, let f : α −→ β be a bijection. Then if g : x −→ α is a bijection, f ◦ g : x −→ β is a bijection: hence x ∼ α → x ∼ β. Now suppose x ∈ Cα. Then x ∈ C ∧ x ∼ α; then x ∈ C ∧ x ∼ β, and therefore x ∈ Cβ. Ergo Cα ⊆ Cβ. Mutatis mutandis to show Cβ ⊆ Cα. 

Is Cα is the cardinality of α? This assignment does not meet the requirement that α ∼ Cα, the useful criteria that x ∼ |x|. For this we need to be even more specific about the object called a cardinal. A set x is a soft cardinal iff there is some ∼-equivalent part of C having x as a member:

C = {x :(∃α)(x ∈ Cα)}. By the wellordering theorem, for every set a there is an ordinal α such that a ∼ α. So for every a there is an x ∈ C such that a ∼ x and x is least of its size; write C(a) for such an x. In a sense, this is a sufficiently robust notion of cardinality: It solves the cardinal assignment problem, yielding easily that a ∼ b when C(a) = C(b). On the other hand, the paraconsistent logic used here is not like classical logic, and the members of C are not demonstrably unique. A fixed point property does not obtain: If x ∈ C, we cannot show (although we don’t have a disproof) that x = C(x). Nor does C(a) = C(b) follow from a ∼ b, which is awkward. To be even more exacting, then, we can assigning a specific ordinal to be the cardinal of a set. Let σ be a choice function. The set of the cardinals are:

C = {x :(∃α)[x = σ(Cα)]}.

Obviously C ⊆ C, and in many cases there is no cause to distinguish the two. The (proper) cardinals, however, do deliver a function: The cardinality of a is a function | · | : V −→ C, hx, κi ∈ | · | ↔ κ ∈ C ∧ x ∼ κ. To show that cardinality is functional, it suffices to check that |x| = κ ↔ κ ∈ C ∧ x ∼ κ, which follows from the functionality of the choice σ. It is similarly basic 154 5. ELEMENTS to check that cardinality is a fixed point: κ ∈ C ∧ κ ∼ κ ↔ |κ| = κ. Theorem 5.50 means that for every a, there is a κ ∈ C such that |a| = κ.

Proposition 5.62. |a| = |b| a` a ∼ b.

Proof. Suppose |a| = |b|. Let |a| = κ = |b|; then by definition a ∼ κ ∼ b. Then by transitivity, a ∼ b. In the other direction, suppose that a ∼ b. If |a| = κ then a ∼ κ, and by transitivity, b ∼ κ too, showing that |b| = κ. Then |a| = κ = |b|, implying |a| = |b| as required. 

Proposition 5.63 (Hartogs). |a| ≤ |b| ∨ |b| ≤ |a|.

Proof. Let |a| = κ, |b| = λ. Since κ, λ are ordinals, trichotomy under ∈ holds. If κ ∈ λ, then κ ⊆ λ, and then f(x) = x is an injection from κ to λ. Mutatis mutandis for λ ∈ κ and κ = λ. 

Proposition 5.64. a ⊆ c ⊆ b, |a| = |c| ` |a| = |c| = |b|.

Proof. Usually the axiom of choice makes this very easy; but even without choice (or any attendant easiness), a proof by standard back-and-forth construction goes through without paraconsistent difficulties. Define by recursion c0 = c, cn+1 = 00 00 f c, and b0 = b, bn+1 = f b. Let dn = cn ∩ bn for every n. Then define g from c to b as  Sn∈ω f(x) if x ∈ dn; g(x) = n=0 x otherwise, and we have that g is a bijection, showing |c| = |b|, and then by transitivity, the result. 

Proposition 5.65 (Cantor-Schroder-Bernstein). |a| ≤ |b| ∧ |b| ≤ |a| ` |a| = |b|

Proof. Let f : a −→ b and g : b −→ a be injections. By definition, g00(f 00a) ⊆ g00b ⊆ a, and g−1 : g00b −→ b is a bijection. Now, g ◦ f : a −→ g00f 00a is an injection. By prop.5.64, therefore, there is a bijection h : a −→ g00b. And then g−1 ◦ h is a bijection between a and b.  The most spectacular feature of set theory is the existence of distinct transfinite cardinals, as Cantor discovered in 1873. His argument is recast at the end of this chapter. Interestingly, one can prove the substance of the theorem with a very direct reflection argument. The following general lemma, in fact, suggests that inconsistency is as closely tied to infinity as Aristotle originally suspected. Distinctive of some paraconsistent mathematics, here is how to count inconsistent sets.

Lemma 5.66. b ∈ a ∧ b 6∈ a ` (∀z)|z| < |a|. 8. CARDINALS 155

Proof. Let b ∈ a ∧ b 6∈ a. The idea is that b is in a but outside the range of any mapping into a, because b is not in a. Let f : z −→ a be a function,

f ⊆ {hu, vi : u ∈ z ∧ v ∈ a}.

For each u ∈ z, we can at least have hu, bi ∈ f be a functional injection, as in prop. 5.50; this shows |z| ≤ |a|. Because b 6∈ a, also ¬(u ∈ z ∧ b ∈ a), for any u. Then

(∀u)hu, bi 6∈ f.

Therefore b ∈ a ∧ ¬(∃u)(u ∈ z ∧ hu, bi ∈ f). Then by generalization, (∀f)(∃y)(y ∈ a ∧ ¬(∃x)(x ∈ z ∧ f(z) = y)). Therefore a is strictly larger than any set. 

The first application is to prove that the ordinals are a cardinal, which makes sense, given that On is a limit ordinal, to say the least. The last ordinal is the greatest size:

Proposition 5.67. On ∈ C.

Proof. There are two main conditions to satisfy: being an ordinal, and being the least ordinal of its size, ¬(∃α)(α ∈ On ∧ α ∼ On). Well, On ∈ On. Since On 6∈ On, also (∀u)hu, Oni 6∈ f, for any u, as in lemma 5.66. If (∀z)|z| < On, then there is no surjection from z to On, and so no bijection, showing (∀z)(x 6∼ On). Therefore (∀x)(x 6∈ On∨x 6∼ On). It is further desirable that On be a cardinal, the representative of its size-class. Now, it seems plausible that if any least ordinal of its size be unique, then On would be. Is On = σ(COn)? Well, unlike any other ordinal, we know that On is maximal: that On ∈ α 7→ On = α, for every α, including any

α ∈ COn. This is a unique property for On, so here we can make the choice function 0 specific and stipulate that σ (COn) = x ↔ (∀α)(α ∈ COn 7→ (x ∈ α 7→ x = α)). This makes On a cardinal. 

Corollary 5.68. For every cardinal, there is a strictly greater cardinal.

Proof. Because (∀x)|x| ≤ On, by generalization (∃y)(∀x)|x| ≤ |y|. Then (∀x)(∃y)|x| ≤ |y| as required. 

A statement and proof of the more well-known form of Cantor’s theorem is below, theorem 5.73. We have shown that inconsistent sets are absolutely infinite—vindicating the limit on size intuition, that all inconsistent sets are overlarge:

Proposition 5.69. If (∀z)|z| < |a|, then |a| = On. 156 5. ELEMENTS

Proof. Let (∀z)|z| < |a|. Then (∀z)(|z| ≤ |a|). Meanwhile, |a| ≤ On. Since On is a cardinal, the Cantor-Schroder-Bernstein antisymmetry prop. 5.65 applies. 

Definition 5.70. Let κ ∈ C. Cardinal successor is a function | · |0 : C −→ C satisfying κ0 = λ ↔ κ < λ ∧ ¬(∃y)(y ∈ λ ∧ κ < y). By recursion, the alephs are

ℵ0 = ω 0 ℵα+1 = ℵα

ℵλ = sup{ℵκ : κ < λ} for λ a limit ordinal.

Cardinal arithmetic works as follows. If a and b are disjoint sets, ¬(∃x)(x ∈ a ∩ b), then cardinal sum and product are

|a| + |b| = |a ∪ b| |a| · |b| = |a × b|

In general, let {κξ : ξ ∈ α} be a set of cardinals; then sum and product are, respectively, X κξ = |sup{κξ : ξ ∈ α} × {ξ}|, ξ∈α Y Y κξ = | κξ|. ξ∈α ξ∈α To define exponentation, let ab = {f : f is a function from b to a}. Then

|a||b| = |ab|.

Lemma 5.71. ℵOn = On.

Proof. Since On is a limit (prop. 5.42),

ℵOn = sup{ℵα : α ∈ On}.

As On ∈ On, it follows by the definition of supremum that ℵOn ∈ sup{ℵα :

α ∈ On}. So by substitution of identicals, ℵOn ∈ ℵOn. These are ordinals, so

ℵOn 6∈ ℵOn, too. Then by prop. 5.66, hOn, ℵOni 6∈ f for all f, showing On ≤ ℵOn. Since ℵOn ∈ On, also ℵOn ≤ On, and the result follows. 

Theorem 5.72 (Cantor 1899). Every cardinal is an aleph. 8. CARDINALS 157

Proof. This is an adaptation of an argument from Cantor’s 1899 letter to Dedekind [Can67] that makes use of inconsistent sets and so shows some affinity of technique with our own. Let |M| be a power not an aleph. By wellorder, then, the whole of the alephs are injectable into |M|; in particular, On ≤ |M|. But also, like everything else, |M| ≤ On, showing that |M| = On = ℵOn, an aleph after all. 

Theorem 5.73 (Cantor 1892). |a| < |P (a)|.

Proof. The function taking each z ∈ a to {x ∈ a : x = z} establishes an injection, |a| ≤ |P (a)|. Cantor’s original argument for strictness employs properties of implication outside DLQ. Nevertheless, ours shares with Cantor the construction of a diagonal set that pushes its powerset beyond the reach of any one-one map with a. Consider r = {x ∈ a : x 6∈ r}. Some properties of r are forthcoming. Since (x ∈ r → x 6∈ r) → x 6∈ r, we have (∀x)x 6∈ r. Further, by definition of r,(∀x)(x ∈ a ∧ x 6∈ r → x ∈ r). Ergo every member of a both is and is not in r. The argument considers an arbitrary f from P (a) to a and finds that it is not a surjection, establishing that all such functions are not onto and hence that there is no bijection, proving strictness. Suppose that there is a one-one correspondence f between a and P (a). Then for any y ⊆ a there is an z ∈ a such that f(z) = y. Either (∃x)x ∈ a ∨ ¬(∃x)x ∈ a. The meta-rule allows us to eliminate cases over consequence. In the second case, for every y ⊆ a there is no z ∈ a such that f(z) = y, so there is no bijection and the theorem holds. So now we focus only on non-empty a. Then for some z ∈ a, f(z) = r, which means by extensionality that

(∀x)(x ∈ f(z) ↔ x ∈ r).

We show this equivalence fails and then so too does the map f. For every x, either x ∈ f(z) ∨ x 6∈ f(z). Let z ∈ a. If the former, then since always z 6∈ r, we have

z ∈ f(z) ∧ z 6∈ r, so by counterexample, (∃x)¬(x ∈ f(z) → x ∈ r), so f(z) 6= r after all. If instead z 6∈ f(z), since z ∈ a we know that z ∈ r,

z ∈ r ∧ z 6∈ f(z) which by counterexample again gives ¬(∀x)(x ∈ r → x ∈ f(z)), and again shows f(z) 6= r. Thus f(z) 6= r for z ∈ a, proving there is no bijective f, quod erat demonstrandum.  This leads directly to Cantor’s paradox: 158 5. ELEMENTS

Corollary 5.74. |V | < |P (V )|, but also |V | = |P (V )|.

Proof. The first is an instance of Cantor’s theorem. The second follows from proposition 5.65.  The generalized continuum hypothesis GCH conjectures for all cardinals κ, λ that ¬(κ < λ < |P (κ)|).

Theorem 5.75. The GCH fails at On, and an instance of GCH holds at On.

Proof. For a failure of the GCH: The cardinal On provides a counterexample. From Cantor’s theorem, On < |P (On)|. But On < On < |P (On)|. For an instance of GCH, let λ be a cardinal On < λ < |P (On)|. Being a cardinal, ¬(On < λ). Thus ¬(On < λ < |P (On)|). In fact, because 2On = On (by the maximality of On and

Cantor’s theorem), and On = ℵOn, and On = On + 1, then by a few substitutions and existential generalization follows

ℵα (∃α)(2 = ℵα+1).

 CHAPTER 6

Reflection and Large Cardinals

Is it you, or is that a fault-line on the horizon

a second sky?

—Matthew Francis, Ocean This chapter proves the existence of some large cardinals, based on a reflection theorem. The ordinals were developed combinatorially in chapter 5. Now, using what in standard terminology would be called proper class arguments, they are enriched to produce a clean description of some features unique to Cantor’s absolute and to diagonal inclosures in general. Connections are also drawn with more recent mainstream mathematics, particularly Kunen’s result on Reinhardt cardinals.

1. Introduction

Definitions by recursion on the ordinals are well-understood—from the bottom up. We define a base case and a method for succession. What about from the top down? As Barwise and Moss point out [BM96](216), expressions like \ f(α) = f(β) β∈α are not well-understood, because to get the recursion going would require a greatest, as opposed to a least, fixed point in the ordinals. The issue arises in non-well- founded set theory, where least members are not in general guaranteed; and some interesting work is done there on corecursion, defining downward from the greatest fixed points of operators. Here we have a straightforward and natural starting point: The fixpoint On itself. As discussed in chapter 2, the ordinals may be approached from below and from above. By building upward from below, the existence of further and further ordinals, and their limiting points, the cardinals, has been a matter of postula- tion. From above, especially from a well-defined starting point like On, the process of approaching these objects is significantly streamlined. Working out the details

159 160 6. REFLECTION AND LARGE CARDINALS here will give some precision to the analysis that “actually conceived, the abso- lute bounds the entire ordinal sequence, whereas potentially conceived, the ordinal sequence is absolutely unbounded.”[Jan95](383) Since at least 1897 it has been clear that there is something special about the totality of the ordinals. This is evinced by the conflicted treatment that the ordinals receive. On the one hand, it is a truism that the ordinals are the central organizational index for set theory, that ordinals as the bearers of wellorder are the essence of the discipline. “When we look at the general pattern of questions discussed in set theory,” writes Levy, “we see that one main theme is the study of the structure of well ordered classes, or, what amounts to the same thing, the study of the structure of the class of the ordinals.”[Lev79](292) To this end most twentieth century research occurred at the limiting case of this attitude, spent trying to reduce the entire set theoretic universe to no more than the wellfounded universe. On the other hand, most research in the twentieth century was preoccupied with blocking and denying the resilient inconsistency of On, because it was believed that to admit On ∈ On would be trivializing, proving everything. The totality of ordinals is important and inconsistent. The driving idea in the present chapter is to find a point of moderation between these extreme tendencies of “angst and awe”, the tendencies of both privileging and scorning the class of ordinals—to learn more than classical theory by following the ordinals all the way up, but by the conviction that in doing so we infer less than ex falso quodlibet suggests. Naive set theory can prove quite a lot without proving everything. This is new mathematics, very much related to old. The ideas are connected with some more recent mainstream research in higher set theory. To show this, I will quote some classical theorems which, though they may not be part of naive theory, make for important points of comparison and corroboration with the picture of sets being developed here—namely, inconsistent, interesting activity at the end of the ordinal line.

1.1. Axioms of Infinity. The deep question about transfinite ordinals, and so of this chapter, is their extent, a theme Levy identifies as a main question of set theory [Lev79](289). The question therefore finds focus at the limiting regions— the place at which the question can be answered. And insofar as ordinals are meant to extend as far as possible, the question is broader, about the totality of all sets V . Reflection principles are for isolating particularly distant ordinals, called large cardinals, in order to shed light on the question: How far does the universe go on? A natural, perhaps naive, strategy for answering this question is simply to try and find the extremity; this will supply, by direct inspection, a concrete answer. 1. INTRODUCTION 161

Paul Cohen [Coh66] showed that ZF is incomplete. For want of more theo- rems, classical set theorists must posit more axioms: in G¨odel’sidiom, axioms of infinity. In [G64¨ ](264), G¨odelexplains that the axioms of set theory by no means form a system closed in itself, but, quite on the contrary, the very concept of set on which they are based suggests their extension by new axioms .... These axioms can be formulated also as propositions asserting the existence of very great cardinal numbers (i.e. of sets having these cardinal numbers). The simplest of these strong ‘axioms of infinity’ asserts the existence of inaccessible numbers (in the

weaker or stronger sense) > ℵ0). Similarly, “the belief in the existence of inaccessible cardinals,” Tarski [Tar62] writes, seems to be a natural consequence of basic intuitions underly- ing the ‘na¨ıve’ set theory and referring to what can be called ‘Cantor’s absolute’. ... What is regarded by many as one of the main aims of research in the foundations of set theory [is] the axiomatization of increasingly large segments of ‘Cantor’s absolute’. The axioms assert that there are more and more cardinals, each prescribing its own transcendence over all the cardinals below [Kan94](xi). These successively stronger assumptions are linearly ordered by consistency strength, and have been shown by Kunen to come to an abrupt, contradictory end at Reinhardt cardinals [Kun71]. The aim of these posits has been to settle outstanding questions of set theory, e.g. the generalized continuum hypothesis. To date, the posits have not been successful in this regard. For classical set theory, large cardinal axioms are an attempt to regain the conceptual closure that is lost in the iterative hierarchy. They also encode the fact that any closure can be transcended. So large cardinals can be explained as an inclosure phenomenon; the part of the diagonal function is played by the various definitions of larger and larger cardinals. Classically, the result at the ultimate closure level is a brute contradiction. In naive set theory this conflict is put to direct use in fueling new proofs. Because the last century has been dominated by the cumulative concept of set, S the wellfounded sets, V0 = ∅,Vα+1 = P (Vα),Vλ = κ<λ Vκ with λ a limit, has been taken as the answer, as the universe simpliciter. In such arguments ‘universe’ is sometimes replaced by ‘class of all ordinals’. Such replacement is quite natural given the tie 162 6. REFLECTION AND LARGE CARDINALS

up between ordinals and the universe through the cumulative hierarchy. Moreover, this also fits with Cantor’s view that the ordinals are natural ‘expression’ or ‘representation’ of the Abso- lute. [Hal84](116)

Now, we can indeed stratify V and index the levels with ordinals, so talk of Vα and even VOn is entirely definite. And in adapting the large cardinal work of classical theory, it will make most sense to deal in ordinals and the fragment VOn, rather than, as is claimed in e.g. [Kan94], the universe V itself. But the iterative sets alone are, we know, not the end of the story. The full universe V is properly more vast even than the closure level VOn; this is clear since on the naive view the axiom of foundation is false, since one can prove the existence of e.g. a such that a = {a}.

So the wellfounded idea that V = VOn certainly fails; a fortiori, if this identity were provable, it would probably be explosive, since V 6∈ V is. On the other hand, there are a great many ordinals, and it is hard to see how

VOn = {xα : α ∈ On} could have left anything out. Indeed,

Proposition 6.1. |V | = On.

Proof. Like everything else, On ⊆ V ; so On injects into V , showing |On| ≤ |V |; and |V | ∈ On, so |V | ⊆ On, and so |V | ≤ On. Thus they are identical, by proposition 5.65. 

There is then some temptation to elide between the overlarge sets VOn and On. But a great deal of care is needed here, since V is fragile. Reflection as we described it in chapter 2, where a set M reflects Φ if some proper subset of M is Φ, is too crude. Suppose V reflects in this way; then if V = V , there is some proper subset M = V . But from V = M ⊂ V , it follows by the definition of ⊂ that some x is not in V , whence triviality. A very precise formulation of reflection is needed to retain cogency while also providing the information we seek about extent. In the section of the chapter on Kunen’s inconsistency theorem, we will be in position to ask whether he has identified the end of the classical universe, or the end simpliciter. Whatever the details turn out to be, our derived ‘axioms’ of infinity ought to answer the question on how far the ordinals go on in most obvious way: To the end.

2. Reflection Theorems

An explicit reflection principle is Russell and Whitehead’s axiom of reducibility in their Principia. This was a device introduced to augment expressive power, compensating for the heavy losses incurred from the vicious circle principle and theory of types. The axiom asserts that any property occurring at a higher level of 2. REFLECTION THEOREMS 163 the type hierarchy already occurs at the first level [RW13](55). In the end, it was reducibility that most overtly exposed the Principia’s failing of logicism, because such an axiom is not self-evident, as Russell knew. Reflection principles have become mainstays of modern set theory and are used as heuristic justifications for cardinal postulates beyond ZFC: Reflection is the means by which the truncated and intuitively shorn classical theory hopes to re- connect with more expansive truths. We have already met the intuitive form of reflection in chapter two, where I suggested its connection to Cantor’s absolute. Kanamori claims that, as reflection principles manifest in large cardinals, they are the “rightful heirs” of early set theory, the “trustees of an older tradition.” Still, Russell’s trouble remains the trouble: Reflection is a subterfuge, a nostalgia for the knowledge abandoned after the trauma of the antinomies; and indeed, as we will see, all agree that a reflection principle in its most basic form is multiply inconsistent. The classical reflection theorem proved by Levy in 1960 deals in the model- theoretic satisfaction relation |=, and reads

(∀α)(∃β > α)[V |= Φ ↔ Vβ |= Φ].

To see how Levy’s principle works, here is a proof sketch. Beginning with the lowest segment Vα that models Φ, define a function f on α and use the axioms of replacement and infinity to prove the existence of a higher set Vf(α) that models Φ. Upward reflection is classically equivalent to the axioms of replacement and infinity, both of which are theorems here. For full detail see [Dra74]. The theorem bears some resemblance to the δ-ε arguments in calculus, used to eliminate talk of infinitesimals; upward reflection can go ‘arbitrarily’ high. As Wierstrauss developed that technique to sterilize the conceptual foundations of his subject, so too here the idea is to approximate V without straying into paradoxes. Tait has a good discussion of second-order reflection in [DO98]. Reflection is a device to speak about totalities indirectly. Naive set theory can stare into these totalities directly; and so we can use reflection as a powerful device to produce not hypotheses but theorems. The next appearance of reflection will provide us with the most useful formula- tion, a dialethic reflection theorem suggested by a proof sketch in [Dra74] of G¨odel’s second incompleteness theorem. This establishes that, were ZF able to prove the existence of its own model (equivalently, by G¨odel’scompleteness theorem, to prov- ing itself consistent), then a contradiction follows. Classically, everything follows. If ZF proves that (∃M)M |= ZF , then ZF proves that (∃M 0)[M 0 < M ∧ M 0 |= ZF ]. Suppose ZF proves that (∃M)M |= ZF . Then M satisfies this fact, so ∃M 0 ∈ M such that ZF proves M 0 is a model of ZF . These wellfounded models are indexed by 164 6. REFLECTION AND LARGE CARDINALS the ordinals, though, and given any stretch of ordinals, there is always a least. So let M be the least provable model of ZF . Then there would be a lesser, some M 0 ∈ M that witnesses ZF proving the existence of its own models—which contradicts the minimality of M. Classical reasoning demands that the premise of the argument be rejected, that ZF cannot prove the existence of its own models. Since the contradiction arrived at is not necessarily explosive, though—in fact, it is quite likely to be a property of naive model theory—G¨odel’smodus tollens may be our modus ponens. G¨odel’stheorem shows that closure leads to infinite descent. Since naive comprehension principles lead to closure, here we investigate the possibility that the subsequent descent is rich and illuminating. The limitation imposed by G¨odel’ssecond theorem reemerges as a powerful mechanism. For now we work just with the ordinals, leaving VOn for later. Let |On| = Ω. Since On = |On|, this is just another name for the set of all ordinals, used to mark only a difference in emphasis. We use the conditional Φ 7→ Ψ, defined Φ ∧ t → Ψ, to state

Theorem 6.2 (Reflection). (∀X)[Ω ∈ X 7→ (∃Θ)(Θ < Ω ∧ Θ ∈ X)].

The notion of reflection all along has been meant to capture absoluteness; all the hard work of founding the ordinals in a paraconsistent setting done, our mathematical characterization is now, appropriately, delivering reflection almost for free.

Proof. Since Ω < Ω (lemma 5.66), then Ω ∈ X ∧ t → Ω < Ω ∧ Ω ∈ X. And Ω < Ω ∧ Ω ∈ X → (∃Θ)(Θ < Ω ∧ Θ ∈ X). Transitivity of the conditional completes the proof. 

In absence of a worked out model theory, we are taking advantage of the tie up between ∈ and predication to state the theorem; X is playing a second-order role. Since naive comprehension makes membership a synonym for bearing a property, the reflection theorem for Ω can be abbreviated without loss: If X(Ω), then, since Ω < Ω, there is some ordinal Θ < Ω such that X(Θ). (A capital letter suggests that Θ is within conceptual striking range of Ω.) Downward reflecting could just as easily be directed upward; again because Ω < Ω, there is an Ω > Ω such that if X(Ω) then X(Θ), too. Downward reflection—that any property enjoyed by Ω is enjoyed by some lesser set—is an old idea that received its clearest articulation from Reinhardt [Rei74] who calls it the naive reflection principle, because in this simple form simple contradictions can be derived. (Take any property satisfiable by nothing but Ω.) Since inconsistency occurs naturally in Ω, though, this is less than reason to turn from the basic reflection principle. Rather, this is incentive. 3. ULTRAFILTERS 165

Reflection has attracted much attention in the last decades, for both its intuitive justification and expansive proving power. Classically, large cardinals mark blind spots: If Vα |= ZF then α is inaccessible in the sense of G¨odel’sincompleteness theorem [Lev60]. Symmetry would suggest, and our theorems will confirm, that these blind spots, these gaps, are really gluts under naive set theory. The time has now come to build some high-impact machines, and use them to find some overlarge sets.

3. Ultrafilters

Ultrafilters, ultraproducts, and ultrapowers are basic tools of higher set theory, and model theory, too; [BS71] is an elegant introduction. For the first part of this section, we will give an appropriate definition of these objects, and make some adapted applications of Zorn’s lemma to prove their fundamental properties. The theory of ultrafilters was founded at the height of classical logic. Funda- mental theorems are proven by use of the disjunctive syllogism. More, the definition of filter as usually given (with a material conditional) fails relevance considerations, as the following example shows. A filter F is is closed under intersection: x ∩ y ∈ F for x, y ∈ F . How is this to be formalized? While x ∩ y ∈ V for any x, y, the entailment x, y ∈ V → x ∩ y ∈ V is irrelevant. But V − {∅} is a very natural candidate for being a filter. To work around these limitations, as with the ordinal numbers, much hinges on our choice of definitions; using the t connective avoids much of the difficulty. Conversely, naive set theory can handle important portions of these concepts very naturally, because ultrafilters in general and ultraproducts in particular pick out large sets.

Definition 6.3. Let M be a set, P (M) its subsets. F ⊆ P (M) is a filter on M, F lt(F,M), iff

∅ 6∈ F, x ∈ F ∧ y ∈ F 7→ x ∩ y ∈ F, x ∈ F ∧ x ⊆ y ⊆ M 7→ y ∈ F.

F is maximal in M just in case: F ⊆ G ∧ F lt(G, M) 7→ F = G. F is an ultrafilter on M iff F is maximal. F is a principal filter on M iff F = {x ⊆ M : y ⊆ x} for some non-empty y ∈ M. A chain is a set totally connected by ⊆.

Proposition 6.4. Let C be a non-empty chain of filters over M. Then S C is a filter over M. 166 6. REFLECTION AND LARGE CARDINALS

Proof. Let [ C = {x :(∃F )(F lt(F,M) ∧ F ∈ C ∧ x ∈ F )}.

There are three conditions to satisfy. One: If ∅ ∈ S C then (∃F )(∅ ∈ F ∧ F ∈ C). But each such F ∈ C is a filter, whereby ∅ 6∈ F , so ∅ 6∈ S C by contraposition. Two: If x, y ∈ S C then there are filters F,G ∈ C with x ∈ F and y ∈ G. Because C is a chain, either F ⊆ G or G ⊆ F ; therefore either x, y ∈ F or x, y ∈ G. Therefore either x ∩ y ∈ F or x ∩ y ∈ G, so it follows that x ∩ y ∈ S C. And three: If x ∈ F , S S and x ⊆ z ⊆ C, then z ∈ F because F is a filter and therefore z ∈ C. 

Theorem 6.5 (Tarski). Every filter is contained in an ultrafilter.

Proof. Let F be a filter in M. Let

F = {G : F lt(G, M) ∧ F ⊆ G}.

Note that F ∈ F, so F is not empty. If C is a non-empty chain in F then S C is a filter by the previous proposition. Also, F ⊆ S C because F ∈ C. Therefore S C ∈ F. And since G ∈ C implies G ⊆ S C, S C is an upper bound of C in F. By Zorn’s lemma there is a maximal element U ∈ F. Because U is maximal, U is an ultrafilter containing F .  This is all the development needed for now. There remain mostly unanswered questions about ultrafilters U in naive set theory. For example, whether they are exhaustive: For any x ⊆ M, does it hold that either x ∈ U ∨M −x ∈ U? Classically this is equivalent to U being maximal. Delicacies of definitions and the behavior of → however, make the matter non-trivial here. Nevertheless, we can proceed without deciding.

4. Axioms of Infinity

“Mathematicians and other children often play the following game: We take turns naming numbers, and see who can name the largest one.”[Bar77](396) So Kunen introduces the climb we are about to undertake. We will have an advantage over the classical strategy in this game, though, since we are starting from the top and descending, using our downward reflection theorem to establish features of the higher infinite. Generally, we study the properties of Ω and ascribe some of these properties to lower sets, proving the existence of various large cardinals. Comparing against the classical postulation method, We look for justification for these axioms from the point of view of the cumulative type structure, where we want to say that the collection of levels, which is indexed by the ordinals, is a very rich 4. AXIOMS OF INFINITY 167

structure with no conceivable end. ... It always seems a plausible step, in view of the reflection principle, to take a property of the whole universe and postulate that it already holds at some level

Vα.[Dra74](124), working naive sets directly extends such “justification” so that the postulates be- come theorems. The arguments are streamlined reflection proofs, achieved just by existential generalization; the proofs end by appealing to reflection, theorem 6.2, as a heuristic. Since higher varieties of cardinals are of the lower varieties, too, e.g. if κ inaccessible and κ < λ then λ is inaccessible, the whole of this part can be summed up: On is the largest cardinal, and exists; so every type of large cardinal exists. The ordering of the higher infinite, at least at its initial segments, corresponds to the historical sequence in which it was discovered. First we consider some ideas of Hausdorff from 1906-8, and then of Mahlo from 1911-1913. For excellent in- troductory exposition of much of this material, see [Dra74]. More advanced is [Kan94].

Definition 6.6. Let α, β be ordinals and α < β. If α is unbounded in β, in the sense that (∀x)[x ∈ β 7→ (∃y)(y ∈ α ∧ x ≤ y)], then α is cofinal with β, cf(α, β). The cofinality of β, cf(β), is the least such cofinal α, cf(β) = α ↔ cf(α, β) ∧ ¬(∃y)(cf(y, β) ∧ y < α). A cardinal β is singular when cf(β) < β. When cf(β) = β, β is a regular cardinal. When β < cf(β), β is defined as vanishing.

The finite numbers, for example, are all singular cardinals. If there is more than one least α cofinal with β, use choice to obtain a unique cofinality. So we are assured that cf is a function. Confinality is a fixed point,

cf(β) = cf(cf(β)), since α is the least cofinal set iff α is the least least cofinal set. The cofinality of β is always dominated by β, cf(β) ≤ β. All this is classical. A point of novelty, however, is to lift regular cardinals from definition to theorem without any additional axioms.

Proposition 6.7. There are regular cardinals.

Proof. Let Θ = cf(Ω). Then for every ordinal x, some y ∈ Θ is such that x ≤ y. Since Ω is an ordinal, (∃y)(y ∈ Θ ∧ Ω ≤ y). Since y is an ordinal, y ≤ Ω. 168 6. REFLECTION AND LARGE CARDINALS

So y = Ω; so Ω ∈ Θ by substitution. But the only ordinal with Ω as a member is Ω itself, because ordinals α are transitive:

(α ∈ Ω → α ⊆ Ω) ∧ (Ω ∈ α → Ω ⊆ α) → (α ∈ Ω ∧ Ω ∈ α → α = Ω).

Therefore Θ = Ω. Ergo Ω is regular. By reflection, there is some κ below Ω such that κ = cf(κ) as desired. 

Novelly, we have the additional and decidedly non-classical case:

Proposition 6.8. There are vanishing cardinals.

Proof. Since Ω = cf(Ω), if Ω < Ω, which it is, then by substitution of identi- cals Ω < cf(Ω). Reflecting down, some κ < cf(κ), as required. 

Appropriately enough, though, these cardinals extinguish themselves:

Proposition 6.9. There are no vanishing cardinals.

Proof. By contraposing the definition of wellorder (def. 5.25), if a set of ordinals has no least member, then it has no members at all. If there were any vanishing cardinals, there would be a least; call it ε. So ε < cf(ε), and for any δ < ε, δ 6< cf(δ) by the minimality of ε. Except since

ε < cf(ε) ≤ ε it follows that ε < ε. So ε is not the least vanishing cardinal. Then there is no least, and so none at all. 

Regular cardinals > ω are inaccessible because such a cardinal would be suffi- ciently large to model ZFC. If we can prove such cardinals exist, then it is very likely that naive set theory can model ZFC. “In general,” Hausdorf observes, whether there are regular initial numbers with limit indices is very problematic; in any case, the smallest among them is al- ready of such an exorbitant magnitude that all sets considered up till now and all sets still to be taken into consideration are probably exceeded.[Hau05](179)

Definition 6.10. Call λ inaccessible iff κ < λ 7→ 2κ < λ. κ is 0−hyperinaccessable iff κ is inaccessable; κ is α + 1−hyperinaccessable iff there are κ α-hyperinaccessable cardinals below κ.

Proposition 6.11. There exist inaccessible cardinals. 4. AXIOMS OF INFINITY 169

Proof. All ordinals precede Ω; in particular for any κ < Ω, also 2κ < Ω. So

2κ < Ω ` κ < Ω 7→ 2κ < Ω.

Modus ponens and reflection on Ω give the result. 

In the spirit of alpinism the climb can be continued, by establishing the ex- istence of, say, an Ω-inaccessible. This is straightforward. Rather than focus on these, we go on to show the existence of the next qualitatively bigger powers, using the idea of a normal function.

Definition 6.12. A function f : On −→ On is normal iff f is monotonic

x < y 7→ f(x) < f(y), and continuous f(κ) = sup{f(x): x < κ}, for κ a limit. κ is a Mahlo cardinal iff every normal function on κ has an inaccessable fixed point.

Proposition 6.13. The identity f(x) = x over the ordinals is a normal func- tion.

It is expected that there are other normal functions. Cantor [Can15] observed that there is an isomorphism (defined in section 5 below) between the set of all infinite cardinals and the set of all ordinals; see [Lev79](90). This is a monotonic function by the definition of isomorphism. The continuity condition is met by the structure of the ordinals. The proof at [Lev79](117) gives the idea, though it needs some modification to go through in DLQ.

Proposition 6.14. There are Mahlo cardinals.

Proof. By a reflection argument. Let f be any normal function on Ω. By the continuity of f,

f(Ω) = supα<Ωf(α). Note that f(Ω) ∈ On since normal functions are on On. Because Ω ∈ Ω, therefore f(Ω) ∈ f(Ω) by monotonicity and modus ponens; but being an ordinal, f(Ω) 6∈ f(Ω), showing f(Ω) 6= f(Ω). Therefore f(Ω) is absolutely large (lemma 5.66), |f(Ω)| = Ω. Since we’ve already shown that Ω is inaccessable, we have the needed fixed point. 

The next cardinals to consider arise out of the Lebesgue-measure problem. 170 6. REFLECTION AND LARGE CARDINALS

Definition 6.15. An ultrafilter U over M is α-complete iff \ (∀ξ)(Xξ ∈ U ∧ ξ < α 7→ {Xξ : ξ < α} ∈ U).

A cardinal κ is measurable iff κ > ℵ0 and there is a non-principal κ-complete ultrafilter over κ.

Proposition 6.16. Ω is a measurable cardinal.

Proof. Let UΩ be the set of all subsets of the ordinals that have Ω as a member,

UΩ = {X : X ⊆ Ω ∧ Ω ∈ X}.

Intuitively, the members of UΩ are exactly all the ordinal properties borne by Ω. So this is a direct reflection argument: that UΩ is the requisite ultrafilter. The most important facts to observe is that Ω ∈ UΩ, since Ω ⊆ Ω ∧ Ω ∈ Ω, and similarly that {x : x ∈ Ω ∧ x = Ω} ∈ UΩ. First it is to be seen that UΩ is a filter. ∅ 6∈ UΩ because Ω 6∈ ∅. If X,Y ∈ UΩ, then X ∩ Y ⊆ UΩ ∧ Ω ∈ X ∩ Y , so X ∩ Y ∈ UΩ.

If X ∈ UΩ and X ⊆ Y ⊆ Ω, then Ω ∈ Y and therefore Y ∈ UΩ. Thus, a filter.

To show it is an ultrafilter, suppose UΩ ⊆ G and G is a filter over Ω. Then every member of G is a subset of the ordinals. And since {Ω} ∈ UΩ ⊆ G on assumption, X ∩ {x ∈ Ω: x = Ω} ∈ G for any X ∈ G because G is a filter; so Ω ∈ X, and therefore X ∈ UΩ, showing G = UΩ as required. For Ω-completeness, consider the wellordering of our ultrafilter, {Xξ : ξ ∈ On}; the intersection

{Y :(∀X)(Xξ ∈ UΩ ∧ ξ ∈ On → Y ∈ Xξ)} has Ω as a member and is a subset of the ordinals, and is therefore in UΩ. Finally, we need that UΩ is non-principal—which follows if ¬(∃Y )(Y 6∈ Ω ∧ UΩ = {X : X ⊆ Ω ∧ Y ⊆ X}). Well, suppose

Y ∈ Ω ∧ (∀X)(X ∈ UΩ ↔ X ⊆ Ω ∧ Y ∈ X).

Since Ω 6= Ω, also Ω 6⊆ Ω; so Ω ∈ UΩ ∧ Ω 6∈ UΩ, proving UΩ 6= UΩ, regardless of Y . This completes the proof.  If the foregoing seems a bit strange, one should remember that this is not an average ultrafilter, but one generated by a notoriously inconsistent object. “The subject of paradoxical assertions is one full of surprises. However that it should be so is not particularly surprising.”1 Measurable cardinals are due in part to Ulam, also to whom in part is due the hydrogen bomb. The existence of measurable cardinals is very suggestive; these are

1“After all ... we all normally assume that we are not reasoning about a paradoxical situation: when we meet a contradiction we take it as a sign that something has gone wrong and refuse to go further. ... It is precisely when we do go further that our familiar world disappears and we find ourselves in strange new surroundings. The new terrain clearly needs to be explored.”[Pri79](240) 5. THE END OF THE ORDINALS 171 the most important of the large cardinals and resolution of their details would settle outstanding questions. For example, Vitali’s use of the axiom of choice to produce a non-Lebesgue-measurable set of reals has been a point of vexation. Another open avenue is Scott’s demonstration that there are measurable cardinals exactly where there are elementary embeddings of initial segments of the well-founded universe, which in turn impacts G¨odel’shypothesis that the universe is constructible.

5. The End of the Ordinals

The last of the known cardinals are extendible cardinals, including a virulent strain due to Reinhardt called (if inconsistent objects are called anything, classi- cally) Reinhardt cardinals. Like most other large cardinals, Reinhardt’s arise by looking for closure properties; but his ideas go substantially beyond, unabashedly taking On as a given totality capable of supersession [Jec74]. That the full ar- ticulation of this idea ends in inconsistency, then, should not be a surprise: Rein- hardt is working with the full set concept, looking for the ultimate sky of Cantor’s universe.[Kan94](311) This section does not prove any theorems of naive set theory. Instead I describe some mainstream results, involving the |= relation, with the aim of suggesting that ineluctable invariance that independent mathematics tends to show. The same activity at the end of the ordinal line that we recorded last chapter has been in- dependently observed by classical set theorists working at the extremity of their subject. For this and the next section, we study not On but the wellfounded uni- verse, [ VΩ = Vα. α∈On To discuss all this we need defined some notions from model theory. Here we introduce the idea, following closely [BS71](73 - 75). Let A = hA, ri, B = hB, si be two structures, with A, B sets and r, s relations on the respective sets. A mapping j : A −→ B is a homomorphism iff, for a0, ..., an ∈ A,

ha0, ..., ani ∈ r ↔ hj(x0), ..., j(xn)i ∈ s. If j maps A onto B, then j is an isomorphism. When there is such a j, the structures are isomorphic, written A =∼ B. A is a substructure of B, written A ⊆ B, iff A ⊆ B and r = s∩A, the restriction of s to A. An equivalent definition could have been: A ⊆ B iff A ⊆ B and the injection i : A −→ B, i(a) = a, is a homomorphism. Two structures are elementarily equivalent, A ≡ B, iff for all sentences Φ,

A |= Φ ↔ B |= Φ. 172 6. REFLECTION AND LARGE CARDINALS

A is an elementary submodel of B, the latter called an elementary extension of the former, iff A ⊆ B and A |= Φ ↔ B |= Φ for any formula Φ. An elementary embedding is an isomorphism j from A to B such that for any formula Φ(x0, ..., xn),

A |= Φ(a0, ..., an) ↔ B |= Φ(j(a0), ..., j(an)), with a0, ..., an ∈ A. As some notation, j : X ≺ Y means that j is an elementary embedding from X to Y , and if there is such a j, that X is elementarily embeddable into Y . An uncountable cardinal κ is extendable iff there is a β and a non-trivial (i.e. not the identity) elementary embedding j : Vκ+α −→ Vβ for α < κ, with κ the least ordinal moved by j. Reinhardt’s conjecture, stated at the end of his dissertation and described at the symposium on axiomatic set theory in 1967 [Jec74], is that there is an elementary embedding:

(∃j)j : VΩ ≺ VΩ.

(Because Reinhardt reckons V = VOn, he omits the subscript.) There would be no further closure property on wellfoundedness than this. The conjecture states, as far as I can see, that Ω itself is extendable. “At times in the past,” Kunen writes in reference to his own 1971 proof, “plau- sible large cardinal assumptions have turned out to be inconsistent with ZFC. . . . But maybe ZFC is inconsistent.”[Bar77](399) Kunen proved Reinhardt’s con- jecture is unstable. Kunen’s theorem, which will be spelled out in more detail in a moment, says that no j can embed the wellfounded universe into itself: If j : VΩ ≺ VΘ, then there is an x such that if x ∈ VΘ then a contradiction follows; therefore Kunen reasons that x 6∈ VΘ, and consequently that VΘ is not universal. This was interpreted to mean Θ 6= Ω. Realizing Kunen’s proof in naive set theory is beyond current capabilities. An outline of the classical proof serves more philosophical purposes: Kunen’s result will be that, if there exists a sufficiently large cardinal, then there is an extension j which both does and does not extend. Structurally, this is a much more sophisticated explanation of what happens in the Burali-Forti construction, and, interpreted from naive set theory, an anticipation from mainstream mathematics of some of the more unusual behaviors of Ω studied here. Recall that f 00(x) = {f(y): y ∈ x} is the image of x under f. Let [x]α be the set of all y ⊆ x such that y =∼ α. The following lemma is due to Erd¨osand Hajnal.

For any ordinal ξ, there is a function f :[ξ]ω −→ ξ where for all y ⊆ ξ, if |y| = |ξ| then f 00[y]ω = ξ. 5. THE END OF THE ORDINALS 173

Any f meeting these requirements is called ω-J´onsson. Here is a sketch of the proof: For x, y ∈ [ξ]ω, we find an equivalence relation ≡ such that

x ≡ y ↔ x − α = y − α S for some α < x. By choice, we get a distinguished member z≡ from each equiv- ω alence class E(x) = {y : y ≡ x}. For x ∈ [ξ] and z≡ ∈ E(x), take the minimal

β such that x − (β + 1) = x≡ − (β + 1); the desired g is g(x) = β. Now show, by reductio, that there is a κ ∈ [ξ]ξ where for any λ ∈ [κ]ξ, κ ⊆ g00[λ]ξ. So the lemma makes use of the axiom of choice, in order to fix a function g from which the requisite function f may be derived. To prove Kunen’s theorem only a special case of the above scenario is needed, namely when ξ is a strong limit cardinal cofinal with ω. Kunen’s answer to Reinhardt’s question, which appeared in a short paper [Kun71], launches from the Erd¨os-Hajnal result.

If j : VΩ ≺ VΘ is any elementary embedding except identity, then Ω 6= Θ.

A few different proofs have been given, each lending different insights [Kan94](320). An outline Kunen’s original argument will show the kind of structural affinity be- tween classical (?) higher set theory and naive theory I’ve suggested.

Suppose j : VΩ ≺ VΘ. Let

j0(x) = x jn+1(x) = j(jn(x)) for n finite. By induction, j is increasing, x < j(x). Further, let x0 be the first point moved by j; similarly, letting

[ n {j (x0): n ∈ ω} = λ,

00 then λ is the least fixed point for j above x0. Then it is to be shown that j (λ) 6∈

VΘ, meaning that Θ is below Ω. Let f be ω-J´onssonwith domain the set of 00 00 functions from ω to λ and range λ. Since j (λ) = {j(ξ): ξ < λ}, if j (λ) ∈ VΘ 00 then some g is a function from ω to j (λ) ∩ VΘ, with (j(f))(g) = x0. A subtle combinatorial argument [Kun71](408) then establishes that, given such a g, x0 is in the range of j. Except that j is increasing, so x0 is also not in the range of j—a contradiction completing Kunen’s proof. In a rather dramatic turn of phrase, and with a sensitivity to the role of choice in the proof, Kanamori remarks: 174 6. REFLECTION AND LARGE CARDINALS

Kunen’s result can best be viewed as an ultimate limitation im- posed by the Axiom of Choice on the extent of reflection pos- sible in the universe. ZFC rallies at last to force a veritable G¨otterd¨ammerungfor large cardinals! (324)

What is shown is that if there is an increasing embedding from VΩ then it is also not increasing. Embeddings become inert when large enough ordinals are involved.

At VΩ, there is simply nowhere left to go—unsurprisingly, since this fragment of V is so large. As our dialethic investigations have been showing, though, Ω is indeed not the end of the ordinals. At the very least, it is a successor, Ω = Ω + 1, where Ω < Ω. And in the next section indications that there are limit ordinals above Ω will be given, too. So Kunen’s observation that a non-trivial embedding must be both increasing and not, without further reductio, seems quite correct from the perspective of naive set theory. Kunen’s is a sophisticated classical confirmation of the more bluntly stated naive propositions. On the other hand, the universe V really is comprehensive; nothing can fall outside its province on pain of triviality. In light of there being a wellordering on V , and the fact that Ω is its ordertype, one might think that Kunen’s result is stronger, that there is no (non-trivial) embedding j : V ≺ V , again on pain of triviality. (Or worse, there is such a j and naive set theory is trivial.) Again,

V 6= VΩ. But Kunen’s result does give pause. What more can be said on this is an open question.

6. Reflecting the Universe

Cantor’s theory brought new meaning to the infinite, and his followers have explored the terrain. Naive theory offers a most precise expression of endlessness. Downward reflection is inconsistent because it implies a bottomless chain of ordi- nals, as predicted by G¨odel’ssecond theorem. Now we go further by considering self-reference. If everything about Ω is reflected, then this very property should itself be reflected at a lower level. Reflection reflects. Aside from its intrinsic interest, the construction we give follows up on Rein- hardt’s extendable cardinals, which demarcate the end of the consistent segment of the ordinals. These come about through the upward extension of elementary embeddings; is there a clearer picture of where these embeddings continue after consistent theory is exhausted?

First, a reflection theorem for VΩ itself.

Proposition 6.17. When VΩ ∈ X, also (∃Θ)(Θ < Ω ∧ VΘ ∈ X).

Proof. Again because Ω < Ω.  6. REFLECTING THE UNIVERSE 175

Let R be the set of all sets that reflect,

R = {M :(∀X)[M ∈ X 7→ (∃M 0)(M 0 < M ∧ M 0 ∈ X)]}.

More generally, now, we can see that any structure satisfying reflection must itself reflect.

Proposition 6.18. M ∈ R ` (∃M 0)(M 0 < M ∧ M 0 ∈ R).

Proof. Since Ω reflects, R is not empty; let M ∈ R. Then [M ∈ R 7→ (∃M 0)(M 0 < M ∧ M 0 ∈ R)]. (Note that this is not contraction, since M really is in R).  Repetition of this argument is iterated reflection.

Corollary 6.19. There is a countable chain of sets that satisfy reflection; if

M ∈ R then countably many other M0,M1, ...Mω are in R, too. Note that full comprehension provides a sharp modeling of these facts with a set Rˆ = {M :(∃M 0)(M 0 < M ∧ M 0 ∈ Rˆ ). A question arises as to the size of R. While it is easy to repeat the basic reflection iteration up to ω, further checking is needed to confirm that the procession can continue.

Proposition 6.20. |R| = Ω.

Proof. Ω ∈ R. But Ω = Ω ∧ ¬(∃Θ)(Θ < Ω ∧ Θ = Ω). So by counterexample, Ω 6∈ R. Ergo R 6= R and is absolutely large, as required.  Because there is an unlimited pool of reflecting ordinals,

Corollary 6.21. Iterated reflection can be continued up to any ordinal. Let- ting

RΩ = {Θ ∈ On :(∀X)[X(Ω) → X(Θ)]}, then |RΩ| = Ω. Structure emerges. Let σ be a choice function. Define by induction

M0 = Ω

Mα = σ(R − {Mβ : β < α}) [ Mλ = Mκ κ<λ with λ a limit; and to complete,

Ω [ M = Mα. α 176 6. REFLECTION AND LARGE CARDINALS

Since the principle mechanism involved is Ω < Ω, we now have a direct point of comparison to Reinhardt’s initiative. Exactly as before, since Ω < Ω, we can see that Ω is extendible, (∃Θ)Ω < Θ. Here there can be embeddings j : VΩ ≺ VΩ, attendant extendable cardinals, and almost certainly many, many other types of cardinal for investigation. Having tread this close to an edge, it seems appropriate to recall Nietzche’s enthusiasm in “The Wanderer” in Zarathustra[Nie76]: Thus must you mount even above yourself—up, upwards, until you have even your stars under you! Yea! To look down upon myself, and even upon my stars: That only would I call my summit. That hath remained for me as my last summit!

7. Conclusion

The purpose of this chapter has been to begin applying the technical apparatus of naive set theory to areas for which it is particularly fit, and in doing so to corroborate the philosophical theses put forward in part one. This continues next chapter in more generality, towards which we make an outgoing remark.

There is a non-empty subset of the ordinals RΩ such that

(∀y)[y ∈ RΩ 7→ (∃z)(z ∈ y ∧ z ∈ RΩ)].

There are sets of ordinals with no least member. This turns out to be an extremely fecund fact. This is also contradicted, via the counterexample axiom, by a theorem of chapter 5: For any u ⊆ On such that (∃x)(x ∈ u),

(∃y)(y ∈ u ∧ ¬(∃z)(z ∈ y ∧ z ∈ u)).

So the proposition that the ordinals are wellordered is a dialethia; this has been proved several times over, since confirming Burali-Forti’s paradox last chapter. Here I want to conjecture, in accord with the limit on size doctrine (but now for clearer reasons than were available in part one), that such subsets occur only at the top of the structure, at On. And if that is right, then various existence theorems in this chapter, as well as the cumulative M hierarchy defined above, the large cardinals and perhaps even all the transfinite cardinals whatever, are all over Ω and nothing else. The rich reflection phenomenon is just this, and no less rich for it: Ω reflects itself. To understand this and the overall meaning of these mathematics is the task of the next and final chapter. CHAPTER 7

On the Transconsistent

At the end, some outstanding philosophical debts need repaying through a cohesive interpretation of the preceding work. The mathematical material is new enough that I do not pretend to have penetrated beyond some observations. There are, with respect to the set theory here, a few clear points of possible concern or confusion as to the meaning and integrity of the mathematics. These are to be met by returning to the philosophical issues of part one and producing a unified account, which is broadly as follows. The paradoxes of naive theory are all due to a single, powerful truth: An inconsistent fixed point stands where none can be, at the meeting of extension and intension, at absolute infinity. The fixed point arises quantitatively when we consider ordinals and cardinals. The fixed point also arises analytically when we consider the comprehensive extensions of predicates. For both Frege and Cantor, the paradox occurs at junction of the universe and its powerset. G¨odeland Tarski identified the fixed point at the limits of proof and truth, respectively, and more recently, Priest generalizes to show that the fixed point arises conceptually when we consider totalities. The fixed point arises even when we try to deny, as we must, that there is any such vertex here: That is exactly to affirm the paradox itself. Using the mathematics of the last chapters, this view of the inconsistent infinite will be given a precise characterization. The same technique is then used to broach logical questions asked in chapter 0, questions about how to isolate true contra- dictions, as opposed to simply false ones. I show that these are intimately related questions, and use some simple paraconsistent logic to provide an answer: Cardinal maximality is necessary for both. We will see, though, that this is a new kind of paradox analysis; and so whilst the cardinal answer is a good start, it carries with it paradoxes of its own. Because I am forging a precise connection between higher set theory and logic, I will resurrect a term from the end of Priest’s [Pri06b] for the non-trivially incon- sistent: the transconsistent. An objective of the following is to precisely demarcate the transfinite from the overlarge or absolutely infinite, and to show that this also helps to demarcate the transconsistent from the absurd or absolutely inconsistent. Dialethic paraconsistency provides a fuller and more exact characterization of the

177 178 7. ON THE TRANSCONSISTENT infinite than any non-paraconsistent mathematics could, and this mathematics in turn provides a map of where the limits of the transconsistent lie. Then a discussion of the mathematics is presented, arguing that the basic driv- ing mechanism behind the transfinite is contradiction; classical logic casts this as a prolix structure, whereas paraconsistent logic makes plain the paradox of infinity— out of which the familiar transfinite can be discerned, to be sure, but now recontex- tualized, as a basic consequence of a dialethia. In his work on paradoxes, Bolzano opened by quoting de Morgan, himself quoting Aaron Hill; and we repeat: Tender-hearted stroke a nettle, And it stings you for your pains; Grasp it like a man of mettle, and it soft as silk remains. If there be in mathematics a nettle danger out of which has been plucked the flower safety, it is speculation on 0 and ∞. The philosophical concerns of chapter 2 are amenable to our inconsistent mathe- matics, which can demarcate the absolute from the transfinite.

1. Characterizing the Absolute

Cantor distinguished three categories: the finite; the transfinite; and the ab- solute. Dedekind produced a precise means of distinguishing the finite from the transfinite. A still-outstanding problem in set theory, however, is finding similar demarcation between transfinite and absolute; without such a method for drawing a clear line, the doctrines of limit on size and proper classes are underdetermined. I present a way to characterize the absolutely infinite—a criteria to differentiate the absolute from the transfinite—as maximal. Paradoxes are fertile sites for new discoveries. In the nineteenth century, the very same phenomenon blocking the way to a science of the infinite—various coun- terintuitive, inconsistent, and even ineffable properties of the infinite—were con- verted to definitions and theorems by Dedekind and Cantor. Dedekind’s inversion of Galileo’s paradox is the motivating example: Dedekind stipulated that a set is infinite iff it bijects with a proper part of itself; and correspondingly, that a set is finite iff there is no such bijection. In symbols, sets M bearing the property

(∃X)(X ⊂ M ∧ |M| = |X|) are dedekind infinite. Every radial through two concentric circles cuts each circum- ference at exactly one point, setting up a bijection between the two circumferences; but isn’t the circumference of the inscribing circle longer than the inscribed? Yes; its points are infinite. So Galileo’s paradox, rather than discrediting, characterizes the infinite. 1. CHARACTERIZING THE ABSOLUTE 179

Similarly here, can we take a heretofore poorly understood phenomenon, that of overlarge or absolutely infinite sets, and interpret their strange features as a precise characterization? With the paradoxes of the absolutely infinite openly on display and a cogent mathematics to work from, we can straightforwardly read off some maximal properties and show that they serve precisely and provably to demarcate absolute from transfinite in the appropriate way. This would be to incite an update from Russell’s [Rus37] comments, to show that while “[set theory], though it cannot wholly dispense with [absolute] infinity, has as few dealings with it as possible, and contrives to hide it away before facing the world,” that in fact “like many skeletons, it was wholly dependent on its cupboard, and vanished in the light of day.” The axiom of choice makes the definition of infinity precise; the axiom of comprehension makes the definition of the incomprehensible possible. The matter of stating necessary and sufficient conditions that a set be abso- lutely infinite, or absolute, is not simple. Whereas in our motivating case, Galileo’s contradiction is simply resolved by making it a condition for infinitude, here there can be no hope of such resolution. The reason is that the absolute is inherently contradictory; this is one of its defining characteristics. And the contradiction is a self-perpetuating one: something like

X is absolute iff X cannot be characterized.

This makes it remarkably difficult a priori to state conditions that, on the one hand, do serve as cogent criteria for absoluteness, and on the other hand respect the nature of the object under study—namely that the object under study cannot be captured. By comparison: When a set is dedekind infinite, it may still offend against certain intuitions (e.g. that wholes simply cannot be the same size as their proper parts) and so remains paradoxical in the etymological sense of the word. But as a matter of pure logic or mathematics, there is no such problem. The particular rules of transfinite arithmetic exert a calm control of the dedekind infinite, with surprises but without contradiction, and that is the end of the problem. The absolute, on the other hand, is essentially paradoxical. Any appropriate definition of the absolute would still by necessity incorporate some of its mystery; being intractably mysterious is in part constitutive of being absolute. We will satisfy this in part, because the naive view of sets is unitary: Any collection at all is a set. The naive approach is exactly to treat the universal set or the set of all ordinals as sets like any other, so absolutely infinite sets are also transfinite; no sets are absolute. And in a sense, this just is the paradox. The absolute is the domain of all objects, the infamous universal set of all sets; and in line with our naive unitary conception, it is a set. But the absolute is also “aliquid quo nihil maius cogitari possit,” in Anselm’s idiom, and is not amenable to mathematical operations or extensions, pulling it 180 7. ON THE TRANSCONSISTENT apart from all the other sets in its province. This fact plays out in two directions in the Cantorian universe, as we saw in chapter 2, in generalizations from below and reflection arguments from above; the former highlights problems of extendability, the latter problems of incomprehensibility. If a set is inconsistent (i.e. it has members that are also non-members) then it is absolute, as demonstrated in lemma 5.66; but not conversely: V is absolutely infinite, yet is consistent, on pain of triviality. So inconsistency is not necessary for absoluteness. One cannot even pick out peculiar, even necessary properties of the absolute in order to isolate it; for example, by the conjecture that a set X is absolute iff it reflects: If Φ(X) then ∃X0 such that X0 ⊂ X and Φ(X0). This fails as a definition as we saw at prop.6.1: The universe V is absolute, and V = V , yet there is no set X ⊂ X = V , on pain of triviality; so absoluteness does not imply this kind of reflection. Assuming the universe is absolute—and if not V , then what?—something else is required. Let Abs(X) mean that X is absolutely infinite. We have seen that Abs(X) implies that X reflects; but reflection cannot characterize absoluteness. A charac- teristic relating to, but not synonymous with, reflection and inconsistency is needed. I propose the following.

Definition 7.1. A set X is absolute iff X is maximal with respect to cardi- nality, Abs(X) ↔ (∀Y )(|X| ≤ |Y | 7→ |X| = |Y |). And X is transfinite iff it is infinite and extendable,

T n(X) ↔ ℵ0 ≤ |X| ∧ (∃Y )(|X| ≤ |Y | ∧ |X|= 6 |Y |).

Here 7→ is the preferred conditional because the consequent may not follow relevantly from the antecedent. Let us show that this is the right definition, by proving that it delivers the expected properties of absoluteness.

Proposition 7.2. Some sets are absolute; in particular, Abs(Ω), and Abs(V ).

Proof. Since the property of absoluteness was inspired by Ω, it is no surprise to see that if Ω ≤ |X|, then since |X| ≤ Ω for any X at all, it follows that Ω = |X|. And since |V | = Ω, by substitution the universe is absolute, too.  Since all inconsistent sets are Ω-sized (lemma 5.66), this gives the immediate

Corollary 7.3. If X is inconsistent, X is absolute.

Proposition 7.4. All absolute sets are transfinite. 2. CHARACTERIZING THE TRANSCONSISTENT 181

Proof. Let Abs(M). Then, by ∀-elimination, |M| ≤ Ω 7→ |M| = Ω. Since every cardinal is less than Ω, by modus ponens follows |M| = Ω. By substitution, since Ω 6= Ω, it follows again by modus ponens that |M|= 6 |M|. Now, |M| ≤ |M|, so adjoining we have |M| ≤ |M| ∧ |M|= 6 |M|, from which follows by ∃-introduction that T n(X). 

Proposition 7.5. Transfinite sets are not absolute.

Proof. If X is transfinite, then it is extendible: X is less than, and not equal to, some Y . Then by counterexample, it is not absolute. 

Proposition 7.6. Every infinite set is transfinite. No set is absolute.

Proof. For every X, we have |X| ≤ Ω∧|X|= 6 Ω. Generalization gives the first result. Then, transfinitude implies non-absoluteness, and all sets are transfinite, proving the second claim. 

2. Characterizing the Transconsistent

Sets and logic have a long history together, and their junction is a notorious locus for paradox. Classical logic and classical set theory arose in the late nineteenth century largely as a single foundational enterprise, and the fact that cardinality and consistency are deeply intertwined has been obvious to most, even if not well articulated by many. It is the connection between cardinality and consistency, but studied for the first time at their limiting fixpoint, exploited here. In the opening set theoretic excursion of chapter 0, we cited a distinction be- tween rejection and denial, and found that, whilst some propositions are dialethic, and so a proof of p does not necessarily rule out ¬p, nevertheless implication of ab- surdity was enough to assure the consistent behavior of some theorems. If we prove p, and ¬p → ⊥, then p is consistent; recalling da Costa’s consistency operator, this can be marked p◦. Discerning consistent from inconsistent information would be very useful. In- ferences that are generally invalid, like disjunctive syllogism, could be revived, presumably in the form: p ∨ q, ¬p, p◦ ` ¬q Certain theorems which are not forthcoming in DLQ might then be provable. For example, letting α, β be ordinals, we have not yet established the basic theorem α ⊂ β ` α ∈ β, as the standard argument makes indispensable use of the disjunctive syllogism. (Suppose α ⊂ β. Then there is a least γ ∈ β − α. To show that γ ⊆ α∧α ⊆ γ, for α = γ and thereby α ∈ β, one ends up considering x 6∈ β ∨x ∈ α, wanting to assert x ∈ α on the basis that x ∈ β. But this inference is invalid.) If (x ∈ β)◦, we could use Priest’s notion of quasi-validity, as discussed in [Pri89] and 182 7. ON THE TRANSCONSISTENT

[Pri06b], and could allow paraconsistent reasoning to wax classical in consistent situation. As usual, though, matters become more subtle when we move from the propo- sitional level to the predicate calculus. In particular for naive set theory, one would need to know when ∈ behaved consistently; and obtaining this knowledge is in general not possible. Now, ∅ and V are consistent on pain of triviality, but what about some ordinal β, as in the example above? If β = On, then we know β is inconsistent and not suitable for quasi-inferences; but if β 6= On, we still don’t know much, since On 6= On so β may yet just be On. And similarly, we have a proof that if (∃z)(z ∈ β ∩ β) then |β| = On, and then β = On, but again since |On|= 6 On, this limit on size theorem cannot serve to reliably demarcate consistent ordinals. According to the logic DLQ, all propositions are either true or false. Some propositions are both true and false. According to DLQ, all propositions are con- sistent. For any Ψ, that is, it is not the case that Ψ ∧ ¬Ψ, so the consistent propositions are all the propositions and dialethism is itself a dialethia. This is analogous to the unitary view we took with the absolute. There, no sets are abso- lute, so the transfinite exhausts all the infinite sets. Again, then, where is a working criteria that at once picks out all and only the dialethic propositions? According to the logic, which has non-contradiction as a theorem, all propositions, all sets, all ordinals are consistent! Identifying the true contradictions, as opposed to the simply false ones, is highly analogous to, and just as difficult as, identifying the absolutely infinite sets. Inconsistency, except in special cases like ∅, cannot be ruled out—the exclusion problem[Ber07](ch14). We have obtained some techniques for demarcating incon- sistent sets—namely, that the transfinite has lapsed into the absolute by showing that cardinality has hit Ω. Still, there is no algorithmic check for consistency or in- consistency, even with criteria for knowing when a set is absolute, since the absolute itself is transfinite, too. This paradox, then, is harder than Galileo’s to appropriate for the foundation of a new science, because one horn of the contradiction is that the subject—the dialethic absolute—does not even exist. Here is the start of a mathematical gauge of inconsistency; we tread into metatheory, which has been outside the interests of the present work, but which can be used to sketch a technique. Let Φ be any formula. Let p·q be a name-forming operator, e.g. a suitable g¨odel coding that could be provided with some basic arithmetic and recursion theory. By comprehension, ν(Φ) = {x :Φ}, the set of witnesses for Φ, can be used to define a truth predicate: ν(Φ) ∈ ν(Φ) is T (pΦq). 2. CHARACTERIZING THE TRANSCONSISTENT 183

It is straightforward to check that

T (pΦq) ↔ Φ, showing that the truth predicate satisfies the T -schema. This will be our point of entry into the connection between cardinality and consistency. The more vexing results of have been in the vicinity of truth and provability predicates, to prove limitive theorems. Of metatheory, Minsky warns that “while the logic is easy to follow, one feels that the whole thing is a sort of extended joke or pun.” [Min72](169) The punchline of Minsky’s joke is obtaining the diagonal lemma, which captures the fixed point phenomenon that we have been studying throughout. For every formula Φ(x) of one free variable, there is a sentence Λ such that Φ(pΛq) ↔ Λ. A proof (which will eventually be able to be carried out within naive set theory itself) is as follows; see [BJ89](173), [Pri06b](49). Let Φ(x) be any formula with only x free. The diagonalization of Φ(x) is Φ(pΦ(x)q). Let k be the g¨odelcode of Φ(x), pΦ(x)q = k, so the diagonalization of Φ(x) is Φ(k). There is a function δ such that

δ(k) = pΦ(k)q.

Consider Φ(δ(x)). Let pΦ(δ(x))q = n. The diagonalization of Φ(δ(x)) is Φ(δ(n)); let pΦ(δ(n))q = m. The Leibniz identity principle is δ(n) = m ` Φ(δ(n)) ↔ Φ(m).

And δ(n) = m. Therefore Φ(δ(n)) is what was sought,

Φ(pΦ(δ(n))q) ↔ Φ(δ(n)). From the diagonal lemma, it is then easy to show that truth is inconsistent. Consider the formula of one free variable, ¬T (x). By the diagonal lemma, there is a proposition Λ ↔ ¬T (pΛq) Either Λ ∨ ¬Λ. Then Λ → T (pΛq), and this by contraposition on the T-scheme brings ¬Λ; so ¬Λ by the laws of dialethic logic. Yet ¬Λ → T (pΛq) by contraposition on Λ itself, which by the T-scheme implies Λ; so Λ. Therefore Λ∧¬Λ, and therefore T pΛq ∧ ¬T pΛq. And from this, if our coding device is working, it follows that truth is absolutely infinite, |{pΦq :Φ}| = On, by lemma 5.66. This insight is now applicable to the question at hand. 184 7. ON THE TRANSCONSISTENT

If Φ is true, then (∀x)x ∈ {y :Φ}, and if ¬Φ is true, then (∀x)x 6∈ {y :Φ}. And in the dialethic case,

T pΛ ∧ ¬Λq ↔ (∀x)(x ∈ ν(Λ) ∧ x 6∈ ν(Λ)). This shows that membership in ν(Λ) for inconsistent Λ is inconsistent, and therefore that ν(Λ) is absolute—that |ν(Λ)| = |On|. Thus we have the promised connection: All true contradictions make set theoretic contact with the absolutely infinite. What is still missing is a sufficiency condition; having an overlarge witness set is only necessity for dialethias. But this, like our account of the absolute, gets some essential properties correct. For a start, contrapositively, if |ν(Λ)| is not absolute, then it is not the case that T pΛ ∧ ¬Λq. But nothing is absolute, as we saw in the last section; so no contradictions are true. This agrees with our paraconsistent logic, and the dialethic nature of dialethsim.1

3. Characterizing the Mathematics

When the idea of changing the logic, rather than the set concept, was raised, I wonder if even its enthusiastic proponents knew exactly what they were propos- ing. The naive view of sets was expected to be: the classical view of sets, with some additions—an extra layer at the top of the cumulative hierarchy was the way Reinhardt put it. It was to be a closure on an otherwise good theory. In gen- eral the expectations of paraconsistent inconsistent mathematics were essentially not revisionary, not even particularly weighty. Though his views have always been treated as heretical extremism, in fact Priest from the start has been conservative, maintaining mostly orthodox structures and interpretations, affixing dialethias at the edges; his logic LP is a straightforward example: In the cases of classical truth values, LP is just classical logic. Like those before, I do not want or expect that dialethic paraconsistency will lead to the rejection of any core mathematics. As remarked from the outset, di- alethism is born out of and obeys the same basic rules as any other philosophy; DLQ has all classical tautologies as theorems, and if → is interpreted materially, then DLQ is a proper fragment of classical logic. But the force of that mathe- matics, that philosophy, that logic, is different. Changes in context make changes in meaning. If there were living microbes discovered on Mars, this would not in the slightest change any biological structures here on Earth. The discovery would,

1As formulated in DLQ. In [Pri06b], chapter 4, Priest formulates the T -scheme with a non- contraposable conditional, so that T p¬(Λ∧¬λq holds without ¬T pΛ∧¬Λq. Given our embedding of the truth predicate in set theory in DLQ, which does contrapose, Priests’ analysis does not fit here. 3. CHARACTERIZING THE MATHEMATICS 185 though, irrevocably alter the meaning of biological structure on Earth. So too with mathematics, philosophy and logic, it should be expected that the introduction and acceptance of entities as singularly exotic as dialethias will irrevocably alter the context and so the meaning of mathematics, philosophy logic. Nothing will change the fact that G¨odelproved the diagonal lemma; what follows from that proof, though, we have yet to see in full. Some of this is, of course, forecast by and even made a feature of the paracon- sistent program. The extent of the epoch-shift, though, is as yet unknown. My claim has been that, in the face of paradox, paraconsistency is a means to advance. In part two a naive set theory was developed, so at the most basic level, we know that there can be some paraconsistent contribution to foundational mathematics. How, though, does the naive set theory of this volume measure up to our adequacy criteria, of being explanatory? Even a cursory reading will show that it is unprece- dented and unusual in some respects, and is therefore not easily judged. “The realm beyond the consistent,” after all, “is a continent on whose shore we have just alighted.”[Pri06b](209)

3.1. To make a reduction of mathematics to set theory worthwhile, recall, it is necessary that sets be better understood than more complex objects like numbers, else we fall afoul of Hausdorff’s obscurum per obscurius. Is naive set theory descrip- tive and explanatory? Naive set theory is based on the idea of sets being predicates in extension; but there is little hope of arguing in isolation that the naive concept is natural. Extended apologies that p is ‘obvious’ are necessarily rather self-defeating. For what it is worth: I’ve argued that the set concept, as much as it can be discerned, is inherited from Cantor, Dedekind and Frege, who were only a few of those who took the concept to be obvious; that no other concept has provided an adequate replacement; and that in fact the natural set concept is still the one implemented in mathematical practice. Routley and Priest write,

No one has ever explained what is wrong with instances of ab- straction which fail in the cumulative hierarchy. If it could be shown that the cumulative hierarchy was the essential core of the theory of sets ... the solution would still not be adequate. For, why we should ever have thought that every condition de- fines a set, remains a complete mystery. It doesn’t even look like a plausible claim. How could such a mistake be made? No: the genuine conception of set is that given by the unrestricted abstraction scheme.... [PR89a](506) 186 7. ON THE TRANSCONSISTENT

Adding a more constructive note to this point, I’ve endeavored to show that there is much to be learned from the inconsistent aspects of sets. Yet, even if it is now demonstrated that naive set theory provides the start of an ontology for mathemat- ics, whether it is the correct, authentic ontology of mathematics is still undecided. Uncontroversially, naive set theory is the description of something; let us try to understand what it describes. There are many theorems in this work, and I think they are true. Some of the theorems are also inconsistent, and as pointed out in the last section, there is no way to fix unequivocally, sub speciae aeternai, that they are not all false. This is the question of the force of dialethic truth, and a point at which it seems to depart from the more widely accepted notion of truth. This is the exclusion problem recast: A consistent truth p has a small witness set, |ν(p)| strongly less than Ω; every size is less than Ω; there is no way to pick the sentences that are true only. Chapter 0 goes to some lengths to reconnect dialethism with the wider community, but there is a basic stopping point here: A mainstream mathematician will not feel that an inconsistent theorem (which all theorems here at least possibly are) is true. I think the theorems of naive set theory are true. To justify this sense, and in part to begin to address the exclusion problem discussed above, we offer a more expansive explanation of mathematical truth, following up on Priest and Routley: “The cumulative hierarchy is exactly what it appears to be with a little historical perspective: a consistent substructure of the inconsistent universe of sets, masquerading as the whole thing.”(506) I do not claim that mathematicians, as a matter of practice, believe in di- alethism. With some notable and mystifying exceptions (for instance, see Woodin’s disconcerting “Tower of Hanoi” in [DO98]), mathematicians do not seem to en- tertain the possibility that, upon encountering a contradiction, they have made anything more than a mistake. So naive set theory is not a cultural study of math- ematicians, and, on this conception, I respect the point put to me by one skeptic (in conversation): “Inconsistent mathematics is not mathematics.” But this is only so much anthropology and does not rule out that naive set theory is explanatory of actual mathematical truth, nor does it deprive naive set theory from descriptive status. Let me explain. The technical details suggest, and I wish to argue further, that classical mathe- matics is an undisturbed fragment of the wider edifice, which by its nature must be paraconsistently dialethic. In this sense naive set theory is explanatorily descrip- tive, by presenting a well-motivated whole of which the familiar is a part. This is not a particularly new claim—Priest has been making it about paraconsistent logic for years, e.g. 3. CHARACTERIZING THE MATHEMATICS 187

Going paraconsistent is a gain. ... A paraconsistent logician may analyze every situation that a classical logician may analyze in exactly the same way. But they can also analyze more. Paracon- sistent logic therefore has excess power. ... Paraconsistent logic is therefore preferable to classical logic for exactly the same rea- son that classical logic was preferable to traditional logic. It is a more powerful and flexible inference engine. Why prevent yourself from exploring what is beyond the consistent, when you lose nothing by this? Why shackle yourself with the necessity of consistency, when you don’t have to? ... Logicians of the world unite! You have nothing to lose but your chains. [Pri00b](232)

The point can only be pushed so far, but it is worth asking what this supercession idea comes to. There is a realist thought here, an optimistic induction. To take a well-worn example as the base, the pythagoreans believed the world is explicable in terms of rational numbers. One can learn quite a lot about pythagorean triangles, right triangles with only natural number length sides (Si´erpinskihas a nice little book on the topic), but these are not all the triangles. Irrational numbers are a part of the larger, more useful structure that is incommensurable with a rational-number world- view. Importantly, that the pythagoreans did not believe in irrational numbers, is completely irrelevant to the existence, meaningfulness, cogency or applicability of irrationals. Similar comments could be made about negative, or much later, com- plex numbers. So real analysis does not, per se, describe pythagorean practise. But because mathematics is not a revisionary science, neither does analysis impute against the positive aspect of the pythagorean belief set; progress adopts what has already been discovered, and then says more. The pattern has the emergence of an incommensurable magnitude or other insolubilia, its integration into a wider sys- tem, and then a retrospective explanation and description of what has come before. The main failure of the cumulative hierarchy is that it retains insolubilia; there is still too much it cannot explain. Now, by turns, we can ask if naive set theory really is a reworking of the nineteenth century systems, and then what points of contact naive set theory has with the other systems of the twentieth century. At the start of his [She00] Shelah asks of his own complex research, “Could Cantor have read it?” and so too here. Again, I do not think Cantor was a dialethist; but I wonder whether he would have regarded naive set theory as later explorations of the same universe he discovered, or whether it would be regarded as a different subject altogether. A case study may help bring out what is at issue: the intertranslatability of theorems. 188 7. ON THE TRANSCONSISTENT

3.2. Routley states that

In order to sustain the ultramodal challenge to classical logic it will have to be shown that even though leading features of classical logic and theories have been rejected, ... by going ultra- modal one does not lose great chunks of the modern mathemat- ical megalopolis. ... The strong ultramodal claim—not so far vindicated—is the expectedly brash one: we can do everything you can do, only better, and we can do more. [Rou80](927)

In calling for the recapture of classical results with paraconsistent logic, Routley seems to have assumed that theorems are somewhat logic-independent—that a classical theorem proved paraconsistently is still the same theorem. For example, Routley’s 1977 proof with DKQ of the axiom of choice does indeed validate the assertion:

There is a function f such that, if x is a non-empty set, then f(x) ∈ x.

But the definition of ‘function’ in [Rou77] is a relation that is either univocal or empty. Now, this latter, as Routley notes, is the case classically by virtue of a paradox of material implication: If f = ∅, then from hx, yi ∈ f follows anything at all, including functionality. So Routley brings to the classical arrangement a non-classical logic, and argues that the axiom of choice is true, essentially (though the details differ) by presenting the set

f = {hx, yi : R ∈ R}, where R is the Russell set. Since R 6∈ R, f is empty, so it is a ‘function.’ Since R ∈ R, any ordered pair at all is in f; so of course f(x) ∈ x for non-empty x. But this is not, by any stretch of the imagination, what Zermelo’s axiom of choice asserts. Unreflective assumptions about the faithful of mathematical con- cepts, then, pose real danger. I have been taking myself to be studying, not just naive paraconsistent set theory, but set theory simpliciter. Naive paraconsistent set theory just is set theory. What comes out in these examples, though, is that the same syntactic strings, e.g. ‘there is a measurable cardinal,’ may mean differently in different contexts. I have proved Cantor’s theorem, but not using the same diag- onal set; the set used here exists only by full comprehension. I have proved Peano’s postulates, but using a m¨obius-like representation of ω, with ω appearing in its own defining predicate. One may worry that, whilst all agree that 1 + 1 = 2, my plus is really a wittgensteinian quus—that these are not the same mathematics. More, 3. CHARACTERIZING THE MATHEMATICS 189 some of the ideas and techniques in naive set theory are novel, e.g. at theorem 5.50, and may seem too different from the familiar canon. The core mechanism of the results here obtained in naive set theory with respect to infinite numbers has been the non-self-identity of On, the set of all ordinals. More than any of the inclosure paradoxes (Russell’s set, for example, has played no more than a nominal role), the Burali-Forti contradiction takes centre stage— from a demonstration of the wellordering theorem, to proof of limitation on size to the existence of all transfinite cardinals, to breaking the generalized continuum hypothesis. This is quite a lot. And the burden of the remainder of this section is to show that this is as it should be. The paradox of all the ordinals was always regarded as pathological; but “it is always a mistake to think of anything in mathematics as mere pathology, for there are no such things in mathematics.”[For95](11) I submit the following explanation.

3.3. The paradox of absolute infinity, exemplified at the top of the ordinals, is the core subject matter of Cantor’s mathematics. Set theory is the study of reflection properties of On. This is disguised by consistent reasoning and made plain by the paraconsistent proofs. Generally, consistency preserving theories take the shape of hierarchies; and this is what Cantor’s transfinite looks like. Here we have the hierarchy, but also a unity, which is what the naive view entails. Put differently, set theory trades in ordinals (to paraphrase one of Forster’s metaphors, that the currency in which ZF trades information is cardinality), and we have all the ordinals. It should not be surprising what an influx of such wealth does to the set theoretic economy. In their excellent essay on unrestricted quantification, Shapiro and Wright ad- mit a certain frustration: There is just no denying that Ω is the set of all ordinals—all possible order-types. Not all the ordinals except those that come after, the ‘proper’ ordinals, or higher-ordinals, or whatever. All of them. There may be a con- sistent formal theory, but it does not sustain the intended inter- pretation except at the cost of informal paradox—the same old Burali-Forti paradox—and for good measure, some additional variations on the theme—that was there all along. Let us stop running in circles and step back.[SW06](292) The authors then outline what they take to be the five possible responses to the situation, each with an associated cost. Each cost is reckoned to be too high. The last option is to proceed with the natural, intuitive use of quantifiers and predicates, and “just accept that there are ordinals that come later than all the ordinals. Cost: none—unless one demurs from the acceptance of contradiction.”(293) Here we have 190 7. ON THE TRANSCONSISTENT not demured and now count the benefits. The point will be detailed by recalling some main characteristics of Cantorian theory: a wellordered hierarchy of transfinite powers pegged to the ordinals, expositing maximal information from the universe of sets by naming maximal ordinals. The properties provided by On square nicely with these features and suggest that direct study of On makes for more direct, but still essentially the same, set theory. Consider the set

θ = {α ∈ On : |α| ≤ ℵ0}, the collection of all finite or countable ordinals. This is a transitive set of ordinals, so θ ∈ On. Therefore θ 6∈ θ. Therefore (the informal argument goes), θ is uncount- able. Taking closure, by recognizing the set of countable cardinals, gives us higher cardinals. And this is the same structure we get at the end of On itself, except at that point the collectivising properties have tapered to a point and there is nowhere left to go. In fact in chapter 6 we saw that an inclosure is just what Kunen found of Reinhardt cardinals—that for sufficiently large values of κ any extension will both be an extension and not. That Kunen concludes there must be no such κ, though, is only the most incredulous reading of data. In mathematics progress is made by asking the right questions; this is how Cantor came to the transfinite. More, just by observing this structure at On, an inclosure structure, we are able to infer the lower, more mainstream cardinal powers, too. When in e.g. [Kan94] large cardinals are studied, there is the expectation that these are discrete, in- dependent, and meaningfully different entities, e.g. that a measurable cardinal qualitatively transcends inacessable cardinals. In naive set theory, measurables qualitatively transcend, too. And an inaccessible cardinal may well be a measur- able cardinal, since all these are initially shown to exist by virtue of the same cardinal, Ω. This makes sense, because large cardinals are, classically, expressions of faith in consistency. ZF trades information in cardinality; ergo G¨odel’sproposed new axioms aimed at resolving outstanding question do not posit new properties of sets, but more limit ordinals. In naive set theory we have reported the same facts, now theorems, by taking the closure of the ordinals and finding repetition in an array of reflection phenomenon. Seen in this way, the proof of Zermelo’s wellordering theorem by appeal to a function into {On} is wholly appropriate. From Burali-Forti’s contradiction we know that On itself is wellordered. What the paradox says, when taken seriously, is that the entire structure of the ordinals is recapitulated at the top. Like all ordinals, On is itself the set of all preceding ordinals; but unlike other ordinals, which in their very naming insist on further extensions, On can only, and does, 3. CHARACTERIZING THE MATHEMATICS 191 extend into itself. The fact that

On = On ∪ {On}, which is not a property of every self-membered set in DLQ but is instead a result of On being an ordinal (proposition 5.39), makes plain that an injection into {On} is an injection into a rich structure. The structure is wellordered because it is a subset of the ordinals. Cantor took wellorder to be a basic correlate of God’s omnipotence. The mathematics of naive set theory, in absentia deo, finds that the only basic assumption one needs is the set concept, and this alone. The intensional grasp, the Begriff, of the infinite induces wellorder. Aside: Not all inconsistent sets are wellordered. Take T = {pΦq :Φ}, which can be seen as the set of all truths. Due to the liar and other dialethias, T 6= T . But the subset {T } is not well founded, since it is not the case that (∀y)(y 6= T ∨y 6∈ T ). The unifying idea has been to observe the absolutely infinite, replete with its contradictions, and to write down what can be discerned. Having shown that On is a cardinal, it is a matter of elementary logic—little more than asking the question— to show that there are orders of infinite cardinals: Just existentially generalize on the discrete sequence ...On < On < ... < On... Cantor’s diagonal argument, where an inconsistency-insistent subset is named and used to foil any surjection from a to P (a), follows the same principle: On the basis of inconsistency, discern non-identity: |a| 6= |P (a)|. From the fact that their identity is contradictory, take the two sizes to be distinct. In this way Cantor was able to see a significant aspect of the structure of infinite cardinals. Nothing proven in naive set theory disturbs his theorem, made by reductio; only naive set theory using the improved paraconsistent apparatus sees more, and says what it sees: The tracts beyond the end of the number line are infinite because they are inconsistent.

3.4. Infinity is dialethic; mathematics has heretofore been insistent on consis- tency; therefore closure on infinity is denied and transcendence delivers a tower of cardinals. Consistent reasoning reads hierarchical structure off an inherently incon- sistent edifice. And naive set theory is also consistent—or at least, all propositions of the form Φ ∧ ¬Φ are (at least) false. Therefore, naive set theory should be ex- pected to and has produced the same transfinite tower. Cantor’s theorem holds in naive set theory. It may yet turn out that all transfinite cardinals are identical. If there were an inconsistency in finite arithmetic, for example, as some have speculated based on G¨odel’stheorem, then in the way we have set them out here, all numbers above 192 7. ON THE TRANSCONSISTENT the least inconsistent would both collapse and not. (If n 6= n for n ∈ ω then n = Ω and so too does every α > n. Models for a collapse in the finite case, at some n = n + 1, are considered in [Pri97], and more generally in [Pri00a]; or in [Pri06b], chapter 17.) The implications for such an inconsistency are vast, beginning with the countability of all sets [Pri94]. If sets really are just predicates in extension, though, then since there are only countably many predicates, this begins to make some cohesive sense; a conjecture: There is a one-to-one function between ω and V . Throughout we have not proven, because we cannot, that all the transfinite cardi- nals do not collapse back into a single, ineffably infinite singularity. Even though |a| < |P a|, and therefore |a|= 6 |P a|, still it may be that |a| = |P a|. Above ω there are different cardinals, but these cardinals may yet collapse back into one. Scott foresees this day: I still feel that it ought to be possible to have strong axioms, which would generate ... models as submodels of the universe, but where the universe could be thought of as something ab- solute. Perhaps we could be pushed in the end to say that all sets are countable (and that the continuum is not even a set [i.e. is inconsistent]) when at last all cardinals are absolutely destroyed.[Bel05](xv) The generalized continuum hypothesis is, by the G¨odel-Cohenindependence result [Coh66], for all intents and purposes a truth-value gap in the classical the- ory. Here in naive set theory we have a straightforward refutation, or at least a counterexample to GCH, at On itself. There we also have an instance where GCH holds. And were it to transpire that all infinite cardinals are identical, then the GCH would eo ipso be true, too. Based on some well-observed dualities between the paraconsistent and the paracomplete, between gluts and gaps [Mor95](chapter 13), this might be what we should expect. The GCH is an exciting question for consistency-based set theory; its answer would still be informative for paraconsis- tent set theory; but once we can see the inconsistent whole apart from its consistent parts, as it were, the emphasis and meaning of the questions change.

3.5. In classical set theory we grow accustomed to forced models in which the axiom of choice is true, and forced models in which it is not; and so we learn to be resigned that there is no real answer here. On the question of the size of the linear continuum—Hilbert’s first problem, on the real points in a familiar interval— we learn from [Coh66] not to press too hard. All of this is simply gone in naive theory—or, if not gone, then shifted in interest: Now we wonder, if the GCH has a counterexample at On, whether or not it continues to hold. Because of classical 3. CHARACTERIZING THE MATHEMATICS 193 training it is hard to think of advanced set theory, the independence proofs that are the centre of most textbooks e.g. [DS96], as not centrally concerned with what are in fact very idiosyncratic bits of classicality. In naive set theory we have back, in full, the set concept, and have given that the ordinals and cardinals, the core of mathematics, can be built on it. A century of anxiety to the contrary can fade away. Now there are new worries, about what to prove next, and how the proof will go. “In many affairs,” Russell reminds us, “it’s a healthy thing now and then to hang a question mark on the things you have long taken for granted.” To the point here, the transfinite: Didn’t Cantor see a rich generalization of the natural number line and demonstrate its discrete, consistent extension into the absolute? Unequiv- ocally, yes—with the caveat that claiming consistency is an overreach. Cantor’s theory was clearly coherent, and the tie up-between coherence and consistency led many to assert the latter, even in the face of paradoxes. Now under the paracon- sistent context shift, the meaning of the transfinite changes, exactly because its structure remains the same. A mark of truth is being both surprising and obvious. A paradigm example is Tarski’s account of truth itself, which on the one hand seems too banal to be worth stating, but on the other was never stated so clearly before, and has had indelible impact since. The brilliance of Tarski’s scheme is not in the answer it provides, so much as the fact that Tarski thought to ask the question at all. Similarly, Cantor simply took the actual infinite as a legitimate mathematical object; it is the fact that there is any mathematics there at all, and not so much the mechanics of, say, cardinal arithmetic, that transfixes us. In both Tarski and Cantor’s case, many formerly perplexing issues are discredited, e.g. the liar sentence is unequivocally grammatical, Galileo’s paradox is not a contradiction. Many new and more per- plexing issues follow. In the dialethic case, we simply take absolute sets and treat them as mathematical objects, and as a result lay to rest perplexities about the full picture of set theory. And many new and more perplexing issues follow.

Conclusion

Let us conclude by looking at Pandora’s Box, and asking a question. No sentence in the box is true. Such a box (conceived of by van Frassen in 1986 and discussed in [HP00](14)) contains a sentence both true and false. Now, any language that can express every proposition can express the Pandora proposition. By classical reckoning, under the law of non-contradiction, it follows that, since Pandora is a contradiction, there can be no such ultimate language. Here we have gone as far as embarking on a closed theory of sets. Pandora’s box is now open; and as with the eponymous box in the myth, hope remains. Cantor started us on an intellectual journey. One can peel off at any point, but one should not make a virtue of doing so. ... Pandora’s box is indeed open: Under what conditions should we admit the extension of a property of transfinite numbers to be a set—or, equivalently, what transfinite numbers are there? No answer is final, in the sense that, given any criterion for what counts as a set of numbers, ... there would be no grounds for denying [it] is a set. ... We go on. In the foundations of set theory, ’s dialectician, searching for the first principles, will never go out of business.[Tai00](283 - 4) The set concept, channelled by the axioms of comprehension and extensionality, is volatile. How far can we go? There are many open avenues for further mathematical research. In terms of pure set theory, I have only put some initial stakes in the ground. Rather than list open questions in set theory proper, though, I want to look farther ahead, to the metamathematical questions of consistency, categoricity, and completeness. The pressing need is to devise an operational paraconsistent metatheory; Shapiro points out that An important advantage—perhaps the major advantage—of the dialethic program is the possibility of a single, uniform seman- tics. There is no need for a separate meta-language, since the

195 196 CONCLUSION

envisioned language is semantically closed. The language we use to talk about the (object) language is just a part of the very (ob- ject) language we are talking about. ... Just as we do not need to keep running through richer and richer meta- in or- der to chase our semantic tails, we also need not keep running through stronger and stronger theories in order to chase G¨odel sentences. We embrace some contradictions in the semantics, and get it all from the start. Or so says Priest. [Sha02](818) It is most important for the dialethic program, now that the set theory toolkit is open, to turn this conjecture into a suite of theorems. For example, the theorems of chapter 6 are motivated by and easiest to grasp through the satisfaction relation of model theory, |=. Classical model theory grew up in a climate of paradoxes and is inured in various idiosyncrasies, such as the object language/metalanguage distinction, which fall afoul of semantic closure and are not parts of the dialethic program. To keep every theorem about |= a theorem of naive set theory (NST ) requires completely recasting model theory so it can be represented in NST itself—which involves g¨odelcoding, faithful representation of recursive functions with NST -proofs that the representation really is faithful (as in [BJ89], [Dav82], or [Sho67]), and a working definition of satisfaction. To do this with DLQ is a significant task; and while I have made progress on it, there is still more to do; at time of writing it is too rough to include here. See [Pri06b], chapter 9, for a sketch of what is involved in constructing a semantically closed langauge, there stating an absolute as opposed to model-relative definition of truth with the logic LP Q. Among the first tasks one might carry out is a non-triviality (absolute con- sistency) proof for NST , within NST itself. One method would be to emulate Brady’s magisterial construction [Bra06]; once model theory is recast to paracon- sistent standards, we might reproduce a similar, but non-classical, proof in DLQ. There would be, however, no need. Another, simpler approach would be to define and check the properties of provability and truth predicates, prove soundness within DLQ, and then argue in one line: Since 0 6= 1, it is not true that 0 = 1, so by soundness, it is not provable that 0 = 1. And thereby the system shows itself to be non trivial. The novel, autobiographical flavor of this argument perhaps indicates more general features of metatheory within NST . Once DLQ can reproduce its own non-triviality results, what would such a theorem mean? A question tied to the philosophical issues we’ve been considering, which would also rely on some model theory, is this. A rigorous way to approach the question of whether NST is the genuine theory of sets would be to determine if NST is categorical: Does the theory have only one model up to isomorphism? With the CONCLUSION 197 move to classical first-order model theory came unintended models. Without placing too much stock in homophony, we have put the ‘intent’ back into an intensional set theory. It is reasonable to speculate that, as Dedekind proved arithmetic categorical using naive set theory, so our theory could be categorical too. Almost certainly the structure N = hV, ∈i is a model for naive set theory—it is the intended universe of study. A simple way to get categoricity would be to show that any other model of NST is isomorphic to N , and then by the transitivity of =∼, any two models are isomorphic. Indeed, classical theory has, for all the wrong rea- sons, espoused that any inconsistent theory is cateogorical. (The fallacious twofold reasoning is, first, that inconsistent theories have no models. And then, by the bad principle that ¬Φ ` Φ → Ψ, follows that all inconsistent models are isomorphic.) But the classical prediction will be immediately false here: NST is not categorical, because an isomorphism is a one-one map preserving structure; sets of different sizes cannot be in a one-one map; two structures of different cardinality cannot be isomorphic. hV, ∈i is a model of NST ; but

|V |= 6 |V |, because |V | = |On| and |On|= 6 |On|. Therefore the canonical model of NST is not even automorphic (isomorphic to itself), although of course it also is. Is this just a dialethic trick? Is there still some means to a categoricity result? To a negative answer, there is stronger evidence to be had. An intriguing function developed to some extent by Quine, which Whitehead suggested be termed an essence (Forster calls these Boffa atoms), is:

E(a) = {x : a ∈ x}.

With sethood understood as predication, the essence of a is the collection of all properties borne by a. Almost immediately follows an observation of Boffa—“the most arresting miscellaneous combinatorial fact in NF ...which has the status not so much of folklore as legend.”[For95](35). Because a ∈ b ↔ b ∈ E(a) ↔ E(a) ∈ E(b),

a ∈ b ↔ E(a) ∈ E(b).

Let E denote the set of all essences. From Boffa’s observation, it would seem that hE, ∈i = E is a model of NST . For along with membership, identity is also modeled: If a = b then E(a) = E(b) by substitution, and E(a) = E(b) → a = b. (Suppose a ∈ x ↔ b ∈ x. Since a ∈ {a}, so too b ∈ {a}. So a = b.) Therefore the two atomic term-forming operators of set theory, ∈ and =, are modeled. This makes even dialethic categoricity implausible, for the following reasons. 198 CONCLUSION

Since E = E00V = {E(x): x ∈ V }, there is no ⊂-minimal model of NST . For if M = hM, ei is a model of NST , then E00M is, too. And because x ∈ E00M → x ∈ E and x ∈ M → x ∈ M, then E00M ⊆ M. But E is a proper substructure of M. As a corollary, it would follow that carrier sets for models of NST need not be universal. Since E is a carrier for a model of NST , suppose for reductio ad absurdum that (∀x)x ∈ E. Then V ∈ E, so (∃U)E(U) = V . But ∅ ∈ V , implying ∅ ∈ E(U), and then by definition, U ∈ ∅, which is absurd. (So this (vaguely disturbing) result is not only true, but true only, on pain of triviality.) Ergo E is a non-universal model. So much for the intended model, which was the universe. Say that NST is Ω-categorical iff |M| = Ω, where M is the carrier set for any model of NST . Open question: Is NST Ω-categorical? These would answer questions of non-triviality (‘absolute consistency’) and cat- egoricity. The third and perhaps most exciting possibility is completeness. Given a working T-scheme, we will have that either Φ is true or ¬Φ is true, since excluded middle is part of the logic DLQ. What about provability—for any sentence Φ in the language of NST , is either Φ provable or ¬Φ provable? Classically, a theory is complete iff any two of its models make all and only the same sentences true (elementary equivalence). Because there will be no object language/metalanguage distinction, there is some reason to suspect that, even without categoricity, any two models of NST are indeed elementarily equivalent. On the other hand, complete- ness is still a ways off; given sets like

R = {x : x ∈ x} then the question of whether or not R ∈ R suggests no means of deciding [BR89](468). Hilbert in his landmark address said: Is this axiom of the solvability of every problem a peculiarity characteristic of mathematical thought alone, or is it possibly a general law inherent in the nature of mind, that all questions which it asks must be answerable? ... This conviction of the solvability of every mathematical problem is a powerful incentive to the worker. We hear within us the perpetual call: There is the problem. Seek the solution. You can find it by pure reason, for in mathematics there is no ignorabimus... In investigating Socrates’ schema, the hypothesis was that dialethic paraconsis- tency, particularly as it supports the inclosure schema, offers some natural Archimedean points. And indeed, by evaluating otherwise familiar structures such as the ordi- nals from external fixpoints, such as On itself, many new and powerful theorems have emerged. The unnameable, “whereof we cannot speak,” has been named and spoken of. CONCLUSION 199

Yet it has also been seen that absolutely comprehensive sets like V cannot be inclosure-carriers: anything not in V implies ⊥. Anything inconsistent is Ω- sized, V sized. There is, interestingly enough, an ultimate extent to transcendence, even for dialethic paraconsistency. There is a trivializing limit, absurdity negation −p = p → ⊥, which cannot be dialethic. The world truly is all that is the case: There is a final outpost of reason beyond which is only chaos, the absurd. Where is the place to stand, to move the world? We have met universal sets, like U = {x : x = x}, which can be and are tran- scended by diagonalization. And we have done some work exploring the reflection properties of VΩ, discerning structure. So it is reasonable to speculate about a penumbra at the limit, as desired all along: A place to stand that is outside, and so Archimedean, but also inside and so available. The point, structurally, lies some- where between U and V , approached and extended by On. The work done here, the founding of a richer and stronger mathematics of the metaphysical, is a start at understanding this structure. Here is a means to study and stand at this fixed point, where none can be, and is.

Bibliography

[AB75] Alan Ross Anderson and Nuel D. Belnap. Entailment: The Logic of Relevance and Necessity, volume 1. Princeton University Press, Princeton, 1975. [AB82] A. I. Arruda and D. Batens. Russell’s set and the universal set in paraconsistent set theory. Logique et Analyse, 98, 1982. [ABD92] Alan Ross Anderson, Nuel D. Belnap, and J. Michael Dunn. Entailment: The Logic of Relevance and Necessity, volume 2. Princeton University Press, Princeton, 1992. [Acz88] Peter Aczel. Non-Well-Founded Sets. Number 14 in csli Lecture Notes. csli Publi- cations, Stanford, 1988. [AF58] A. Levy A. Fraenkel, Y. Bar-Hillel. Foundations of Set Theory. North Holland, 1958. [Arr89] A. I. Arruda. Aspects of the historical development of paraconsistent logic. In Priest et al. [PRN89], pages 99–130. [Asm09] Conrad Asmus. Restricted arrow. Journal of , 2009. forthcoming. [Aus88] David F. Austin, editor. Philosophical Analysis. Kluwer, 1988. [Bar77] Jon Barwise, editor. Handbook of Mathematical Logic. North-Holland, 1977. [BBH+06] JC Beall, Ross T. Brady, A.P. Hazen, Graham Priest, and Greg Restall. Relevant restricted quantification. Journal of Philosophical Logic, 35:587 – 598, 2006. [Bel05] John L. Bell. Set Theory: Boolean-Valued Models and Independence Proofs. Oxford, 2005. [Ber68] . Axiomatic Set Theory. North Holland, 1968. Historical Introduction by Abraham Fraenkel. [Ber07] Francesco Berto. How to Sell a Contradiction. Studies in Logic v. 6. College Publi- cations, 2007. [Ber08] Francesco Bero. Adunaton and material exclusion. Australasian Journal of Philoso- phy, 86(2):165 – 190, 2008. [BJ89] George Boolos and Richard Jeffrey. Computability and Logic. Oxford University Press, third edition, 1989. [BM96] Jon Barwise and Lawrence Moss. Vicious Circles. csli Publications, 1996. [BMPvB00] D. Batens, C. Mortensen, G. Priest, and J-P. van Bendegem, editors. Frontiers of Paraconsistency. Kluwer Academic Publishers, 2000. [Bol51] B. Bolzano. Paradoxes of the Infinite. Routledge and Keegan Paul, London, 1950 [1851]. Translated by F. Prihonsky. [Bol73] Bernard Bolzano. Theory of Science. D. Reidel Publishing, 1973. Edited, with an introduction, by Jan Berg. Translated by Burnham Terrell. [Boo83] William Boos. A self-referential ‘cogito’. Philosophical Studies, 44(2):269 – 290, 1983. [Boo98] George Boolos. Logic, Logic and Logic. Harvard University Press, 1998. [BR89] Ross T. Brady and Richard Routley. The non-triviality of extensional dialectical set theory. In Priest et al. [PRN89], pages 415–436.

201 202 BIBLIOGRAPHY

[Bra71] Ross Brady. The consistency of the axioms of the axioms of abstraction and exten- sionality in a three valued logic. Notre Dame Journal of Formal Logic, 12:447 – 453, 1971. [Bra89] Ross T. Brady. The non-triviality of dialectical set theory. In Priest et al. [PRN89], pages 437–470. [Bra03] Ross Brady, editor. Relevant Logics and their Rivals, Volume II: A continuation of the work of Richard Sylvan, Robert Meyer, Val Plumwood and Ross Brady. Ashgate, 2003. With contributions by: Martin Bunder, Andre Fuhrmann, Andrea Loparic, Edwin Mares, Chris Mortensen, and Alasdair Urquhart. [Bra06] Ross Brady. Universal Logic. CSLI, 2006. [BS71] J.L. Bell and A.B. Slomson. Models and Ultraproducts: An Introduction. North Hol- land, 1971. [Bur05] John P. Burgess. Fixing Frege. Princeton, 2005. [Can95] . Beitr¨agezur begr¨undungder transfiniten mengenlehre (erster artikel). Mathematische Annalen, 46:481 – 512, 1895. [Can97] Georg Cantor. Beitr¨agezur begr¨undungder transfiniten mengenlehre (zweiter ar- tikel). Mathematische Annalen, 49:207 – 246, 1897. [Can15] Georg Cantor. Contributions to the Founding of the Theory of Transfinite Numbers. Dover, 1915. Edited, Translated, and Introduced by P.E.B. Jourdain, from [Can95], [Can97]. [Can67] Georg Cantor. Letter to dedekind. In van Heijenoort [vH67]. [Car94] Richard Cartwright. Speaking of everything. Noˆus, 28:1—20, 1994. [Cha63] C.C. Chang. The axiom of comprehension in infinite valued logic. Math. Scand., 13:9 – 30, 1963. [Coh66] Paul J. Cohen. Set Theory and the Continuum Hypothesis. New York: W.A. Ben- jamin, 1966. [Cur42] Haskell B. Curry. The inconsistency of certain formal logics. Journal of Symbolic Logic, 7:115–117, 1942. [Czy94] Janusz Czyz. Paradoxes of measures and dimensions originating in Felix Hausdorff’s ideas. World Scientific, 1994. [Dau79] Joseph Warren Dauben. Georg Cantor: His Mathematics and Philosophy of the In- finite. Princeton, 1979. [Dav82] Martin Davis. Computability and Unsolvability. Dover Publications, Inc., 1982. [dC74] N. C. A. da Costa. On the theory of inconsistent formal systems. Notre Dame Journal of Formal Logic, 15:497–510, 1974. [dC00] Newton da Costa. Paraconsistent mathematics. In Batens et al. [BMPvB00], pages 165–180. [dCKB04] Newton C. A. da Costa, Decio Krause, and Otavio Bueno. Paraconsistent logics and paraconsistency: Technical and philosophical developments, 2004. [Ded88] Richard Dedekind. Essays on the Theory of Numbers. Dover, 1963 [1888]. [Dev79] Keith J. Devlin. Fundamentals of Contemporary Set Theory. Springer, 1979. Second edition 1993, retitled “The Joy of Sets.”. [DO98] H.G. Dales and G. Oliveri, editors. Truth in mathematics. Oxford : Clarendon, 1998. [DR02] JM Dunn and Greg Restall. Relevance logic. In Dov M. Gabbay and Franz G¨unthner, editors, Handbook of Philosophical Logic, 2nd Edition, volume 6, pages 1–128. Kluwer, 2002. BIBLIOGRAPHY 203

[Dra74] Frank Drake. Set Theory: An Introduction to Large Cardinals. North Holland, 1974. [DS96] F. R. Drake and D. Singh. Intermediate Set Theory. John Wiley and Sons, 1996. [Dum91] Michael Dummett. Frege: Philosophy of Mathematics. Duckworth, 1991. [Dun87] JM Dunn. Relevant predication 1: The formal theory. Journal of Philosophical Logic, 16:347 – 381, 1987. [Dun88] J. Michael Dunn. The impossibility of certain higher-order non-classical logics with extensionality. In Austin [Aus88], pages 261 – 280. [Euc56] Euclid. The Thirteen Books of the Elements. Dover, 1956. Edited, translated, and annotated by Sir Thomas L. Heath. [FH89] M. Forti and R. Hinnion. The consistency problem for positive comprehension prin- ciples. Journal of Symbolic Logic, 54:1401 – 1418, 1989. [FM92] H. Friedman and R.K. Meyer. Whither relevant arithmetic? Journal of Symbolic Logic, 57:824–31, 1992. [For82] Thomas Forster. Axiomatising set theory with a universal set, 1982. typeset 1997. [For95] Thomas Forster. Set Theory with a Universal Set. Clarendon Press, Oxford, 1995. [Fra53] Abraham Fraenkel. Abstract Set Theory. Amsterdam, North-Holland, 1953. [Fre03] Gottlob Frege. Grundgesetze der Arithmetik, Begriffsschriftlich abgeleitet. Verlag Hermann Pohle, Jena, 1893–1903. [G64]¨ Kurt G¨odel.What is cantor’s continuum problem? In P. Benacerraf and H. Putnam, editors, Philosophy of Mathematics, pages 258–273. Cambridge, 1964. [Gil74] Paul C. Gilmore. The consistency of partial set theory without extensionality. In Jech [Jec74], pages 147–153. [Hal74] Paul Halmos. Naive Set Theory. Springer, 1974. [Hal84] Michael Hallett. Cantorian Set Theory and Limitation of Size. Oxford Logic Guides, 1984. [Hau57] Felix Hausdorff. Set Theory (Third Edition). Chelsea Publishing Co., New York, 1957. First edition 1914. [Hau05] Felix Hausdorff. Hausdorff on Ordered Sets. American Mathematical Society, 2005. Edited, with notes, by J.M. Plotkin. [Hen71] Leon Henkin, editor. Tarski Symposium. Proceedings. AMS, 1971. [Hil67] . On the infinite. In van Heijenoort [vH67]. [Hin94] Roland Hinnion. Naive set theory with extensionality in partial logic and paradoxical logic. Notre Dame Journal of Formal Logic, 35:15–40, 1994. [Hin95] Jaakko Hintikka, editor. From Dedekind to G¨odel. Kluwer: Dordrecht, 1995. [HL08] Roland Hinnion and Thierry Libert. Topological models for extensional partial set theory. Notre Dame Journal of Formal Logic, 49(1), 2008. [HP00] Dominic Hyde and Graham Priest, editors. Sociative Logics and their Applications: Essays by the Late Richard Sylvan. Ashgate, 2000. [HW01] Bob Hale and Crispin Wright. The Reason’s Proper Study: Towards a Neo-Fregean Philosophy of Mathematics. Oxford: Clarendon, 2001. [Jan95] Ignacio Jan´e.The role of the absolute infinite in cantor’s conception of set. Erkennt- nis, 42:375 – 402, 1995. [Jec74] , editor. Axiomatic Set Theory. American Mathematical Society, 1974. [Jec78] Thomas Jech. Set Theory. Academic Press, 1978. [Kan94] Akihiro Kanamori. The Higher Infinite: Large Cardinals in Set Theory from Their Beginnings. Springer Verlag, 1994. 204 BIBLIOGRAPHY

[Kau75] Walter Kaufmann, editor. Existentialism from Dostoevsky to Sartre. New American Library, 1975. with an introduction, prefaces, and new translations. [Kel55] John L. Kelley. General Topology. Springer-Verlag, 1955. [Kin93] David King. From Godel to Derrida : undecidability, indeterminacy, and infinity. PhD thesis, Murdoch University, Western Australia, 1993. Cited by Bernacerraf in “What Mathematical Truth Could Not Be - I”. [Kre99] Philip Kremer. Relevant identity. Journal of Philosophical Logic, 28:199 – 222, 1999. [Kun71] Kenneth Kunen. Elementary embeddings and infinitary combinatorics. Journal of Symbolic Logic, 36(3):407 – 413, 1971. [Kun80] Kenneth Kunen. Set Theory: An Introduction to Independence Proofs. North Hol- land, 1980. [Lak76] Imre Lakatos. Proofs and Refutations: The Logic of Mathematical Discovery. Cam- bridge University Press, 1976. [Lav94] Shaughan Lavine. Understanding the Infinite. Harvard University Press, 1994. [Lev60] Azriel Levy. Axiom schemata of strong infinity in axiomatic set theory. Pacific Jour- nal of Mathematics, 10:223–238, 1960. [Lev79] Azriel Levy. Basic Set Theory. Springer Verlag, 1979. [Lib04] Thierry Libert. Models for paraconsistent set theory. Journal of Applied Logic, 3, 2004. [LL59] CI Lewis and CH Langford. Symbolic Logic. Dover, 1959. [Mad83] Penelope Maddy. Proper classes. Journal of Symbolic Logic, 48:113–139, 1983. [Mar92] E. Mares. Semantics for relevant logic with identity. Studia Logica, 51:1 – 20, 1992. [Mar04] Edwin Mares. Relevant Logic. Cambridge, 2004. [Mar05] Joao Marcos. Logics of Formal Inconsistency. Funda¸c¨acBiblioteca Nacional, Brazil, 2005. [McG90] Van McGee. Truth, , and Paradox. Hackett, 1990. [Mey71] Robert K. Meyer. Entailment. Journal of Philosophy, 68(21):808 – 818, 1971. [Mey87] Robert K. Meyer. God exists! Noˆus, 21(3):345 – 361, 1987. [Min72] Marvin Minsky. Computation: Finite and Infinite Machines. Prentice Hall Interna- tional, 1972. [Moo82] Gregory H. Moore. Zermelo’s Axiom of Choice. Springer Verlag, 1982. [Mor95] Chris Mortensen. Inconsistent Mathematics. Kluwer Academic Publishers, 1995. [MR77] Robert K. Meyer and Richard Routley. Extensional reduction (i). The Monist, 60:355 – 369, 1977. [MRD78] Robert K. Meyer, Richard Routley, and J. Michael Dunn. Curry’s paradox. Analysis, 39:124 – 128, 1978. Rumored to have been written only by Meyer. [Nie76] Fredrich Nietzsche. The Portable Nietzsche. Pengin, 1976. Translated, edited, and with an introduction by Walter Kaufmann. [PBAG04] G. Priest, J.C. Beall, and B. Armour-Garb, editors. The Law of Non-Contradiction. Oxford: Clarendon, 2004. [Pet00] Uwe Petersen. Logic without contraction as based on inclusion and unrestriced ab- straction. Studia Logica, 64:365–403, 2000. [Pot00] Michael Potter. Reason’s Nearest Kin. Oxford, 2000. [Pot04] Michael Potter. Set Theory and Its Philosophy. Oxford: Clarendon, 2004. [PR89a] Graham Priest and Richard Routley. The philosophical significance and inevitability of paraconsistency. In Priest et al. [PRN89], pages 483–537. BIBLIOGRAPHY 205

[PR89b] Graham Priest and Richard Routley. Systems of paraconsistent logic. In Priest et al. [PRN89], pages 151–186. [Pri79] Graham Priest. The logic of paradox. Journal of Philosophical Logic, 8:219–241, 1979. [Pri80] Graham Priest. Sense, entailment and Modus Ponens. Journal of Philosophical Logic, 9:415–435, 1980. [Pri87] Graham Priest. In Contradiction: A Study of the Transconsistent. Martinus Nijhoff, The Hague, 1987. [Pri89] Graham Priest. Reductio ad absurdum et modus tollendo ponens. In Priest et al. [PRN89], pages 613–626. [Pri94] Graham Priest. Is arithmetic consistent? Mind, 103, 1994. [Pri97] Graham Priest. Inconsistent models of arithmetic part i: Finite models. Journal of Philosophical Logic, 26:223 – 35, 1997. [Pri00a] Graham Priest. Inconsistent models of arithmetic, ii: The general case. Journal of Symbolic Logic, 65:1519–29, 2000. [Pri00b] Graham Priest. Motivations for paraconsistency: The slippery slope from classical logic to . In Batens et al. [BMPvB00], pages 223–232. [Pri01] Graham Priest. An Introduction to Non-Classical Logic. Cambridge, 2001. [Pri02a] Graham Priest. Beyond the Limits of Thought. Cambridge University Press, Cam- bridge, 2002. [Pri02b] Graham Priest. Paraconsistent logic. In Dov M. Gabbay and Franz G¨unthner, editors, Handbook of Philosophical Logic, 2nd Edition, volume 6, pages 287–394. Kluwer, 2002. [Pri05] Graham Priest. Towards Non-Being. Oxford, 2005. [Pri06a] Graham Priest. Doubt Truth Be A Liar. Oxford, 2006. [Pri06b] Graham Priest. In Contradiction: A Study of the Transconsistent. Oxford, 2006. Second expanded edition of [Pri87]. [Pri08] Graham Priest. An Introduction to Non-Classical Logic. Cambridge, 2008. Second Edition. [PRN89] Graham Priest, Richard Routley, and Jean Norman, editors. Paraconsistent Logic: Essays on the Inconsistent. Philosophia Verlag, 1989. [Qui69] W.V.O. Quine. Set Theory and its Logic. Harvard University Press, 1969. [Rea88] Stephen Read. Relevant Logic. Basil Blackwell, Oxford, 1988. [Rei74] W. N. Reinhardt. Remarks on reflection principles, large cardinals and elementary embeddings. In Jech [Jec74], pages 189–205. [Res92] Greg Restall. A note on na¨ıve set theory in LP . Notre Dame Journal of Formal Logic, 33:422–432, 1992. [RM76] Richard Routley and Robert K. Meyer. Dialectical logic, classical logic and the con- sistency of the world. Studies in Soviet Thought, 16:1–25, 1976. [Rou77] Richard Routley. Ultralogic as universal? Relevance Logic Newsletter, 2:51–89, 1977. Reprinted in [Rou80]. [Rou80] Richard Routley. Exploring Meinong’s Jungle and Beyond. Philosophy Department, RSSS, Australian National University, 1980. Interim Edition, Departmental Mono- graph number 3. [RPMB82] Richard Routley, Val Plumwood, Robert K. Meyer, and Ross T. Brady. Relevant Logics and their Rivals. Ridgeview, 1982. 206 BIBLIOGRAPHY

[RR63] Herman Rubin and Jean E. Rubin. Equivalents of the Axiom of Choice. North Hol- land, 1985 [1963]. [RU06] Agustin Rayo and Gabriel Uzquiano, editors. Absolute Generality. Oxford University Press, 2006. [Ruc82] Rudy Rucker. Infinity and the Mind. Brighton, 1982. [Rus05] Bertrand Russell. On some difficulties in the theory of transfinite numbers and order types. Proceedings of the London Mathematical Society, 4:29–53, 1905. [Rus37] Bertrand Russell. The Principles of Mathematics. George Allen & Unwin, second edition, 1937. [RW13] Bertrand Russell and . Principia Mathematica. Cambridge University Press, 1910 - 1913. in three volumes. [Sco68] Dana Scott, editor. Axiomatic Set Theory. American Mathematical Society, 1968. [Sha91] Stewart Shapiro. Foundations without Foundationalism: A case for second-order logic. Oxford University Press, 1991. [Sha97] Stewart Shapiro. Philosophy of Mathematics: Structure and Ontology. Oxford Uni- versity Press, 1997. [Sha02] Stewart Shapiro. Incompleteness and inconsistency. Mind, 111:817 – 832, 2002. [Sha05] Stewart Shapiro, editor. The Oxford Handbook of Philosophy of Mathematics and Logic. Oxford University Press, 2005. [She00] Shaharon Shelah. Cardinal Arithmetic. Oxford Logic Guides, 2000. [Sho67] Joseph R. Shoenfield. Mathematical Logic. Addison-Wesley, 1967. [Sla89] J. K. Slaney. Rwx is not curry-paraconsistent. In Priest et al. [PRN89], pages 472– 480. [SMC00] C.D.C. Reeve S. Marc Cohen, Patricia Curd, editor. Readings in ancient Greek phi- losophy : from Thales to Aristotle. Hackett, 2000. [Sor01] Roy Sorensen. Vagueness and Contradiction. Clarendon: Oxford, 2001. [ST00] G. Sher and R. Tieszen, editors. Between Logic and Intuition: Essays in Honor of Charles Parsons. Cambridge, 2000. [SW06] Stuart Shapiro and Crispin Wright. All things indefinitely extensible. In Rayo and Uzquiano [RU06], pages 255 – 304. [Tai00] W.W. Tait. Cantor’s Grundlagen and the foundations of set theory. In Sher and Tieszen [ST00], pages 269 – 90. [Tar44] . The semantic conception of truth and the foundations of semantics. Philosophy and Phenomenological Research, 4:341–376, 1944. [Tar62] Alfred Tarski. Some problems and results relevant to the foundations of set theory. In Logic, Methodology and Philosophy of Science. Proceedings of the 1960 International Congress, pages 125–135. Stanford University Press, 1962. [TZ71] G. Takeuti and W. M. Zaring. Introduction to Axiomatic Set Theory. Springer-Verlag, 1971. [Urb89] Igor Urbas. Paraconsistency and the c-systems of da costa. Notre Dame Journal of Formal Logic, 30:583 – 597, 1989. [vA86] James van Aken. Axioms for the set theoretic hierarchy. Journal of Symbolic Logic, 51(4):992 – 1004, 1986. [vH67] Jean van Heijenoort, editor. From Frege to G¨odel:a a source book in mathematical logic, 1879–1931. Harvard University Press, Cambridge, Mass., 1967. BIBLIOGRAPHY 207

[vN67] . Axioms for set theory. In van Heijenoort [vH67], pages 346 – 354. [vN76] John von Neumann. Collected works, Vol I; logic, theory of sets and quantum me- chanics. Pergamon, Oxford, 1976. edited by A. H. Taub et al. [Web09] Zach Weber. Paradox and Foundation. PhD thesis, The University of Melbourne, 2009. [Webng] Zach Weber. Extensionality and restriction in naive set theory. Studia Logica, forth- coming. [Wei94] Paul Weingartner, editor. Alternative Logics: Do Sciences Need Them? Springer, 1994. [Wei98] Alan Weir. Naive set theory is innocent! Mind, 107:763–98, 1998. [Wey19] Herman Weyl. The Continuum. Dover, 1919. [Whi79] Richard White. The consistency of the axiom of comprehension in the infinite valued predicate logic of lukasiewicz. Journal of Philosophical Logic, 8:503–534, 1979. [Wit56] . Remarks on the Foundations of Mathematics. MIT Press, 1956. Edited by G. H. von Wright, R. Rees and G. E. M. Anscome. [Woo01] W. Hugh Woodin. The continuum hypothesis, part i. Notices of the AMS, 48(6):567– 576, 2001. [Woo03] John Woods. Paradox and Paraconsistency. Cambridge, 2003. [Wri83] Crispin Wright. Frege’s Conception of Numbers as Objects. Aberdeen University Press, 1983. [Zal07] Ed Zalta. Frege’s theorem. Stanford Encyclopedia of Philosophy, 2007. [Zer30] . Uber¨ Grenzzahlen und Megenberiche: Neue Untersuchungen ¨uber die Grundlagen der Mengenlehre. Fundamenta Mathematicae, 16:29 – 47, 1930. Reprinted in From Kant to Hilbert volume 2, ed. William Ewald, Oxford 1996. [Zer67] Ernst Zermelo. Investigations in the foundations of set theory. In van Heijenoort [vH67], pages 200 – 15.

Minerva Access is the Institutional Repository of The University of Melbourne

Author/s: WEBER, ZACH

Title: Paradox and foundation

Date: 2009

Citation: Weber, Z. (2009). Paradox and foundation. PhD thesis, School of Philosophy, Anthropology and Social Inquiry, The University of Melbourne.

Publication Status: Unpublished

Persistent Link: http://hdl.handle.net/11343/35184

File Description: Paradox and foundation

Terms and Conditions: Terms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works.