<<

From the Outside Looking In: Can mathematical certainty be secured without being mathematically certain that it has been?

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

Matthew Souba, MSc, MLitt

Graduate Program in Philosophy

The Ohio State University 2019

Dissertation Committee: Neil Tennant, Advisor Stewart Shapiro Christopher Pincock Copyright by Matthew Souba 2019 Abstract

The primary aim of this dissertation is to discuss the epistemological fallout of Gödel’s Incompleteness on Hilbert’s Program. In particular our focus will be on the philo- sophical upshot of certain proof-theoretic results in the literature. We begin by sketching the historical development up to, and including, Hilbert’s mature program, discussing Hilbert’s views in both their mathematical and their philosophical guises. Gödel’s Incompleteness Theorems are standardly taken as showing that Hilbert’s Pro- gram, as intended, fails. Michael Detlefsen maintains that they do not. Detlefsen’s argu- ments are the focus of chapter 3. The argument from the first incompleteness , as presented by Detlefsen, takes the form of a dilemma to the effect that either the infini- tistic is incomplete with respect to a certain subclass of real sentences or it is not a conservative extension over the finitistic theory. He contends that Hilbert need not be committed to either of these horns, and, as such the argument from the first incompleteness theorem does no damage to Hilbert’s program. His argument against the second incomplete- ness theorem as refuting Hilbert’s Program, what he calls the stability problem, concerns the particular formalization of the statement shown unprovable by Gödel’s theorem, and endorses what are called Rosser systems. The success of Detlefsen’s arguments critically depends upon the precise characterization of what exactly Hilbert’s program is. It is our contention that despite Detlefsen’s attempts, both of the arguments (from the first and sec- ond incompleteness theorems) are devastating to Hilbert. The view that Detlefsen puts forth is better understood as a modified version of Hilbert’s general program cast as a particularly

ii strict form of instrumentalism. We end by analyzing the coherence of Detlefsen’s proposal, independently of the historical Hilbert. In response to Gödel’s Incompleteness theorems several modified or partial Hilbert’s pro- grams have been pursued. In chapter 3 we consider one such version due to Gentzen that enlarges the methods to be admitted in consistency proofs. By giving up the stress on strictly finitary reasoning and liberalizing what counts as epistemically acceptable, Genzten was able to prove that PA is consistent by appeal to the principle of quantifier-free transfinite induction up to the ordinal 0. Gentzen’s method proceeds by means of ordinal assignments to, and reduction procedures for, possible proofs of contradiction, showing such proofs to be impossible. We first present Gentzen’s method in order to provide a systematic overview of the structure of his proof and its philosophical motivations. We then consider the modern version of the proof as presented by Gaisi Takeuti. Central to Takeuti’s proof is the demon- stration of the claim that whenever a concrete method of constructing decreasing sequences of ordinals is given, any such decreasing sequence must be finite. Takeuti takes the construc- tive demonstration of this result as being of particular philosophical and epistemic value. The central theme that comes out of the philosophical discussion is that such a result can only be understood “from the outside” of the system. Our discussion of Takeuti shows how this theme generalizes, and will be of central importance in the rest of the dissertation. Predicativity is similar in spirit to the work of Gentzen discussed above in the sense of liberalizing what counts as an epistemically safe starting point. The basic idea behind Predicativism is the acceptance of the natural numbers as basic to human understanding and then to see just how much of can be shown on the basis of this starting point. The discussion of Predicativity is important for two reasons. First, it signals a shift in how one understands foundational work in mathematics. In particular it signals a shift from a focus on security and justification to that of determining the limits of particular philosophical views. It serves as a lynchpin, so to speak, between the early foundational work discussed in the first half of the dissertation and the more contemporary foundational work discussed

iii in the second part. To see this clearly we begin chapter 5 by sketching some of the early historical developments of Predicativity, focusing on Russell and Weyl. We then look at more recent technical developments of Predicativity, culminating with the limitative result that the bound of Predicativity is Γ0. The second reason why Predicativity is important is due to a relevant feature it shares with Finitism. It provides another lucid example of the internal/external divide that concerned the philosophical portion of the discussion of Takeuti mentioned above. This theme emerges as the central point upon which the philosophical value and significance of the epistemic aspect of hinges. We end chapter 5 by highlighting this connection to Finitism, setting the stage for some philosophical work to be done in chapter 6.

The proof-theoretical reduction of the system WKL0 to PRA is taken as perhaps the clearest example of a partial realization of Hilbert’s Program. The concept of reductive proof theory more generally is taken as being foundationally informative in the sense that certain reductions and conservativity results are taken as revealing just how much mathematics is justified on the basis of other, more elementary frameworks. Implicit in this is the notion of epistemic security. We consider as case studies the reductions of the systems IΣ1 and

WKL0 to PRA in order to get clear on exactly what such justification and security amounts to. We argue that to properly understand what is meant by these terms requires a closer look at the radicalization of the axiomatic method and the shift to formalism that underlies Hilbert’s Program. After looking at Hilbert’s famous disagreement with Frege we suggest that central to the notions of justification and security is the meaning of what is expressed in formal languages that is eschewed by the shift to formalism. To appreciate this meaning and hence achieve the level of justification and security that is claimed by proof-theoretical reductions requires one to be on the outside of the looking in. The notion of a foundation for mathematics is vague. One can distinguish between what might be called Hilbertian Foundationalism and Naturalistic Holism. In light of Gödel’s results, Foundationalism has largely gone out of style. The lesson from Gödel is commonly

iv taken to be that the goal of the Hilbert Program is simply an outdated and unattainable ideal. In its place has emerged the holistic picture whereby a foundation for mathematics is to be understood as a way to bring the seemingly disparate branches of mathematics together in a unifying way. The sharp contrast between Naturalistic Holism and Hilbertian Foundationalism connects chapter 7 to the rest of the dissertation by illustrating how, to many, the philosophical steam has left the Hilbertian machine. We look at one such brand of Naturalistic Holism due to Penelope Maddy centered on theory as the foundational arena. Maddy’s aim is to understand the proper grounds for the introduction of sets and set-theoretic , as well as a justification of set-theoretic practice. After considering Maddy’s account of how such an understanding proceeds, we provide critical commentary. Central to the discussion is the notion of mathematical depth. We end with some remarks on mathematical methodology followed by a brief discussion of the a priori and its role in bridging the gap between mathematics and philosophy.

v Dedication

Dedicated to my wife, Giulia, who has given up so much, and to my son, Jack, for whom I would give up everything.

vi Acknowledgments

I would like to thank all of my past teachers for their time and dedication to providing me an education. Having now, myself, taught others, I understand the sometimes thanklessness of the job. Special thanks to those who noticed something in me and encouraged me to further study. In particular, I thank Glenn Ross, my teacher at Franklin & Marshall College, for first introducing me to the beauty of logic and sparking my interest in . I would like to thank my committee members for their generosity with their time. Chris Pincock provided numerous insightful comments and thought-provoking questions on early versions of this work that me re-think certain things I had taken for granted. I first met Stewart Shapiro while a student at St. Andrews University in Scotland. His willingness to meet with me and discuss my Master’s thesis, and the level of attention that he showed both it and me, was the primary reason why I chose to attend The Ohio State University. His help in the subsequent years has been invaluable. To my advisor, Neil Tennant, who opened the door to my interest in proof theory, specifically with respect to foundational issues. The time that you have dedicated to me, whether talking through philosophical issues, providing suggestions and encouragement, or imparting general life advice has really meant a lot to me. Your devotion as a scholar, teacher, and mentor is truly inspiring. I would like to thank my family and friends for being an outlet from academia. To my mother-in-law, Gabriella. Without your help these past few weeks watching Jack, I never would have been able to complete the final touches on this dissertation on time. To my best friend, Tim Oeschger. You are one of the few people with which I can truly be myself

vii without fear of judgment. To my sister, Julia, for being a sounding board during moments of frustration and for providing a fresh and encouraging outlook. To my mother, Lynne. You unending love and support during this process in particular, and throughout my life in general, have been enormous in my personal and academic development. To my son, Jack. You are the light of my life and have given me a new perspective of what is truly important. You are my favorite student and yet you already have taught me so much. To my wife, Giulia. The level of sacrifice that you have made so that I may pursue my dreams, often at the expense of your own, has not gone unnoticed or unappreciated. I love you. Finally, special thanks to my father, Wiley Souba. My love of mathematics began with you from a very young age. The importance that you have always placed on education has certainly played a large role in my desire to pursue academia. More importantly, though, you have consistently been with me throughout my existential struggles and have encouraged me to stay on the path. The countless conversations we have had throughout the years have no doubt shaped who I am today. You have always been my biggest fan and are my best friend and confidant.

viii Vita

Education B.A. in Philosophy Franklin & Marshall College, 2008

M.Sc. in Philosophy The London School of , 2010

M.Litt. in Philosophy University of St. Andrews, 2011

Graduate Teaching Associate Department of Philosophy The Ohio State University, 2012-2019

Fields of Study Major : Philosophy

ix Table of Contents

Abstract ii

Dedication vi

Acknowledgments vii

Vita ix

1 Introduction 1 1.1 Pre-Gödelian Hilbert’s Program ...... 7 1.1.1 Early Foundations ...... 8 1.1.2 Later Foundations - “The” Hilbert program ...... 15 1.2 Gödel ...... 21 1.3 Post-Gödelian Hilbert Programs ...... 25 1.3.1 Subsystems of Second-Order Arithmetic ...... 25 1.3.2 Gentzen-style HP ...... 27 1.3.3 Predicativity ...... 30 1.3.4 Reductive Proof Theory ...... 30 1.4 Post-Gödelian Holism ...... 33

2 Recurring Technical Concepts 39 2.1 PRA ...... 39 2.2 Subsystems of Second-Order Arithmetic ...... 41 2.2.1 RCA0 ...... 42 2.2.2 WKL0 ...... 43 2.2.3 ACA0 ...... 43 2.2.4 ATR0 ...... 44 1 2.2.5 Π1 ´ CA ...... 45 2.3 Gödel’s Incompleteness Theorems ...... 45 2.3.1 Gödel’s First Incompleteness Theorem ...... 46 2.3.2 Gödel’s Second Incompleteness Theorem ...... 46 2.4 Proof-Theoretical Reduction ...... 48

3 Detlefsen 50 3.1 Introduction ...... 50 3.2 The argument from the first incompleteness theorem ...... 53

x 3.2.1 Detlefsen’s Reply ...... 54 3.3 The argument from the Second Incompleteness Theorem ...... 59 3.3.1 The Standard Argument and The Stability Problem ...... 60 3.4 Historically Hilbertian? ...... 66 3.5 Detlefsen’s Strict Instrumentalism ...... 69

4 Finitistic Transfinite Justification? 77 4.1 Introduction ...... 77 4.2 Some preliminaries ...... 83 4.3 Transfinite induction and its justification: Gentzen ...... 87 4.3.1 The Proof ...... 87 4.3.2 Transfinite justification: Gentzen and Gödel ...... 94 4.4 Takeuti’s justification ...... 97 4.4.1 Eliminators ...... 99 4.4.2 I-VII - Finitude up to ωω ...... 100 4.4.3 VIII-XIV - The general theory of α-sequences and pα, nq-eliminators . 104 4.5 Assessment ...... 105 4.5.1 Tait ...... 108 4.5.2 Burgess ...... 112 4.6 Appendix ...... 116

5 Predicativity 122 5.1 Predicativity I - Historical Developments ...... 122 5.1.1 Russell and Poincaré ...... 123 5.1.2 Weyl ...... 126 5.2 Predicativity II - Technical Developments ...... 127 5.2.1 ...... 128 5.2.2 Predicative Definability ...... 129 5.2.3 Predicative Provability ...... 139 5.3 Philosophical ‘Ramifications’ ...... 144 5.3.1 Independent characterizations ...... 144

6 On the Philosophical Significance of Proof-Theoretic Reductions 152 6.1 Introduction ...... 152 6.2 Proof-theoretic reduction ...... 156 6.3 Two Case Studies ...... 159 6.3.1 IΣ1 ď PRA ...... 162 6.3.2 WKL0 ď PRA...... 163 6.4 Philosophical Upshot ...... 165 6.4.1 Internal limitations ...... 165 6.4.2 Justification and epistemic security ...... 170 6.4.3 External value I ...... 175 6.4.4 Dedekind ...... 178 6.4.5 Frege-Hilbert ...... 182 6.4.6 External value II ...... 192

xi 6.5 Appendix ...... 195

7 On Penelope Maddy’s Defending the Axioms: A critical study 201 7.1 Introduction ...... 201 7.2 The fall of Hilbertian Foundationalism and the rise of Maddyan Naturalism . 205 7.2.1 Thin Realism vs. Arealism ...... 208 7.2.2 New Axioms ...... 216 7.3 Reply ...... 218 7.3.1 The uniqueness of V ...... 220 7.3.2 Depth ...... 221 7.3.3 Thin Realism vs. Robust Realism ...... 226 7.3.4 The choice (or lack thereof) between Thin Realism and Arealism; and a moral for methodology ...... 230 7.3.5 More on methodology, and some consequences ...... 231 7.3.6 New Axioms? ...... 233 7.3.7 Concluding remarks ...... 236

Bibliography 239

xii Chapter 1

Introduction

The primary aim of this dissertation is to discuss the epistemological fallout of Gödel’s Incompleteness Theorems on Hilbert’s Program. In particular our focus will be on the philosophical upshot of certain proof-theoretic results in the literature. We are aware of extensive set-theoretic results that have bearing on our question and we will at times refer to some of them; they will not, however, be the primary focus in what follows. Hilbert’s Program (HP) was one of the foremost foundational projects in the early twen- tieth century, particularly in the 1920s. In an age characterized by a shift in mathematical method, from a focus on calculation to that of abstract reasoning, Hilbert aimed to synthe- size two opposing views: the constructive “finitism” of Kronecker with the rich development of Cantorian set theory and the development of the axiomatic method. The work of Cauchy and Weierstrass had helped to secure the standing of the . And the account of the infinite by Cantor was marked by increased clarity and generality. Mathematics had reached an age of great intellectual freedom and power of method. The problem though was that these notions of infinity, if unchecked, led to contradiction. And given that mathematics is taken to be the most secure and precise of all disciplines, the mathematical world seemed to be in crisis. The responses to these difficulties were two-fold. Some viewed the antinomies as reveal-

1 ing the contradictory and unfounded nature of these methods, and viewed the appropriate response as one of rejection. Others replied by looking at the great success that the un- restricted classical methods had had and responded by saying that one need not reject the methods outright, but rather need only to be more careful about how they are used and what commitments are made; more precision than a strictly naive conception is required. Hilbert took it upon himself (indeed it seems he felt it was his duty as the preeminent mathematician of his day) to save mathematics from these threats. He attempted to show that appeal to the non-constructive, infinitistic methods used in, say, analysis and set theory, were safe from objection by proving their consistency within the part of mathematics that was taken to be secure and unobjectionable. The first part of this introduction will be devoted to sketching the historical development up to, and including, Hilbert’s mature program. Doing so will give us the opportunity to explore the historical landscape, and to illuminate various important themes found through- out the dissertation. Among these will be the shift toward rigorous, abstract, foundational thinking, and the emergence of axioms as implicit definitions. We discuss Hilbert’s views in both their mathematical and their philosophical guises. The mathematical part of the discussion will provide a modern characterization of Hilbert’s attempt to provide a direct consistency proof for analysis. Hilbert’s work on foundations is typically divided into two periods – that from 1900-1905, and that from 1922 to 1931, after which the program was brought to a stop by Gödel.1 Though Hilbert’s views changed over several decades, the “program” reached its maturity in the 20’s and was to proceed roughly as follows.2 First, separate completely the finitistic portion of mathematics. This involves getting rid of any mention of infinite totalities as well as certain logical moves with unrestricted quantification. Though the exact notion of the “finitistic” portion is vague, the idea is that

1Sieg has forcefully argued that this ignores all the work that was being done already in Zurich in 1917. So to say that the Hilbert program emerged from 1921/22 is, strictly speaking, mistaken; the period from 1917 was very fruitful. For more see Mancosu [1998] and Sieg [2013]. 2The following three step rough sketch of Hilbert’s program loosely follows that found in Simpson [1988].

2 we ought to be able to handle basic , and reasoning about manipulations of finite strings of symbols. Second, reconstrue infinitistic mathematics as a formal system whose domain of discourse is replete with infinite sets, and admits unrestricted logical reasoning, functions, and so on. The formulas of this system, however, will be able to be manipulated in a finitistic manner. Some care is required here as this could be understood in two different ways. One way would be to say that Hilbert wanted to formalize all of mathematics in one single formal system. The other reading is that he instead desired to formalize certain branches of mathematics, e.g. analysis, set theory, etc., and show that these formalizations were consistent. As we will see later, which reading one adopts is critically important for assessing the success of the program. Finally, prove the consistency of the entire system using only the finitistic part. This is

0 achieved by giving a finitistic proof of the Π1-conservativity of the infinistic system(s) over

0 the finitary one. Doing so would ensure that any Π1 sentence provable within the infinitistic system would be finitistically justified, and our foundation would be secured. Hilbert’s (later) insight was that by characterizing these systems formally, one could then study them mathematically. This would allow one to prove things about the formal systems themselves. This , it was hoped, would enable one to prove the consistency of the formal systems, and thus show that they are reliable. Moreover, the meta- mathematical means by which these proofs should proceed ought to be finitary, since finitary mathematics was itself seen as reliable (or at least as reliable as one could ever hope it to be3). A very natural reading of HP then, and one that is fairly general, is that it was ultimately intended to secure the use of mathematical methods by providing an epistemic justification for them. Doing so would “endow mathematical method with...definitive reliability.”4 As Kreisel puts the point, Hilbert’s aim was to provide ...a final solution of all foundational questions by purely mathematical means, specifically, by a general method for deciding whether or not any given arbitrary formal system is consistent. 3See Tait [1981]. 4Hilbert [1925], p. 370.

3 He was convinced that the notion of formal system was sharp enough for (as was later verified by Turing). Such a method would provide a final solution, at least ‘in principle’: we should know, here and now, how to decide the consistency of any formal rules we encounter, whatever their source...5

This highlights the philosophical part of the program, the aim being to secure the infinite in terms acceptable to the constructivist. Doing so would allay epistemic concerns regarding the infinite by grounding it in the finite. By formalizing mathematics in different systems and then proving, by finitist means, the consistency of the systems, Hilbert thought he would thereby dispose, once and for all, of all the foundational worries about mathemat- ics. In addition to answering these critics, the more general hope was that the consistency proofs would provide the various branches of mathematics with an ultimately secure foun- dation, thereby guaranteeing the epistemic security of these branches. Had the program been successful it would have been, in the words of Gödel, “without a doubt of enormous epistemological value...[For m]athematics would have been reduced to a very small part of itself...[a]nd reduced to a concrete basis, on which everyone must be able to agree.”6 This brief sketch highlights two different ways to understand Hilbert’s epistemic goals. On the first reading, Hilbert’s goal of a consistency proof was to allay the doubts of those like Brouwer and Weyl about the use of logical principles in doing mathematics, the most important one being the use of the law of excluded middle in application to infinite totalities. A consistency proof was to provide assurance about the new commitment to the “completed” infinite totalities. There is no doubt that appealing to certain ideal forms of reasoning aid in the proofs of real statements, where, roughly, the ideal corresponds to the infinite, and the real cor- responds to the finite.7 The issue raised by the is whether these ideal forms of inference and methods of reasoning are epistemically secure. On this first, modest, reading of Hilbert, the goal of a consistency proof was to quell any doubts on the part of those with intuitionistic (constructive) leanings, by proving the questionable inference rules acceptable

5Kreisel [1976], pp. 111-12, as quoted in Detlefsen [1990]. 6Gödel [1938], p. 113. 7We will see specific examples of these concepts in §1.1.2.

4 and doing so by means of inferences that were acceptable to the intuitionists. This view is purely instrumentalist in spirit. On the stronger reading, the consistency proof for Hilbert was to provide mathemat- ics with an ultimately secure foundation. It was to guarantee the epistemic security of mathematics. This stronger reading is supported by Hilbert’s emphasis that mathematics is the king of all the sciences and is objective and true independently of all human thinking. He stresses that mathematics is to play the ultimate role underlying any scientific activity. Mathematics is supposed to be clear of all doubts. Indeed, if it is to underlie all thinking and be the most secure discipline, then it must be beyond all doubt. A consistency proof (if one exists) is taken by Hilbert to provide this level of security by showing, despite appear- ances to the contrary as raised by the antinomies, that it is free of contradictions. Indeed he says in ‘On the Infinite’ that we need to clarify the nature of the infinite for the very purpose of human understanding itself. So while Hilbert surely wanted to justify the appeal to ideal methods of thought and thereby refute the intuitionist’s worries, the stronger view claims more, namely that he viewed it as of the utmost importance to secure the standing of mathematics as king by removing all doubts about its consistency. It is well known that Gödel’s Incompleteness Theorems present certain insurmountable difficulties for Hilbert’s Program, as originally intended, and all but destroyed his founda- tional goals. Roughly, the first incompleteness theorem states that given a consistent formal system S capable of capturing enough arithmetic, there will be statements formalizable in the language of the system that are true, yet unprovable in S. The second incompleteness theorem says that among these sentences will be a statement asserting the consistency of S. Though more recently there have been arguments against Hilbert’s Program stemming from the first incompleteness theorem, traditionally the Gödelian argument appeals to the second theorem. The most standard argument against Hilbert’s Program is to note that given the results of the second incompleteness theorem, Hilbert’s goal of proving the consistency of (various branches of) mathematics (within those branches) is impossible.

5 The second part of this chapter discusses various post-Gödelian attempts to “overcome” incompleteness and salvage, in some sense, Hilbert’s aims. Work in this area is standardly divided among two strategies. The first was to focus on constructive rather than finitistic methods. These are epistemically secure, though not strictly finite. Genzten, for example, gave a constructive proof that Peano Arithmetic was consistent by appeal to ordinals and transfinite induction. The other route, pursued notably by the work of Harvey Friedman and Stephen Simpson on , concerns different subsystems of second-order arithmetic and their relative strengths. These subsystems are determined by the different comprehension and induction schemes that they allow. What their research has shown is that, given certain appropriate characterizations of finitist and infinitistic mathematics à la Hilbert within these subsystems, substantial portions of infinitistic mathematics can be reduced to finitism (in a sense to be made precise later), thereby giving us certain significant partial realizations of Hilbert’s program. Or so it is claimed. Central to these various attempts at partial realization is the project of characterizing different by their different consistency strengths. There are several other “natural” or “canonical” ways of strengthening a theory to “overcome” incompleteness. Moreover, in virtue of being “natural” or “canonical”, these extensions (at least) seem epistemically accept- able. Some examples include adding ConP A to Peano arithmetic, postulating different large cardinals to extend ZFC, stratifying the language into types, adding a new truth predicate, and going up within reverse mathematics (which does not always increase the consistency strength) to the next strongest theory. We might even think that a particular axiom be- comes clearly true after enough reflection and familiarity with a branch of mathematics, or we might consider some of the consequences of adding an axiom and adopt it on the basis of considerations of overall utility to our theory.8 In all of these examples we are adding further sentences as axioms to our theory. The natural questions are: First, do we need new axioms? Secondly, given our focus on epistemology, which methods actually are epistemi-

8Gödel [1944].

6 cally acceptable, and why? Discussion of these issues points to a shifting conception of what a foundation of mathematics is/ought to be.

1.1 Pre-Gödelian Hilbert’s Program

Cantor’s rich development of set theory showed us the possibilities of new generality, clarity, and rigor. And despite the great advances in the rigorization of mathematics through the work of Weierstrass, et al., and the power of the infinite in reasoning, certain set-theoretic paradoxes (e.g. Russell’s and the Burali-Forti paradox) put the nature of the infinite in question. The paradoxes were appealed to in support of the intuitionist’s critique of the classicist’s appeal to completed infinite totalities and unrestricted use of the law of excluded middle. Among these critics were Brouwer, and Hilbert’s converted student Weyl. Their diagnosis was that the notion of infinity supposed by the set theorists had been shown to be flawed by the paradoxes. It became Hilbert’s goal to justify the use of these abstract methods and to secure the role of the infinite in mathematics. The development of his proof theory was aimed to accomplish this task. Moreover, Hilbert wanted to beat these constructivists at their own game and respond by establishing the consistency of mathematics by the finitist means deemed acceptable by them. By formalizing mathematics in different axiom systems and then proving, by finitistic means, the consistency of the systems, Hilbert arguably thought he would thereby dispose, once and for all, of all the foundational worries about mathematics. Hilbert thus aimed to provide a justification for the methods and concepts used in mathematics. He aimed to show that they were epistemically acceptable, that all doubt was removed. As he famously said, “no one shall drive us from the paradise that Cantor created.” Part of the development of included an emphasis on rigor through ax- iomatization. The shift toward axiomatization was a hallmark of rigor because the axioms, along with precise logical rules, would allow (in theory) for a clear and precise characteri-

7 zation of the subject, as well as the ability to compare different structures and study their relations. The shift toward an emphasis on rigor through axiomatization was marked by the use of abstract concepts. The concepts themselves are characterized by implicit defini- tions. This allows one to investigate the structural relationships among the concepts. This axiomatic method allows one to investigate branches of mathematics rigorously, without appeal to intuition. Central to Hilbert’s view on the foundations of mathematics was his adoption of the ax- iomatic method. The axiomatic method was important for Hilbert because it allowed one to divorce rigorous mathematics from any exercise of intuition. It allows one to abstract away from particular meanings of the basic concepts taken to underlie the theory and to investi- gate their structural relationships. This stress first began with Hilbert’s axiomatization of in 1899. With the advent of different , the study of “geometry” became more of a mathematical one than strictly a physical study. To best understand these different geometries, mathematicians were led to the use of formalism and axiomatization. Hilbert’s axiomatization is a watershed in this axiomatic development that has shaped mathematics to its current form.

1.1.1 Early Foundations

As has been stressed by Wilfried Sieg, there was a significant change in Hilbert’s approach to foundational issues over the years.9 Hilbert’s early view of axiomatics has been called existential or structural axiomatics, whereas his later view can be aptly described as formal axiomatics. The existential axiomatic approach, heavily influenced by Dedekind, gets its name because it provides axioms for a system of objects assumed to exist and satisfy the axioms.10 9See, e.g., Sieg [2013]. Sieg also emphasizes that the later proof-theoretic program and its methodology is often the means by which Hilbert’s earlier views have been understood in the literature; such is a mistake. See also Ferreirós [2009]. 10My understanding of early Hilbert and the influence of Dedekind is heavily indebted to Ferreirós [2009] and Sieg [2013]. For a comparison of the logicisms of Dedekind and Frege see Demopoulos and Clark [2005].

8 This axiomatic approach of Hilbert first took its shape in his Grundlagen der Geometrie. There Hilbert’s goal was to “establish for geometry a complete, and as simple as possible, set of axioms and to deduce from them the most important geometric theorems in such a way that the meaning of the various groups of axioms, as well as the significance of the conclusions that can be drawn from the individual axioms, come to light."11 Assuming the existence of a collection of objects, namely points, lines, and planes, the axioms serve to define these objects by detailing their structural relationships. The complete description of this relational structure is done by means of five groups or collections of axioms: incidence, order, , parallels, and continuity.

Here one customarily begins by assuming the existence of all the elements, i.e. one postulates at the outset three systems of things (namely, the points, lines, and planes), and then–essentially on the pattern of Euclid–brings these elements into relationship with one another by means of certain axioms–namely, the axioms of linking, of ordering, of congruence, and of continuity. The necessary task then arises of showing the consistency and the of these axioms, i.e. it must be proved that the application of the given axioms can never lead to contradictions, and, further, that the system of axioms is adequate to prove all geometric propositions. We shall call this procedure of investigation the axiomatic method.12

On this view, the axioms serve to define the notions of betweenness, congruence, etc., and these in turn implicitly define the notions of point, line, and plane – points, lines, and planes just are whatever satisfiy the axioms. Which axioms were included or excluded from the different groups would result in different notions of point, line, and plane, and hence in different geometrical statements being true or false (with respect to the axioms). As he says,

Consider three distinct sets of objects. Let the objects of the first set be called points and be denoted by A, B, C,...; let the objects of the second set be called lines and be denoted by a, b, c,...; let the objects of the third set be called planes and be denoted by α, β, γ...The points, lines, and planes are considered to have certain mutual relations and these relations are denoted by words like “lie,” “between,” “congruent.” The precise and mathematically complete description of these relations follow from the axioms of geometry13

For Hilbert at this time, heavily influenced by the structuralism and of Dedekind, the notion of a line, e.g., is simply to be a thing of our system. The consistency proof was to answer the question of whether the system of things actually exists. The defining

11Hilbert [1971], p. 2. 12Hilbert [1900a], p. 1092-1093. 13Hilbert 1971, p. 3.

9 characteristic of this approach, and the reason why Hilbert and Bernays would dub it exis- tential axiomatics results from the assumption that such objects as points, lines, and planes exist and satisfy the axioms. It is then the job of a consistency proof to discharge such an assumption. This attitude was adopted with respect to all mathematics. That is, axioms, once shown to be consistent, will serve to “define” a system. Hilbert took the axiomatic method as the way to organize a subject by implicitly defining its underlying concepts. This would serve to identify the structure that the theory is intended to describe. It was also the main way to explore metamathematical questions. Hilbert thought that if we could give an axiomatization of the reals and prove its consistency, then this would ensure that the reals existed as a competed totality.

In the case before us, where we are concerned with the axioms of real numbers in arithmetic, the proof of the consistency of the axioms is at the same time the proof of the mathematical existence of the complete system of real numbers or of the . Indeed, when the proof for the consistency of the axioms shall be fully accomplished, the doubts, which have been expressed occasionally as to the existence of the complete system of real numbers, will become totally groundless.14

The doubts that Hilbert mentions here are certainly those of Brouwer, Kronecker, etc. By demonstrating the consistency of a set of axioms, the underlying system of objects satisfying the axioms was then taken to exist as a mathematical entity. Further metaphysical questions could be done away with. The underlying mantra of this early methodology might be phrased as “consistency entails existence”, or

Con ñ Ex for short. It is unclear, however, whether a consistency proof would have been sufficient to quiet such skeptics, despite the fact that Hilbert seems to have thought that it would. There is a question as to what exactly Hilbert meant by a “system” and by “existence” in the above quote. If he meant coherent mathematical concept or piece of thought, i.e. something that can be contemplated by mathematicians, then this seems quite reasonable.

14David Hilbert, [1900b], p. 1105.

10 Perhaps more questionable is the issue of Platonic realism. It is conceptions of this sort – the existence issues – that play a central role in Hilbert’s disagreement with Frege and tie into the questions raised there regarding what the status of the axioms is/ought to be. Frege was a notorious realist, concerning both objects and truth value. As such he placed great emphasis on the necessity of the truth of the axioms. Hilbert on the other hand eschews such a requirement and is concerned only with the consistency of the axioms. The idea is that the axioms themselves are meaningless, or at least can be. It is often claimed that Hilbert once remarked that in place of points, lines, and planes, we could just as well talk of tables, chairs, and beer mugs. We can see the Dedekindian structuralist influence clearly here. A further issue concerns what exactly Hilbert means by “complete”. It is vague whether Hilbert means complete in the sense that all the truths that follow from the axioms are provable, or whether he means complete in the sense of those truths accepted in ordinary mathematics.15 Hilbert’s consistency and independence results for his axiomatization of Euclidean geome- try proceeded by means of the now common method of reinterpretation. Hilbert’s Arithmetik reinterpretation of ‘point’, ‘line’, and ‘plane’ reduced the consistency and independence of geometrical axioms to that of analysis16. The consistency proof for geometry is achieved by constructing a set of objects from the real numbers that satisfy the axioms. That is, he reduced the consistency of geometry to that of analysis by giving a model of the geometric axioms within R3. The consistency of geometry was thus, in this way, to be reduced to that of Arithmetik. As such, the consistency (and independence) proofs is a relative con- sistency proof because the consistency of geometry is relative to that of Arithmetik. If the geometrical axioms were inconsistent then the resulting reinterpreted sentences, using real numbers, would themselves be inconsistent. Thus, by showing that analysis is consistent, geometry would be shown to be consistent as well. The ultimate question, then, for Hilbert, was whether analysis is consistent. This was listed as the second of his famous problems for mathematics as presented in Paris in 1900: 15See Sieg [2013], p. 87 and Mancosu [1998], p. 151. 16In what follows I will often use the german Arithmetik to indicate mathematics including analysis.

11 I wish to designate the following as the most important among the numerous questions which can be asked with regard to the axioms: To prove that they are not contradictory, that is, that a finite number of logical steps based upon them can never lead to contradictory results.17

It thus became Hilbert’s goal, one that characterized his foundational concerns from the turn of the century until the discovery of Gödel’s Incompleteness Theorems, to establish the consistency of Arithmetik. Hilbert stressed in his 1904 Heidelberg address, “Über die Grundlagen der Logik und der Arithmetik”, that proceeding the same way as in the case of geometry, i.e. providing a relative consistency proof of the real-number system, would not be sufficient. Instead one needed to provide a direct consistency proof. Given that Russell’s paradox was known, providing such a proof was becoming increasingly important. The 1904 address, published in 1905, was his first attempt to do so. That paper begins by considering various attempts by other thinkers to provide solid foundations for mathematics. Hilbert concludes, in each case, that they suffer from serious difficulties. These difficulties, could, however, be overcome by appealing to the axiomatic method. This would provide a satisfactory foundation for our notion of number. Initally Hilbert thought that by modifying the work of Dedekind he could provide such a model-theoretic consistency proof, by means of the genetic method whereby one generates a system of things satisfying the axioms. It was only later that he rejected such a strategy, employing instead the completeness assumption and simply postulating the existence of an object satisfying the axioms. Hilbert’s later genius was to consider mathematical proofs themselves as finite objects and show that no proof will have as its last line a contradiction. His focus was thus to provide a syntactic proof, as opposed to the standardly model-theoretic direction of his predecessors. Important for subsequent developments was Hilbert’s stress that arithmetic is often con- sidered to be part of logic, and, when trying to give a satisfactory account of arithmetic, we do use logic. However, arithmetic concepts are used in logical reasoning. Thus, we need to

17Hilbert [1900b], p. 1104.

12 develop the two disciplines simultaneously so as to avoid the paradoxes. In an oft quoted passage, Hilbert says that,

Arithmetic is often considered to be a part of logic, and the traditional fundamental logical notions are usually presupposed when it is a question of establishing a foundation for arithmetic. If we observe attentively, however, we realize that in the traditional exposition of the laws of logic certain fundamental arithmetic notions are already used, for example, the notion of set and, to some extent, also that of number. Thus we find ourselves turning in a , and that is why a partly simultaneous development of the laws of logic and of arithmetic is required if paradoxes are to be avoided18

The 1904 address is a brief account of how the simultaneous development would go. This attempt to bring mathematics and logic together is important to fully understanding Hilbert’s program. Following Avigad and Rech [2001], one can understand “” in two ways, depending on how “mathematical” is understood–either as subject matter or as method. The first is the study of the reasoning involved in mathematics. The second is the study of certain disciplines using mathematics. The intersection of these is then the study of the reasoning of mathematics using mathematics. It was this synthesis of mathematics with logic and their coeval development that was first brought out by Hilbert. Stressing this simultaneous development of mathematics and logic is important because it allows one to understand Hilbert’s Program more fully, especially in its mature form to be discussed in the next section. In particular, too much focus has been placed on characterizing Hilbert’s Program as a response to the foundational “crisis” posed by the antinomies, and not enough on understanding Hilbert’s program as responding to, and attempting to unify, two different conceptions of mathematics: “general conceptual reasoning about abstractly char- acterized mathematical structures, on the one hand, and computationally explicit reasoning about symbolically represented objects, on the other...[O]ne of the strengths of Hilbert’s program lies in its ability to reconcile these two aspects...”19 The moves toward axiomatics discussed above, and in particular the stress on abstraction by appeal to implicit definition, was at some remove from the emphasis on calculation that so heavily influenced the work of, say, Kronecker. Characteristic of this shift, mathematicians

18Hilbert [1905], p. 131. 19Avigad and Reck [2001], p 4.

13 were becoming increasingly comfortable working with infinities. The work of Weierstrass, Cantor, and Dedekind shows the move from dealing with only potential infinities to now taking infinity “at face value” and accepting completed infinities. This led to an investigation into the interrelatedness of the different fields of mathematics and toward the desire for a general mathematical framework in which to view the entire discipline. Notably, this move to abstraction came with much philosophical reflection as to what exactly mathematics is about. What is it that we are doing, or perhaps should be doing, when we do mathematics? Historically, views were divided. And not surprisingly, these philosophical positions were informed by the divergent conceptions of mathematics as stated in the quote from the last paragraph. This difference in opinion has obvious consequences for how one approaches the infinite. On the calculation conception, the objects must be intuitable, and infinities can be dealt with only when they can be represented algorithmically. What exactly this means will become clear shortly. On the alternative conception, the infinities are taken at face value and constrained only by consistency, , and so on. While the real differences between the two conceptions of mathematics only come out in the infinite, these differences obviously inform which approaches to method are allowed. As has been hinted at above, Hilbert’s emerging insight was to examine by purely syntac- tic methods various formal systems containing abstract forms of reasoning. As will become clear later, the formalist position is thus to use the finitistic methods, externally, or metathe- oretically, in order to examine mathematics itself. By making the distinction between math- ematics and metamathematics, one can fruitfully use formalism to model mathematics by means of formal deductive systems and study it. By doing so, Hilbert hoped to get the best of both worlds; we can appreciate the demand both for symbolic representation and calcu- lation (Kronecker), and for conceptual reasoning about structures, characterized abstractly (Dedekind, Cantor).

14 1.1.2 Later Foundations - “The” Hilbert program

We’ve seen that one of Hilbert’s insights was to rigorize mathematics by axiomatizing it and then investigating the relations among the axioms. He realized that for these foundational questions to even be addressable, and for us to perform this investigation of the axiomatic method, we need a proper logical formalism.20 Doing so would enable us to precisely specify such notions as consistency. And it was the proof of consistency that he took to ensure existence and justify use. In addition, he realized the central importance of the integers for this endeavor. This central importance he shared with Kronecker. But unlike Kronecker, he thought that legitimate mathematics could go beyond them. Moreover, as we saw above, Hilbert wanted to give a direct consistency proof. Unlike various proposals to define, for example, the rational numbers by reducing them to the naturals, he realized that reductions must ultimately end somewhere, and he desired to set the foundations on a solid, non-reducible base. To quell the critics and secure mathematics, this foundation was to be what Hilbert referred to as “contentual number theory”. This consists of sequences of marks, or signs, that are “completely surveyable” (recall Kronecker) and represented in intuition (à la Kant). The strokes themselves lack meaning but can be concatenated and manipulated by finitary means. The finitist point of view is concerned, then, with what sort of reasoning about, and operations on, these symbols can be achieved without any appeal to various abstract concepts and in particular, “completed” infinities. It is natural to ask what the subject matter of finitary arithmetic is. The obvious answer is the natural numbers. But what are these? Otherwise put, what exactly are the objects of finitism? The quick answer is that they are the finite sequences of strokes. But are they types or tokens? There is some disagreement here, and it is unclear exactly what Hilbert meant. These issues are not important for our discussion so we will omit them.21

20For a while Hilbert thought that the formal system of Principia would be the one to do so, and he even went so far as to think that a logicist reduction would work. However, concerns with the eventually led him to reject this. 21See Tait [1981], Mancosu [1998], and Sieg [2013] for more details.

15 What is important is that such signs, or strokes, and finite sequences thereof, are the basic entities, and, importantly, do not enter into logical relations. They are defined recursively. As stressed by Zach [2005], it is crucial to “the conception that the numerals are sequences of one kind of sign, and that they are somehow dependent on being grasped as such a sequence, that they do not exist independently of our intuition of them.” This is not to say that they are created by thought, so to speak, but that their existence is not independent of any intuitive construction. Such numerals are given in our representation of them as strokes, but they are not the strokes themselves. As Hilbert wrote, As a further precondition for using logical deduction and carrying out logical operations, some- thing must be given in conception, viz., certain extralogical concrete objects which are intuited as directly experienced prior to all thinking. For logical deduction to be certain, we must be able to see every aspect of these objects, and their properties, differences, sequences, and contiguities must be given, together with the objects themselves, as something which cannot be reduced to something else and which requires no reduction. This is the basic philosophy which I find necessary, not just for mathematics, but for all scientific thinking, understanding, and communicating. The subject matter of mathematics is, in accordance with this theory, the concrete symbols themselves whose structure is immediately clear and recognizable.22 For Hilbert, then, science consists of a certain type of reasoning and this reasoning is captured by finitist operations. And it is no different for the truths of mathematics. If we can thus reduce all of mathematical reasoning to finitistic reasoning then we have a solution for foundations. Such thinking eventually led Hilbert to the revolutionary suggestion that we should distinguish between mathematics and metamathematics. Whereas the subject matter of our number theory is signs, metamathematics concerns formulas and proofs. The mathematics in which we normally engage involves quantifiers, and functions, and sets, and so on, as well as logical reasoning, including induction and the law of excluded middle. We then capture all this axiomatically and aim to prove the consistency of the axiomatization. But what exactly is finitism? Hilbert is notoriously imprecise when it comes to char- acterizing what he means by this. He does say that it is the type of reasoning involved in scientific thought and that it needs at least to be able to capture basic number theory and the manipulation of finite strings of symbols. It is at least clear that unquantified statements about particular numbers count as finitary. The reason is that these are all effectively de- 22Hilbert [1925, 192].

16 cidable. Even more, statements involving bounded quantification count as finitary, for these are also effectively decidable; there are only a finite number of cases to consider. The prob- lematic statements, those that go beyond finitary means, are those that involve unbounded quantification. This is a notoriously tricky area since Hilbert himself never explicitly designated the bounds of finitary (real) mathematics. However, there are certain things that can be said to get a clearer understanding of what a reasonable characterization is. First of all, the real is supposed to be meaningful and contentful. The most basic aspect of finitary arithmetic are equations involving number terms. So, for example, the famous Kantian equation 7 ` 5 “ 12 is finitary. So too are various predications such as ‘p508 ˆ 2q ` 6 is prime’. Basic logical combinations of such sentences are also finitary, e.g. ‘7 ` 5 “ 12 or p508 ˆ 2q ` 6 is prime’. These are all decidable and clearly finitary. Given this theme of decidability, one also allows statements including only bounded quantifiers as finitary. However, not all unbounded quantification is problematic, so long as it is understood in a particular way. Affirmative general statements, i.e. universal generalizations, are non-

0 23 problematic, so long as every instance can be computationally checked. These Π1 sentences (to use modern parlance) are considered meaningful to the finitist because they are taken as hypothetical statements about each individual case, rather than as an infinite conjunction concerning all cases.24 In this way the actual infinite does not play a role. To quote Gentzen,

“@xF pxq may be significantly asserted if F pxq represents a significant and true proposition for arbitrary successive replacements of x by numbers.”25 Take, for example, the statement,

‘for all numerals n, n ` 1 “ 1 ` n’. This is, according to Hilbert, meaningful. Each instance is finitary and so the statement itself is considered finitary, even though, he stresses, it is not to be taken as an infinite conjunction (which are taken to be non-finitary), but rather as only hypothetical judgments that assert something once a numeral is given.

23Actually, Hilbert did not use quantification, instead representing these universal generalizations with free variables. 24Hilbert [1925]. The distinction is reminiscent of that between any and all in Russell [1908]. 25Gentzen [1936], p. 163.

17 But what happens if you try to negate a general statement? Negation shows that one needs to be careful when dealing with universal generalizations. The reason is that, according to Hilbert, universal generalizations are incapable of being negated since the negations are

0 not finitary. For if they were, then we would have (again using modern parlance) a Σ1 sentence asserting that there exists an instance where the generalization is false. And this, according to Hilbert, involves unbounded quantification of the problematic kind. Negations, which correspond to existential generalizations, take us into the transfinite and are not considered meaningful. So along with the real/ideal distinction there is the further distinction within the real statements between the problematic and the unproblematic, the problematic corresponding to those whose admission of full classical reasoning sends us into the ideal. The finitary generalizations are merely a part of what are taken to be problematic. And, it should be noted, not all problematic statements are problematic for the same reason. Now while there is certainly debate as to what exactly finitistic mathematics is, it has been influential, following Tait [1981], to identify finitary mathematics with Primitive Recursive

0 26 Arithmetic and finitary statements with Π1 formulae. We will look more closely at Tait’s characterization in chapter 4. For now we quote him: I attempted to answer the first, conceptual, question by taking seriously the notion of an arbitrary or generic object X of a given finitist type, where a finitist type of the first kind is a product N ˆ ¨ ¨ ¨ ˆ N, N being the type of the natural numbers, and a finitist type of the second kind is a product whose factors are numerical equations m “ n. An object of the latter type, if there is any, consists of a proof of each of the factors m “ n from the axiom 0 “ 0 using the inference a “ b ñ a1 “ b1. My argument is that one can understand the idea of an arbitrary object of a given finitist type independently of that of the totality of objects of that type; and on its basis, we may proceed to construct objects of possibly other finitist types, which may depend on X. Thus, when X is of finitist type of the first kind, we may construct from it other objects fpXq of types of the first kind. I claimed that, when we identify just what means of construction from X are implicit in the idea of such an arbitrary object, they turn out to be precisely those by means of which we define the primitive recursive functions. Likewise, when we identify, for given finitist functions fpXq and gpXq, what constructions of a proof of fpXq “ gpXq are implicit in the idea of an arbitrary X, they turn out to yield proofs of exactly those equations deducible in PRA.

1 Beyond the finitary we get what Hilbert called ideal mathematics. This includes Π1 sentences27, i.e. claims about sets of numbers (the reals), and so on. It is ideal mathematics

26More recently, however, it has been thought to extend beyond this so as to account for double nested recursions, a lá the Ackermann function. See Sieg [2013] and Zach [2005]. 27 0 Presumably the dividing line between the ideal and the real is Πk for k ą 1.

18 that aids in our study of finitary mathematics. What we need to show is that ideal methods do not lead us to false statements of finitary mathematics. That is, we need the ideal part of mathematics to be a conservative extension of the finitary part of mathematics. He says,

[T]he infinite in the sense of an infinite totality, where we still find it used in deductive methods, is an illusion...[D]eductive methods based on the infinite must be replaced by finite procedures which yield exactly the same results; i.e. which make possible the same chains of proofs and the same methods of getting formulas and theorems28

As beautiful as is and as invaluable as the infinite is for it, it is Cantor and set theory that really illuminate the infinite. Hilbert even went so far as to describe Cantor’s insights as “...the finest product of mathematical genius...” In analysis we are concerned with the infinitesimally small and infinitely large but only as “limiting concepts”, as potential infinities. It is set theory that gives us the actual infinity conceived of as a completed totality by moving beyond into the transfinite. But as noted above, the problem was that contradictions started to appear owing to lack of care, resulting in strong reactions questioning this notion of the infinite, the definitions, and the deductive methods involved. Nevertheless, “No one shall drive us out of the paradise that Cantor has created for us!” What to do? Hilbert took it that elementary number theory is on solid grounds. The idea was to extend this certitude throughout mathematics. And this involves a clear elucidation of the infinite. He also took it to be the case that our logic was fine. It is not the logic that is to be blamed for the paradoxes but rather the abstractness of the concepts involved in the infinite. The subject matter of mathematics, i.e. “the concrete symbols themselves whose structure is immediately clear and recognizable”, is given to us separately from logic. Mathematics cannot be reduced to logic in the way that Frege and Dedekind wanted. We must be sure that the subject matter to which we apply our logic is on solid ground. We rely on those concrete objects (signs) that present themselves to us directly, prior to experience. The solution, Hilbert says, is to save ourselves by introducing ideal elements, in much ? the same way that, for example, i “ ´1 was introduced to save the rules of . We must, therefore, “supplement the finitary statements with ideal elements.” Symbols such as

28Hilbert [1925], p. 184.

19 a, b, `, “ and a ` b “ b ` a mean nothing in themselves but we use them to derive formulas that do have meaning, namely the particular finitary statements. The parameters are used to communicate the meaningful finitary statements.

Mathematics [is thus] a stock of two kinds of formulas: first, those to which the meaningful communications of finitary statements correspond; and, secondly, other formulas which signify nothing and which are the ideal structures of our theory.29

We need to formalize the logical operations and the mathematical proofs. The logical symbols do not mean anything. They, too, are ideal. It is our use of symbols that allows us to represent mathematics. While these symbols are operated on by certain rules, they do not, strictly speaking, mean anything or have “content” themselves. Ideal mathematics, then, takes us beyond the finitary into the transfinite and includes full classical reasoning as well as appeal to “completed” infinite totalities.30 Unlike the finitary statements, ideal statements are meaningless (they do not express content) and merely play a role in deriving statements that do.31 And, depending on what branch of mathematics one is trying to formalize, the ideal theory needing justification will be different (e.g. set theory, or analysis, etc.). Regardless, though, the strategy will be the same. von Neumann [1930] has summarized Hilbert’s Program very nicely:

1. To enumerate all the symbols used in mathematics and logic... 2. To characterize unambiguously all the combinations of these symbols which represent statements classified as “meaningful” in classical mathematics. These combinations are called “formulas.”... 3. To supply a construction procedure which enables us to construct successively all the formulas which correspond to the “provable” statements of classical mathematics. This procedure, accordingly, is called “proving.” 4. To show (in a finitary combinatorial way) that those formulas which correspond to state- ments of classical mathematics which can be checked by finitary arithmetical methods can be proved (i.e., constructed) by the process described in (3) if and only if the check of the corresponding statement shows it to be true.

To accomplish tasks 1-4 would be to establish the validity of classical mathematics as a short- cut method for validating arithmetical statements whose elementary validation would be much too tedious. But since this is in fact the way we use mathematics, we would at the same time sufficiently establish the empirical validity of classical mathematics.

29Hilbert [1925], p. 196. 30For more on the divisions between the real and the ideal see Hilbert [1925], Sieg [2013], Mancosu [1998], and Detlefsen [1986], [1990]. 31Or at least one can treat them as being meaningless. See chapter 6 for more discussion.

20 To prove 4 it suffices to show that 0 “ 1 is not in the set generated in 3. This would show that classical mathematics is consistent. The tricky part is to do this in a finitary way.

1.2 Gödel

Unfortunately for Hilbert, his optimism regarding absolute reduction was short lived. In 1931 the young logician Kurt Gödel shocked the world with his incompleteness theorems. Gödel’s incompleteness theorems are standardly taken to show that Hilbert’s program, as originally intended, fails. Let T be a formal system containing a certain amount of arithmetic. What Gödel showed is that given an appropriate characterization of T, there will be a sentence G such that, if T is consistent then G is not provable in T, and if T is ω-consistent, then the negation of G is not provable in T. Based on the construction of these canonical Gödel sentences, G essentially says of itself that it is unprovable.32 And, since it is unprovable, it is therefore true. Note that if Hilbert’s Program was to be conceived of as one whereby he wanted to find a single formal system that proved all finitary truths, then given that G is taken to be a

0 finitary formula (it is Π1), the first incompleteness theorem would immediately destroy those hopes. A more reasonable of Hilbert’s Program though is one where he wanted to formalize various branches of mathematics and show them consistent. See the remark on page 3 as well as the Kreisel quote that follows it. By formalizing the reasoning involved in the proof of the first incompleteness theorems and by placing certain restrictions (the so called derivability conditions) on the notion of provability within T, Gödel then showed that if one could prove a statement formalizing the consistency of T, then one would be able to prove G. Given T’s failure to prove G, it follows that T is also unable to prove the statement expressing T’s consistency. Put another way, Gödel’s first incompleteness theorem says that for any formal system, F, of sufficient strength, there are statements in the language of F that are true, and yet unprovable in

32See Milne [2007] for more on Gödel sentences and what they say.

21 F, assuming the consistency of F. Gödel’s second incompleteness theorem says that among

0 these true yet unprovable statements is the sentence of Goldbach type, i.e. Π1, expressing the consistency of F, call it ConF . Roughly, ConF expresses that @n, n is not the Gödel number of a proof of 0 “ 1. As concerns Hilbert’s program, assuming that the real part of mathematics can be ap- propriately captured by a formal system, the second incompleteness theorem says that the statement asserting its consistency is unprovable within the system. And because such a statement is provable in various infinitistic theories, the desired conservativity result is im- possible. How do these two results relate to Hilbert’s Program and what impact do they have on it? The standard argument against HP proceeds from Gödel’s second incompleteness theorem: Hilbert wanted to justify various branches of mathematics by proving them consistent within the epistemically secure finitist base theory. The second incompleteness theorem shows that any branch T of mathematics containing enough arithmetic will, if consistent, not prove its own consistency. And, if this branch of infinitary mathematics contains finitary mathematics, then the secure subsystem will not prove the consistency of T either. Thus, despite Hilbert’s desire for a finitist consistency proof, the argument from the second incompleteness theorem, it is maintained, destroys this hope. More recently though, it has been claimed by some that Gödel’s first incompleteness theorem is devastating to Hilbert’s program.33 The primary motivation for these arguments stems from the claim that Hilbert’s program ought to be seen as one where the infinitary system is a conservative extension of the finitary system, where, generally, a theory T2 is conservative over a theory T1 iff for some set F of formulae, any formula φ in F that is provable in T2 is also provable in T1. Conservativity is widely taken as a philosophically satisfying way establish an indirect justification for a theory. If T1 is consistent and T2 conservative over T1, then T2 is also consistent. The first incompleteness theorem demonstrates, it is claimed, that

33See, for example, Smorynski [1985], Kreisel [1976], Simpson [1988].

22 such a conservation result is not possible; it is thus sufficient to block Hilbert’s Program. One might also wonder if we cannot provide an argument against the Gödel phenomena by noting that the Gödel sentence shown undecidable is artificially “cooked” up. As such it is not an arithmetic statement of genuine interest. So while there may be sentences that are independent, they are not mathematically interesting sentences (whatever that means) and so we ought not to be fased by them. That is, they are not the sort of sentences that we ought to care about. Perhaps such an argument could have gained traction. Relatively recent results, though, have undermined such claims. In particular, Paris and Harrington provided a modified Ramsey sentence of a purely combinatorial nature that was shown to be

0 independent of PA. Also, Harvey Friedman has a general method for generating Π1 sentences that are independent of ZFC. Despite these arguments, Michael Detlefsen ([1986], [1990]) maintains that the argument from the first incompleteness theorem (G1) is not sufficient to block Hilbert’s program, and that the more promising line of attack against HP is from the second theorem (G2). This is not to say that Detlefsen thinks that arguments from the second incompleteness theorem do undermine Hilbert’s program, for he has gone to great lengths to argue that it does not. His point, instead, is

to affirm the pivotal status of the G2-based argument and thus to focus attention on it. We are thus making a proposal concerning what we take to be the proper focus of work aimed at determining the effect of Gödel’s work on HP.34

Given his claims that the proper argument against HP proceeds from G2, he has further argued that such arguments are not applicable to Hilbert’s position of finitism. Detlefsen presents the argument from G1 as a dilemma to the effect that either the infinitistic theory is incomplete with respect to certain subclass of real sentences or it is not a conservative extension over the finitistic theory. He then argues that Hilbert had reason to reject both of these horns, and, as such the argument from the first incompleteness theorem does no damage to Hilbert’s program. His argument against G2 as refuting HP concerns the

34Detlefsen [1990], p. 344.

23 particular formalization of the consistency statement shown unprovable by Gödel’s theorem, and endorses what are called Rosser systems. Detlefsen’s arguments are the focus of chapter 3. As we will see, their success will critically depend upon the precise characterization of what exactly Hilbert’s Program is. It is our contention that despite Detlefsen’s attempts, both of the arguments (from the first and second incompleteness theorems) are devastating to Hilbert’s Program. This is not to say that Detelfsen’s proposal fails per se, for as we will see the view Detlefsen puts forth is not a faithful characterization of Hilbert’s views; it is, rather, a modified version of Hilbert’s general program cast as a particularly strict form of instrumentalism. It then remains to analyze the coherence of Detlefsen’s views independently of the historical Hilbert. It should be stressed, however, that Hilbert’s program was monumental despite its “fail- ure”. As von Neumann notes, “although the content of a classical mathematical sentence cannot always (i.e. generally) be finitely verified, the formal way in which we arrive at the sentence can be.”35 It was Hilbert’s ambitions, and the distinctions, discussed above, between mathematics and metamathematics, and semantics and syntax, that truly led to the appreciation of the study of proofs, and led Gentzen to ultimately found the area of mathematics we commonly refer to as Proof Theory. Moreover, even though Hilbert’s plan for establishing “security” did not work out, his focus on finitism in his investigation of the infinite was of the utmost importance. For as Tait [1981] has claimed,

[N]o absolute conception of security is realized by finitism or any other kind of mathematical reasoning. Rather, the special role of finitism consists in the circumstance that it is a mini- mal kind of reasoning presupposed by all nontrivial mathematical reasoning about numbers. And for this reason is is indubitable in a Cartesian sense that there is no preferred or even equally preferable ground on which to stand and criticize it. Thus finitism is fundamental to mathematics even if it is not a foundation in the sense Hilbert wished.

We can arguably take Tait’s claim even further. For it seems reasonable to think that Hilbert adopted a position similar to Frege, who took logic to be universally applicable. On this view, failure to grasp numbers means failure to think at all.36

35von Neumann [1930], p. 62. 36Thanks to Stewart Shapiro for stressing this extension.

24 What the Gödel phenomena show is that we can only get relative consistency results. And this is a block to the complete epistemic security that, as I have claimed in the introduction, Hilbert wanted. We get it only in a conditional form. If theory T is consistent, then so is S, where T proves S’s consistency.37 So the Gödelian phenomena shows that we are limited in our ability to have complete mathematical assurance that our systems are acceptable or secure in any robust sense.

1.3 Post-Gödelian Hilbert Programs

In light of Gödel’s theorems, one might wonder whether Hilbert’s program is still viable in any way. There are two primary avenues of thought toward such modified Hilbert programs, depending on whether one restricts its scope or enlarges the methods of proof to be admit- ted. The former, specifically the work of Friedman [1975, 1976] and Simpson [1988, 2009], abandons the project of providing a foundation for all of mathematics on the basis of finitary reasoning, and sees just how much we can actually get on the basis of finitary means. That is, assuming we can formalize finitary arithmetic, we need to determine which theories are finitarily provably conservative over that formalization. The latter task, notably pursed by Gentzen [1936, 1938] and Gödel [1958], gives up the stress on strictly finitary reasoning and liberalizes what counts as epistemically acceptable.

1.3.1 Subsystems of Second-Order Arithmetic

The first route mentioned above looks to see how much infinitary mathematics we can in- directly justify on the basis of finitary mathematics. Executing this program requires that we invoke more mathematical precision than Hilbert did. Following remarks by Hilbert and Bernays in their Grundlagen der Mathematik, Simpson [1981, 2009] takes the infinitistic

37Interestingly, these conditionals are proved very ‘low down’ in the hierarchy of consistency strengths, usually in PRA.

25 38 system to be that of second order arithmetic (Z2). He follows Tait [1981] in taking the finitistic system to be that of primitive recursive arithmetic (PRA).

There seems to be a certain naturalness about PRA which supports Tait’s conclusion. PRA is certainly finitistic and “logic-free” yet sufficiently powerful to accommodate all elementary reasoning about natural numbers and manipulations of finite strings of symbols. PRA seems to embody just that part of mathematics which remains if we excise all infinitistic concepts and modes of reasoning.39

Clarified in these terms, the aim of the pre-Gödelian Hilbert program was to prove the consistency of Z2 within PRA. We could achieve this aim by showing (within PRA) that Z2

0 is a conservative extension of PRA with respect to the Π1 sentences. If we could do this, then it would show that all of these sentences provable in Z2 are likewise provable in PRA. Thus, appeal to the infinite would be nothing more than a form of convenience. This aim cannot be achieved, however, because Gödel’s incompleteness theorems show that there are

0 certain Π1 sentences provable in Z2 that are not provable in PRA. Hence, no conservative extension result is possible, regardless of how strong a one uses. Hilbert’s pro- gram, thus stated, is thwarted. It is not that Z2 or PRA are inadequate characterizations of the infinitistic and finitistic, but rather than there can be no reduction of the former to the latter.

Despite this failure, we might wonder just how much of Z2 is conservative over PRA. It turns out that the answer is quite a bit. Whereas it is customary in mathematics to see what theorems can be proved from a certain set of axioms, the program of Reverse Mathematics tries to determine, given a particular theorem, the logically weakest ‘natural’ set of axioms that suffice to prove it. As it turns out, almost all of ‘ordinary’ mathematics can be stated in the language of Z2 (via copious methods of coding) and proven in some subsystem of it, where ‘ordinary’ mathematics is that subset of mathematics that is independent of abstract set- theoretic concepts. The general strategy for determining the weakest subsystem in which the theorem τ can be proven is to show that the set existence axioms characterizing the subsystem are equivalent to τ, where the equivalence is provable in a weaker system in

38 We provide details regarding Z2 in chapter 2. 39Simpson [1988], p. 352.

26 which τ is not provable.

Of particular importance and interest is the subsystem of Z2 that Friedman has called

Weak Konig’s Lemma, denoted WKL0. WKL0 has the same language as Z2 and employs full

0 classical logic with LEM. Unlike Z2, WKL0 employs induction restricted to Σ1 sentences. In addition, WKL0 contains PRA but adds the nonconstructive axiom known as Weak Konig’s Lemma. This states that any infinite tree composed of finite sequences of 0’s and 1’s will have an infinite branch.

0 Friedman has shown that WKL0 is conservative over PRA with respect to Π2 sentences,

0 and thus is also conservative over the weaker Π1 sentences. This is taken as showing that

PRA counts as a finitistic reduction of WKL0 in the sense that Hilbert desired. Moreover, this reduction is provable in PRA. All of this would be for naught, though, if WKL0 were as weak as PRA with respect to infinitistic mathematics. But, as shown by Friedman and Simpson, it is strictly stronger.40 Simpson has recently strengthened this result and shown

` that an extension of WKL0 which he calls WKL0 is properly contained in ACA0. and is also

0 conservative over PRA with respect to Π2 sentences.

1.3.2 Gentzen-style HP

Rather than restrict the scope of Hilbert’s Program, one can also liberalize the methods by which a consistency proof may be deemed acceptable. As we’ve seen, a theory is consistent just in case proofs in that theory do not lead to contradictions. The proofs themselves then become objects of study. The first step is to formalize mathematical proofs, then prove that none of the possible proofs leads to contradiction. Now in proving this, we are already using certain inferences and concepts, so we must presume from the beginning that these are themselves consistent and sound. Thus, there can be no absolute consistency proof. So we need to proceed from inferences, etc. that we deem more secure than those we are trying to prove consistent. The need to re-set our goal in this way is brought out particularly clearly

40 A list of theorems provable in WKL0 can be found in Simpson [1988, 2009].

27 by Gödel’s second incompleteness theorem. Gödel himself believed that though axiomatic set theory provided the appropriate axioms for a foundations for mathematics, such a foundation presupposes a form of Platonism. And though set theory is most likely to be consistent, showing as much needs to avoid the objec- tionable epistemic concerns that plague Platonism. In virtue of the second incompleteness theorem, the finitist methods of Hilbert will not suffice. The move then was to extend the acceptable reasoning to that of constructive proof. Though not finitist it is still epistemically more acceptable than full classicism. In 1933 Gödel showed how to translate PA into Heyting arithmetic (HA). This reduction of classical predicate logic to intuitionistic predicate logic proceeds by means of the so-called double-negation translation, and is commonly referred to as the Gödel-Gentzen translation. Following Avigad and Feferman [1998], p. 342, it is defined as follows:

1. ϕN “ ϕ, for ϕ atomic 2. pϕ ^ ψqN “ ϕN ^ ψN 3. pϕ _ ψqN “ p ϕN ^ ψN q 4. pϕ Ñ ψqN “ ϕN Ñ ψN 5. p@xϕpxqqN “@xϕpxqN 6. pDxϕpxqqN “ @x ϕpxqN

Classically, every formula is equivalent to its N-interpretation.

Importantly, pϕ _ ψqN Ø pϕN _ ψN q and pDxϕqN Ø DxϕN are provable intuition- istically. Moreover, one can show

Theorem 1.3.1 Suppose a set of axioms S proves a formula ϕ using classical logic. Then SN proves ϕN using intuitionistic logic.

Corollary 1.3.1.1 Suppose PA proves a formula ϕ. Then HA proves ϕN .

This is a step in the right direction for the Hilbert program, but HA still appeals to first order quantification theory and so is not finitary in the sense required by Hilbert. The translation result did, however, make clear that finitism does not accord with intuitionistic views as it was taken to do by Hilbert and others. So the move to make was to widen the conception of Hilbert’s program. Rather than the strict finitism of Hilbert we ought perhaps

28 to extend the acceptable means to those of a constructive character. It was conceptions of this sort that are the first steps toward a relativized form of Hilbert’s program. In 1936 Gentzen provided a consistency proof for PA by appeal to transfinite induction

up to the ordinal 0. As he makes clear, the goals of consistency proofs are to “justify the disputable forms of inference...on the basis of indisputable inferences” [Gentzen, 1936, 158]. To do this we obviously need to figure out what the disputable and what the correct infer- ences are. Our goal is to show that the last line of a derivation cannot be a contradiction. The first step in showing this is to translate PA to HA. The next step is to reduce deduc-

tions by appropriate substitutions to only contain @, ^, and . One then transforms every derivation into its cut-free version. Finally one performs reduction procedures to show that each derivation whose concluding sequent has neither a false atomic in the antecedent nor a true atomic in the succedent can be reduced to such a form. Moreover this reduction can be achieved in a finite number of steps. A proof of Ñ 0 “ 1 can never be reduced in such a way. Each derivation is assigned a unique ordinal. We use transfinite induction to show that every derivation can be reduced accordingly.414243 The modern exposition of Gentzen’s proof is due to Takeuiti and is the focus of chapter 4. Central to Takeuiti’s proof is the demonstra- tion of the claim that whenever a concrete method of constructing decreasing sequences of ordinals is given, any such decreasing sequence must be finite. Takeuti takes the constructive demonstration of this result as being of particular philosophical and epistemic value. The central theme that comes out of the philosophical discussion of chapter 3 is that such a result can only be understood “on the outside” of the system. The idea of being on the outside looking in was stressed by Burgess [2010] in his discussion of the philosophical significance of

the reduction of WKL0 over PRA mentioned at the end of §1.3.1. Our discussion of Takeuti in chapter 3 shows how this theme generalizes, and will be of central importance in the rest

41See Sieg [2013] for a very illuminating account of the development of Gentzen’s results in light of his involvement with the Hilbert school and the influence of Gödel. 42Since Gentzen, proof theorists have mimicked this “” for theories stronger than Peano Arithmetic. See Rathjen [2006]. 43For another example of this constructive reduction see Gödel [1958].

29 of the dissertation.

1.3.3 Predicativity

Chapter 5 considers in detail work by Solomon Feferman on Predicativity. In particular Feferman was able to demonstrate the limits of Predicativity as coinciding with the ordinal

Γ0. The discussion of Predicativity is in a similiar spirit as that of Gentzen’s work in the sense of liberalizing what counts as an epistemically safe starting point. The basic idea behind Predicativism is the acceptance of the natural numbers as basic to human understanding and then to see just how much of mathematics can be shown on the basis of this starting point. The discussion of predicativity is important for two reasons. First, it signals a shift in how one understands foundational work in mathematics. In particular it signals a shift from a focus on security and justification to that of determining the limits of particular philosophical views. It serves as a lynchpin, so to speak, between the early foundational work discussed in the first half of the dissertation and the more contemporary foundational work discussed in the second part. The second reason why it is important is that it provides another lucid example of the internal/external divide that concerns the philosophical portion of the discussion of Takeuti. As mentioned above, this theme emerges as the central point upon which the philosophical value and significance of the epistemic aspect of proof theory hinges. This becomes particularly clear in the discussion of proof-theoretical reductions generally, and is the subject of chapter 6.

1.3.4 Reductive Proof Theory

The above examples are all illustrative of reductive proof theory more generally. Feferman has argued that proof theory can play an important role in bridging the gap between mathematics and philosophy. This is part of his overall view that logic casts light on mathematics. Much of proof theory is concerned with various reductions. Our concern will be with foundational

30 reductions, specifically the idea that mathematical reductions can serve to justify certain philosophical positions. This position is nicely summarized by Feferman:

A body of mathematics M is represented directly in a formalized system T1 which is justified by a foundational framework F1.T1 is reduced proof-theoretically to a system T2 which is justified by another, more elementary such framework F2.

In Hilbert’s original proposal, F1 is infinitary mathematics and F2 is finitary mathematics.

T1 is then (roughly) Z2, and T2 is (something like) PRA. Work in contemporary reductive proof theory shows that one need not specifically focus on Hilbert’s program. Indeed, one can use proof theory to give a hierarchy of mathematical dependencies. Except for some of the more traditional proof-theorists, establishing consistency of formal systems has receded as the main goal, to be replaced by a more general reductive program. This still takes foundational aims to be primary, but gives up hopes for any supposedly absolute foundation in favor of reductions of various systems to others recognizably more basic in terms of concepts and/or principles.44

In this way we get clearer on, to borrow the phrase from Feferman, what rests on what. The point is related to the distinction discussed above regarding mathematical practice. Are we doing mathematics for purely calculation reasons, or are we trying to discover the structure of mathematics? Theories have different degrees of justification and different ways of being justified. Philo- sophically there are several ways to provide indirect justification of a theory. One way is

to translate sentences in a language L1 of a theory T1 to those of language L2 of a theory

T2. If T2 is justified, then the translation provides indirect justification for T1. An example of a translation is that of PA into HA that we saw above. Another way of giving indirect justification of a theory is to show that it is a conservative extension of a theory taken to be justified. Again, as we saw above, conservative extension is a philosophically satisfying way to indirectly justify a theory. This is because if T2 is a conservative extension of T1, and

T1 is consistent, then so too is T2. Reductive proof theory thus yields relative consistency proofs because they tell us that if the the theory being reduced to (T1) is consistent, then so is the reduced theory (T2).

44Feferman [1993b], p. 223.

31 Now one might expect that we could have theories T1 and T2 that are not comparable in terms of consistency strengths. And indeed one can come up with trivial examples. However, what is perhaps quite shocking is that all the mathematically interesting theories form a linear hierarchy – there is a natural linear chain of strengthenings and every theory of mathematical interest has been identifiable with some level in the hierarchy. What proof-theorists have done in the wake of Gödel is to show that different theories are of equal strength by showing them both ‘equal’ (two-way reducible) to one another or to one of the fundamental theories in the linear hierarchy. The above sketch of reverse mathematics gave us a taste of how this works. The fundamental theories of the second-order arithmetical hierarchy were those of

1 RCA0, WKL0, ACA0, ATR0, Π1-CA, and Z2 mentioned above. Some of the fundamental theories in the hierarchy of first-order arithmetic are Q, EFA, PRA, and PA. Finally, some of the fundamental theories in the first-order set-theoretic hierarchy are Z, ZF, ZFC, ZFC plus measurable cardinals, and so on for increasingly ‘large’ cardinals.45 Three important questions now come to the surface.46

1. Which formal theories are adequate (and necessary) for which parts of classical math- ematical practice?

2. To which constructive theories can classical theories be reduced?

3. What contribution to our understanding of (the nature of) mathematics do reductions make?

In light of our discussion above regarding reductive proof theory and some of the technical results, we can recast question three as follows: What good does it do to know what rests on what? One answer was given above when we said that a theory can be indirectly justified. In the beginning of this introduction I noted two ways of conceiving of Hilbert’s program. The first was that the consistency result would serve an instrumental purpose. While this seems true to some degree, it seems unlikely that this was the only goal of Hilbert, since there are certainly other reasons for investigating the ideal parts of mathematics over and above simply

45For some good overviews see Burgess [2005], Sieg [2013], and Feferman [1977, 1988, 1993b, 2000]. For more encyclopedic treatments see Simpson [2009], Hajek and Pudlak [1998], and Kanamori [1997]. 46See Sieg [2013], p. 290.

32 being a means to arrive at finitary statements. Recall that the stronger reading of Hilbert’s goal was one of epistemic security. The idea was that if a base theory is epistemically secure and conservativity is achieved, then that transmits epistemic security to the extended theory; a relative consistency result will give us an assurance that what we are doing is in order, and indeed the things we are proving are true. It has been claimed, perhaps most notably by Feferman, that one of the take-home points of the proof theoretic reductions lies in informing one’s philosophy of mathematics, on the basis of one’s foundation outlook. By carving up mathematics, proof theoretic investiga- tions tell us which formal systems line up with which philosophies of mathematics. For example, the five basic standard theories of reverse mathematics line up with various philo-

sophical positions concerning foundations: RCA0 formalizes the constructive mathematics

of Bishop, WKL0 the finitistic reductionism of Hilbert, ACA0 the predicativism of Weyl and

Feferman, ATR0 the predicative reductionism investigated by Friedman and Simpson, and

1 47 Π1-CA captures the impredicativity developed by Buchholz, Feferman, Pohlers, and Sieg. These results also help to answer questions 1 and 2 above. The conservativity results are taken as showing us just how much mathematics is justified on the basis of our prior philo- sophical viewpoints. Whether they actually succeed is showing this is the focus of chapter 6. The discussion ties together the theme of being inside of a system versus being outside that comes to the fore in the philosophical discussions of chapters 4 and 5.

1.4 Post-Gödelian Holism

We mentioned that the work on predicativity signaled a shift from security to surveyance, and that such surveyance is central to much work in contemporary reductive proof theory. The shift also highlights an ambiguity surrounding what is meant by a foundation for math- ematics. Fundamental to getting clear on this is to distinguish between two different views of what a foundation for mathematics is/ought to be. The two main threads of thought

47See Simpson [2009], p. 43 for more.

33 are what could be called Hilbertian Foundationalism and Holistic Naturalism. Hilbertian Foundationalism is what has been discussed above in the work of Hilbert and (properly un- derstood) Feferman’s work on reductive proof theory. The rough idea is that what is meant by a ‘foundation for (a branch of) mathematics’ involves a sense of clarity and definiteness. The holistic picture is one whereby a foundation is to be understood as a way to bring the seemingly disparate branches of mathematics together in a unifying way. This is largely in response to both the emergence of mathematical practice as one involving an indirectness of method and an interconnectedness of the various branches of mathematics, as well as a failure, due to Gödel’s Incompleteness Theorems, of the more traditional Foundationalist conception. (Naturalistic) Holism has been most explicitly recently defended by Penelope Maddy, under the influence of John Burgess, though the philosophical consequences of their views are different. In light of Gödel’s results, Foundationalism has largely gone out of style. The lesson from Gödel is commonly taken to be that the goal of the Hilbert program is simply an outdated and unattainable ideal. Shapiro [1991], for example, denies the need for an absolutely secure foundation: “[J]ust as we have learned to live with uncertainty in virtually every special subject, we can live with uncertainty in logic and foundations of mathematics, and we can live well...mathematics does not have an absolutely secure foundation...The practicing mathematician and the prac- ticing logician do not stand in need of a secure basis for their work...There are important roles for foundations of mathematics, but providing maximal security is not among them”48

Of course this is not to say that foundational study itself ought to cease. It is just to say that foundational investigations and methodology will be of a different sort. Rather than conceiving of a foundation for mathematics as being an epistemic bedrock of definiteness and clarity, it has became standard to use set theory as the “arena” in which to investigate and compare various mathematical structures. What set theory does is provide a generous, unified arena to which all local questions of coherence and proof can be referred. In this way, set theory furnishes us with a single tool that can give explicit meaning to questions of existence and coherence; make previously unclear concepts and structures precise; identify perfectly general fundamental assumptions that play out in many different guises in different fields; facilitate interconnections between disparate

48Shapiro [1991], pgs. 25-26.

34 branches of mathematics now all uniformly represented; formulate and answer questions of provability and refutability; open the door to new strong hypotheses to settle old open questions; and so on. In this philosophically modest but mathematically rich sense, set theory can be said to found contemporary mathematics.49

Hilbert’s relative consistency proofs illustrate two characteristics of the emerging mathe- matics that had crucial implications for the ideals of rigorization: indirectness of method and interconnectedness of various branches. Interconnectedness implies that “it will no longer be sufficient to put each individual branch of mathematics separately on a rigorous basis.”50 If you’re going to use one branch of mathematics to investigate another, you had better not lose rigor along the way. Putting them both on the same rigorous foundations ensures that this does not happen.

To guarantee that rigor is not compromised in the process of transferring material from one branch of mathematics to another, it is essential that the starting points of the branches being connected should at least be compatible...The only obvious way to ensure compatibility of the starting points of different branches is ultimately to derive all branches from a common, unified starting point. The material unity of mathematics, constituted by the interaction of its various branches at their higher levels, virtually imposes a requirement of formal unity, of development within the framework of a common list of primitives and postulates, if the rigorization project is to be carried to completion.”51

Indirectness of method implies that

the common, unified starting point will have to be such as to make provision for all the types of constructions by which new, auxiliary spaces or number systems or whatever are manufactured out of old, traditional ones...To complete the project of rigorization, a framework not only common but commodious would be called for, one accommodating (1) all traditional branches [though traditional number theory is enough]; and (2) all methods for constructing new spaces or number systems or whatever from old, with all their actually existing and all their potential future applications.52

Mathematicians deduce and define new notions and results from previous notions and results, and ultimately from first principles, primitives, and postulates. A genuine deduction of a conclusion from premises shows that conclusion to be a consequence of those premises in the sense that the logical form of the premises alone guarantees that their truth implies the truth of the conclusion. The fact that form alone guarantees truth preservation means that any argument of the same form guarantees truth preservation, regardless of what the

49Maddy [2011], p. 34. 50Burgess [2015], pp. 60. 51Burgess [2015], pp. 61, 62. 52Burgess [2015], p. 62.

35 premises and conclusion are actually about. That is, various unintended interpretations will also be such that if their premises are true, then so too will be the conclusion. Recall what Hilbert said about barstools and beer mugs. What this does, though, is impose what John Burgess calls the paradox of rigor, i.e., the observation that a treatment of a given subject matter that is genuinely rigorous will ipso facto cease to be a treatment of that subject matter (alone). It will always be equally a treatment of any other subject matter where conditions alike in logical form to the postulates in question are satisfied.53

Following Tennant [2000b], define monomathematics as the mathematics of a unique struc- ture. Familiar examples include the theory of the natural numbers, the theory of the ratio- nals, and the theory the real-number line. In each case, the mathematician has a unique structure in mind (often called the intended structure), and is trying to articulate as comprehensively as he can the interesting truths about it. He wants to exhibit them all as true statements concerning the intended structure.54

Define Polymathematics as the mathematics of structures enjoying some definable structural affinity. Familiar examples include groups, rings, and topological spaces. In each case, the mathematician has a variety of structures in mind, which all have one crucial thing in common. They all satisfy a particular collection of axioms pinning down certain structural features. . . In each case here, the mathematician is interested not only in the logical consequences of the (underspecified) set of axioms characteristic of these structures, but also in various embeddings of any one structure into others within the same of structures, and in the invariants of such embeddings. The mathematics here is, as it were, more a brand of for the axiom set in question.55

With these characterizations in mind, rigorization and the giving up of the requirement that postulates be self-evident, means that “foundationalism”, in the monomathematical sense, had to go. What is meant by foundations now is something like “starting point”, where axioms are often justified by something like verifiable consequences, not by self-evidence. According to Burgess the foundationalism characterized by Fregean self-evidence almost inevitably had to go, if one thinks mathematics needs a unified framework for all branches. For to accommodate all branches of current and future mathematics will surely require a framework that is less evident than basic principles of traditional branches. Burgess calls this insight the paradox of foundations, namely, 53Burgess [2015], p. 65. 54Tennant [2000b], p. 259. 55Tennant [2000b], p. 259.

36 the observation that anything that is sufficient as a foundation for all of mathematics (including all of its newest branches), will ipso facto fail to be a foundation for some of mathematics (including some of its oldest branches).56

Foundations here, though, is being used equivocally between something like ‘framework’ and something like ‘support’. The important point is this:

What the paradox points to is the fact that there is almost bound to be some kind of trade-off between power or comprehensiveness, on the one hand, and intrinsic evidence or certitude, on the other hand.57

It is this tension that, I take it, underscores much of the disagreement between Frege and Hilbert, and, perhaps more importantly, places Hilbert at the crossroads between a more traditional Fregean conception of what mathematics is/is about and the mainstream con- temporary views, and foreshadows the ultimate failure of Hilbert’s mature program. Set theory eventually emerged as a “foundation” in the sense of Maddy’s arena, quoted above, because of its ability to adequately rigorize mathametics by providing a commodious framework that (1) is ontologically ‘monotypical’ in that sets are things of the same ab- stract kind; (2) is conceptually fertile by providing definitions of all mathematical concepts; (3) accommodates traditional number theory; and (4) accommodates all methods for con- structing new spaces (or number systems or whatever) from old ones, while maintaining all their actually existing and all their potential future applications to the various branches of mathematics.

[T]he value of formal unification within an agreed framework [lies in] facilitating material unification of mathematics, by permitting the mathematician working in one branch to draw on results from the widest range of other branches, should they seem relevant, without having to worry that those results may ultimately be based on different and incompatible principles.58

Roughly, for a mathematical proof to be a genuine proof is for it to be possible in principle to flesh it out into a formal proof within ZFC. In addition to indirectness of method and interconnectedness of branches, Naturalistic Holism emerged from the failure, due to the Incompleteness Theorems, of the more tradi- tional Foundationalist picture. Recall the quote from Shapiro on page 34. But while the

56Burgess [2015], pp. 66-67. 57Burgess [2015], p. 67. 58Burgess [2015], p. 117.

37 incompleteness theorems give a particular example undercutting the Hilbertian aspiration, they also shed light on the more general phenomena of independence from axiom systems. Perhaps most notable is the independence of the Continuum Hypothesis (CH) from the axioms of ZFC. Given the independence phenomena, there has been much work done at- tempting to find new axioms for set theory. But what counts as an acceptable reason for adopting a new axiom? Under which circumstances is it considered epistemically acceptable, or even responsible, to adopt a new axiom? Following Gödel we can distinguish between intrinsic reasons and extrinsic reasons. Intrinsic reasons say that we can extend the current axioms with others that are natural, non-arbitrary continuations of them. Extrinsic reasons go beyond this. Gödel’s program of searching for large cardinals to settle the continuum hypothesis is a good example of searching for external reasons. Given Gödel’s Platonism, he thought that the CH had a definite truth value. In an attempt to determine it, Gödel’s plan was to search for new axioms “based on hitherto unknown principles...which a more profound understanding of the concepts underlying logic and mathematics would enable us to recognize as implied by these concepts.”59 In addition to this, one might, as mentioned above, take abundant verifiable consequences that ought to be accepted as reason to accept a new axiom, akin to the practice of science. These are both examples of external justification. Penelope Maddy has come up with her own naturalized philosophy of mathematics aimed at validating extrinsic justification. The focus of chapter 7 is Maddy’s version of Natural- ism. The chapter opens by discussing Quinean naturalism and then shifts gears to Maddy’s modification. The rest of the chapter is devoted to providing a critique of Maddy. Central to the critique is the distinction mentioned above between two different ways of conceiving of what exactly a foundation for mathematics is/ought to be. We end the dissertation with a brief discussion of the a priori and its role in bridging the gap between mathematics and philosophy.

59Gödel [1947/64], p. 182.

38 Chapter 2

Recurring Technical Concepts

In this chapter we present some technical concepts that will recur throughout the dissertation.

2.1 PRA

The following definition of PRA, primitive recursive arithmetic, is from Simpson [2009].

The language of PRA is a first order language with equality. In addition to the 2-place

predicate symbol =, it contains a constant symbol 0, number variables x0, x1, . . . , xn,... pn ă

k ωq, 1-place operation symbols Z and S, k-place operation symbols P i for each i and k with 1 ď i ă ω, and additional operation symbols, which are introduced as follows. If g is an m-

place operation symbol and h1 . . . hm are k-place operation symbols, then f “ Cpg, h1 . . . hmq is a k-place operation symbol. If g is a k-place operation h is a pk`2q-place operation symbol, then f “ Rpg, hq is a pk`1q-place operation symbols. The operation symbols of the language of PRA are called primitive recursive function symbols.

The intended model of PRA consists of the nonnegative integers, ω “ t0, 1, 2,... u, to- gether with the primitive recursive functions. In detail, the number variables range over ω

and we interpret “ as equality on ω, 0 as 0, Z as the constant zero function Z defined by

k Zpxq “ 0, S as the successor function S defined by Spxq “ x ` 1, P i as the projection

39 k function Pi defined by

k Pi px1, . . . , xkq “ xi,

Cpg, h1 . . . hmq as the function f defined by composition as

fpx1, . . . , xkq “ gph1px1, . . . , xkq, . . . , hmpx1, . . . , xkqq, and Rpg, hq as the function f defined by primitive recursion as

fp0, x1, . . . , xkq “ gpx1, . . . , xkq

fpy ` 1, x1, . . . , xkq “ hpy, fpy, x1, . . . , xkq, x1, . . . , xkq.

The axioms of PRA are as follows. We have the usual axioms for equality. We have the usual axioms for 0 and the successor function:

Zpxq “ 0, Spxq “ Spyq Ñ x “ y, x ‰ 0 ØDypSpyq “ xq.

We have defining axioms for the projection functions:

k P i px1, . . . , xkq “ xi.

For each function f “ Cpg, h1 . . . hmq given by composition, we have a defining axiom

fpx1, . . . , xkq “ gph1px1, . . . , xkq, . . . , hmpx1, . . . , xkqq.

For each function f “ Rpg, hq given by primitive recursion, we have defining axioms

fp0, x1, . . . , xkq “ gpx1, . . . , xkq

fpSpyq, x1, . . . , xkq “ hpy, fpy, x1, . . . , xkq, x1, . . . , xkq.

40 Finally we have the schema of primitive recursive induction:

pθp0q ^ @xpθpxq Ñ θpSpxqqqq Ñ @xθpxq where θpxq is any quantifier-free formula in the language of PRA with a distinguished free number variable x. We define PRA, primitivie recursive arithmetic, to be the formal system with the above axioms.

2.2 Subsystems of Second-Order Arithmetic

The following definitions of Z2, Second-Order Arithmetic, and its subsystems are from Simp- son [2009].

The language of Z2 (L2) has two types of variables: number variables and set variables. These are, respectively, those ranging over the natural numbers (ω “ t0, 1, 2, ...u), and those ranging over subsets of natural numbers (Ppωq). The numerical terms consist of number variables, the constant symbols 0 and 1, and t1 ` t2 and t1 ¨ t2, where t1 and t2 are numerical terms. As usual, ` and ¨ are the binary operations for addition and multiplication. Atomic formulas are t1 “ t2, t1 ă t2 and t1 P X, where t1 and t2 are numerical terms and X is a set variable. As usual, “, ă, and P are the binary predicates for equals, less than, and is a member of. Formulas are built up from atomic formulas by means of the usual logical connectives p , ^, _, Ñ, Øq, number quantifiers p@n, Dnq and set quantifiers p@X, DXq. The formation rules for the connectives and quantifiers are as usual. A sentence is a formula with no free variables.

The intended model for L2 is

pω, Ppωq, `, ¨, 0, 1, ăq, where ω is the set of natural numbers, Ppωq the set of all subsets of ω, and `, ¨, 0, 1, ă are as usual. The axioms of Z2 consist of the universal closures of the following formulas of the

41 second-order language:

1. Basic axioms: n ` 1 ‰ 0 m ` 1 “ n ` 1 Ñ m “ n m ` 0 “ m m ` pn ` 1q “ pm ` nq ` 1 m ¨ 0 “ 0 m ¨ pn ` 1q “ pm ¨ nq ` m m ­ă 0 m ă n ` 1 Ø pm ă n _ m “ nq 2. Induction axiom: p0 P X ^ @npn P X Ñ n ` 1 P Xqq Ñ @npn P Xq 3. Comprehension scheme: DX@npn P X Ø ϕpnqq, where ϕpnq is any formula of L2 in which X does not occur freely.

Second Order Arithmetic is the formal system in the language L2 consisting of the axioms of second order arithmetic, together with all the formulas of L2 which are deducible from those axioms by means of the usual logical axioms and rules of inference.

The standard subsystems of Z2 (the “Big Five”) are, in increasing order of strength:

1 RCA0, WKL0, ACA0, ATR0, Π1 ´ CA.

2.2.1 RCA0

The acronym RCA stands for recursive comprehension axiom. This is because RCA0 con- tains axioms asserting the existence of any set A which is recursive in given sets B1,...,Bk (i.e., such that the characteristic function of A is computable assuming oracles for the char-

0 acteristic functions of B1,...,Bk). The subscript 0 denotes restricted induction. The Σ1 induction scheme is the restriction of the second order induction scheme to L2 formulas ϕpnq

0 which are Σ1, i.e. it is the universal closure of

pϕp0q ^ @npϕpnq Ñ ϕpn ` 1qqq Ñ @nϕpnq,

42 0 0 where ϕpnq is any Σ1 formula of L2. The ∆1 comprehension scheme consists of (the universal closures of) all formulas of the form

@npϕpnq Ø ψpnqq Ñ DX@npn P X Ø ϕpnqq

0 0 where ϕpnq is any Σ1 formula, ψpnq is any Π1 formula, n is any number variable, and X is a set variable which does not occur freely in ϕpnq. RCA0 is the subsystem of Z2 consisting

0 0 of the basic axioms, the Σ1 induction scheme, and the ∆1 comprehension scheme. We note

0 0 that ∆1 formulae define recursive sets and Σ1 formulae define recursively enumerable sets.

Also of importance is that the first-order part of RCA0 is PA with induction restricted to

0 Σ1 formulae.

2.2.2 WKL0

Let t0, 1uăN (2ăN) denote the full binary tree, i.e., the set of (codes for) finite sequences of

0’s and 1’s. Weak König’s Lemma says that every infinite subtree of 2ăN has an infinite path.

WKL0 is defined to be the subsystem of Z2 consisting of RCA0 plus weak König’s Lemma.

WKL0 is much stronger than RCA0 with respect to both ω-models and mathematical prac- tice. It is, however, of the same strength as RCA0 in a proof-theoretic sense. Namely, the

0 first order part of WKL0 is the same as that of RCA0, viz. Σ1 ´ PA. Another important

0 result is that WKL0 is conservative over PRA with respect to Π2 sentences. In particular,

0 given a Σ1 formula ϕpm, nq and a proof of @mDnϕpm, nq in WKL0, one can find a primitive recursive function f : ω Ñ ω such that ϕpm, fpmqq holds for all m P ω. We look at this result closely in chapter 6.

2.2.3 ACA0

The acronym ACA stands for arithmetical comprehension axiom. This is because ACA0 contains axioms asserting the existence of any set which is arithmetically definable from

43 given sets. The subscript 0 denotes restricted induction. The arithmetical comprehension scheme is the restriction of the comprehension scheme to arithmetical formulas ϕpnq, where a formula of L2 is said to be arithmetical if it contains no set quantifiers. Thus we have the universal closure of

DX@npn P X Ø ϕpnqq,

where ϕpnq is a formula of L2 which is arithmetical and in which X does not occur freely.

ACA0 is the subsystem of Z2 whose axioms are the arithmetical comprehension scheme, the induction axiom, and the basic axioms. The induction axiom and the arithmetical comprehension scheme imply the arithmetical induction scheme:

rpϕp0q ^ @npϕpnq Ñ ϕpn ` 1qqq Ñ @nϕpnqs

for all L2-formulas ϕpnq which are arithmetical. It is also worth noting that ACA0 is a conservative extension of first order arithmetic, in particular sharing the proof-theoretic ordinal 0. This is equivalent to saying that PA is the first-order part of ACA0.

2.2.4 ATR0

The acronym ATR stands for arithmetical transfinite recursion. Informally, arithemetical transfinite recursion can be described as the assertion that the Turing jump operator can be iterated along any countable well ordering starting at any set. More formally, consider an arithmetical θpn, Xq with a free number variable n and a free set variable X. Note that θpn, Xq may also contain parameters, i.e., additional free number and set variables. Fixing these parameters, we may view θ as an “arithmetical operator” Θ: P pNq Ñ P pNq, defined by

ΘpXq “ tn P N : θpn, Xqu.

44 Now let A, ăA be any countable well ordering, and consider the set Y Ď N obtained by transfinitely iterating the operator Θ along A, ăA. This set Y is defined by the following

a conditions: Y Ď N ˆ A and, for each a P A, Ya “ ΘpY q, where Ya “ tm : pm, aq P Y u and

a a Y “ tpn, bq : n P Yb ^ b ăA au. Thus, for each a P A, Y is the result of iterating Θ along the initial segment of A, ăA up to but not including a, and Ya is the result of applying Θ one more time. Arithmetical transfinite recursion is the axiom scheme asserting that such a set Y exists, for every arithmetical operator Θ and every countable well ordering A, ăA.

ATR0 is the subsystem of Z2 consisting of ACA0 plus the scheme of arithmetical transfinite recursion.

1 2.2.5 Π1 ´ CA

1 A formula ϕ is said to be Π1 if it is of the form @Xθ, where X is a set variable and θ is

1 an arithmetical formula. A formula is said to be Σ1 if it is of the form DXθ, where X is

1 a set variable and θ is an arithmetical formula. Π1 ´ CA0 is the subsystem of Z2 whose axioms are the basic axioms, the induction axiom, and the comprehension scheme restricted

1 to L2-formulas ϕpnq which are Π1. Thus we have the universal closure of

DX@npn P X Ø ϕpnqq

1 for all Π1 formulas ϕpnq in which X does not occur freely.

2.3 Gödel’s Incompleteness Theorems

The following presentation of Gödel’s Incompleteness Theorems follows Smith [XXXX].

45 2.3.1 Gödel’s First Incompleteness Theorem

Call a theory T adequately strong if whenever m ‰ n, $T m ‰ n, T adequately represents all recursive relations and functions, and T contains first order logic.1 All the theories con- sidered in this dissertation are of adequate strength.

First Incompleteness Theorem. Suppose T is adequately strong and $T G Ø ProvT pxGyq. Then (i) if T is consistent, &T G and (ii) if T is ω-consistent then &T G.

Proof. (i) Suppose for reductio that $T G. Then for some n, PrfT pn, xGy) and because Prf

is represented in T , $T PrfT pn, xGyq. So $T DxPrfT px, xGyq, i.e. $T ProvT pxGyq. But if

$T G, then by the diagonal lemma $T ProvT pxGyq. So &T G. (ii) Suppose for reductio that $T G. Because T is consistent, there is no proof of G, i.e. for all n, it is not the case that PrfT pn, xGyq. And because Prf is representable, $T PrfT pn, xGyq, for all n. But by the diagonal lemma, $T ProvT pxGyq, i.e., $T DxPrfT px, xGyq. So T is ω-inconsistent. So if T is

ω-consistent, &T G. l

2.3.2 Gödel’s Second Incompleteness Theorem

To prove the Second Incompleteness Theorem we need the following Derivability Conditions:

DC1. $T ϕ then $T ProvT pxϕyq

DC2. $T ProvT pxϕ Ñ ψyq Ñ pProvT pxϕyq Ñ ProvT pxψyqq

DC3. $T ProvT pxϕyq Ñ ProvT pxProvT pxϕyqyq

Also needed is the Diagonal Lemma: $T G Ø ProvT pxGyq. The idea for proving the second incompleteness theorem is this: Any theory of the sort

T that we are interested in will prove 0 ‰ 1. Thus, if a theory is consistent then it will not prove 0 “ 1. A natural consistency statement for the theory is then DxP rfpx, x0 “ 1yq.

This is representable by ProvT px0 “ 1yq. So let ConT abbreviate ProvT px0 “ 1yq. Now the first incompleteness theorem tells us that if T is consistent then T does not prove G.

1For more details see, e.g., Boolos, et. al. [2007], or Enderton [2001].

46 And given that ConT is true just in case T actually is consistent, we ought to be able to

formalize a proof of ConT Ñ ProvT pxGyq in T . Moreover, we might expect to be able to actually prove this in T , i.e.

T $ Con Ñ ProvT pxGyq.

By the diagonal lemma we know that T $ G Ø ProvT pxGyq. And we know that if T is consistent then T & G. So, assuming T is adequately strong, as we did earlier, if T is consistent then T & ConT . So far so good. What the derivability conditions allow us to do is prove all this in T , assuming T is adequately strong.

Second Incompleteness Theorem. Suppose T is consistent, ProvT is a provability pred- icate satisfying the derivability conditions, G a fixed point for ProvT , and ConT “def DxPrfT px, x0 “ 1yq, i.e. ProvT px0 “ 1yq. It follows that T $ ConT Ñ ProvT pxGyq, and hence that T & ConT .

Proof.

1. $T G Ñ ProvT pxGyq Given

2. $T ProvT pxG Ñ ProvT pxGyqyq 1,DC1

3. $T ProvT pxGyq Ñ ProvT px ProvT pxGyqyq 2,DC2

4. $T ϕ Ø pϕ Ñ 0 “ 1q Given

5. $T ProvT px ϕyq Ø ProvT pxϕ Ñ 0 “ 1yq 4,DC1,DC2

6. $T ProvT px ProvT pxGyqyq Ñ ProvT pxProvT pxGyq Ñ 0 “ 1yq Instance of 5

7. $T ProvT pxGyq Ñ ProvT pxProvT pxGyq Ñ 0 “ 1yq 3,6

8. $T ProvT pxGyq Ñ pProvT pxProvT pxGyqyq Ñ ProvT px0 “ 1yqq 7,DC2

9. $T ProvT pxGyq Ñ ProvT pxProvT pxGyqyq Instance of DC3

10. $T ProvT pxGyq Ñ ProvT px0 “ 1yq 8,9

11. $T ProvT px0 “ 1yq Ñ ProvT pxGyq 10

12. $T ConT Ñ ProvT pxGyq 10,Definition of ConT .

By appealing to the reasoning immediately before the theorem, it follows that T & ConT . l

47 2.4 Proof-Theoretical Reduction

Following Feferman [1988a], [1993b], let T1 and T2 be formal axiomatic theories, and let L1

and L2 be the languages of T1 and T2, respectively. Then L1 X L2 is the common part of the

two languages. We take PRA to be included in all the theories considered. So LPRA Ď L1XL2 and PRA Ď T1 X T2. The basic idea of a proof-theoretical reduction of T1 to T2, written

T1 ď T2, is that we have an effective method f that transforms any proof p in T1 of a closed equation into a proof fppq in T2 of the same formula. Moreover, this can itself be proven in

T2. More generally, we can say that T1 is proof-theoretically reducible to T2 for Φ, written f : T1 ď T2rΦs, if f is a partial recursive function such that

(1) whenever PrT1 px, yq and x is (the code of) a formula ϕ P Φ then fpyq is defined and

PrT2 px, fpyqq, and

(2) the formalization of (1) is provable in T2.

We say that T1 is proof-theoretically equivalent to T2, written T1 ” T2, when T1 ď T2 and

T2 ď T1. We say that T1 is conservative over T2 for Φ just in case

@ϕ P Φ, T1 $ ϕ ñ T2 $ ϕ.

Clearly, then, if T1 ď T2rΦs then T1 is conservative over T2 for Φ. It also follows immediately that

T1 ď T2 ñ T2 $ pConT2 Ñ ConT1 q.

One can also establish a result attributed to Kreisel such that

T $ ϕ ñ PRA ` ConT $ ϕ,

0 for ϕ P Π1 and T some theory. And since in practice the f in our definition of f : T1 ď T2 is primitive recursive, it can be established by means of Kreisel’s result that relative consistency

48 can be established within PRA, i.e.

2 PRA $ ConT2 Ñ ConT1 .

It should be noted that conservativity and proof-theoretic reduction are different notions.

While a proof-theoretic reduction of T1 to T2 guarantees that we have a conservative ex- tension the converse is not guaranteed. The important difference concerns whether the conservation is provable in T2. It so happens that typically having conservation allows one to also establish reducibility, but as a matter of methodology, there is an epistemic, and foundational advantage to having a reduction.3

2For details about these last results see Smorynski [1977], pp. 858-859 and Feferman [1988a], pp. 368-369. 3Feferman [1988a], p. 369.

49 Chapter 3

Detlefsen

Gödel’s Incompleteness Theorems are standardly taken as showing that Hilbert’s Program, as intended, fails. Michael Detlefsen maintains that they do not. The argument from the first incompleteness theorem, as presented by Detlefsen, takes the form of a dilemma to the effect that either the infinitistic theory is incomplete with respect to a certain subclass of real sentences or it is not a conservative extension over the finitistic theory. He contends that Hilbert need not be committed to either of these horns and, as such, the argument from the first incompleteness theorem does no damage to Hilbert’s program. His argument against the second incompleteness theorem as refuting Hilbert’s Program, what he calls the stability problem, concerns the particular formalization of the consistency statement shown unprovable by Gödel’s theorem, and endorses what are called Rosser systems. The success of Detlefsen’s arguments critically depends upon the precise characterization of what exactly Hilbert’s program is. It is our contention that despite Detlefsen’s attempts, both of the arguments (from the first and second incompleteness theorems) are devastating to Hilbert. The view that Detlefsen puts forth is better understood as a modified version of Hilbert’s general program cast as a particularly strict form of instrumentalism. We end by analyzing the coherence of Detlefsen’s proposal, independently of the historical Hilbert.

3.1 Introduction

In this chapter we look at two arguments put forward by Michael Detlefsen to the effect that neither one of Gödel’s two Incompleteness Theorems is sufficient to block Hilbert’s Program. Both of Detlefsen’s arguments appeal to the conception of Hilbert as an instrumentalist. On this view, the infinite is merely an instrument to be used to reason about the finite. The whole goal of Hilbert’s consistency proof was to justify their use as instruments. The general picture of Instrumentalism (with respect to a theory) is as follows: The “epistemic potency”

50 of a theory S can be accounted for without taking the elements of the theory literally. The idea is that a theory can be valuable and epistemically useful, i.e. help us arrive at justified beliefs and knowledge, without requiring us to have the same epistemic attitudes toward the theory itself. The sentences and proofs that enable us to arrive at such epistemic conclusions are what Detlefsen calls “inference tickets”. These “inference tickets” allow us to arrive at an epistemic attitude toward a given proposition without us having any epistemic attitude toward the inference ticket itself. According to Detlefsen, Hilbert was interested in “computational” instruments. Computational instruments differ from “genuine” proofs, the latter being characterized by means of the truth of premises and the truth-preserving nature of the inferences. The former enables us to have a belief in the last line of the computation by means of syntactic transformations that are in some sense reliable without having the same sort of belief about the computation itself; we only care about the terminal formula being true. As far as this relates to Hilbert’s Program, the differences between what Detlefsen calls genuine proofs and computational instruments roughly correspond to Hilbert’s notions of real proof and ideal proof, respectively. The epistemic value of the real statements derives from their evidentness, while that of the ideal statements derives from the role that they play in calculations. The ideal statements and ideal forms of reasoning are thus instrumental in furnishing proofs of the real statements. They are the inference tickets by whose means we can arrive at meaningful and contentful real statements without having any epistemic attitude toward the ideals themselves. It is important that the ideal methods do not prove any real statements refutable by the real theory (i.e. they are real-sound in Detlefsen’s ter- minology), and that, in order to be truly instrumental, they be more efficient than their real-counterparts. On the instrumentalist view, the ideal methods are merely convenient devices that simplify and unify proofs of real statements. They increase our epistemic ac- quisition of long and complex real proofs of real statements by means that we are able to actually comprehend. This, for Detlefsen, is their epistemic value.

51 [I]t is its ability to simplify which is the source of the epistemic efficiency of the ideal methods. By simplifying our thought, the use of ideal elements unifies it. And in unifying thought, more of our cognitive or epistemic concerns are brought under the purview of a manageably complex method of epistemic acquisition. In a nutshell, the use of ideal elements widens the scope of our methods of epistemic acquisition without complicating those methods to such an extent as would make them humanly unfeasible. Similar extension of scope using purely contentual methods would be impossible since it would require methods too complicated to be humanly practicable.1

The supplementation of the real by the ideal is what is in need of justification and this is the job of the consistency proof. All this might be taken as reason to think that Hilbert endorsed a form of general instru- mentalism. In “On the Infinite”, Hilbert writes

For there is a condition, a single but absolutely necessary one, to which the use of the method of ideal elements is subject, and that is the proof of consistency; for, extension by the addition of ideals is legitimate only if no contradiction is thereby brought about in the old, narrower domain, that is, if the relations that result for the old objects whenever the ideal objects are eliminated are valid in the old domain.2

Moreover, Hilbert clearly thought that mathematics was the discipline that lay at the founda- tion of all human thought. It was to be a tool used to characterize all branches of knowledge. In particular, Hilbert was interested in “the vexed question about the share which thought, on the one hand, and experience, on the other, have in our knowledge.” It was mathematics, he maintained, that is “the instrument that mediates between theory and practice, between thought and observation” and that it [mathematics] “builds the connective bridges and makes them even sounder.” He goes on to say that “...our entire modern culture, in so far as it rests on the penetration and utilization of nature, has its foundation in mathematics.”3 So it at least seems reasonable that some sort of instrumentalism would be accepted by Hilbert. It is at least true that ideal proofs help to simplify and unify finitary thought. And, on a modest conception of instrumentalism, this much is taken as true. It therefore seems reasonable to think that Hilbert would have at least endorsed a version of modest instrumentalism. With this background in place we are now ready to consider Detlefsen’s arguments.

1Detlefsen [1986], p. 8. 2Hilbert [1925], p. 383. 3Hilbert [1930], p. 1163, as quoted in Sieg [2015], p. 3.

52 3.2 The argument from the first incompleteness theorem

The version of the argument from the first incompleteness theorem against Hilbert’s Program that Detlefsen [1990] engages is one due to Craig Smorynski. Smorynski’s argument is quick, and requires some unpacking for proper analysis. Here is the argument:

Hilbert’s Programme can be described thus: There are two systems, nowadays called formal theories, S and T of mathematics. S consists of the finite, meaningful statements and methods of proof and T the transfinite, idealized such statements and methods. The goal is to show that, for any meaningful assertion G, if T $ G then S $ G. Moreover, this is to be shown in the system S. Gödel destroyed Hilbert’s Programme with his First Incompleteness Theorem by which he produced a sentence satisfying a sufficiently narrow criterion of meaningfulness and which, though readily recognized as true – hence a theorem of the transfinite system T, was unprovable in S. In short, he produced a direct counterexample to Hilbert’s desired conservation result.4

In the rest of §3.2 we will refer to the real (or finitist) theory as ‘F’, and to the ideal (or transfinite) theory as ‘T’. One might wonder why the fact that the Gödel sentence is readily recognizable as true ought to entail that it is a theorem of the transfinite system T. What kind of a requirement on T is being assumed here? According to Detlefsen, Smorynski is thinking of T as the kind of theory that we should hope for, given Hilbert’s claims to foundational certitude. It is supposed to act as a norm for ideal theorizing whereby it proves those real statements that it ought to prove, those sentences it ought to prove at least including those ‘readily recognized as true’.5 Though it is unclear precisely what is meant by ‘readily recognized as true’, whatever it means seems to apply to the (canonical) Gödel sentence because it “says of itself that it is unprovable” and is indeed unprovable (in the system). It is thus easily recognized as true and so ought to be provable in the ideal system. Smorynski is thus taking as given a certain completeness constraint on the ideal theory with respect to this certain class of real sentences. Following Detlefsen, call a theory Smorynski-complete if it is complete with respect to the readily recognizedly true real sentences. Assuming that the ideal theory is complete in this sense the claim is straightforward.

4Smorynski [1985], pp. 3, 4, as quoted in Detlefsen [1990], p. 357. Detlefsen’s quote is not verbatim, having changed Smorynski’s ‘φ’ to ‘G’, but obviously nothing hinges on this difference. 5Detlefsen [1990], p. 357.

53 Because T is Smorynski-complete, T $ G. But by G1, F & G. So T is not conservative over F, which it is supposed to be according to Hilbert’s Program. The program is thus said to fail. The argument, then, is that, given Gödel’s theorem, if the ideal theory is complete with respect to these recognizably true real sentence, then it fails to be a conservative extension of the real theory. Said otherwise, either T is incomplete with respect to this class of real sentences, or it fails to be a conservative extension of F. And both horns of this dilemma are, according to Smorynski, problematic for Hilbert’s program. Let’s call this dilemma Hilbert’s Dilemma.

Hilbert’s Dilemma: Either T is incomplete with respect to the class of readily recogniz- able true real sentences, or it fails to be a conservative extension of F.

There are certain things that Hilbert says that seem to support the view that he would have endorsed both of these horns. Taken this way, the argument is, it seems, clearly dev- astating. To argue against Smorynski, then, one needs to suggest a coherent story whereby Hilbert would have rejected at least one of these requirements. This is exactly what Detlefsen tries to do, though he claims that Hilbert could deny both horns. And as we shall see, the resulting theory that Detlefsen’s instrumental understanding amounts to (so far as I properly understand it) is one that is neither Smorynski-complete, nor conservative.

3.2.1 Detlefsen’s Reply

Do we need Smorynski-completeness?

The instrumentalist only requires efficiency and real-. The point of appealing to the ideal theory is that it is instrumental. It need only prove real sentences more efficiently than the real system and this by itself does not entail that the ideal theory should prove more sentences than the real theory. So on this view, T need not prove a formula ϕ unless F proves ϕ.

54 One thing to note about the completeness criterion is that the notion of truth being assumed is the classical rather than constructive notion. This is because the argument concerns the Gödel sentence and this is classically true but not constructively true; “what makes G true is the truth of its instances, not its provability by finitary means.”6 And this, Detlefsen claims, leads one to conclude that

...T’s ‘failure’ to prove G can only be counted as a ‘failure’ to prove all classically true real sentences formulable in its language; which means that Smorynski-completeness must be seen as the requirement that an ideal theory T prove all classically true real sentences formulable in its language. The significance of this fact for our argument is that it clarifies the possible defenses for Smorynski-completeness as a constraint on the ideal theorizing. In particular, it shows us that it cannot merely be seen as an attempt to enforce a simple strength requirement on T to the effect that T be powerful enough to codify the whole of finitary reasoning. This is clear from the fact that the theorems of finitary reasoning are not at all the same as the classical true real sentences formulable in L(T), since the latter clearly include sentences that do not belong to the former. What we must now consider is whether there is some other justification for it.7

Detlefsen thinks that there is not any other justification and defends the rejection of the completeness requirement by appealing to the above interpretation of Hilbert as an instrumentalist. Whereas on a realist conception a theory ought to prove all truths, on an instrumentalist view this need not be the case. The less efficient proofs (in this case) are to be replaced by more efficient ones. There is no requirement to prove anything more than what is provable in F. But note that Detlefsen has changed the game by strengthening Smorynski’s claim. Since T is assumed to be consistent and axiomatized, it is true that T fails to prove all classically true sentences. But Smorynski’s original argument only required the recognizably true real sentences be provable, not all the true real sentences, which Detlefsen’s formulation now involves. To make the further claim that Smorynski completeness does require all classically true real sentences to be provable requires a further step. And this step seems plausible only if one takes Hilbert’s Program as requiring a single formal system to capture all of mathematics. 6Detlefsen 1990, p. 359. 7Detlefsen 1990, pp. 359-360.

55 Indeed, it could be argued that whereas Hilbert may not have required outright com- pleteness, he would have at least expected completeness with respect to this restricted class of readily recognizable true real formulas, given that the program was supposed to model our minds as mathematicians. So there is reason to think that we need completeness in that respect. And this does fail by Gödel’s arguments. Viewed this way, T should still go beyond F and so Smorynski’s argument still goes through. So simply arguing that one should understand Hilbert as an instrumentalist generally is not enough to provide Hilbert a way out of the dilemma, at least on this horn. For if it is reasonable to think that T be complete with respect to the readily recognizable classically real truths, even if not complete with respect to all classical real truths, then one is still confronted with a failure of conservativity appropriately understood.

What about conservativity?

The conservativity requirement, then, is where the real action is. Can Detlefsen provide a suitable interpretation of Hilbert’s program whereby one need not require conservativity? On the instrumentalist conception that we have been considering above, we stressed the primacy of efficiency and reliability. These motivations by themselves do not imply that the ideal theory be conservative over the real theory. To see this, first note that the real theory F is supposed to be the arbiter with respect to the truth of real sentences. F then is taken to be sound. Moreover, we require that if F decides a sentence ϕ, then if T also decides ϕ, T and F must decide it in the same way. That is, T, to be a reliable theory, needs to be real-sound with respect to the real formulae decided by F. We require that T not prove any formula refutable by F. This by itself, Detlefsen claims, does not in any way commit one to the conservativity of T over F with respect to real sentences. Were T to prove a formula ψ not decidable in F, T would indeed not be conservative over F. (If T proves G, as it was claimed above might be a reasonable requirement on T, this would be an example of a failure of conservativity). And yet, in this way, failure to be conservative does not constitute a violation of the real-

56 soundness requirement. And this is all we really require, according to the instrumentalist. Put concisely, soundness is different from conservativitiy, and the instrumentalist is only committed to the soundness requirement, not the conservativity requirement. As Detlefsen puts it,

It is one thing to say that any real proposition ρ which [F] decides must be decided in the same way by T, if T decides it, and quite another to say that [F] must prove every real proposition ρ provable in T.8

And, he goes on,

[t]he reason for this divergence is, of course, that [F] may not decide ρ at all; in which case, respect for the epistemic authority of [F] provides no reason to demand that ρ be provable in T only if it is also provable in [F]. To put it another way, when [F] does not decide ρ, T cannot transgress against the authority of [F] by deciding ρ. Thus, under such circumstances it is freed from the obligation to adhere to the conservation condition.9

Given all this, Detlefsen proposes replacing the usual conservation requirement with a weaker one: Weak Conservation: For every real sentence r of L(T) such that r is decided by F, if r is provable in T, then r is provable in F. Moreover, this is supposed to be provable in F. The upshot, then, is that if F is supposed to capture all finitary reasoning, and one allows that there are real sentences that are not finitarily decidable (e.g. G), then it is plausible that T will not be conservative over F with respect to the real sentences. But, according to the instrumentalist, Hilbert need only be committed to the weak conservation principle. And the weaker version does not entail the stronger version. So while G violates the stronger version, it does not violate the weaker version. Thus, according to Detlefsen, the argument fails to disturb the instrumentalist project. Is there, however, any reason for thinking that the instrumentalist, though committed to the weaker form of conservation might also be committed to the stronger form? Detlefsen presents such a reason, and it stems from the close parallels that Hilbert draws between the real/ideal distinction and the observational/theoretical distinction in physical theorizing.10

8Detlefsen 1990, p. 361. 9Detlefsen 1990, p. 361. 10See Detlefsen [1990], pp. 362-363.

57 The argument is that because the two cases are analogous, if one can claim that the the- oretical is conservative over the observational, then it would also be natural to think that the ideal ought to be conservative over the reals. The theoretical is conservative over the observational. So the ideal ought to be conservative over the reals. Given that the par- allels between the mathematical and physical cases are taken to support instrumentalism, the instrumentalist who claims that the conservativity holds in the physical case should also maintain this in the mathematical case. The reason to think that the theoretical is conservative over the observational stems from the fact that the ideal theory, it is assumed, needs to be real sound (in the sense defined above), and that the observational theory is decidable with respect to real sentences. Given these two facts, it does follow that the theoretical, if true, will be conservative over the observational sentences. But the same argument fails in the mathematical case, given the first incompleteness theorem. Unlike the observational statements which are plausibly decidable, the real theory is known to be undecidable. Without this further piece, the parallel argument does not go through. Now it might be objected that Hilbert required, not merely expected, that the real theory would be decidable. Were this so, then, regardless of the conservation requirement, the first incompleteness theorem would kill Hilbert’s Program in that way. But assume for the moment that Hilbert need not require that the real theory be decidable.11 Is this enough to show that he would also not require conservativity? It seems to me that it is not. For while decidability is certainly sufficient to commit one to conservativity, it need not be necessary. Indeed, Hilbert does seem to have been committed to conservativity: [T]he modes of inference employing the infinite must be replaced generally by finite processes that have precisely the same results.... That, then, is the purpose of my theory. Its aim is to endow mathematical method with the definitive reliability.12 Indeed the whole point of Hilbert’s epsilon substitution method was to provide a procedure that, given any ideal proof of a real formula, would convert it into a real proof. 11Detlefsen [1990], pp. 363-365 argues that there is certain textual and conceptual support for this assumption. We will not engage with his discussion here. 12Hilbert [1925], p. 370.

58 The point of conservation is to ensure that the ideal theory not prove anything false. Given the stress on epistemic justification, without the conservativity requirement, it seems that one’s attitude toward any real sentences that one might prove by ideal means that are not decidable by real means ought to lower one’s credence in the truth of the ideal theory. For what good is the instrument if you are not confident that there will be a corresponding real proof, given that the real proofs are supposed to be the epistemic authority? So even if Detlefsen’s instrumentalist need not require conservativity, it seems to me that Hilbert himself was committed to it, and so is not invulnerable to the arguments from the first incompleteness theorem. As such, the arguments above against G1 that appeal to instru- mentalism are insufficient, as it stands, to save Hilbert’s Program.

3.3 The argument from the Second Incompleteness The-

orem

Despite arguments against Hilbert’s Program stemming from the first incompleteness theo- rem (G1), traditionally the Gödelian argument appeals to the second theorem (G2). Detlef- sen [1986] has provided an unusually detailed account of one such argument. Following Detlefsen, we call this the Standard Argument (SA). Detlefsen has argued that SA has three serious weaknesses that cannot be overcome. As a result, he disputes the claim that G2 undermines Hilbert’s program. The first weakness, which Detlefsen calls the Stability Prob- lem, and the one that will be our focus, is the claim that while the Gödelian sentence, call

it ConGpT q, that expresses the consistency of an adequately strong consistent theory T is shown unprovable in T by G2, it does not follow that there will not be some other sentence that expresses the same proposition as ConGpT q, i.e. the consistency of T , that is provable in T . After laying out the Standard Argument we examine the Stability Problem closely and consider Detlefsen’s claims that it is a problem that cannot be overcome.

59 3.3.1 The Standard Argument and The Stability Problem

The Standard Argument, as presented by Detlefsen (78-79), is the following:

1. If T is consistent, then ConGpT q is not provable in T

2. ConGpT q “expresses” the consistency of T 3. Every finitary truth (in particular those of the finitary metamathematics of T ) can be “expressed” as a theorem of T

4. If ConGpT q is not provable in T , then ConGpT q does not express a theorem of the finitary metamathematics of T (3)

5. If ConGpT q is not provable in T , then the consistency of T is not provable in the finitary metamathematics of T (2,4)

6. If T is consistent, then the consistency of T is not a theorem of the finitary metamath- ematics of T (1,5)

7. If T is inconsistent, then the consistency of T is not a theorem of the finitary meta- mathematics of T

8. The consistency of T is not provable in the finitary metamathematics of T (6,7)

9. If the consistency of T is not finitarily provable, then neither is the real-soundness of T

10. The real-soundness of T is not finitarily provable (8,9)

11. There is no formal theory containing elementary number theory whose real-soundness is finitarily provable (10)

12. If there is no formal theory containing elementary number theory whose real-soundness is finitarily provable, then Hilbert’s Program cannot be carried out for any appreciable body of ideal mathematics

13. Hilbert’s Program cannot be carried out for any appreciable body of ideal mathematics (11,12)

The Stability Problem challenges this argument by pointing out that the inference from 3 to 4 is invalid. The reason that the inference is invalid is due to the possibility that the particular features found in the particular sentence ConGpT q that Gödel constructed that keep it from being provable are not those that are required for it to express consistency.

Were this possibility the case, the unprovability of ConGpT q would be coincident with its expression of consistency. For the inference from 3 to 4 to actually go through, one needs

60 to somehow “insure that any formula expressing the same proposition as ConGpT q will be

13 provable in T only if ConGpT q is.” The only way that Detlefsen can see that one might insure this would be to show that “every set of properties sufficient to make a formula of T a fit expression of T ’s consistency is also sufficient to make that formula unprovable in T (if T is consistent).”14 So we need a set C of conditions on formulae of T such that every formula expressing consistency satisfies C, and no formula that satisfies C can be proven in T , assuming, of course, that T is consistent. Only then can one generalize the

claim that ConGpT q is unprovable in T to the claim that if T is consistent and ConpT q a formula satisfying C and expressing consistency, then ConpT q is not provable in T . Note that Detlefsen is not making the same claim that Gödel himself made in 1931 about Hilbert’s program with respect to G2:

I wish to note expressly that [G2 does] not contradict Hilbert’s formalistic viewpoint. For this viewpoint presupposes only the existence of a consistency proof in which nothing but finitary means of proof is used, and it is conceivable that there exist finitary proofs that cannot be expressed in the formalism of [our basic system].15

Indeed, Detlefsen is assuming that every finitary truth is expressible as a theorem of T . Gödel himself eventually adopted this view shortly thereafter. What, then, might one take to be a natural choice for C? The first to provide general requirements for deriving G2 were Hilbert and Bernays [1939] in their Grundlagen der Math- ematik, Vol. II. The modern characterization of these derivability conditions is due to Löb. They are as follows:

DC1. $T ϕ then $T ProvT pxϕyq

DC2. $T ProvT pxϕ Ñ ψyq Ñ pProvT pxϕyq Ñ ProvT pxψyqq

DC3. $T ProvT pxϕyq Ñ ProvT pxProvT pxϕyqyq

An important thing to ask at this point is: Where exactly did these conditions come from? Sure they might suffice for a proof of the second incompleteness theorem but what

13Detlefsen [1986], p. 81. 14Detlefsen [1986], p. 81. 15Gödel [1931], p. 615.

61 was the motivation for arriving at them in the first place? Well, the quick answer is that upon inspection of the proof of G1, the derivability conditions are essentially formalized versions of the reasoning involved in that proof.

The reasoning of the first incompleteness theorem started from $T G Ø ProvT pxGyq.

We then showed that if $T G then $T ProvT pxGyq. If T is consistent then these give us that &T G. Given that this is the first half of G1, it is reasonable to think that such reasoning ought to be captured within T itself. That is, that given $T G Ø ProvT pxGyq,

T should know this, i.e., $T ProvpxG Ø ProvT pxGyqyq. But this is just an instance of

DC1. Furthermore, given that if $T G then $T ProvT pxGyq, T should know this, i.e.,

$T ProvT pxGyq Ñ ProvT pxProvT px Gyqyq. But this is just an instance of DC3. Given all this, and given modus ponens, we get the first theorem. That T knows that modus ponens is a legitimate inference is captured by DC2. The derivability conditions are then a very natural set of conditions that show the un- provability of a very natural consistency statement within the theory itself. So far so good. The questions at hand though, and the ones for Detlefsen, are the following: Which prov- ability predicates satisfy the derivability conditions? Why think we need the derivability conditions? And is there a proof predicate that violates the derivability conditions that is acceptable? We will not answer all these questions in detail. Indeed, the first question involves technical details we need not get into here. As for the second question, we will look at one reason for thinking we need the derivability conditions in a moment. And as for the third question, it is true that there are provability predicates that violate the derivability conditions. Whether or not they are acceptable requires us to ask another question: What are the constraints on a consistency statement ConT ? Well, in answer to this, one natural thing to think is that we want it to be such that it is true just when T is actually consistent. What the two incompleteness theorems above show is that given any provability predicate one can diagonalize and construct an undecidable sentence. However, as we’ll see, it is not always true that given any proof predicate Prov*, Prov*p1 “ 0q is an unprovable consis-

62 tency statement. And this, it seems, is Detlefsen’s point. You need to have constructed the right sort of consistency statement to ensure that it is unprovable in T. It remains to argue that only those that satisfy the derivability conditions are of the right sort. To summarize: If one’s proof predicate satisfies the derivability conditions, then one will get the generalized form of unprovability of a consistency statement within the theory at hand. So one way to argue against Hilbert’s program is to try to argue for the derivability conditions. Detlefsen’s aim is to show that (a) no current proposal suffices to justify them and that (b) there can be no satisfactory justification. More specifically, Detlefsen’s claim is that there is no analysis of what is required of a formula to adequately express the consistency of T that involves in any necessary way the derivability conditions listed above. Given that Detlefsen wants to defend the Hilbertian, he needs to explain why the derivability conditions need not be satisfied despite their plausibility and provide a justification of a situation in which they fail. For this latter part to be truly convincing, a defense of a proof predicate that does not satisfy them, and is such that it captures the notion of proof in an acceptable way, is required. One reason that one might argue in favor of the derivability conditions is that they are required by, or are entailed by, any adequate account of the way that one practices informal mathematics.16 In particular, the derivability conditions are required to adequately formalize the mathematics needed for Hilbert’s Program. Hilbert was not trying to formalize just any theory, but rather substantial mathematics. And at the very least, according to Kreisel and Takeuti, substantial mathematics involves elementary mathematical truths, and elementary

0 mathematical reasoning, namely, Σ1-completeness, and closure under cut, respectively. ...the usual condition on systems [the Derivability Conditions]...are necessary if a formalization of mathematical reasoning is to be adequate for Hilbert’s programme... Let us spell out the two adequacy conditions on a system F : 0 (a) Demonstrable completeness w.r.t. Σ1 formulae is needed to assure us that elementary mathematics (with a constructive existential quantifier) can be reproduced in F at all... (b) Demonstrable closure under cut (and in the quantifier free case also under substitu- tion) is also needed because cut is constantly used in mathematics. Realistically speaking, a (meta)mathematical proof of such closure is needed and not a case study of mathematical

16Kreisel and Takeuti [1974].

63 texts because cut – like most logical inferences – is often used without being mentioned; in contrast, for example, to the use of mathematical axioms.17

How do these requirements relate to the derivability conditions? Well,

$ ProvT pxφyq Ñ ProvT pxProvT pxφyqyq and

$T ProvT pxφ Ñ ψyq Ñ pProvT pxφyq Ñ ProvT pxψyqq

0 are formalized versions of particular instances of the truth of Σ1-completeness and closure

0 under cut, respectively. If mathematical practice requires Σ1-completeness then in particular if ProvT pxφyq is true then $T ProvT pxProvT pxφyqyq will hold, given that ProvT pxφyq is itself

0 Σ1. Similarly, if a mathematical theory has closure under cut, then it will have modus ponens. The idea is that we need a proof, in T , of these properties, in order to be sure that their content, and informal character, is adequately captured by T . Detlefsen challenges the Kreisel-Takeuti claim on grounds of a particular view of “theoret- ical revision” that he finds plausible for the instrumentalist. Assume that a given theory, say T , is shown inconsistent. Standardly, an inconsistency requires a modification of an axiom or a . This is because one is aimed at truth. And if one’s axioms or rule of inference lead one to an inconsistency, then something must have gone wrong and one’s theory cannot be true. If, however, one is an instrumentalist, as Detlefsen takes Hilbert to be, an inconsistency need not require a change of one’s axiom(s) or rule(s) of inference, but instead only a change in the “range of its applicability” (“it” being, I presume, the axiom or rule of inference in question.) As he says,

[U]pon deriving a contradiction from a set of axioms or rules of inference, his [the instrumen- talist’s] response need not be one of changing some specific axiom(s) or rule(s) of inference, but might rather consist simply in ruling out certain combinations of axioms and rules of inference as forming acceptable (i.e. reliable) proofs. This is a consequence of the fact that he is not committed to calling his axioms true and his rules of inference truth-preserving. He demands instead that a system be an efficient and reliable generator of real truths, where this does not imply that the system be constructed from true axioms and truth-preserving rules of inference.

17Kreisel and Takeuti [1974], as quoted by Detlefsen [1986], p. 114; brackets mine.

64 Indeed, for the instrumentalist, the categories of truth and truth-preservingness do not apply to the evaluation of an ideal system.18

Now sometimes the axioms and rules of inference might lead us to a real truth. And other times they might lead us to a contradiction. The task is to separate the two. Assuming that the instrumentalist has a rank order of ideal proofs in terms of their “revisability”, start with the set of axioms and, using the rules of inference, begin generating the set of potential proofs. Given a criterion of “revisability”, these proofs will be ordered in a particular way, presumably so that the least revisable one is first. Now go through the rank ordering of the possible proofs. We determine what counts as a genuine proof as follows: First, the axioms themselves count as genuine. If a possible proof does not conflict with the conclusion of any genuine proof earlier in the ordering then count it as a “genuine” proof. Otherwise discard it. Only the genuine proofs are accepted as proofs. The end result will be a set of proofs that are consistent and of instrumental value. Such a method of proof ordering and such a criterion for proof acceptance is directly akin to a modified notion of a formal system based on Rosser [1936]. These Rosser systems, as we’ll call them, are very similar. Given a formal system T , and some ordering on the proofs, count a sequence of formulae as a genuine proof if (a) the sequence of formulae constitute a proof in the intuitive sense that we are familiar with and (b) the conclusion of the proof does not contradict the conclusion of any other proof earlier in the ordering. What is particularly important about Rosser systems is that they are known to not satisfy the

0 derivability condition for Σ1-completeness, i.e. $ ProvT pxφyq Ñ ProvT pxProvT pxφyqyq. Indeed they also violate closure under cut, i.e. $T ProvT pxφ Ñ ψyq Ñ pProvT pxφyq Ñ ProvT pxψyqq, for it is entirely plausible that one might prove φ Ñ ψ and φ and yet not prove ψ because one has an earlier proof of ψ. So if these Rosser systems are considered acceptable, and they converge with Hilbertain instrumentalist ideal proof revision, the Hilbertian instrumentalist need not respect the derivability conditions. And if this is so, then the stability problem remains, and the second incompleteness theorem does not undermine Hilbert’s Program.

18Detlefsen [1986], pp. 120-121.

65 Moreover, though Detlefsen does not point this out, if one adopts a Rosser system, then one actually can prove a corresponding consistency statement. More on this later. So it might seem that Detlefsen has not only provided reason to think the Standard Argument is undermined, but he has actually succeeded in doing what Hilbert hoped for, i.e. being able to prove consistency. But appearances can be deceiving and things are not always as they seem. We must therefore evaluate the suggestion to see if it has any merit. There are, I think, at least three questions that need to be addressed. First, how close does Detlefsen’s Hilbertian line up with the historical Hilbert? Second, regardless of how we answer the first question, is the position coherent and well motivated? Finally, assuming we have an affirmative answer to the second question, is it a reasonable view to hold? We take them in turn.

3.4 Historically Hilbertian?

At the beginning of this chapter we briefly sketched a very generic picture of standard instrumentalism and noted that some have thought that Hilbert was an instrumentalist be- cause of his distinctions between the finitary/non-finitary, contentual/non-contentual, mean- inful/meaningless, and real/ideal (where these can be roughly used synonymously). Detlef- sen in particular takes the distinction between the contentual and non-contentual as being primarily one of differing . The “epistemic utility” of the contentual (real) statements is genuine meaningfulness and evidentness, whereas that of the non-contentual (ideal) statements is “a meta-theoretic function of its purely formal of “algebraic” [combina- torial or computational] propeties.”19 He goes on to say that “an instrumentalist’s ultimate justification must be its usefulness in helping us to perform tasks whose performance we value.”20 The ultimate justification of ideal proofs and propositions is then their usefulness in helping us provide/construct proofs of (true) real-propositions. The reason why these

19Detlefsen [1986], p. 4. 20Detlefsen [1986], p. 6.

66 ideal proofs are valuable is primarily, according to the instrumentalist, because of their effi- ciency. Because each ideal proof can be transformed into a real proof, the ideal proofs are merely “abbreviations” of the contentual reasoning involved in a real proof. Along with this simplification, adding ideal statements is useful because it unifies our thought. All this might be taken as reason to think that Hilbert endorsed a form of general in- strumentalism, whereby, strictly speaking, ideal statements are meaningless. But as noted by Richard Zach, this at best shows a “methodological instrumentalism”: “A successful exe- cution of the proof-theoretic program would show that one could pretend as if mathematics was meaningless.”21 Whether Hilbert actually thought that the ideal statements were mean- ingless is unclear. But by showing consistency we can (supposedly) side-step the need to even engage in the philosophical questions regarding the meaning of ideal statements, the actual existence of ideal objects, etc. It is at least true that ideal proofs help to simplify and unify finitary thought. And, on a modest conception of instrumentalism, this much is taken as true. So it at least seems reasonable that some sort of instrumentalism would be accepted by Hilbert. The question, though, is whether Hilbert would have endorsed the restricted form of instrumentalism put forth by Detlefsen. This I find much less plausible. In §3.2, I pre- sented Detlefsen’s arguments against the Hilbertian Dilemma and claimed that regardless of their merit, they are not historically Hilbertian; Detlefsen’s instrumentalism is different from Hilbert’s views. The instrumentalist mantra against completeness was that since G is not provable by real means, there is no reason why an ideal theory – whose aim is to improve upon the efficiency of real methods of proof in a reliable way – should be required to prove G either. I argued that this is not enough to save the Hilbertian since she has other motivations that clash with this. The instrumental mantra against conservativity was that conservativity ought to be distinguished from real-soundness, and it is only the latter that the instrumentalist is committed to. As with the completeness case, I argued that

21Zach [2005], p. 429. Cf. Sieg [2013].

67 the Hilbertian has motivations stemming from the fundamentally epistemic character of the program that conflict with this. Either of these is enough then to discredit the claim that such a characterization would be enough for Hilbert. Moreover, I take Hilbert to have taken the consistency proof, if there be one, as doing more than merely supporting the view of ideal elements as instruments. Instead, I think he took it to be providing a foundation for mathematics generally. One reason for thinking this is that Hilbert emphasizes at length how mathematics is the king of all sciences and is objective and true independently of all human thinking. In an oft-quoted reply to the paradoxes of set theory, Hilbert declares that “no one shall drive us from the paradise Cantor created!” Such a restricted instrumentalism seems to me to partially drive us out of Cantor’s paradise. Furthermore, Hilbert was clear that he wanted to justify the use of the laws of Aris- totelean logic even in the infinite and to use proof theory as a means of describing how our understanding works by modeling how we actually think mathematically.22 As a result it seems highly unlikely that Hilbert would have found such a gerrymandered notion of proof as that of §3.3 acceptable. As Raatikainen puts the point, “Hilbert’s purpose was to jus- tify just the ordinary laws of logic when applied to the infinite, not to devise some ad hoc logic (notion of provability) allowing an apparent consistency proof for the axioms of, say, analysis.”23 And, perhaps even clearer, consider the following quote from an address to the Swiss Mathematical Society entitled “Axiomatic Thought” in 1917: “The chief requirement of the theory of axioms must...show that within every field of knowledge contradictions based on the underlying axiom-system are absolutely impossible.”24 This seems to me to be in direct opposition to Detlefsen’s characterization of the instrumentalist’s response to deriving a contradiction as quoted on page 65. Thus, it seems highly unlikely to me that Hilbert would have been satisfied with Detlefsen’s restricted instrumentalism. 22See, for example, Hilbert [1925], p. 379. 23Raatikainen [2003], p. 165. 24Hilbert [1918]

68 That said, the bare instrumental theory that Detlefsen provides does escape the argument from the first incompleteness theorem by avoiding Hilbert’s dilemma. And, as we noted at the end of §3.3, it also undermines the stability argument underlying the the second incompleteness theorem, as well as proving a certain modified consistency statement. So whether it is what Hilbert himself would have been comfortable with becomes somewhat irrelevant. The important question to ask now is: Does Detlefsen’s proposal make sense as an attractive philosophy of mathematics? And here I think the proposal comes up short on several fronts. To see why I think it is unsatisfactory it will first be useful to summarize what I take to be the resulting view, based on certain things that Detlefsen has said. It will end up being slightly different from the instrumental characterization discussed above. What follows is my construction of the resulting instrumental theory when taking into account various commitments on Detlefsen’s part; he does not anywhere (as far as I know) discuss what the resulting theory will actually look like.

3.5 Detlefsen’s Strict Instrumentalism

I have claimed that the character that Detlefsen has cast Hilbert as cannot be correct. However, Detlefsen’s Program, as we will call it, can better be viewed as a modified Hilbert program – a restricted instrumentalism in Hilbertian spirit. On this restricted view, only a part of infinitary mathematics needs to be justified by a consistency proof. Our question now is whether such a restricted instrumentalist position makes sense in itself as a modified version of Hilbert’s program, or even as a coherent philosophy of mathematics. To get clearer on this, let us start by taking a closer look at the motivation for restriction. There needs to be an independent reason for restriction that is over and above a desire to prove the consistency statement. Were this the only reason, the suggestion would seem ad hoc. Detlefsen needs to give us a reason for thinking that not all ideal sentences are instrumental, and show how this should lead us to endorse Rosser consistency.

69 The first thing to point out is that we want to make sure that the end results of the ideal methods are indeed genuine real truths and that the epistemic attitudes towards these end results are legitimate. This was essentially the complaint of Frege when he wondered how it is that an ideal proof can give us epistemic warrant in its conclusion. The epistemic problems resides with the status of any ideal axioms that the proof may involve. Proofs, for Frege, start from true assumptions and proceed by gapless truth-preserving inferences to arrive at true conclusions. Consequently, there is to be no doubt that we have a positive epistemic attitude toward the conclusion of a genuine proof. The truth of the premisses and the truth-preserving nature of the inferences give us every reason to believe in the truth of the conclusion. How, though, can we be sure of the truth of the conclusion of an ideal proof, given that the propositions involved are “meaningless” and hence lack truth value? Hilber’s answer to this question, and one of his great points of genius, involves distinguishing between mathematics and metamathematics. By evaluating these ideal proofs metamathematically, one can hope to show that any excursion through the ideal is benign. We broach these issues in more detail in chapter 6. Assuming then that a satisfactory answer to Frege can be given, and that we can have epistemic warrant in the conclusions of ideal proofs, it follows that our ideal proofs give us what Detlefsen calls “quantitative noetic gain”. The idea is that because certain finitary proofs of real sentences will be long and or highly complex, we can increase the stock of truths that we can arrive at in an epistemically fruitful way. However, Detlefsen wants to ensure that we do not suffer any “qualitative noetic loss” in the process. Suppose for example that there is an extremely short and simple real mathematical proof that is shorter and simpler than its ideal counterpart. Then, according to Detlefsen, it would make no sense to replace the real with the ideal; doing so would result in a qualitative loss because the real proof has more epistemic utility than the ideal proof. This leads to a general principle for what counts as instrumentally acceptable: First, those ideal proofs of real statements that lack any real proof that is shorter or less complex than any ideal proof of that statement. Second, those

70 ideal proofs that are not too long or complex to be humanly graspable. This view takes seriously the finite limitations of human thought (with respect to time, complexity, energy, focus, material, etc.). Detlefsen is thus clearly not advocating any ide- alized assumptions on the part of the mathematical mind, as is done, for example, in com- putability theory. These limitations then pose a serious restriction on the limits of our epistemic acquisition, and, in particular, on what proofs actually count as instrumental.

[T]he very limitations on human epistemic resources which attract the Hilbertian to the ideal method also pose clear limits to its utility. Thus, only ideal proofs falling below a certain level of length and/or complexity will be of any human utility. Ideal proofs exceeding that level will be of no value as devices of epistemic acquisition. Hence, the question of their reliability is of no concern to the Hilbertian instrumentalist25

Considerations of these sorts, coupled with an appropriate of length/complexity of proof, commit Detlefsen to Strict Instrumentalism. The Thesis of Strict Instrumentalism: Of the infinitely many ideal proofs constructible in a given system T of ideal mathematics, only finitely many of them are of any value as instruments of human epistemic acquisition. And presumably there is a corresponding restriction on which real proofs count as “valuable instruments of human epistemic acquisition.” It follows then that the system of real mathe- matics will be finite too. Now, given that both are finite, clearly neither one is complete with respect to real statements. Nor need the ideal be conservative over the real. For there might be a case where there is an ideal proof of a real formula that is of low enough complexity to count as cognitively graspable while any corresponding real proof of the same formula be of high enough complexity to not count as cognitively graspable. Detlefsen’s instrumentalism thus avoids Hilbert’s Dilemma of §3.2. Moreover, given the stress on human feasibility there should be real statements whose real proofs and ideal proofs will both be beyond human comprehension. As a result there will be real truths that admit of no feasible proof. Given Hilbert’s stress on non ignorabimus it seems reasonable to think that he would have expected an adequate mathematical theory would be complete. This is another reason to think that he would not have accepted Detlefsen’s

25Detlefsen [1986], p. 83.

71 restricted instrumentalism. I’m unmoved by such restrictions. Indeed, it seems to me that some sort of idealizing assumptions on behalf of the mathematical mind should be made in the same way that is done when we learn . Not only does idealizing appropriately seem to be more in line with capturing mathematical practice more generally, but if we do not idealize then we will be met with issues of vagueness regarding what counts as humanly feasible – the standard Sorites issues will arise. It also strikes me as very plausible, and even true, that ideal sentences have value in themselves, regardless of their use in arriving at real-statements. What’s more, though, I am puzzled as to what motivates the jump from this view of restricted instrumentalism to an endorsement of Rosser systems. My guess is that it is something like the following: Because the instrumentalist ignores proofs that are not instru- mental, she can ignore proofs that lead to contradiction because they themselves are not instrumental. I find this highly unsatisfactory though. If anything, a proof of a contra- diction is instrumental for the very reason that it shows you that your axiom(s) or rule(s) of inference are inconsistent and hence incoherent. Given that the very point of answering Frege’s concern was to ensure epistemic confidence in the conclusions of ideal proofs, finding a proof of an inconsistency would, I would think, lead me to conclude that something has gone wrong. I take it that Detlefsen’s reply will involve this notion of a revisability ordering. Because ideal proofs come with different degrees of human feasibility, given an appropriate ordering, a contradiction should not lead us to the conclusion of the last paragraph but should, he might claim, actually give one reason to believe that something was wrong with that particular proof, not with the system as a whole. I take it this is what he is getting at in the block quote from page 65. And if this is so, then perhaps eliminating the proof as not being instrumental would work. But how exactly is this ordering supposed to go? Unless there is a unique ordering it seems likely that the same axioms, will, given a different ordering of revisability, yield different theories. And it is not clear to me why this should be acceptable,

72 even to the restricted instrumentalist. Assuming then that there is a unique ordering, how is it determined? Is it strictly a matter of complexity or length of proof? Should we simply use an ordering based on Gödel numbering? Regardless, though, assuming we do have such an ordering, what keeps the resulting theory from proving false statements? If the proof of a false statement is earlier on the list than its negation then it stays and the true sentence goes. But then how do we square this with the desire to answer Frege above and ensure that what we are proving are truths? To see this a bit more clearly let’s consider what the resulting instrumental theory might look like. I take it that it will be a combination of real proofs and ideal proofs with no conclusions in common. The question is how do we generate this instrumental theory from these two systems? How it is done depends on what one requires. My contention, though, is that no matter how one attempts to do it, one will fail to meet at least one of the requirements of instrumentalism, either efficiency or reliability. The first requirement is that we want efficiency and instrumentality. So the shortest proof will always trump. That is, if the shortest proof of a real statement is a real proof, then the real proof is more “instrumental” than the ideal proof. Similarly, if the shortest proof of a real statement is a real proof, then the real proof is more instrumental. All other proofs of the same statement are then discarded. So far so good. Now, Detlefsen claims that he just wants real soundness. This requirement, remember, was that the ideal theory not prove anything refutable by the real theory. Given these requirements then, one could establish the new theory as follows. Take your two finite sets of proofs, the real and the ideal. Call the set of reals R and the set of ideals I. For simplicity assume that each of these is individually ordered by complexity. Take these and combine them into one list, L, and order this by complexity. L will then consist of R and I ordered by complexity. Now create another list, INST, a subset of L, that will count as the truly instrumental proofs. The list will be instrumental in that it will consist of only the shortest proofs of real statements. And it will be real-sound based on construction. Here

73 is how we create it: Each stage n will involve considering whether to add the nth proof in our enumeration of L to INST. For arbitrary stage n, assume we are considering a proof π of the formula ϕ. First, check if the proof π is ideal or real. If π is real, check all the members of INST. If there is no proof of ϕ, add π to INST and check if there is an ideal proof in INST of ϕ. If there is, remove this ideal proof from INST and progress to stage n ` 1. If there is a real or ideal proof of ϕ already in INST, do not do anything and progress to stage n ` 1. If the nth proof is an ideal proof and there is no proof of ϕ or its contrary, add ϕ to INST and progress to stage n ` 1. Otherwise do nothing and progress to stage n ` 1. Generating the instrumental theory INST in this way, will result in a consistent, real- sound, efficient collection of real statements. It seems to me that there are, however, two serious defects. First of all, given the finitude of the two theories, merely requiring real- soundness as defined above is simply not enough. For imagine the case where the ideal theory proves a real sentence that would be refutable were the real theory extended beyond what is currently humanly feasible. This would, strictly speaking, be real-sound. And yet, as noted above, what good is the ideal theory as an instrument if it is actually proving falsehoods? Second, how instrumental can the theory really be if in order to generate it you have to keep track of everything already proved, continually checking to make sure that there are no repeats and contradictions? Before we can determine whether the nth proof actually is “instrumental”, we have to do all this bookkeeping. Not only does this seem highly inefficient, but also completely at odds with how mathematicians actually operate. That said, let’s consider again the first worry. It seems to me that we require more than real soundness. Indeed we ought to require conservativity. That is, in order for an ideal proof of ϕ to end up in INST, it must be such that there is a real proof of ϕ in R. This would avoid the worry given in the last paragraph. Moreover, checking for this would be straightforward and our above procedure easily modified to account for it: whenever you are considering adding an ideal statement to INST first check to see that there is proof of

74 ϕ in R. But what good would the ideal theory be if in order for an ideal proof to count as instrumental we would have to first check all the real proofs? We might as well just stick to the real theory. And unless Detlefsen can provide another way of securing the instrumental theory INST that does not succumb to these issues, it seems to me that Detlefsen’s proposed instrumental theory does not provide a satisfactory alternative to Hilbert’s original program. It is unclear to me then, that there is good enough reason for thinking that we ought to restrict ourselves instrumentally. And even if one finds that view attractive, I do not see any non-ad hoc reason for appealing to Rosser-style proof predicates. That said, I would like to end by briefly examining these Rosser systems a bit more closely to see whether, were there some plausible motivation for considering them, actually endorsing them would be satisfactory. There is no doubt that they are of mathematical interest. What I do find doubtful, though, is whether they are particularly philosophically satisfying as a way to skirt the second incompleteness theorem. We noted in passing above that one can generate a consistency statement for the Rosser system that is actually provable. We sketch how this works, following Smith [XXXX], §3.

Define the relation Contrpm, nq as holding just in case m and n are the Gödel numbers of a pair of simply contradictory formulas. This is primitive recursive. Now define another relation

P rf*px, yq “def P rfpx, yq& pDu ď xqpDw ď uqpP rfpu, wq&Contrpy, wqq.

This is primitive recursive. So we can represent P rf* as

Prf(x,y)& pDu ď xqpDw ď uqpPrf(u,w)&Contr(y,w)q.

Now define the modified provability predicate Prov*(y) “def DxPrf*(x,y) and a modified

75 consistency statement for a theory T as

Con*T “def DxDypProv*T (x)&Prov*T (y)&Contr(x,y)q.

26 As it turns out, for any theory T at least as strong as IΣ1, T $ Con*T . So it might seem then that if Detlefsen can provide satisfactory motivation for appealing to Rosser systems then he might really be onto something. For given the appropriate provability predicate and a theory at least as strong as IΣ1, T proves its consistency statement (and Hilbert was certainly interested in theories at least as strong as IΣ1). However, any system can be proven consistent on this view. And so it seems that something has gone wrong. These provable consistency statements do not really tell us anything helpful. These Rosser systems allow the possibility for drastic separation of the set of theorems proved from a set of axioms and the set of its logical consequences.27 The theorems will be consistent regardless, based simply on how they were constructed, but the consequences might not be. And this seems to be seriously problematic.

26 0 IΣ1 is Q with induction for Σ1 formulas. 27Thanks to Stewart Shapiro for raising this point.

76 Chapter 4

Finitistic Transfinite Justification?

In response to Gödel’s Incompleteness theorems several modified or partial Hilbert’s programs have been pursued. We consider one such version due to Gentzen that enlarges the methods to be admitted in consistency proofs. By giving up the stress on strictly finitary reasoning and liberalizing what counts as epistemically acceptable, Genzten was able to prove that PA is consistent by appeal to transfinite induction up to the ordinal 0. Gentzen’s method proceeds by means of ordinal assignments to, and reduction procedures for, possible proofs of contradiction, showing such proofs to be impossible. We first present Gentzen’s method in order to provide a systematic overview of the structure of his proof and its philosophical motivations. We then consider the modern version of the proof as presented by Gaisi Takeuti. Central to Takeuti’s proof is the demonstration of the claim that whenever a concrete method of constructing decreasing sequences of ordinals is given, any such decreasing sequence must be finite. Takeuti takes the constructive demonstration of this result as being of particular philosophical and epistemic value. The central theme that comes out of the philosophical discussion is that such a result can only be understood “from the outside” of the system. Our discussion of Takeuti shows how this theme generalizes, and will be of central importance in the rest of the dissertation.

4.1 Introduction

Gödel’s second incompleteness theorem places obvious limitations on establishing consistency proofs for formal systems of sufficient strength. After all, increased epistemic confidence is supposed to be gained by having the consistency proof itself involve weaker techniques than those contained in the theory whose consistency is to be established. Yet if the system itself cannot prove its own consistency, neither will the epistemically privileged subtheory. It is for this reason that, historically, Hilbert’s program as intended is considered a failure. There

77 is nonetheless a certain genius in Hilbert’s general strategy. And it is not inconceivable that such a consistency proof might be obtained by appealing to methods that are themselves considered constructive (and hence epistemically privileged), and yet still transcend in some way (i.e., in consistency strength) the framework of elementary number theory. Gödel ex- pressed such optimism in the prospect of a modified Hilbert program as early as 1933, when he said that there remains the hope that in future one may find other and more satisfactory methods of construction beyond the limits of a system A [capturing finitist methods], which may enable us to found classical arithmetic and analysis upon them.1

And later, in 1936, Gentzen puts the point by saying that [i]t remains conceivable that the consistency of elementary number theory can in fact be verified by means of techniques which, in part, no longer belong to elementary number theory, but which can nevertheless be considered to be more reliable than the doubtful components of elementary number theory itself.2

Of course the methods being proved consistent cannot be used in the consistency proof itself. That is, we must use an unobjectionable base theory that avoids non-constructive existence proofs and non-predicative definitions, since these are what we are trying to justify. And the particular forms of inference used in the consistency proof itself (the base theory) must themselves be presupposed correct, else we could never get off the ground. To borrow a phrase from Gentzen, there can be no ‘absolute consistency proof’.3 A natural candidate for this base theory is constructive mathematics. But as stressed by Gödel, it’s not entirely clear what exactly counts as constructive mathematics because our notion of constructivity comes in different ‘layers’. And as one ascends the different layers, arriving closer and closer to ordinary non-constructive (classical) mathematics, the methods of proof and construction become less satisfactory and less convincing.4 The strictest con- structivity requirements that Gödel mentions in 1933 are as follows: Applications of “all” or “any” are restricted to those infinite totalities for which we can give a finite procedure for generating all of their elements; negation must not be applied to propositions stating

1Gödel [1933], p. 53. 2Gentzen [1936], p. 139. 3Gentzen [1936], p. 138. 4Gödel [1933], p. 51.

78 that something holds for all elements because this would give existence propositions; we are allowed the use of complete induction in the generating process of our elements; and our notions and functions must be decidable. Since these can always be defined by com- plete induction our system is based “exclusively” on the method of complete induction in its definitions and proofs.5 Given the apparent obviousness of these restrictions and the corresponding evidence for the rule of complete induction, it would be “the most desirable thing” if the proof of the consistency of non-constructive mathematics were provable by methods allowable in such a system.6 Moreover, it is clear that the theory that Gödel has in mind as characterized by these requirements is that of what is now known as primitive recursive arithmetic, PRA. It is taken as the most fundamentally basic of the systems of constructive reasoning, and is important because it aligns with perhaps the most widely accepted analysis of finitist math- ematics due to Tait [1981], as well as the base system for Gentzen’s proof of the consistency of classical arithmetic. Despite this desirability, given the impact that Gödel’s theorem had for Hilbert’s program, the new questions become: What extensions of finitist methods would yield a consistency proof? And what epistemological value would such a proof have? Two important early re- sults in this direction were the relative consistency result of classical first-order arithmetic to that of intuitionistic arithmetic – the so called double negation translation – and Gentzen’s

consistency proof based on transfinite induction up to 0. The former shows that intuition- istic methods go beyond finitistic methods. While both the finitistic and the intuitionistic understanding of arithmetic are standardly taken as both counting as constructive in the sense of maintaining an exhibition requirement for existence, they are distinguised by the types of elements to which they are allowed to appeal. Bernays [1935] has emphasized this fact, claiming that the difference between intuitionistic and finitistic reasoning concerns the

5These would later be slightly modified in Gödel [1938] and [1941]. See Sieg and Parsons [1995] for a nice discussion of the changes and their relation to Gödel’s analysis in 1958 of the difference between (strictly) finitist and intuitionistic considerations. 6Gödel [1933], p. 51.

79 intuitionistic appeal to “abstract elements”, which the finitist eschews. The idea, picked up on by Gödel, is that what distinguishes intuituionistic reasoning from finitary reasoning is not a difference in the requirements of exhibition, but rather the nature of the objects exhibited. Whereas finitary reasoning is restricted to spatio-temporal ‘concrete’ objects, intuitionism goes beyond this by considering such abstracta as the mental constructions which constitute proofs.7 As Gödel put it in 1958, P. Bernays has pointed out on several occasions that, since the consistency of a system cannot be proved using means of proof weaker than those of the system itself, it is necessary to go beyond the framework of what is, in Hilbert’s sense, finitary mathematics if one wants to prove the consistency of classical mathematics, or even that of classical number theory. Consequently, since finitary mathematics is defined as mathematics in which evidence rests on what is intuitive, certain abstract notions are required for the proof of the consistency of number theory. . . . Here, by abstract (or nonintuitive) notions we must understand those that are essentially of second or higher order, that is, notions that do not involve properties or relations of concrete objects (for example, of combinations of signs), but that relate to mental constructs (for example, proofs, meaningful statements, and so on); and in the proofs we make use of insights, into these mental constructs, that spring not from the combinatorial (spatiotemporal) properties of the sign combinations representing the proofs, but only from their meaning.8 The reason why this is important is that it signals why the double negation translation result is not itself the end of the story. Given that Intuitionistic mathematics is constructive, one might consider simply resting content with the double negation result. After all, what the double negation translation result shows is that for every classical proof, there is a corresponding intuitionistically acceptable proof involving double negations. And thus, if Intuitionistic number theory is consistent, so too is Classical number theory. Moreover, all of this is provable intuitionistically, and hence is constructive. Why then should not this count as a satisfactory reduction, and hence itself establish the consistency of Peano arithmetic? To see why, recall the different levels of constructivity mentioned by Gödel above. By virtue of the fact that for intuitionism the “substrate on which constructions are carried out are proofs” and the axioms quantify over “any proof” shows that intuitionism (as formalized in Heyting Arithmetic) violates the restriction placed on the universal quantifier in the

7Though see Detlefsen [2003] who argues that “a more illuminating treatment of the differences between finitism and intuitionism reveals that it is the epistemological conception of the nature of exhibition itself which most fundamentally distinguishes the two” (p. 308). We omit discussion of Detlefsen’s argument. 8Gödel [1958], p. 241.

80 conception of strict finitism above. So, intuitionistic mathematics lies higher on the scale of constructivity than strict finitism, and its methods of proof and construction are less satisfactory and less convincing. This vague nature of the intuitionistic conception of the totality of proofs means, for Gödel, that such a foundation is of “doubtful value”.9 The hope then is that one can find a consistency proof for transfinite mathematics relative to constructive mathematics that is of greater epistemological significance. And this is exactly what Gentzen claims for the principle of transfinite induction applied to certain transfinite ordinals that is central to his result. But, of course, therein lies the rub. Does it count as epistemically ‘good enough’? While one often speaks of Gentzen’s consistency proof, he actually produced four different versions of the proof.10 Our focus will be on the third (Gentzen [1938]). Gentzen’s proof proceeds by providing a reduction procedure for derivations of the empty sequent. The proof shows that for each derivation of the empty sequent, there exists another, less complex, reduced version that has a lower ordinal. That no such simple derivation of the empty sequent exists entails that there exists no arbitrary derivation of a contradiction, and hence that the system is consistent. The methods involved in this third proof are representable in PRA. And if one takes seriously the identification of finitary reasoning with PRA, then Gentzen’s results show what further principle is required for Hilbert’s Program, namely transfinite

induction up to 0 restricted to quantifier-free formulas. Moreover, Gentzen showed in his

fourth proof (Gentzen [1943]) that transfinite induction up to 0 is the weakest induction principle not provable in Peano arithmetic. It is worth mentioning one important, though often overlooked, technical aspect of Gentzen’s proof, namely that primitive recursive arithmetic plus quantifier-free transfinite induction is incomparable to Peano arithmetic – it is not weaker than PA since it proves the consistency of Peano arithmetic; but neither is it stronger since it does not prove the principle of ordinary mathematical induction. It is commonly assumed that proving the

9Gödel [1933], p. 53. 10For a nice overview of the differences see Siders [2012].

81 consistency of a theory requires appealing to another theory that is of strictly greater logical

strength. After all, one can have T1 & T2, T2 & T1, but T2 of greater consistency strength than T1. When trying to present mathematical results to a wide or general audience one is im- mediately caught in a balancing act between mathematical precision and overall clarity. Gentzen’s presentation of his account has the benefit of being largely explained in prose. Doing so allows the overall structure and motivations of the proof to shine through, as well as hinting at its philosophical significance. The tradeoff though is that he is sketchy is certain areas and much of the details are left to the reader. He even admits this, writing that

the main emphasis will be on developing the fundamental ideas and on making every single step of the proof as lucid as possible. For this purpose I shall in places dispense with the explicit exposition of certain details, vis., in those places where this is unimportant for the understanding of the context as a whole and where it can furthermore be supplied by the reader himself without much difficulty.11

The modern version of the Gentzen-style consistency proof is due to Takeuti [1987]. There he presents things in a much more mathematically precise way. The tradeoff here is that the proof becomes quite dense and non-mathematicians can easily lose the forest for the trees, so to speak. While the mathematical presentation provided allows one to see an overview of the big pieces of the proof, there is too often too little explanation of the rationale behind the individual pieces that make up the big pieces themselves. We believe that while much is to be gained philosophically from understanding the broad picture, the broad picture hangs together through the details, and so much is also to be gained philosophically from understanding them. The devil is, after all, often in the details. The goal for the rest of this chapter is to try to balance both of these acts simultaneously. As such, the following relies heavily on Gentzen’s original [1938] paper as well as Takeuiti’s modernized presentation in [1987]. We begin by following Gentzen, providing the big picture motivations behind the details. We then present (part of) Takeuti’s version of the proof in mathematical detail, providing general commentary along the way in order to instill a deep understanding of

11Gentzen [1938], p. 252.

82 just what the philosophical significance of such a consistency proof is. First, though, some preliminaries.

4.2 Some preliminaries

A sequent is an expression of the form

A1,A2,...,An Ñ B1,B2,...,Bm,

where the A1,A2,...,An,B1,B2,...,Bm may be arbitrary formulae. The A’s are called an- tecedent formulae and the B’s succedent formulae. A sequent with multiple succedents is taken to mean that at least one of the succedents holds. One of the notable differences between Gentzen’s earlier proofs and his 1938 proof is that in the latter he allows multiple succedents. He does so for the notable technical advantages that follow from symmetrizing the sequents. Nonetheless, such a change signals a departure from the way that mathemati- cians normally operate. As Gentzen himself concedes, It must be admitted that this new concept of a sequent in general already constitutes a de- parture from the ‘natural’ and that its introduction is primarily justified by the considerable formal advantages exhibited by the representation of the forms of inference following below which this concept makes possible.12

A sequent with no antecedent formulas is taken to mean that at least one of the succedents is true, independently of any assumptions. A sequent with empty succedents is taken to mean that the assumptions lead to contradiction. That is, given the assumptions, no possibility

holds. And an empty sequent Ñ is taken as meaning that under no assumptions, no possibility hold, i.e. the system itself is inconsistent. It is easy to see then that the goal of the consistency proof is to show, proof theoretically, that deriving an empty endsequent is impossible. An inference figure consists of a line of inference, an upper sequent (above the line of inference), and a lower sequent (below the line of inference). The upper sequent represents the

12Gentzen [1938], p.255.

83 premises and the lower sequent the conclusion of the inference. The schemata for inference figures are as follows: structural rules (thinning, contraction, interchange, and cut); logical rules for , ^, _, Ą, @, and D; and the formal counterpart of complete induction. Thinning: left: Γ Ñ ∆ right: Γ Ñ ∆ D, Γ Ñ ∆ Γ Ñ ∆,D

Contraction: D,D, Γ ∆ Γ ∆,D,D left: Ñ right: Ñ D, Γ Ñ ∆ Γ Ñ ∆,D

Exchange: Γ,C,D, Π ∆ Γ ∆,C,D, Λ left: Ñ right: Ñ Γ,D,C, Π Ñ ∆ Γ Ñ ∆,D,C, Λ

Cut: Γ Ñ ∆,DD, Π Ñ Λ Γ, Π Ñ ∆, Λ

Negation: Γ ∆,D D, Γ ∆ left: Ñ right: Ñ D, Γ Ñ ∆ Γ Ñ ∆, D

Conjunction: C, Γ Ñ ∆ D, Γ Ñ ∆ ^ left: and C ^ D, Γ Ñ ∆ C ^ D, Γ Ñ ∆

Γ Ñ ∆,C Γ Ñ ∆,D ^ right: Γ Ñ ∆,C ^ D

Disjunction: C, Γ Ñ ∆ D, Γ Ñ ∆ _ left: C _ D, Γ Ñ ∆

Γ Ñ ∆,C Γ Ñ ∆,D _ right: and Γ Ñ ∆,C _ D Γ Ñ ∆,C _ D

84 Conditional:

Γ ∆,CD, Π Λ C, Γ ∆,D left: Ñ Ñ right: Ñ C Ą D, Γ, Π Ñ ∆, Λ Γ Ñ ∆,C Ą D

Universal: F ptq, Γ Ñ ∆ Γ Ñ ∆,F paq left: right: , @xF pxq, Γ Ñ ∆ Γ Ñ ∆, @xF pxq

where t is an arbitrary term and a does not occur in the lower sequent. a is called the eigenvariable. Existential: F paq, Γ Ñ ∆ Γ Ñ ∆,F ptq left: right: , DxF pxq, Γ Ñ ∆ Γ Ñ ∆, DxF pxq

where t is an arbitrary term and a does not occur in the lower sequent. Induction: F paq, Γ Ñ ∆,F pa1q F p1q, Γ Ñ ∆,F ptq

The formula in the schema of the logical rules containing the is called the principal formula. The principal formula is either ‘eliminated’, and is found in the antecedent formulae (on the left), or is ‘introduced’, and found in the set of succedents (on the right). Note that principal formulae are always located in the lower sequent. Arranging inferences in this way plays an important role in the consistency proof itself, since the complexity of the inferences is always increasing. The degree of a formula is the total number of connectives in it. The degree of a cut is the degree of its cut formulae. The degree of an induction inference is the degree of the formula in the scheme designated by F paq. The level of a sequent is the maximum of the degrees of the cuts and inductions which occur below it. A prime formula is a formula that contains no logical connectives, i.e. is of degree 0. A cut is called essential if its cut formula is not prime. Basic sequents are classified into two kinds: basic logical sequents and basic mathematical

sequents. A basic logical sequent is a sequent of the form A Ñ A. A basic mathematical

85 sequent is a sequent containing only prime formulae that become ‘true’ upon substituting arbitrary numerical terms for any occurrences of free variables. Because all predicates are decidable, prime formulae without free variables are themselves immediately decidable. A derivation is a tree-like structure consisting of a number of sequents, the lowermost of which is called the endsequent, the uppermost of which are called initial sequents, such such that the uppermost sequent is connected to the endsequent by inference figures. Note that while there may be multiple initial sequents there will only be one endsequent. A path in a derivation is a sequence of sequents beginning from a given initial sequent, passing from an upper sequent of an inference figure to the lower sequent of that inference figure, and terminating with the endsequent. The ending of a derivation consists of all sequents in the derivation beginning with the endsequent and traveling up their respective paths and stopping when one reaches the lower sequent of a logical inference. This lower sequent is included in the ending. The ending thus obviously contains only induction inferences or structural inferences. Call identical formula in the upper and lower sequents of a structural inference clustered. And call the totality of all formula in the ending clustered with any particular formula a cluster of formulas. Importantly, with every cluster is associated a cut such that its cut formula belongs to the cluster. This is so since every formula in the ending is clustered with a formula in the sequent immediately below it, except when it’s a cut formula. And since the endsequent is empty, every formula in the ending must at some point be cut away. Moreover the cut associated with the cluster is uniquely determined. Starting with this cut, trace the cluster upward, both on the left of the cluster and the right, until you reach the uppermost formula of the cluster. The left and right sides together constitute the whole cluster.

86 4.3 Transfinite induction and its justification: Gentzen

4.3.1 The Proof

Above we noted that a proof whose endsequent is the empty sequent is a proof of incon- sistency. So to show that every derivation fails to establish inconsistency, one proves that no derivation has an empty endsequent. The consistency proof proceeds inductively by first proving the consistency of simple derivations, and then of more complex derivations by ap- peal to the consistency of the simple derivations. And because each stage of complexity involves an infinite sequence of derivations, the induction used is transfinite induction. To sum up, Gentzen breaks the proof down into three stages:

1. The consistency of an arbitrary derivation is reduced to the consistency of all ‘simpler’ derivations. This is done by defining an - unambiguous - reduction step for arbitrary ‘contradictive derivations’, i.e. derivations with the empty sequent as endsequent; this step transforms such a derivation into a ‘simpler’ derivation with the same endsequent... 2. Then a transfinite is correlated with every derivation and it is shown that in a reduction step the contradictive derivation concerned is turned into a derivation with a smaller ordinal number. In this way the so far only loosely determined concept of ‘simplicity’ receives its precise sense: the larger the ordinal number of a derivation, the greater its ‘complexity’ in the context of this consistency proof... 3. From this the consistency of all derivations then obviously follows by ‘transfinite induc- tion’. The inference of transfinite induction which, at first, is a rather ‘disputable’ infer- ence, may not be presupposed in the consistency proof nor proved as in set theory. This inference requires rather a separate justification by means of indisputable ‘constructive’ forms of inference...13

We consider them in turn.

Reducing consistency

Suppose one has a derivation of the empty endsequent. One gives a reduction step that transforms this derivation into a ‘simpler’ derivation of the same empty endsequent. The idea behind proceeding this way, and for thinking that such a reduction in complexity is possible at all, has to do with the complexity of the proposition expressing inconsistency, e.g. 0=1. If such a ‘simple’ proposition as 0=1 can be proven, it must be because of certain

13Gentzen [1938], p.261.

87 operational inferences in the proof. After all, given the simplicity of initial basic sequents, their decidability, and the forms of the various structural rules, such a derivation would, without appeal to operational rules, be impossible. It is also clear that somewhere in the proof there must a proposition of maximal complexity (degree). Such a proposition would have to be introduced at some point in the derivation and then subsequently eliminated since the basic initial sequents and the contradictory endsequent are simple. In the same spirit as his Hauptsatz, Gentzen provides reduction procedures for the operational rules showing that there is a more direct passage to the conclusions of derivations involving said rules. Of course operational rules are not the only way to increase the complexity of the deriva- tion; there might also be an introduction of increased complexity by means of a complete induction. Having ‘prepped’ the deduction by replacing all free variables with a numerical term and determining that the ending contains at least one induction inference, select an induction inference that occurs above no other induction inference. That is, select an in- duction inference such that there exists no induction inference on the path between the one chosen and the end-sequent; it is the lowest-most induction inference on its path. In this case, as per above, the induction inference has the form:

1 F paq, Γ Ñ ∆,F pa q , F p1q, Γ Ñ ∆,F pnq where n is a numerical term. Note that, as indicated above, in the general schema for induction inferences n could be a general term t. But the case where t is a variable cannot exist here due to our preparation of the derivation. Indeed no free variable can exist in the derivational path from here to the endsequent. Moreover, that derivational path will contain only structural inferences. One then replaces the original induction inference with the following reduction:

88 F p1q, Γ Ñ ∆,F p11q F p11q, Γ Ñ ∆,F p12q cut F p1q, Γ, Γ Ñ ∆, ∆,F p12q F p1q, Γ Ñ ∆,F p12q F p12q, Γ Ñ ∆,F p13q F p1q, Γ, Γ Ñ ∆, ∆,F p13q F p1q, Γ Ñ ∆,F p13q . . F p1q, Γ Ñ ∆,F pnq where the double line represents the possibility of several interchanges and contractions, and the dots represent other potential proof work. If there is no induction inference in the ending then, after ‘preparing’ the ending by elimi- nating all the occurences of thinnings and basic logical sequents, one performs an operational reduction. Importantly, there will exist at least one cluster of formula in the ending of the derivation with at least one uppermost formula on the left and right that is the principle formula of an operational inference. This is the formal analogue of the informal idea of removing peaks of complexity sketched above. A principle formula that is an uppermost formula on the left corresponds to an introduction. And a principle formula on the left corresponds to an elimination. So a cluster of a principle formula where that formula is both uppermost on the right and left followed by a cut is the formal analogue of introducing that connective and then subsequently eliminating it. Such a deviation is unnecessary and the elimination of the cut eliminates the detour, and thereby lowers the complexity of the entire derivation. To see that there always will exist at least one cluster of formula in the ending of the derivation with at least one uppermost formula on the left and right that is the principle formula of an operational inference, first note that there has to be at least one operational inference. Were there not then a ‘false’ sequent would have been derivable from basic math- ematical sequences alone by means of purely structural inferences. And as mentioned above this is clearly impossible given the simplicity, decidability, and ‘truth’ of the basic math- ematical sequents. The result would be an inference with true upper sequents and a false lower sequent, which is impossible. So given the ending, consider each path whose uppermost

89 sequent is the lower sequent of an operational inference. Follow these paths down and make note of any formula that are either principle formulae themselves, or are clustered with a principle formula directly above it. If at any point the property of being clustered is not transferred to the lower sequent this must be because of the presence of a cut. Such a cut is called a suitable cut. The idea is that the other upper formula of the cut will itself be the principle formula of an operational inference, and also be the upper formula of a branch in the ending. Once one has identified a suitable cut, one may then proceed to perform an operational reduction. The cases for the connectives are all analogous, and Gentzen provides an illustra- tion of how this works for the case in which the terminal connective of the clustered formulae is a @. While we will not reproduce the derivation and corresponding reduction, the rough idea involves duplicating the original inference, then introducing the original cut formulae not as the principal formulae of operational inferences, but rather by thinnings. Doing so leaves the correctness of the reasoning involved in tact, but reduces the complexity of the cut by reducing the complexity of the cut formulae. This is so because of the particular details of the ordinal assignments to be discussed below. Continuing in this way one has an effective method for reducing the complexity of any supposed derivation of an inconsistency, since for any contradictive derivation, there will be a place in which one can perform some kind of reduction. We say “some kind” because of complications such as not being able to perform an induction reduction, or that the formula of highest degree is not ammenable to a reduction. In the latter case one can always find a ‘relative extremum’. The idea is that were there a proof of a contradiction there would also be a proof of said contradiction from a set of basic sequents. But there can be no such proof from the basic sequents (for the reasons mentioned above). In summary, assuming that you have an arbitrary proof of a contradiction, one must in some sense ‘prepare’ the derivation (step 1). This is done by replacing all free variables in the derivation (with the exception of eigenvariables) with an arbitrary numeral, say, 1,

90 for convenience. (After all, they are redundant.) Now consult the ending of the derivation. Recall that the ending contains only induction inferences or structural rules. If there is an induction inference, then one performs the appropriate ind-reduction (step 2) alluded to above. Otherwise one performs an operational reduction (step 5), after first eliminating any thinnings (step 3) and basic logical sequents (step 4) that may occur in the ending. The informal idea behind eliminating thinnings is that a thinning is itself simply a weakening of the logical strength of a sequent. And if a contradiction can be derived from the weakened sequent, then it can be derived without it. As for basic logical sequents, since they are merely tautologies they play no role in the mere structural transformations found in the ending. In each case, the reductions eliminate various essential cuts and induction inferences from the ending, thereby reducing the proof’s complexity.

Ordinals assignments

We are now in a position to consider ordinal assignments to proofs. The transfinite ordinals that are required form a segment of the second number class but are generated entirely constructively. Starting with the sequence of natural numbers 1, 2, 3,... one then introduces

the number ω. ω is then followed by ω`1, ω`2, ω`3, and so on. Continuing in this manner, behind all numbers of the form ω ` n follow ω ¨ 2, ω ¨ 2 ` 1, ω ¨ 2 ` 2, and so on, then ω ¨ 3, ω ¨ 3 ` 1, etc. The procedure is repeated until we get numbers of the form ω2, ω3, ... , . .. ωω and so on. The limit required for the purposes of Gentzen’s consistency proof is ωωω , or 0. Note that each ordinal α constructed that is not 0 can be represented uniquely as

α1 α2 αn ω ` ω ` ¨ ¨ ¨ ` ω , where α1, . . . , αn are ordinals and α1 ě ¨ ¨ ¨ ě αn. This is known as the normal form of α. Derivations are assigned ordinal numbers by traveling down from initial sequents to the endsequent. Each sequent in the derivation as well as each inference line are assigned ordinals. The ordinal number of the endsequent is the ordinal number of the entire derivation. Each uppermost sequent gets ordinal of 1. Inference lines are assigned as follows: structural

91 inferences do not change the ordinal assignment of the uppersequents. The ordinal of a cut inference is the natural sum of the ordinal numbers of the two upper sequents. The operational inferences inferences add 1. Induction inferences where the ordinal of the upper sequent is ωα1 ` ¨ ¨ ¨ ` ωαv pv ě 1q are assigned the ordinal ωα1`1. If the ordinal of the inference line is α, the ordinal of the lower sequent below the inference is calculated as follows: if the level of the lower sequent is the same as that of the upper sequent, then the ordinal number of the lower sequent is equal to α. If its level is lower by 1, the ordinal number of the lower sequent is ωα. If lower by 2, the ordinal number of the lower sequent is ωωα . And so on. These definitions and proofs are entirely ‘finitist’. What is of particular importance is that whatever ordinal number is assigned to an inconsistent derivation, the ordinal number of the various reductions will themselves be less than that of the derivation being reduced. Because of its importance for appealing to transfinite induction, and because we illustrated its reduction procedure above, we explain how this works in the case of complete induction.

So suppose that the ordinal of the upper sequent of the induction inference is ωα1 `

¨ ¨ ¨ ` ωαv . Then, following the custom laid out above, the ordinal of the inference line will be ωα1`1. Since the cuts associated with the clusters to which F p1q and F pnq belong must occur further down in the derivation and have the same degree as the induction inference

figure, the ordinal of the lower inference sequent is also ωα1`1. In the new derivation, each of the uppermost sequent receives the same ordinal number as the original inference figure, namely ωα1 ` ¨ ¨ ¨ ` ωαv . And since all sequents in the reduced figure have the same level as the two sequent of the original induction inference, the ordinal number of the lowest sequent of the figure is equal to the natural sum of all numbers ωα1 ` ¨ ¨ ¨ ` ωαv . As such it begins with ωα1 ` ... , which is clearly smaller than ωα1`1. The ordinal number of the entire new derivation has thus also been decreased, since everything below the new derivation is preserved and consists of only structural inference figures. If in the induction inference n equal 1, then the new derivation receives the ordinal number 1. And since its ordinal in the

92 old derivation was at least ω1, this is obviously a decease.

Transfinite induction

The reason why Gentzen appeals to transfinite induction in the consistency proof has to do with the structure of proofs and the complexity of inferences used therewithin. After all, a proof is just a collection of subproofs, themselves being fundamentally individual inference steps. The potential infinite complexity of any particular inference step is what results in the need to go into the transfinite. For suppose that in the proof a proposition is proved for all natural numbers by means of a complete induction. On the constructive conception this depends on the correctness of an infinite number of proofs, one for each particular . Moreover, we need to consider all conceivable number-theoretic proofs. Each proof is then assigned a certain transfinite ordinal number. [W]e need the transfinite ordinal numbers for the following reason: It may happen that the correctness of a proof depends on the correctness of infinitely many simpler proofs. An example: Suppose that in the proof a proposition is proved for all natural numbers by complete induction. In that case the correctness of the proof obviously depends on the correctness of every single one of the infinitely many individual proofs obtained by specializing to a particular natural number. Here a natural number is insufficient as an ordinal number for the proof, since each natural number is preceded by only finitely many other numbers in the natural ordering. We therefore need the transfinite ordinal numbers in order to represent the natural ordering of the proofs according to their complexity.14 If induction inferences were eliminated then it would suffice to limit our ordinal assign- ments to the natural numbers. So the reason why one needs to appeal to transfinite induction is induction inferences are involved. Every induction inference increases the ordinal assign- ment by omega. That is, the reason why the ordinal ωα1`1 is chosen for induction inferences is that we need to account for the limit of all n-fold multiples of the ordinal number of the upper sequent, namely ωα1 ¨ ω. Moreover, this highlights the connection between the magni- tude of the ordinal number of a derivation and the highest degree of the formulae occurring in the derivation. The ordinal number of derivations in which only formulae of degree 0 occur will be smaller than ωω. If the degree of the highest formulae equals 1, then the derivation will have ordinal number less than ωωω . If the degree of the highest formulae equals 2, then 14Gentzen [1936-37], p. 232.

93 ω the ordinal number is smaller than ωωω . And so on.15 The reason why one must appeal to transfinite induction up to 0 has to do with Gödel’s incompleteness theorems. For transfi- nite induction up to a number below 0 is provable in our formalism. So, assuming we retain representability, were there a consistency proof carried out by means of this induction, we would contradict Gödel’s theorem.16

4.3.2 Transfinite justification: Gentzen and Gödel

The reduction procedures are effective and every reduction reduces the complexity of the derivation. Moreover, they are acceptable to the finitist. That is, the technical aspects of the proof are all unproblematic. Of course what matters is whether this is all philosophically satisfying. For the whole point of the consistency proof was to proceed by means that are epistemically more secure. But is it? That is, do we have a justification of transfinite induction up to 0? After all, as Gentzen stressed in stage 3 on page 87 above, [t]he inference of transfinite induction which, at first, is a rather ‘disputable’ inference, may not be presupposed in the consistency proof nor proved as in set theory. This inference requires rather a separate justification by means of indisputable ‘constructive’ forms of inference...

For Gentzen, what is important for the construction of ordinals, and in particular for its epistemic safety, is that given that a number already exists, one can form its successor, and that given an infinite sequence one can form a new ranking behind the entire sequence. The natural thing to wonder is whether this violates the constructivity requirement. After all, it seems that in some sense one appeals to the actualist conception; is not the completed sequence of natural numbers presupposed in the formulation of ω? Gentzen’s answer is “no,” the idea being that the concept of infinity involved here can still be understood in purely potential terms.

Regardless of how far we may go in constructively forming new natural numbers, the number ω stands in the order relation n ă ω to any such natural number n. And the infinite sequences that arise in the formation of the other ordinal numbers should be interpreted in precisely the same way.17

15Gentzen [1938], p. 285. 16Gentzen [1938], p.286. 17Gentzen [1936-37], p. 231.

94 Transfinite induction is taken to be a natural extension of the rule of complete induction to the transfinite ordinal numbers, a rule that Gentzen thinks one can easily convince oneself is correct: beginning with 1, we know that the proposition holds for 2, and hence for 3, and hence for any natural number n. We can thereby conclude that it holds of all natural numbers, and so also holds of ω. Continuing as before we can conclude that it holds for

ω ` 1, ω ` 2, and so on, eventually convincing ourselves that it holds of ω ¨ 2, etc. In this way we may thereby convince ourselves of the truth of the principle of transfinite induction

(up to 0). Despite the fact that Gentzen seems to think that the jump to limit ordinals is obvious, this strikes me as by no means trivial. This idea of being able to safety “travel up through” the ordinals is sometimes called accessibility. It is important that accessibility not presuppose any completed infinite total- ities, considering them instead as simply potential. Moreover there is no general sense of accessibility beyond the individual cases and it’s supposed to be in line with the constructive rejection of impredicativity.

The concept of ‘accessibility’ in the ‘theorem of transfinite induction’ is of a very special kind. It is certainly not decidable in advance whether it is going to apply to an arbitrary given number;...this concept therefore has no immediate sense, since an ‘actualist sense’ has after all been rejected. It acquires a sense merely by being predicated of a definite number for which its validity is simultaneously proved. It is quite permissible to introduce concepts in this way; the same situation arises, after all, in the case of all transfinite propositions if a finitist sense is ascribed to them... With the statement that ‘if all numbers smaller than β have already been recognized as accessible, then β is also accessible’, the definition of the ‘accessibility’ is already formulated in conformity with this interpretation. No circularity is involved in this formulation; the definition is, on the contrary, entirely constructive; for β is counted as accessible only when all numbers smaller than β have previously been recognized as accessible. The ‘all’ occurring here is of course to be interpreted finitistically...; in each case we are dealing with a totality for which a constructive rule for generating all elements is given.18

Gödel, though conceding that Gentzen’s result is not of trivial epistemic value, does not consider the appeal to transfinite induction to be finitistically satisfiable. He notes that it is possible to define an ordering of type 0 within the strictly finitistic system (PRA), but that any such attempted proof of their well-foundedness will violate the predicative requirement of constructivity since the property the induction is applied to is that of being an

18Gentzen [1936], p. 195.

95 ordinal.19 After all, the very notion of a well-ordering is impredicative. Nonetheless, though impredicative, this general method of defining an ordinal by induction on ordinals has “a high degree of intuitiveness”. It can be, after all, considered a generalization of ordinary induction, “and in this respect the deviation from [the induction requirement] is perhaps not such a drastic one.”20 So he seems to agree that there is a certain epistemological significance to Genzten’s proof: “One will not be able to deny of Gentzen’s proof that it reduces operating with the transfinite E to something more evident (the first -number).”21 Of course it is less satisfying than would be a reduction to the basic finitary system PRA alone. Nonetheless, given that that cannot itself be achieved, Gentzen’s proof stands as a step in the right direction. The challenge for Gentzen’s proof, and one that Gödel does not think can be overcome, lies in showing that the ordinals ă 0 as defined by the recursion clauses are well-ordered “without using set-theoretical methods of proof.”22 Gödel expounds on these ideas in [1958] and [1972], claiming that Gentzen’s method provides strong evidence to support Bernays’ claim that certain abstract notions are required in the consistency proof. The bulk of Gödel’s argument lies in the claim that once you pass a certain level, the accessibility of the ordinals is no longer strictly concrete.

Recursion for 0 could be proved finitarily if the consistency of number theory could. On the other hand the validity of this recursion can certainly not be made immediately evident, as is possible for example in the case of ω2. That is to say, one cannot grasp at one glance the various structural possibilities which exist for decreasing sequences, and there exists, therefore, no immediate concrete knowledge of the termination of every such sequence. But furthermore such concrete knowledge (in Hilbert’s sense) cannot be realized either by a stepwise transition from smaller to larger ordinal numbers, because the concretely evident steps, such as α Ñ 2 α , are so small that they would have to be repeated 0 times in order to reach 0.. . . What can be accomplished is only an abstract knowledge based on concepts of higher level, e.g., on “accessibility”. This concept can be defined by the fact that the validity of induction is constructively demonstrable for the ordinal in question.23

As for this notion of accessibility, Gödel says, Accessibility. . . create[s] the deceptive impression of being based on a concrete intuition of certain infinite procedures, such as “counting beyond ω” or “running through” the ordinals smaller than an ordinal α. We do have such an intuition, but it does not reach very far in the

19Gödel [1938], p. 107. 20Gödel [1938], p. 107. 21Gödel [1938], p. 113. 22Gödel [1941], p. 194. 23Gödel [1972], p. 273-274.

96 of ordinals, certainly no farther than finitism. In order to make the concept of accessibility fruitful, abstract conceptions are always necessary, e.g., insights about infinitely many possible insights in Gentzen’s original definition. . . 24

Compare this to the Gentzen quote above. The real insight of Bernays’ observations, according to Gödel, are that they show us that we must make a distinction between two different “component parts” of finitary mathemat- ics: the constructivistic element embodied by the exhibition requirement, and the specifically finitistic element which requires that mathematical objects and facts be given in concrete mathematical intuition, i.e. finite space-time configurations.25 The upshot, as far as consis- tency proofs go, is that this second requirement must be dropped, i.e. we must make some appeal to abstract concepts.

4.4 Takeuti’s justification

Though Gentzen’s proof is relatively accessible, the standard modern presentation of it is due to Takeuti [1987]. While the details of Takeuti’s proof are slightly different, the overview of the proof and its basic structure, as Takeuti presents it, is in accord with Gentzen’s:

1) We present a uniform method such that, if a proof-figure P is concretely given, then the method enables us to concretely construct another proof-figure P 1; furthermore, the end-sequent of P 1 is the same as that of P if the end-sequent of P does not contain quantifiers. The process of constructing P 1 from P is called the “reduction” (of P ) and may be denoted r. Thus P 1 “ rpP q

2) There is a uniform method by which every proof-figure is assigned an ordinal ă 0. The ordinal assigned to P (the ordinal of P ) may be denoted by opP q. 3) o and r satisfy: whenever a proof-figure P contains an application of ind or cut, then opP q ą ω and oprpP qq ă opP q, and if P does not contain any such application, then opP q ă ω.26

Though the particular details do not concern us here, central to his proof is to establish the following Lemma:

Lemma 4.4.1 If P is a proof of Ñ , then there is another proof P 1 of Ñ such that opP 1q ă opP q.27

24Gödel [1972], p. 272, fn. c. 25Gödel [1972], p. 274. 26Takeuti, [1987], p. 92 27This is Takeuti’s Lemma 12.8.

97 The steps to establishing the Lemma are in line with those of Gentzen described above, and can be summarized as follows:

1. Replace free variables in the ending not used as eigenvariables with 0. This does not change opP q. Go to 2. 2a. If there is an induction inference in the ending, then perform an induction reduction on the lowest one. The ordinal of the reduced subproof will be lower, i.e. opP 1q ă opP q. Go to step 1. 2b. If there is no induction inference in the ending, go to step 3. 3a. Eliminate all logical initial sequents D Ñ D from the ending. This will also result in opP 1q ă opP q. Go to step 1. 3b. If there are no logical initital sequents in the ending go to step 4. 4a. Eliminate all weakenings. Two subcases: (i) If no contraction is present then opP 1q ă opP q (ii) If contraction(s) are present then opP 1q “ opP q Return to step 1. 4b. If there are no weakenings in the ending, go to step 5. 5. Since P will contain a suitable cut (recall from the discussion above), perform the appropriate logical reduction. The ordinal of the reduced subproof will be lower, i.e. opP 1q ă opP q. Return to step 1.

Call a proof simple if it contains no free variables and only mathematical initial sequents, weak inferences, and inessential cuts. If we could then show that the reduction procedure involved in the proof of Lemma 4.4.1 must eventually terminate, it would follow that we’d be left with a simple proof of Ñ . But a quick semantic argument shows that there can be no such simple proof.28 And since there can be no such proof P of Ñ , it would follow that PA is consistent. As Takeuti puts is,

Suppose we have concretely shown that any strictly decreasing sequence of natural numbers is finite, and that whenever a concrete method of constructing decreasing sequences of ordinals ă 0 is given it can be recognized that any decreasing sequence constructed this way is finite (or such a sequence terminates). (By “decreasing sequence” we will always means strictly decreasing sequence.) We can then conclude, in the light of 1)-3) above, that, for any given proof-figure P whose end-sequent does not contain quantifiers, there is a concrete method of transforming it into a proof-figure with the same end-sequent and containing no applications of the rules of cut and ind. It can be easily seen, on the other hand, that no proof-figure without applications of a cut or ind can be a proof of the empty sequent. Thus we can claim that the consistency of a system has been proved.29

The crucial point is thus to demonstrate: 28Recall Genten’s remarks above. See also Lemma 12.3 of Takeuiti [1987], p. 102. For a proof-theoretic argument that there is no simple proof of Ñ see Siders [2015]. 29Takeuti, [1987], p. 92.

98 p˚q Whenever a concrete method of constructing decreasing sequences of ordinals is given, any such decreasing sequence must be finite.

Call the property p˚q accessibility. It is important to note that accessibility is actually different from transfinite induction, and while Gentzen’s proof is often presented as requiring

transfinite induction up to 0, what the above shows is that it is really accessibility up to 0 that is needed.30 Compare this to the quotes from Gödel [1972] above. So where Gentzen largely provides evidence, as opposed to a strict “proof”31 for the principle of transfinite induction, Takeuti endeavors to prove the well-ordering in a way that is as finitistically acceptable as possible, and, though he never mentions Gödel, avoids the features at the end of the last section to which Gödel claims any proof of consistency will have to appeal. This is the analogue of Gentzen’s stage 3 on page 87 above. In particular, Takeuti claims that his standpoint operates only on concretely given figures, avoids any appeal to an “infinite mind”, and (almost completely) avoids appeals to abstract concepts. In order to determine whether Takeuti succeeds, we will, in what follows, present his proof

of p˚q.

4.4.1 Eliminators

Takeuti’s presentation proceeds in fourteen steps. Steps I-VII provide an illustration of p˚q by means of example. This is useful in giving the basic idea behind the more general characterization found in steps VIII-XIV. Takeuti’s presentation is quite dense, so part of the idea here is to explain the steps of the proof in a way that is accessible to a non- mathematician. This is important because the philosophical importance of the consistency proof lies in the details, namely in showing that such a demonstration is indeed satisfactory to the finitist. Before looking at the details of each section, we will do well to provide a general overview of what is done at each step, so that the reader may see the overall structure of the proof;

30See Takeuti [1974], p. 366. 31Cf. Gödel [1938], p. 107.

99 after all, given the density of Takeuti’s presentation, it is easy to lose the forest for the trees, so to speak. Steps I-VII show, by means of 1-eliminators and n-eliminators, that any sequence

ω a0 ą a1 ą ..., where a0 ă ω , is finite. This is done by induction. First, one shows it holds for ω1. This is obvious. One then defines a 1-eliminator and shows how it produces a

sequence b0 ą b1 ą ... such that if b0 ą b1 ą ... is finite, then so too is a0 ą a1 ą ..., where

2 a1 ă ω . One then defines an n-eliminator and uses it, coupled with the induction hypoth-

2 esis that if b0 ą b1 ą ... is finite, then so too is a0 ą a1 ą ... , where a1 ă ω , to show that the claim also holds for ωn`1. One thus conclude that it holds for all n, and so holds for any

ω decreasing sequence such that a0 ă ω . This explains the general strategy. Steps VIII-XIV show how this works in a more abstract setting, and proves that it holds

for all sequences of ordinals a0 ą a1 ą ... , where a1 ă 0. The structure of the argument is

.ω ω .. similar: Having shown that it works for ω , show that it holds for ωn, where ωn “ ω n. Things get a bit tricky though when Takeuti introduces α-sequences and pα, nq-eliminators) . In §4.4.2 we present the proof of I-VII in detail because it is straightforward enough to follow and illustrates the basic idea of the general case. In §4.4.3 we only sketch the proof of VIII-XIV in order to give the idea of how it proceeds, proving some commentary along the way. The interested reader can find the details of VIII-XIV in the appendix to this chapter.

4.4.2 I-VII - Finitude up to ωω

(I) First, suppose that a0 ą a1 ą ... is a decreasing sequence of ordinals concretely given.

If a0 ă ω, i.e. a0 is a natural number, then clearly a0 ą a1 ą ... is finite; indeed it is of length at most n ` 1.

(II) Suppose then that a0 ą a1 ą ... is a decreasing sequence of ordinals such that a0 is not

100 a natural number and suppose that each ai is written in canonical form, i.e. has the form

i i i µ µ µn ω 1 ` ω 2 ` ¨ ¨ ¨ ` ω i ` ki,

i where µj ą 0 and ki is a natural number (and including the case where ki does not ap-

i i i µ µ µn pear). Following Takeuti, call ω 1 ` ω 2 ` ¨ ¨ ¨ ` ω i in ai the 1-major part of ai, and call any sequence such that each member of the sequence is in 1-major form a 1-sequence.

That is, a 1-sequence is one where ki does not appear for any ai. We now give a concrete

method (M1) such that for any descending sequence a0 ą a1 ą ... where each ai is written in canonical form, M1 produces a (decreasing) 1-sequence b0 ą b1 ą ... satisfying the condition

(C1) b0 is the 1-major part of a0, and we can concretely show that if b0 ą b1 ą ... is a finite sequence, then so is a0 ą a1 ą ... .

1 1 Define the method M1, which we call a 1-eliminator, as follows. Put ai “ ai ` ki, where ai is

1 1 1 the 1-major part of ai. We can thus express a0 ą a1 ą a2 ... as a0 `k0 ą a1 `k1 ą a2 `k2 ... .

1 Now to construct the 1-sequence b0 ą b1 ą ... . First, set b0 “ a0. Suppose b0 ą b1 ą ¨ ¨ ¨ ą

1 1 1 1 bm has been constructed where bm is aj, for some j. Then either aj “ aj`1 “ ¨ ¨ ¨ “ aj`p for

1 1 1 1 some p and aj`p is the last term in the sequence, or aj “ aj`1 “ ¨ ¨ ¨ “ aj`p ą aj`p`1. This is

1 1 1 because aj “ aj`1 “ ¨ ¨ ¨ “ aj`p “ ... implies that kj ą kj`1 ą ¨ ¨ ¨ ą kj`p ą ... , since such a sequence must be finite (by (I)), and hence must terminate. So either the whole sequence

1 1 1 stops, or aj`p ą aj`p`1 for some p. If the former, stop. If the latter, set bm`1 “ aj`p`1.

Clearly b0 ą b1 ą ¨ ¨ ¨ ą bm ą ... . Suppose further that the sequence is finite, say b0 ą b1 ą ¨ ¨ ¨ ą bm. Then according to the way of constructing bm`1, the original sequence must itself be finite. Thus b0 ą b1 ą ... satisfies (C1). This completes the definition of M1.

2 (III) Suppose we are given a decreasing sequence a0 ą a1 ą ... such that a0 ă ω . Using

101 the 1-eliminator M1 applied to this sequence we can construct a 1-sequence b0 ą b1 ą ... where b0 ď a0. We can then rewrite the 1-sequence b0 ą b1 ą ... as ω ¨ k0 ą ω ¨ k1 ą ... .

This implies that k0 ą k1 ą ... , which, by (I), implies that k0 ą k1 ą ... is finite. This in turn implies that b0 ą b1 ą ... and a0 ą a1 ą ... are finite. [The last follows because the

finitude of k0 ą k1 ą ... implies the finitude of b0 ą b1 ą ... . And this, together with C1, imply that a0 ą a1 ą ... is finite.]

1 1 (IV) Let a0 ą a1 ą ... be a descending sequence written in the form a0 ` c0 ą a1 ` c1 ą ... ,

1 i n n where if ai “ ai `ci then each monomial a ě ω and each monomial in ci ă ω . Analogously

1 to (II), ai is called the n-major part of ai. Such a sequence is called an n-sequence if every ci is empty.

n Now assuming for induction that any descending sequence d0 ą d1 ą ... , with d0 ă ω , is finite, we define a concrete method Mn (an n-eliminator), analogous to M1, such that given a descending sequence a0 ą a1 ą ... , Mn concretely produces an n-sequence b0 ą b1 ą ... satisfying the condition

(Cn) b0 is the n-major part of a0, and if b0 ą b1 ą ... is finite, then we can concretely show that a0 ą a1 ą ... is also finite.

To construct the n-sequence, proceed as in the case of the 1-sequence. First, write each ai

1 1 1 as ai ` ci where ai is the n-major part of ai. Next, put b0 “ a0. Suppose b0 ą b1 ą ¨ ¨ ¨ ą bm

1 1 1 1 1 has been constructed and bm is aj. If aj “ aj`1 “ ¨ ¨ ¨ “ aj`p and aj`p is the last term in the

1 1 1 1 sequence, then stop. Otherwise aj “ aj`1 “ ¨ ¨ ¨ “ aj`p ą aj`p`1, for some p. This is because

1 1 1 n aj “ aj`1 “ ¨ ¨ ¨ “ aj`p implies that cj ą cj`1 ą ¨ ¨ ¨ ą cj`p. And since by definition cj ă ω ,

32 1 cj ą cj`1 ą ¨ ¨ ¨ ą cj`p is finite by the induction hypothesis. Now define bm “ aj`p`1. The

32 1 1 1 Takeuti justifies aj`p ą aj`p`1 by saying, “hence for some p, cj`p`1 ě cj`p, which implies aj`p ą 1 1 aj`p`1” but I admit that I do not see why cj`p`1 ě cj`p must be true. Either way it is clear that if aj`p is 1 1 not the last term in the sequence then aj`p ą aj`p`1 for some p.

102 sequence b0 ą b1 ą ... satisfies (Cn). This defines Mn.

(V) Now using this n-eliminator, Mn, we can show that a decreasing sequence a0 ą a1 ą ...

n`1 is finite, where a0 ă ω . First, apply Mn to a0 ą a1 ą ... to concretely construct an

n n-sequence b0 ą b1 ą ... , where b0 ď a0. Moreover, each bi can be written as ω ¨ ki, for ki

n n some natural number, i.e. b0 ą b1 ą ... can be rewritten as ω ¨k0 ą ω ¨k1 ... . This implies

n n that k0 ą k1 ą ... , which by (I) is a finite sequence. So ω ¨ k0 ą ω ¨ k1 ... is finite, i.e.

b0 ą b1 ą ... is finite. And because b0 ą b1 ą ... satisfies Cn by virtue of being constructed

by Mn, it follows that the original sequence a0 ą a1 ą ... is also finite.

(VI) It thus follows from (III) and (V) [and hence (IV)] by induction that given (con- cretely) any natural number n, we can concretely demonstrate that any decreasing sequence

n a0 ă a1 ă ... with a0 ă ω is finite.

ω (VII) And this shows that any decreasing sequence a0 ă a1 ă ... is finite if a0 ă ω , since

ω n if a0 ă ω then clearly a0 ă ω for some n and so (VI) applies. This completes the proof. l

Some commentary is in order. Of central importance to the proof is the work that the induction hypothesis is doing and the legitimacy of its assumption for this to count as a purely finitary proof. Takeuti never explicitly discusses this, presumably because he takes the following to be obvious to the mathematician, but it is important to make it explicit.

First note that the Mn eliminator will, as seen in the case above, generate an n-sequence

b0 ą b1 ą ... that is finite. But it is possible that this sequence will only exhaust an initial

segment of the original sequence a0 ą a1 ą ... since it is plausible that only an initial segment of the original sequent will be greater than or equal to ωn. The rest of the original sequence will be less than ωn. So the induction hypothesis is playing two roles. Not only is it

103 1 1 1 ensuring the finitude of the possible cases of equality (aj “ aj`1 “ ¨ ¨ ¨ “ aj`p in (IV)), but it is also responsible for ensuring that the remainder of the original sequence is finite. After all, simply knowing that the n-sequence is finite is worthless unless you also know that all the pn´kq-sequences (for k “ 1, . . . , n´1) are also finite. And of course we do, given the way that the eliminators are constructed. One first starts with the 1-eliminator in order to construct a 1-sequence, 2-eliminator in order to construct a 2-sequence, 3-eliminator to construct a 3-sequence, and so on up to any arbitrary n-eliminator. But such an n-eliminator has to have in fact been constructed by appealing to earlier stages and in line with the constructive standpoint as stressed by Gentzen. Simply assuming that all pn ´ kq-sequences are finite would violate such a requirement.

4.4.3 VIII-XIV - The general theory of α-sequences and pα, nq-eliminators

(VIII) The rest of the proof provides the general theory of α-sequences and pα, nq-eliminators, where α ranges over ordinals ă 0 and n ranges over natural numbers ą 0. In line with above, a descending sequences d0 ą d1 ą ... is called an α-sequence if all monomials in each di are ě ωα. If a “ a1 ` c where each monomial in a1 is ě ωα and each monomial in c is ă ωα then we call a1 the α-major part of a. An α-eliminator is such that given a concrete descending sequence a0 ą a1 ą ... it concretely produces a α-sequence b0 ą b1 ą ... such that

(i) b0 is the α-major part of a0

(ii) If b0 ą b1 ą ... is finite then we can concretely demonstrate that a0 ą a1 ą ... is also finite.

The idea is that if we have an α-eliminator for every α ă 0, then it can be shown that any (strictly) decreasing sequence (whose first member is less than 0) is finite. For given

α`1 a0 ą a1 ą ... , with a0 ă 0, there exists some α such that a0 ă ω . Using our α-eliminator one can construct an α-sequence b0 ą b1 ą ... such that (i) and (ii) hold. But each bi can

α α α be written in the form ω ¨ ki, which means ω ¨ k0 ą ω ¨ k1 ą ... . This implies that k0 ą k1 ą ... , which by (I), is finite. So b0 ą b1 ą ... is finite, and hence a0 ą a1 ą ... is also finite (see (ii) above). So if we can construct α-eliminators for all α ă 0, we’re done.

104 (IX) First, rename an α-eliminator to be an pα, 1q-eliminator. Suppose pα, nq-eliminators have been defined. A pβ, n`1q-eliminator is a concrete method for constructing an pα¨ωβ, nq- eliminator from any given pα, nq-eliminator.

(X) Suppose tµmumăω is an increasing sequence of ordinals whose limit is µ (where there’s a concrete method for obtaining µm for each m). Suppose that gm is a µm-eliminator. Then we can construct a µ-eliminator g.

(XI) Suppose tµmumăω is sequence of ordinals whose limit is µ and suppose for each m, a pµm, n ` 1q-eliminator is given. Then we can define a pµ, n ` 1q-eliminator g. (XII) Suppose g is a pµ, n ` 1q-eliminator. Then we can construct a pµ ¨ ω, n ` 1q-eliminator. (XIII) We can now construct a p1, m ` 1q-eliminator for every m ě 0.

(XIV) An pα, nq-eliminator can be constructed for every α of the form ωm, i.e.,

ω .. ω. m *

4.5 Assessment

So what exactly is the philosophical significance of Takeuti’s proof? Well one thing that seems clear is that there is supposed to be an epistemic component, namely an increase in confidence in the consistency of Peano Arithmetic.

Our standpoint does not assume the absolute world as set theory does, which we can think of as being based on the notion of an “infinite mind”. It is obvious that, on the contrary, it tries to avoid the absolute world of an “infinite mind” as much as possible. It is true that in the study of number theory, which does not involve the notion of sets, the absolute world of numbers 0, 1, 2,... is not such a complicated notion; to an infinite mind it would be quite clear and transparent. Nevertheless, our minds being finite, it is, after all, an imaginary world to us, no matter how clear and transparent it may appear. Therefore we need reassurance of such a world in one way or another.33

That said, it’s also clear that what is not being claimed is that the proof is purely finitist, since he seems to accept the identification of finitism with primitive recursive arithmetic and is certainly aware of the limitation placed upon such a proof by Gödel’s incompleteness

33Takeuti [1987], p. 100.

105 theorems. [I]t seems quite reasonable to characterize Hilbert’s finitist standpoint as that which can be formalized in primitive recursive arithmetic. . . It is therefore of paramount importance to clarify where a consistency proof exceeds this formalism [and on what basis it can be justified].34

Nonetheless he often speaks as if the proof is finitary. Our standpoint, which has been discussed above, is like Hilbert’s [and Gentzen’s] in the sense that both standpoints involve “Gedankenexperimente” only on clearly defined operations ap- plied to some concretely given figures and on some clearly defined inferences concerning thess [sic] operations. An α-eliminator is a concrete operation which operates on concretely given figures. A pβ, 2q-eliminator is a concrete method which enables one to exercise a Gedanken- experiment in constructing an α¨ωβ-eliminator from any concretely given α-eliminator. So if an 35 ordinal, say ωk is given, then we have a method for concretely constructing an ωk-eliminator. And later, Our standpoint avoids abstract notions as much as possible, except those which are eventually reduced to concrete operations of Gedankenexperimente on concretely given sequences. Of course we also have to deal with operations on operations, etc. However, such operations, too, can be thought of as Gedankenexperimente on (concrete) operations.36

The key is the “as much as possible”; this seems a clear concession that he is appealing to abstract concepts somewhere, but pinpointing where exactly is not clear. Where precisely does one exceed finitary reasoning? One important thing to note is that the proof takes as given the standard well-ordering of type 0 and the generation of ordinals based on O1-O3:

O1 0 is an ordinal µ O2 Let µ and µ1, µ2, . . . , µn be ordinals. Then µ1 ` µ2 ` ..., `µn and ω are ordinals. O3 Only those objects obtained by O1 and O2 are ordinals.37

The connection between the recursive generation of ordinals O1-O2, and the subsequent induction on that recursive generation is crucial here, and reveals the importance of the fact that Takeuti’s (and Gentzen’s) proof exploits the specific ordinal notations. As Takeuti himself claims,

An important point to note is this. Our proof of the accessibility of 0. . . depends essentially on the fact that we are using a standard well-ordering of type 0, for which the successive steps in the argument are evident. Of course this is not so for an arbitrary well-ordering of type 0, nor for the general notion of well-ordering or ordinal.38

34Takeuti [1987], p. 90. 35Takeuti [1987], p. 97. 36Takeuti [1987], p. 100-101. 37Takeuti [1987], p. 90. 38Takeuti [1987], p. 100.

106 But it seems to me that by focusing on specific notations, Takeuti can claim, in some sense, that he has avoided any appeal to abstract reasoning because he has remained silent on the particular meanings of the symbols. And what’s more, he could, ironically, defend this by quoting Gödel himself:

In our proof for freedom from contradiction we do not have to worry about the meaning of the symbols of our system because the rules of inference never refer to the meaning, and so the whole matter becomes merely a combinatorial question about the handling of symbols according to different rules.39

That is, one could maintain the line that no appeal is made to abstract concepts anywhere in the proof and that every step is concrete and the mathematical presentation is not only constructive, but, by virtue of this strictly formalist attitude, also finitist. We have simply defined α-eliminators formally without any recourse to any reference to some independent conception of ordinal, etc. And, as mentioned above, this is sometimes how things appear.

We can finitely operate on a concrete sequence of concrete figures, given before us, and infer a general statement as a Gedanken experiment and . . . [f]urthermore we can finitely operate on concrete operations, concrete operations on concrete operations etc. and infer a general statement about them, as a Gedanken experiment.40

But there is a further sense in which for the proof to really be illuminating and satisfy its primary goal of increasing our credence in the consistency of arithmetic beyond standard conviction, the details of the proof need to mirror the structure of the constructed ordinals. And the crux of this conviction lies in actually grasping Takeuti’s indexing of eliminators.

That is, what exactly is an pα, nq eliminator, beyond a convenient way of tracking notation? Well, for one, though Takeuti is not explicit about this, it seems to be motivated by Gentzen’s discussion (from page 93 above) with regard to the presence of induction inferences. There we quoted Gentzen as writing that

the reason why one needs to appeal to transfinite induction is induction inferences are involved. Every induction inference increases the ordinal assignment by omega. That is, the reason why the ordinal ωα1`1 is chosen for induction inferences is that we need to account for the limit of all n-fold multiples of the ordinal number of the upper sequent, namely ωα1 ¨ ω.

But recalling that fact seems to confirm the idea that to really satisfy our epistemic concerns we need to have a strong intuitive grasp of the operation and the effects that it has on the

39Gödel [1933], pp. 50-51. 40Takeuti [1974], pp. 366, 368.

107 hierarchy of the ordinals. That is, we need to be in the right sort of epistemic position to perform the appropriate Gedankenexperimente. And here it seems that Gödel’s concerns rear their head. It’s not clear that one has the right kind of intuitive grasp of the ordinals that is needed to in order to have firm epistemic confidence. But even if one can claim that abstract concepts are avoided at each stage of the proof, there is a further issue that confronts the proof. In order to fully appreciate this, however, we need to look more closely at the claim that finitism is to be identified with primitive recursive arithmetic.

4.5.1 Tait

The dominant position amongst logicians and philosophers of mathematics is to identify finitism with primitive recursive arithmetic. And the classic argument establishing the iden- tification is that presented in Tait’s seminal article “Finitism”. Tait’s starting point is the following question: In what sense can we prove propositions about the natural numbers, such

as @xypx ` y “ y ` xq, without assuming the infinitude of numbers or some other infinite totality? After all, for finite mathematics to be nontrivial, one must be able to prove such

0 propositions, and, moreover, the consistency statement is itself a Π1 statement. In answering this question, Tait provides an explication of finitism. It comes out of this explication that finitism is essentially PRA.

When trying to analyze the notion of a finitist proof f of @xF pxq (i.e. f : @xF pxq), where x is a variable of some type A and F pxq is an equation between some terms of type B, two things seem immediately clear. First, a proof of a general proposition is a schema for proving its instances: if f is such a proof and a an object of type A (a : A), then f should

give us a proof fa of F paq, i.e., fa : F paq. Conversely, if for each a : A, f associates a proof F paq, then f should count as a proof of @xF pxq. There are, however, two problems with this converse from the finitist point of view. First, the consistency of, say, ZF, is expressed as such a general proposition. Suppose that ZF is consistent. Then for each n, there is a proof

108 of F pnq. We thus have, in some sense, defined a “proof” of ZF, but certainly not one that is finitist. And moreover, in our analysis the proof f is a function defined for all a : A and so is a transfinite object, and hence unavailable to the finitist.

So, writing f : A Ñ B to mean that f is a function from A to B, i.e. that f assigns a unique fa : B to each a : A, the real question is: What does it mean to say that f is a finitist function? This question does not have any finitist meaning since the notion of function does not have any finitist meaning. Finitists cannot understand f : A Ñ B as being a function from A to B. Indeed, Tait claims that there is no conception of functions as finitist objects that the finitist can accept. “What we are trying to do is characterize the finitist functions and finitist proofs from a nonfinitist standpoint.”41 Since the way to understand finitist proofs follows immediately from the way to understand finite functions, and since the discussion of finitist function is sufficient for our purposes, we restrict our attention to them. Tait first considers the proposal that f is finitist if it is given by a rule of computation or an algorithm, but dismisses it for two reasons. Saying that f is finitist if it is given by a rule of computation (or algorithm) will not work because algorithms, strictly speaking, are not finitist objects since they are not merely syntactic objects. But perhaps more so than that, to be finitist, a finitist must see that the algorithm works, i.e. that it yields a unique B for each A. Part of this requirement can be expressed by: @xDyGpx, yq, where Gpx, yq means that y is a computation of a value from x. This, understood finitistically, means that we have a function g s.t. @xGpx, gxq. But this is circular – our analysis of function as algorithm presupposes that we know what a function is. Tait takes this circularity to have several morals: First, the notion of function is primitive: we cannot define the notion of constructive function by appeal to computation. Second, the distinction between constructive and nonconstructive concerns not the kind of objects, but how we reason about the objects. Finally, “computability of constructive functions is a theorem rather than a

41Tait [1981], p. 527.

109 definition: computability is a property that a function has in virtue of having been introduced constructively.”42

Tait concludes that the only way a finitist can understand f : A Ñ B is as “recording the fact that he [the finitist] has given a specific procedure for defining a B from arbitrary A or, as we shall say, of constructing a B from arbitrary A.”43 And to avoid the infinite regress of a demand for proof,

there must be constructions of which it makes no sense to ask for a proof, beyond the construc- tion itself...That there are such constructions–there there is no infinite regress and proof has a starting point–rests on the fact that to be a C is to be constructed in a certain way.44

The goal is to show that there are no transfinite presuppositions involved in this general notion of construction, C, or the idea of number. Understanding Number as the form of finite sequences, and understanding its relationship to particular numbers, explains how it is that one can consider an arbitrary number without reference to infinite totalities.

[T]he relation of Number to the numbers is not that of a concept to the objects falling under it but that of a form of structure to its least specific subforms. We do not understand Number via the concept of number, i.e., of being a number. Rather it is the other way around. We understand the numbers as the specific determinations of Number...It is in this sense that we can consider an arbitrary number without reference to infinite totalities.45

The idea is that when presented with a sequence of strokes we first understand it as a sequence of strokes. It is only after that that we count the strokes and determine which number “they are”. We need not assume the infinity of numbers in order to understand the generic form of a finite sequence. And because we can do it here we can also do it for any

objects of finitist type: arbitrary A ^ B consists of arbitrary A and arbitrary B. Assuming we have constructions, Tait shows that the composition, identity, constant, pairing, and projection functions suffice to define all functions A Ñ B that “must exist”, in the sense of not depending on knowing the nature of the types involved. And while

his argument is not itself finitist, “it does show us that any f : A Ñ B that the finitist might introduce without reference to the concept of number can be obtained by the above

42Tait [1981], p. 528. 43Tait [1981], p. 528. 44Tait [1981], p. 528. 45Tait [1981], p. 530.

110 constructions.” To see which other constructions are permissible on the basis of the concept Number, one notes that finite sequences of numbers are obtained from 0 by iterating the taking of successors. Iteration is thus implicit in Number: “If we do not understand Number then we do not understand iteration either.” Ignoring iteration for a second, this construal of finite sequences of number yields each constant function and the successor function. If we now add the concept of iteration, we can construct a function defined by primitive recursion. The functions built up from all of these are the primitive recursive functions. It is important to note, as Tait stresses, that the recursion equations have no a priori meaning for the finitist.46 Classically (from the outside) they can be seen as implicitly defining a function. But since the finitist rejects the notion of a function, for her they are merely shorthand descriptions of how fn is constructed.

The above shows that every primitive recursive function F : A Ñ B is finitist in the sense that the finitist can accept it as a construction of a B from an arbitrary A. If Tait can show the converse then he can claim Thesis 1: The finitist functions are precisely the primitive recursive functions. Tait points out that this thesis cannot be understood by the finitist because the notion of a finitist function is not finitist. Moreover, even if we step outside of finitism we cannot prove the thesis rigorously since it’s not yet clear what finitism means. In this way the situation is analogous to Church’s thesis:

We must argue that every plausible attempt to construct a finitist function that is not primitive recursive either fails to be finitist according to our specifications or else turns out to be primitive recursive after all.47

Tait considers several examples. We might try defining fa in terms of a and functions fa ” t, for t a term of type B built up from a and previously introduced functions. Since the primitive recursive functions are closed under such explicit definitions this does not violate the thesis. One might also try to construct fa in terms of a, certain given functions, and terms ft, for t a term built up from a and previously introduced functions as well other terms fs. But this is really just a form of primitive recursion and so it too will not violate

46Tait [1981], p. 532. 47Tait [1981], p. 533.

111 the thesis. This idea is present in all the cases that Tait considers:

[T]he analysis in all cases is essentially the same: the construction is justified only if we possess a function k : A Ñ N that yields the length ka of the chain of reductions by which we obtain the value fa. And in that case f is primitive recursive in k and the given functions.48

The point of this excursion into the details of Tait’s explication of finitism was to point out that the explication proceeds externally. This internal/external phenomenon is nicely used by Burgess [2010] to admonish caution on behalf of foundationalists who claim that certain metatheorems show that various results in reverse mathematics and proof-theoretic reductions more generally count as relativized Hilbert’s programs. For our purposes the relevant context is as follows.

4.5.2 Burgess

Burgess’ explicit goal in his paper is to discuss some of the philosophical implications of Gödel’s incompleteness theorems, in particular to stress a significant connection between Hilbert’s program and the Lucas-Penrose arguments concerning mechanism and the mind. The former, of course, was to convince the finitist of the instrumental value of classical mathematics by providing a finitist proof of the consistency of classical mathematics. If a

0 Π1 statement is classically provable, then so too are all of its numerical instances. If such a statement is false, then at least one of its numerical instances is false. But a false numerical instance would be disprovable by simply performing the relevant computation. So if an untrue statement were provable, there would be an inconsistency. So a finitist consistency proof amounts to giving a finitist proof that all classically provable statements are true (or finitistically provable, i.e. prove a conservative extension result). Gödel’s theorems show that this is not possible, assuming the analysis of Tait. Conservative extension results are often considered partial realizations of Hilbert’s Pro- gram. In particular, if one accepts Tait’s analysis of finitism, the metatheorem stating that

0 WKL0 is conservative over PRA with respect to Π1 formulas is supposed to show that any 48Tait [1981], p. 534.

112 0 Π1 statement provable in WKL0 is provable finitistically. Moreover,

0 [g]iven a finitist proof of a metatheorem to the effect that any Π1 statement having a WKL0 0 proof has a finitist proof, obtaining a WKL0 proof of a Π1 statement P would give a finitist proof of the existence of a finitist proof of P , and (tacitly assuming that a finitist proof of the existence of a finitist proof is as good as a finitist proof) would allow a finitist to infer P . In 49 this way, WKL0 would be instrumental and, hence, a restricted form of Hilbert’s idea.

Details of the Lucas-Penrose dialectic need not concern us. The dominant take-away from that debate is to distinguish between something being true and it being known that it is true. The idea is that while it may be true that the output of some machine coincides with human provability does not entail that this fact is itself humanly provable. The connection between Hilbert’s Program and the Lucas-Penrose debate is that Tait’s analysis of finitism discussed above provides a model of that considered in the Lucas-Penrose discussion: PRA may in fact coincide with finitism, and we who are “on the outside” can see this, but those “on the inside”–the finitists–cannot. As we saw above, while finitists are in a position see that each individual axiom of PRA is finitistically acceptable, they cannot establish the universal generalization that all axioms of PRA are finitistically acceptable. Doing so requires going beyond finitism. The Gödelian discussion of the constructive spectrum at the beginning of this chapter comes into play here. One must appeal in some way to abstracta that pass beyond finitism. This highlights a potential epistemic fallacy: Even if it is true that finitist provability coincides with PRA-provability, it does not follow that a finitist proof of the PRA-provability of some result serves as a finitist proof of the finitist provability of that result. In order for this to follow the finitist would have to know that PRA-provability and finitist provability coincided, something that the finitist cannot know. Analogous points can be made with regard to consistency proofs: the conservativity results establish finitistically the relative consistency of WKL0 over PRA, but this does not amount to a finitist proof of the consistency of WKL0 because the finitist does not have a finitist proof of the consistency of PRA for Gödelian reasons. 49Burgess [2010], p. 131.

113 It is not especially easy to project oneself imaginatively into the mind-set of someone who is really, truly, and sincerely a finitist; but if one tries to do so, one should be able to see that a finitist presented with a finitist proof of the metatheorem (1) and a specific proof p 0 of a specific Π1 result P can conclude that fppq would be a PRA proof of P , but cannot straightaway conclude that fppq would be a finitist proof of P . If this point is accepted, then the metatheorem can no longer be viewed as giving the finitist a once-and-for-all guarantee that any WKL0 proof could be backed up by (or expanded into) a finitist proof. The finitist can at best apply the theorem on a case-by-case basis. We who are not finitists can see that if the finitist could and did actually carry out a computation of fppq and a check of each PRA axiom used in the resulting proof, a finitist proof would be obtained. But the finitist cannot see this in advance of actually carrying out the computation and check.50

Setting aside issues of practical feasibility, the point remains that “we are left a very long way indeed from Hilbert’s original ambition of getting rid once and for all of foundational issues with a single metatheoretic result.”51 It should be clear how all of this relates to Takeuti’s proof. The point is that even if it were true that the steps in Takeuti’s proof were all finitist, the finitist will not be in a position to see that it will hold of all ordinals. That is, for any particular ordinal he can be in a position to see that any descending sequence with it as its first member, that particular sequence will terminate. But one is not in a position to take the final step and tie everything together.

[W]hile finitist provability and PRA provability do in actual fact coincide, and while we who are not subject to the intellectual limitations of the finitist can see this “from the outside,” still the finitists themselves cannot see it “from the inside.” Considering the axioms of PRA one by one – or two by two, since the axioms in question are generally the defining equations for primitive recursive functions, and come in pairs – finitists can successively see that each is finitistically acceptable, that the new symbols these introduce define legitimate ways of operating on natural numbers. The finitist community (if there were one) could even maintain a central register listing at any given time the largest n such that all axioms of PRA up to the n-th level have been verified. But what finitists, individually or as a community, cannot establish is the single universal generalization that all axioms of PRA are finitistically acceptable. To do that, according to Tait, requires passing beyond the realm of finitism into that of the Gödelian “extension of the finitist standpoint,” where the whole infinite course-of-values of the addition or multiplication or exponentiation or whatever function is considered a single “completed” object, to which functionals of higher type, composition and primitive recursion, may be applied.52

So it seems to me that this is really where the abstracta present their heads. One can maintain that the finitist, presented with Takeuti’s proof, can understand each individual

50Burgess [2010], p. 137. 51Burgess [2010], p. 139. 52Burgess [2010], p. 133-34.

114 step. What they cannot do is make the final conclusion from the accessibility of any particular ordinal below 0 to the accessibility of all ordinals below 0. Said another way, even if the proof itself is finitist, the finitist will not be able to appreciate the proof as a whole, and so whatever justification such a proof bestows on the consistency of arithmetic, it will be external. But then we must ask what work such a “justification” is doing. At the beginning of §4.3.2 we said that while Gentzen-style consistency proofs are tech- nically unproblematic, what really matters is whether they are philosophically satisfying. But this, of course, depends on what we want. For Hilbert, the appeal of finitism was its security, and for this reason finitism was supposed to be the ground upon which to found mathematics. The security of finitism lay, for Hilbert, in the fact that numbers are repre- sentable in intuition. This is opposed to functions, sets, and other transfinite objects which are ideas of pure reason. One of Tait’s conclusions is that finitism does not enjoy this special sense of security, though it is true that finitism is the minimal kind of reasoning presupposed by all nontrivial mathematical reasoning about numbers – it is indubitable. “[F]initism is fundamental to mathematics even if it is not a foundation in the sense Hilbert wished.”53 Nonetheless, one might rest content with this fundamentality of finitist thinking as giving it epistemic privilege, and thus in some sense answering Hilbert’s call. But what the discussion of the internal/external divide shows is that even if one could maintain such a position, appre- ciation of any form of justification will be external. What we’ll see in the next two chapters is that this this phenomena generalizes. In chapter 5 we focus on the case of predicativity given the natural numbers before looking more closely, in chapter 6, at the systems WKL0 and PRA. The purpose of these case studies is to investigate the philosophical significance of this internal/external divide as relates to relative consistency proofs and proof-theoretic reductions more generally. In particular we’ll focus on how such results limit what we can have, and, as a result, how they ought to shape what we want.

53Tait [1981], p. 525.

115 4.6 Appendix

VIII-XIV - The general theory of α-sequences and pα, nq-eliminators

(VIII) The rest of the proof provides the general theory of α-sequences and pα, nq-eliminators,

where α ranges over ordinals ă 0 and n ranges over natural numbers ą 0. In line with above,

a descending sequences d0 ą d1 ą ... is called an α-sequence if all monomials in each di are ě ωα. If a “ a1 ` c where each monomial in a1 is ě ωn and each monomial in c is ă ωn then we call a1 the α-major part of a. An α-eliminator is such that given a concrete descending sequence a0 ą a1 ą ... it concretely produces a α-sequence b0 ą b1 ą ... such that

(i) b0 is the α-major part of a0

(ii) If b0 ą b1 ą ... is finite then we can concretely demonstrate that a0 ą a1 ą ... is also finite.

The idea is that if we have an α-eliminator for every α ă 0, then it can be shown that any (strictly) decreasing sequence (whose first member is less than 0) is finite. For given

α`1 a0 ą a1 ą ... , with a0 ă 0, there exists some α such that a0 ă ω . Using our α-eliminator one can construct an α-sequence b0 ą b1 ą ... such that (i) and (ii) hold. But each bi can

α α α be written in the form ω ¨ ki, which means ω ¨ k0 ą ω ¨ k1 ą ... . This implies that k0 ą k1 ą ... , which by (I), is finite. So b0 ą b1 ą ... is finite, and hence a0 ą a1 ą ... is also finite (see (ii) above). So if we can construct α-eliminators for all α ă 0, we’re done. (IX) First we rename an α-eliminator to be an pα, 1q-eliminator. Suppose pα, nq-eliminators have been defined. A pβ, n`1q-eliminator is a concrete method for constructing an pα¨ωβ, nq- eliminator from any given pα, nq-eliminator.

(X) Suppose tµmumăω is an increasing sequence of ordinals whose limit is µ (where there’s a concrete method for obtaining µm for each m). Suppose that gm is a µm-eliminator. Then we can construct a µ-eliminator g.

More succinctly, given pµm, 1q-eliminators we can produce pµ, 1q-eliminators. The main idea here is to generate a µ-sequence through iterated application of the various µm-eliminators

116 at our disposal. Each application of the various µm-eliminators results in a µm-sequence such that if it is finite, then the sequence from which it was generated will also be finite. At each stage the new sequence is generated by exploiting the µ-major representation (e.g.

1 1 a0 “ a0 `c0, where a0 is the µ-major part of a0) of a particular one of its members in order to

determine which µm-eliminator to use to generate a new sequence based on that member’s

corresponding µm-eliminator. The process eventually bottoms out, since each iteration is such that if the sequence that it generates is finite, so too is the previous sequence. Back- tracking in this way allows us to see that the original sequence is finite. We essentially go

through the original sequence a0 ą a1 ą ... “member” by member, beginning with a0, and generate new sequences at each stage.54 Each stage i gives us a new sequence whose leading

1 1 member is ai, i.e. the µ-major parts of each ai. This allows us to collect ai at each stage and simultaneously create the needed µ-sequence suitably satisfying (i) and (ii) above for the process to count as the required µ-eliminator.

To see how this works in more detail, suppose a0 ą a1 ą ... is a concretely given

1 1 sequence. If we write a0 as a0 `c0, where a0 is the µ-major part of a0, then there exists some

µm m such that c0 ă ω . Now apply gm to a0 ą a1 ą ... to produce a µm-sequence

b1,0 ą b1,1 ą b1,2 ... (4.1)

that satisfies (i) and (ii) above, i.e. b10 is the µm-major part of a0, and if b1,0 ą b1,1 ą ... is

finite then we can concretely demonstrate that a0 ą a1 ą ... is also finite. Note also that

1 b1,0 “ a0 and so is also the µ-major part of a0. Now write b1,0 “ b0.

µ Now consider the sequence b1,1 ą b1,2 ą ... and suppose that b1,1 ě ω . Now write

1 1 b1,1 “ b1,1 ` c1,1, where b1,1 is the µ-major part of b1,1. Then there will be an m1 such

µm1 that c1,1 ă ω . Apply the µm1 -eliminator gm1 to the sequence b1,1 ą b1,2 ą ... to get a

54 I put “member” in quotes because they may not literally equal ai. They will at least have the same µ-major part. The form of ci importantly determines which µm-eliminator is used.

117 µm1 -sequence

b2,1 ą b2,2 ą b2,3 ą ...

that also satisfies (i) and (ii) above (suitably modified), i.e. b2,1 is the µm1 -major part of b1,1 (and hence a1), and if b2,1 ą b2,2 ą ... is finite then we can concretely demonstrate that b1,0 ą b1,1 ą b1,2 ą ... is also finite (and hence, by the above, that a0 ą a1 ą ... is also

1 55 finite). Note also that b2,1 “ a1 so put b1 “ b2,1.

µ Now consider the sequence b2,2 ą b2,3 ą ... and suppose b2,2 ě ω . By repeating the above procedure we can obtain a sequence

b3,2 ą b3,3 ą b3,4 ą ...,

and put b2 “ b3,2. Continuing in this way we obtain a µ-sequence

b0 ą b1 ą b2 ą ....

If this sequence is finite, with last term bl “ bl`1,l, then it follows that in the sequence

bl`1,l ą bl`1,l`1 ą bl`1,l`2 ą ... (4.2)

µ µ 1 1 bl`1,l`1 ă ω . So bl`1,l`1 ă ω m for some m . Since we are assuming we have a µm1 - eliminator gm1 , apply gm1 to (4.2) to obtain a µm1 -sequence with the only term 0. Since this 55Takeuti’s version of this paragraph is:

µ Now consider the sequence b11 ą b12 ą ... . Suppose b11 ě ω . Then repeat the above procedure: i.e., for the 1 1 sequence [(4.1)], write b10 “ b10 ` c10, where b10 is the µ-major part of b10. Then there exists an m1 such that µm1 c10 ă ω . So apply gm1 to the sequence b11 ą b12 ą b13 ... to obtain the sequence

b21 ą b22 ą b23 ą ...

satisfying (i) and (ii) (with µm1 in place of α), with b21 the µ-major part of b10. Put b1 “ b21. (Takeuit [1987], p. 95, italics mine) Setting aside the fact that Takeuti’s failure to use commas in his subscripts leads to confusion between the different sequences, the italicized parts must be a typo, since b1,0 is the µ-major part of itself, i.e. c1,0 “ 0. gm1 needs to be determined by c1,1, not by c1,0 (it’s not even clear how one would specify gm1 based on c1,0). Moreover, b2,1 is the µ-major part of b1,1, not b1,0. Compare this to the main text.

118 µm1 -sequence is finite, by definition of a µm1 -eliminator, (4.2) is also finite. And since (4.2) is finite, the sequence bl,l´1 ą bl,l ą ... is also finite. Continuing backwards like this it can

56 easily be seen that a0 ą a1 ą ... is also finite.

(XI) Suppose tµmumăω is sequence of ordinals whose limit is µ and suppose for each m, a pµm, n ` 1q-eliminator is given. Then we can define a pµ, n ` 1q-eliminator g. The definition of the pµ, n ` 1q-eliminator g proceeds by induction on n, with the base case appealing to (X). That is, for n “ 0, i.e. n ` 1 “ 1, given a pµm, 1q-eliminator, we can, by means of (X), define a pµ, 1q-eliminator g. So suppose (XI) holds for n, i.e. suppose that given a pµm, 1q-eliminator, we can define a pµ, nq-eliminator. That means that there is an

1 operation kn such that for any sequence tγmumăω with limit γ and pγm, nq-eliminator gm, kn

1 applied to gm concretely produces a pγ, nq-eliminator. So far so good. Now for the induction step suppose that tβmumăω is a sequence with limit β and that we have an pα, nq-eliminator

βm p. Since gm is a pβm, n`1q-eliminator it concretely produces an pα ¨ω , nq-eliminator gmppq

1 βm β from p. Taking α ¨ ω for γm, gmppq for gm, and α ¨ ω for γ, we can apply the induction

β 1 hypothesis and define an pα ¨ ω , nq-eliminator q by applying kn to tgmu. Since this process of defining q from p is concrete, it serves as a pβ, n ` 1q-eliminator (recall the definition from (IX) above) and we’re done.

(XII) Suppose g is a pµ, n ` 1q-eliminator. Then we can construct a pµ ¨ ω, n ` 1q-eliminator. We appeal to (XI) and show that we can construct a pµ ¨ m, n ` 1q-eliminator from g, for every m ă ω, appealing to the fact that

α ¨ ωµ¨m “ α ¨ ωµ ¨ ωµ . . . ωµ . m loooooomoooooon So suppose we have an pα, nq-eliminator f. Since g is a pµ, n ` 1q-eliminator, g concretely constructs an pα¨ωµ, nq-eliminator from f. Call this gpfq. Now apply g to gpfq to concretely construct an pα¨ωµ¨ωµ, nq-eliminator gpgpfqq. By repeating this procedure m times we obtain

56 Takeuti writes a0 ą a2 ą ... , but this is certainly also a typo.

119 an pα ¨ ωµ¨m, nq-eliminator gpgp. . . gpfq ... qq. But this is, by definition (again, recall (IX)), the required pµ ¨ ω, n ` 1q-eliminator. (XIII) We can now construct a p1, m ` 1q-eliminator for every m ě 0.

Again, the construction is by induction on m. For a p1, 1q-eliminator we can use M1 from (II) above. For m “ 1, Takeuti says that “the construction of a p1, 2q-eliminator is reduced to the construction of an pα`αq-eliminator from an α-eliminator.”57 To see how this

works, assume we are given a sequence a0 ą a1 ą ... and apply an α-eliminator to obtain b0 ą b1 ą ... , satisfying criteria (i) and (ii) from (VIII). That is, tbiu is an α-sequence, b0 is

the α-major part of a0, and if tbiu is finite, then so is taiu. Now each bi can be written in the

α form ω ¨ ci, where tciu is itself decreasing and, if tciu is finite, then so too is tbiu. Moreover,

α a0 “ b0 ` e0, where e0 ă ω . Now apply an α-eliminator to tciu to obtain an α-sequence d0 ą d1 ą ... with d0 the α-major part of c0 and such that if tdiu is finite then so too is tciu.

α α It follows that tω ¨ diu is a decreasinng pα ` αq-sequence. Moreover, if tω ¨ diu is finite, then so too is tdiu, tciu, tbiu, and taiu successively, and

α α ω ¨ d0 “ ω ¨ (the α-major part of c0)

“ the pα ` αq-major part of b0

“ the pα ` αq-major part of a0.

So tω ¨ diu is the pα ` αq-sequence which was desired for taiu. For m ą 1, suppose f is an pα, mq-eliminator. By (XII) (with n ` 1 “ m), we can concretely construct an pα ¨ ω, mq-eliminator from f. We thus have a p1, m ` 1q-eliminator as required to complete the induction.

57Takeuiti [1987], p. 96. While the procedure of constructing an pα ` αq-eliminator from an α-eliminator is straightforward enough, I admit that I do not quite understand how such a reduction suffices. After all, given any pα, 1q-eliminator, a p1, 2q-eliminator should produce an pα ¨ ω, 1q-eliminator. Since α will be of the form ωα1 ` ¨ ¨ ¨ ` ωαn , this, together with our rules for multiplying ordinals, means that α ¨ ω will be ωα1`1. So I do not quite see how an pα1 ` 1, 1q-eliminator can be reduced to an pα ` α, 1q-eliminator.

120 (XIV) An pα, nq-eliminator can be constructed for every α of the form ωm, i.e.,

ω .. ω. m *

We proceed by induction on m. For m “ 0, define α to be 1 “ ω0. We then appeal to (XIII) to define a p1, nq-eliminator for every n. Now suppose that f is a p1, nq-eliminator and that we have defined an pα, n ` 1q-eliminator g. Then g operates on f to produce the required p1 ¨ ωα, nq “ pωα, nq-eliminator.

121 Chapter 5

Predicativity

Predicativity is similar in spirit to the work of Gentzen discussed in the last chapter in the sense of liberalizing what counts as an epistemically safe starting point. The basic idea behind Predicativism is the acceptance of the natural numbers as basic to human understanding and then to see just how much of mathematics can be shown on the basis of this starting point. The discussion of Predicativity is important for two reasons. First, it signals a shift in how one understands foundational work in mathematics. In particular it signals a shift from a focus on security and justification to that of determining the limits of particular philosophical views. It serves as a lynchpin, so to speak, between the early foundational work discussed in the first half of the dissertation and the more contemporary foundational work discussed in the second part. To see this clearly we begin by sketching some of the early historical developments of Predicativity, focusing on Russell and Weyl. We then look at more recent technical developments of Predicativity, culminating with the limitative result that the bound of Predicativity is Γ0.The second reason why Predicativity is important is due to a relevant feature it shares with Finitism. It provides another lucid example of the internal/external divide that concerned the philosophical portion of the discussion of Takeuti from the last chapter. As mentioned above, this theme emerges as the central point upon which the philosophical value and significance of the epistemic aspect of proof theory hinges. We end by highlighting this connection to Finitism, setting the stage for some philosophical work to be done in chapter 6.1

5.1 Predicativity I - Historical Developments

The origins of Predicativity date back to the turn of last century when the various paradoxes began to appear in mathematics. Perhaps the two best known paradoxes are Russell’s para- dox and the Liar Paradox. It is now commonplace to distinguish between logical paradoxes and semantic paradoxes, Russell’s paradox being an example of the former, and the Liar

1For several good introductions to the brand of Predicativism that we will be discussing here see Feferman [2005], Crosilla [2017], Koellner [20XX], and the references therein. The standard presentation, to which much of the following is owed (especially the technical work on predicative provability), is Feferman [1964].

122 Paradox being an example of the latter. Since many of the paradoxes result from the naïve assumption that classes are defined by arbitrary propositional functions, it became impera- tive to determine just which predicates do have corresponding extensions, those predicates being called, by Russell, predicative. Of course the challenge was to provide a satisfactory way of characterizing predicativity (and hence impredicativity).

5.1.1 Russell and Poincaré

Poincaré, impressed by Richard’s solution to the paradox of definability, diagnosed the prob- lem of paradoxes generally as being the result of a vicious circle. A vicious circle is en- countered when defining an entity by referencing a totality to which that entity belongs. Impredicative definitions are thus those that quantify over totalities that include the entity being defined. This led Russell to adopt the famous Vicious Circle Principle:

[W]hatever in any way concerns all or any or some of a class must not be itself one of the members of a class.2

In determining just which definitions of classes xϕˆ pxq (tx|ϕpxqu) are predicatively admis- sible, Russell diagnosed the violation of the vicious circle principle in the unrestricted use of bound variables. These he called apparent variables. Restricting our attention to proposi- tional functions of one argument, there are two potential occurrences of apparent variables in xϕˆ pxq: in the variable ‘x’ itself, but also in any quantifiers that occur in ϕ.3 This led to an alternate formulation of the VCP:

[W]hatever contains an apparent variable must not be a possible value of that variable.4

Russell’s formalization of these restrictions came to be known as the Ramified Theory of Types.5 Russell’s first move was to claim that each propositional function has a range of significance, i.e. a particular range of arguments that the function may meaningfully apply to. When a particular (monadic) propositional function takes as argument a member of the

2Russell [1973], p. 198. 3Feferman [2000b], p.8. 4Russell [1908], p. 163. 5For nice introductions to type theory, in both its simple and ramified forms, see Copi [1971] and Coquand [2018].

123 collection that is its range of significance, the propositional function is either true or false. Taking an argument outside its range is meaningless. Such a move was articulated as early as 1903.

Every propositional function ϕpxq – so it is contended – has, in addition to its range of truth, a range of significance, i.e. a range within which x must lie if ϕpxq is to be a proposition at all, whether true or false.6

These ranges of significance are organized into different levels. At the base level 0 are the type of individuals. The type of level 1 contains classes of individuals, the type of level 2

contains classes of classes of individuals, and so on. In general, the type of level n`1 consists of all classes of entities of type n.

[These] ranges of significance form types, i.e. if x belongs to the range of significance of ϕpxq, then there is a class of objects, the type of x, all of which must also belong to the range of significance of ϕpxq, however ϕ may be varied. . . 7

In addition to this Russell adopted the stipulation that propositional functions are significant if and only if their arguments are of type one less. So functions of type 1 take as arguments objects of type 0, i.e. individuals. Functions of type 2 take as arguments functions of type 1, etc. This Simple Theory of Types handles the cases of impredicativity found in the logical paradoxes. For example, such restrictions, together with the stipulation that t P u is meaningful if and only if the type of u is 1 higher than the type of t, make x P x in Russell’s paradox ill-formed.8 Nonetheless the Simple Theory of Types does not handle all the paradoxes. In order to handle semantic paradoxes like the liar, Russell further ramified propositional functions into hierarchies of orders (degrees).9 For each type greater than 0 (i.e. the propositional functions), Russell divided functions of the same type into different orders. This is required by the vicious circle principle to handle cases of quantification over propositional functions.

Suppose we have a function ϕpxq of type n that takes arguments of type n ´ 1. The formula

6Russell [1903], p. 523, as quoted in Copi [1971], p. 22. 7Russell [1903], p. 523, as quoted in Copi [1971], p. 22. 8Russell’s account of the Simple Theory of Types was actually more complicated than this since he was also concerned with relations and had certain further particular beliefs regarding the nature of types. See Copi [1971], pp. 24, 26. For clarity and brevity we restrict ourself to this simplified version as presented above. 9See Russell [1908].

124 @ϕϕpxq is also a function of type n and is definable only in terms of the totality of functions of type n. So the totality of all functions of type n, assuming there is one, would contain members definable only in terms of that totality. But this is banned by the vicious circle principle. To overcome this, Russell divided all functions of type n into hierarchies. First-order propositional functions of type n are those definable without any reference to any totality of functions of type n. Second-order propositional functions of type n are those definable by reference to the totality of first-order propositional functions, or whose expressions contain bound variables ranging over first-order propositional functions. And so on. In general, an

mth order function is one definable by reference to the totality of (m ´ 1)th order functions but no totality of functions of order greater than m ´ 1. This ramification does rule out all the semantic paradoxes, and, by virtue of subsuming the Simple Theory of Types, handles all cases of impredicativity. The problem, though, is that this kind of ramification makes the practice of doing mathematics clumsy and seriously restricts the reconstruction of classical mathematics. For example, one no longer has one type of variable for type 1. Instead, one has a multitude of variables Xk of type 1 for each degree k. This leaves us with real numbers relative to degrees and makes talk of arbitrary impossible.10 In response to these issues, Russell introduced the axiom of Reducibility. Roughly, this states that for any functions of type k of order m there is a coextensive function of type k of order 1.

@XmDX1@xpx P Xm Ø x P X1q

Reducibility thereby effectively eliminates the ramification and collapses the system back into the Simple Theory of Types. It was Ramsey who first noted that the simple theory of types was sufficient to block the set theoretic paradoxes. Moreover, while assuming the axiom of Reducibility does allow one to once again carry out classical mathematics, it has

10Feferman [1964], p. 6.

125 been dismissed as ad hoc and unnatural.

5.1.2 Weyl

Drawn by the appeal of eliminating impredicativity, but repelled by Reducibility, Weyl took a different tactic than Russell. Weyl’s goal was to reconstruct analysis in a predicatively satisfactory way. Central to Weyl’s project11 is the idea of constructing predicative sets step by step from some base domain and acceptable operations. The predicative sets are those objects possessing certain primitive or derived properties. Like Poincaré, Weyl did not accept the logicist program of Russell and its subsequent need to start with a definition of natural number. Instead, he took the natural numbers and full mathematical induction as given, believing “the idea of iteration, i.e. of the sequence of natural numbers, is an ultimate foundation of mathematical thought, which can not be further reduced.”12 So Weyl’s approach, like Poincaré’s, was a form of definitionism modulo the natural numbers.13 That is, the mathematical objects with which one deals are to be defined, with the structure of the natural numbers being already granted. And since Weyl is taking the infinite totality of the natural numbers as given, i.e. as a definite concept, he assumes classical logic.

So, Weyl was a predicativist in the sense that he was only going to deal with things that were introduced by definition, but not an absolute predicativist in the sense that everything had to reduce to purely logical principles—rather, a predicativist given the natural numbers. . . [O]ne could think of it as a kind of a relative stance: if one understands or grants certain concepts, then what is predicative given those concepts is that which is obtainable by successive definition from them.14

Since Weyl wants to avoid the appeal to ramification, he (basically) restricts himself to order 0 sets definable in terms of N. Using modern terminology, Weyl is working within a subsystem of second-order arithmetic ACA0. The striking thing about Weyl’s work is just how much of classical mathematics can be carried out in such a restricted system. In particular, he was able to secure all those parts

11Our focus will be on Weyl’s seminal Das Kontinuum. 12Weyl [1918], as quoted in Crosilla [2017], p. 433. Note also that this thought is now built into the ‘big five’ of reverse mathematics. 13Feferman [2000b], p. 5. 14Feferman [2000b], p. 11.

126 of real analysis that suffice for traditional scientific theorizing such as Newton’s theory of motion and gravity.15 It is this sense of predicativity given the natural numbers that we will be concerned with in the rest of this chapter. Feferman [1964] summarizes the view thus:

[O]nly the natural numbers can be regarded as “given” to us. . . [S]ets are created by man to act as convenient abstractions (façon de parler) from particular conditions or definitions. In order, for example, to predicatively introduce a set S of natural numbers x we must have before us a condition Fpxq, in terms of which we define S by

^xrx P S Ø Fpxqs.

However, before we can assert the existence of such S, it should already have been realized that the defining condition Fpxq has a well-determined meaning which is independent of whether or not there exists a set S satisfying [^xrx P S Ø Fpxqs] (but which can depend on what sets have been previously realized to exist). In particular, to determine what members S has, we should not be led via Fpxq into a vicious-circle which would return us to the very question we started with. Conditions Fpxq which do so are said to be impredicative; it should be expected that most conditions Fpxq involving quantifiers ranging over “arbitrary” sets are of this nature. Finally, we can never speak sensibly (in the predicativist conception) of the “totality” of all sets as a “completed totality” but only as a potential totality whose full content is never fully grasped but only realized in stages.16

5.2 Predicativity II - Technical Developments

Predicativity was sidelined until the 1950s, due in large part to the success of axiomatic set theory in handling the paradoxes. Research once again continued in the 1950s after certain developments in recursion theory. But the new logical analyses of predicativity were less concerned with foundational issues, and more concerned with just how far one can extend the conception of the natural numbers. In this way the study of predicativity went beyond that of Weyl’s. The subsequent logical and philosophical investigations were, as Crosilla [2017] puts it,

the beginning of a study of predicativity which, although of relevance for the philosophical debates on the foundations of mathematics, is carried out independently of predicativism; its principal aims are no more to secure the ultimate justification of (a portion of) mathemat- ics, but to draw a clearer demarcation of the boundary between predicative and impredicative mathematics. We may distinguish between two main objectives: (1) the determination of a

15For more details regarding Weyl’s work see Weyl [1918] and Feferman [1988b], [2000b]. 16Feferman [1964], pp 1-2.

127 theoretical limit of predicativity; and (2) the clarification of the extent of predicative mathe- matics.17

5.2.1 Set theory

It will be helpful to characterize these developments by first illustrating predicativity in relation to Zermelo-Fraenkel (ZF) set theory and Platonism about sets. In ZF, the set theoretic paradoxes are handled by restricting comprehension in the form of the Separation Axiom schema.:

@zDy@xpx P y Ø px P z ^ ϕpxqqq,

for every formula ϕpxq with only x free. To quote Zermelo [1908], p. 202, By giving us a large measure of freedom in defining new sets, [the Axiom of Separation] in a sense furnishes a substitute for the general definition of set [as ‘a collection, gathered into a whole, of certain well-distinguished objects of our perception or our thought’] that was. . . rejected as untenable. It differs from that definition in that it contains the following restrictions. In the first place, sets may never be independently defined by means of this axiom but must always be separated as subsets from sets already given; thus contradictory notions such as “the set of all sets” or “the set of all ordinal numbers”. . . are excluded.

But Separation, coupled with the Axiom of Infinity and the Power Set Axiom, also results in impredicativity, giving rise, for example, to the real number system and ultimately to the l.u.b. principle. And what’s more, these axioms arguably assume a very Platonistic con- ception of sets, whereby set existence is independent of human definition and construction. This independence supports the notion of “arbitrary (sub)set” and gives rise to the standard interpretation of ZF: the cumulative hierarchy xVαyαPOn. The cumulative hierarchy is defined by transfinite recursion on the class of ordinals On by transfinite iteration of the power set

17Crosilla [2017], p. 434.

128 operation, beginning with the :

V0 “H

Vα`1 “ ℘pVαq

Vα “ Vβ for limit α. βăα ď It is perhaps of no surprise that mathematicians are generally undisturbed by impredicative definitions given the utility, and hence ubiquity, of impredicative definitions in mathematics, coupled with the prevalence of this Platonistic outlook. If one modifies the successor step in the recursive definition of the cumulative hierarchy by replacing the power set operation with the definable power set operation (℘Def pXq), consisting of all the sets definable by restricting quantification over X and with all parameters in ϕ elements of X (tx : x P X ^ ϕX pxqu), then one ends up with Gödel’s constructible hierarchy:

L0 “H

Lα`1 “ ℘Def pLαq

Lα “ Lβ for limit α. βăα ď The constructible hierarchy may be considered predicative “in a modified sense”: [T]he constructible hierarchy may be considered to be entirely predicative except perhaps in its free use of arbitrary ordinals. Since ordinals are the order types of well-ordered sets and those are defined impredicatively by the condition that any nonempty subset has a least element, the constructible hierarchy is not on the face of it predicative. But it may be considered to be predicative in a modified sense, relative to the notion of arbitrary well-ordering or ordinal number.18

5.2.2 Predicative Definability

Definable Autonomy

As noted above, Weyl’s work on predicativity concerned arithmetical definability, taking the natural numbers as given. One of the key insights of those like Feferman was to extend Weyl’s 18Feferman [2005], p. 603.

129 strategy into the transfinite. The idea is that not only are the arithmetically definable sets considered predicatively justified, but by quantifying over these arithmetically definable sets of natural numbers, we can define new sets of natural numbers. These statements involving quantification over the arithmetically definable sets of natural numbers are also taken to be clear, given our conception of the natural numbers. This process can be iterated, and is clearly in line with the vicious circle principle and highlights the potentiality of the totality of all sets realized in stages. Iterating in this way gives rise to the ramified analytic hierarchy:

R0 “ N

˚ Rα`1 “ pRαq

Rα “ Rβ for limit α, βăα ď where X˚ is the collection of all sets definable by some formula ϕpxq of second-order arith- metic with all quantifiers in ϕpxq relativized to range over X. Note the similarity to Gödel’s constructible hierarchy above, with the exception that in this case we are dealing at each stage with collections of subsets of N.19 It was claimed at the end of last section that the constructible hierarchy can be considered predicative “in a modified sense”. The qualification is due to the fact that this procedure relies on the notion of “arbitrary ordinal”, and, as such, is meaningless on the predicative conception. This led Kreisel to propose the following proviso of autonomy: we consider only those limit ordinals α for which there is some γ ă α and some set S of ordered pairs ă x, y ą such that:

(i) S P Rγ and (ii) S is a well-ordering of its field of order type α.20

The proviso suggests introducing collections associated with well-ordering relations. Other- wise put,

if α is a predicative ordinal and W is a well-ordering relation in Rα and if β is the order type of W , then β is predicative.21

19From now on, unless otherwise specified, our concern will be with sets and relations of natural numbers. 20Feferman [1964], p. 9. 21Feferman [2005], p. 605.

130 This autonomy condition played an important role in Kreisel’s proposal to identify the predicative sets with the hyperarithmetic sets. To explain this proposal, we need to make a brief excursion into recursion theory. The plan for the rest of §5.2.2 is as follows: We first show that the arithmetic sets are those sets that are Turing reducible to finite iterates of the Turing jump on the empty set. We then show that the hyperarithmetic sets are those sets that are reducible to transfinite iterates of the jump operator on the empty set. Equivalently,

1 22 the hyperarithmetical sets are those that are recursive in ∆1 sets in the Kleene hierarchy. This sets the stage to illustrate a significant connection between the predicative hierarchy and the hyperarithmetical hierarchy.

Relative Recursiveness and Reducibility

Given a set X, imagine a computation that given input n has the ability to answer the question “is n in X?” Such a computation is said to have an oracle for X. A (partial) function that can be computed by means of an X-oracle is said to be recursive (computable) relative to X. A set or relation is computable relative to X, or X-computable, if the charac- teristic function, cX , is computable relative to X. Similarly, a set or relation is , or c.e., relative to X if the semi-characteristic function is a partial function computable relative to X. We use “computable relative to”, “recursive relative to”, “recursive in” interchangeably. That is, we say that X is recursive in B if cA is a B-recursive function.

Theorem 5.2.1 X is recursive in Y ô X and X are both recursively enumerable in Y .

Theorem 5.2.2 X is recursive in Y ô cX is recursive in Y ô cX is recursive in cY ô X is recursive in cY .

Definition 5.2.1

(a) A is one-one reducible to B, written A ď1 B, if there exists a one-one recursive function f such that @xpx P A Ø fpxq P Bq

22These technical results, along with their proofs, can be found in the classic Rogers [1987]. See also Ash and Knight [2000] for a nice introduction to these ideas. I have preserved, for the most part, the presentation found in these texts.

131 (b) A is many-one reducible to B, written A ďm B, if there exists a recursive function f such that @xpx P A Ø fpxq P Bq (c) A is Turing reducible to B, written A ďT B, if A is recursive in B, i.e when cA is computable given an oracle for cB. We can thus use the terminology “A ďT B” and “A is recursive in B” interchangeably.

Note that ď1, ďm, and ďT are reflexive and transitive, and A ď1 B ñ A ďm B ñ A ďT B.

The Jump Operator

Following Ash and Knight [2000], Let ϕe denote the partial ϕ with

th index e. We can think of ϕe as the e Turing machine. Let We denote the domain of ϕe. Since a relation R is r.e. if and only if it is the domain of a partial computable function, the list

W0,W1,W2,...

consists of exactly the r.e. sets (with repetitions). Let

K “ tx : x P Wxu.

K is called the halting set.

Theorem 5.2.3 K is recursively enumerable but not recursive.

Definition 5.2.2 A set X is said to be complete for a class C if X P C and all sets in C are Turing reducible to X, i.e., if X P C and p@Y qpY P C Ñ Y ďT Xq.

Theorem 5.2.4 For every r.e. set Y , Y ďT K.

Corollary 5.2.4.1 K is a complete r.e. set.

X th Let ϕe be the e Turing machine equipped with an oracle for the set X. Relativizing the definition of the halting set to arbitrary X, define the Turing jump of X as

1 X X “ tx : x P Wx u.

132 Define Xpnq for all n P ω, where

X0 “ X, and Xpn`1q “ pXpnqq1.

So the Turing jump of a set X is the complete recursively enumerable set relative to X. Theorem 5.2.5 lists some basic facts about the jump operation:

Theorem 5.2.5 (a) B ďT A ô both B and B are recursively enumerable in A. (b) A1 is recursively enumerable in A. 1 (c) A ęT A. 1 (d) B is recursively enumerable in A ô B ď1 A . 1 1 (e) A ďT B ô A ď1 B .

The Arithmetic Hierarchy

We now illustrate the connection between the jump operator and the arithmetical hierarchy. The arithmetical hierarchy is a hierarchy of sets based on the complexity of the formulas

0 that define them. A relation is Σ1 if it can be expressed in the form

DyRpx, yq, where Rpx, yq is computable.

0 0 A relation Spxq is Π1 if the complementary relation, Spxq, is Σ1. More generally, a relation R is in the arithmetical hierarchy if and only if it is recursive or, for some m, can be expressed as

tă x1, . . . , xn ą |pQ1y1q ... pQnymqSpx1, . . . , xn, y1, . . . , ymqu,

where Qi is either @ or D for 1 ď i ď m, and S is an pn`mq-ary recursive relation. A relation

0 0 0 is ∆n if it is both Σn and Πn. A relation is arithmetical if it is definable by some formula in

0 0 the arithmetical hierarchy. By quantifier manipulation, if a relation is either Σn or Πn then

0 0 it is both Σn`1 and Πn`1. Thus we can say that a relation can be said to be arithmetical if

0 it is ∆n, for some n. And, of course, the above holds similarly for sets.

133 pnq A Letting A be the result of applying the jump operator n times to A, and B P Σn to

mean that B is Σn-definable relative to A, we have the following:

Theorem 5.2.6 (Kleene, Post) Let any set A be given. Then for all n and any set B

A pnq (a) B P Σn`1 ô B is recursively enumerable in A ; A A pnq (b) B P Σn`1 X Πn`1 ô B ďT A .

(b) is sometimes called Post’s Theorem and follows from (a) by Theorem 5.2.1. The unrel- ativized form of Post’s Theorem gives us:

pnq B P Σn`1 X Πn`1 ô B ďT H .

So the arithmetic sets are those sets that are Turing reducible to finite iterates of the Turing jump on the empty set. Kleene took this further and defined the hyperarithmetic sets as those sets that are reducible to transfinite iterates of the jump operator on the empty set. Equivalently, the

1 hyperarithemtical sets are the sets that are recursive in ∆1 sets in the analytical hierarchy.

1 1 The analytical hierarchy extends the arithmetical hierarchy by including Πn and Σn formulas. It is sometimes called the Kleene hierarchy. Before we get to these results, though, we need to review some basics of ordinal notations and the constructive ordinals.

Ordinal Notations and Constructive Ordinals

Roughly, a system of S is a mapping vS from DS Ď N onto a segment of the ordinal numbers On. We sometimes write |a|S “ α to denote that a is the notation for α in S.

Definition 5.2.3 A system of notation S is

(a) univalent if vs is one-one; (b) recursive if DS is recursive; (c) recursively related if

134 RS “ tă x, y ą |x P DS & y P DS & vSpxq ď vSpyqu is recursive.

Note that since DS “ tx| ă x, x ąP RSu, a recursively related system must be recursive. Definition 5.2.4 α is a constructive ordinal if there is a system of notation which assigns at least one notation to α.

Clearly every ordinal notation defines at most a countable segment of the ordinals. So the constructive ordinals are all countable. The converse fails. That is, there exist denumer- able ordinals which are not constructive. The least non-constructive ordinal is called the

CK Church/Kleene ordinal and is denoted ω1 . Definition 5.2.5 A system of notation S is maximal if S gives a notation to every con- structive ordinal.

Definition 5.2.6 A system of notation S is universal if for any system S1, there is a partial recursive function ϕ, mapping DS1 into DS, such that x P DS1 ñ vS1 pxq ď vSpϕpxqq.

So a universal system is also maximal. It turns out that while there is no maximal recursively related system of notations, there are universal (and hence maximal) systems. Kleene’s O is one such system.

Definition 5.2.7 The system O is defined as follows. We define both vO and a partial ordering ăO on DO.

0 receives notation 1

Assume all ordinals ă γ have received their notations, and assume that ăO has been defined on these notations.

(i) If γ “ β `1, then for each x such that β has x as a notation, γ receives 2x as a notation; x and the ordered pairs ă z, 2 ą are added to the relation ăO for all z for which either z “ x or ă z, x ą is already in ăO.

n“8 (ii) If γ is a limit, then for each y such that tϕypnqun“0 are notations for an increasing sequence of ordinals with limit γ and such that

p@iqp@jqri ă j ñă ϕypiq, ϕypjq ą is already in ăOs,

γ receives 3 ¨ 5y as a notation; and the ordered pairs ă z, 3 ¨ 5y ą are added to the relation ăO for all z for which

pDnqră z, ϕypnq ą is already in ăOs.

135 Theorem 5.2.7 (Kleene) O is a universal system of notation.

Corollary 5.2.7.1 The system O is maximal; i.e., it associates a notation with every con- structive ordinal.

Corollary 5.2.7.2 Given any univalent system S, there is a partial recursive ϕ mapping DS into O such that

(i) x P DS Ñ vSpxq “ vOpϕpxqq; (ii) x, y P DS Ñ rx ă y Ø ϕpxq ăO ϕpyqs.

From these one can also prove that

Theorem 5.2.8 For every constructive ordinal, there is a recursively related, univalent sys- tem assigning a notation to that ordinal.

Definition 5.2.8 α is a recursive ordinal if there exists a relation R such that: (i) R is a well-ordering (of some set of integers); R is recursive; and (iii) the well-ordering given by R is order-isomorphic to α.

This, coupled with Theorem 5.2.8 shows that

Corollary 5.2.8.1 Every constructive ordinal is recursive.

Moreover,

Theorem 5.2.9 (Markwald, Spector) Every recursive ordinal is constructive.

CK And so an ordinal is recursive if and only if it is constructive. ω1 is thus also the least nonrecursive ordinal.

The hyperarithmetic sets

Using O and ăO, Kleene [1955] defined a hierarchy of sets of natural numbers, the hyper- arithmetic hierarchy, by iterating the jump operator on H into the transfinite:

Hp1q “ H; Hp2xq “ pHpxqq1, for x P O; y y y Hp3 ¨ 5 q “ tă u, v ą |v ăO 3 ¨ 5 & u P Hpvqu, for 3 ¨ 5 P O.

136 So a set is hyperarithmetic if it is Turing reducible to Ha for some a P O. That is, the

CK pαq hyperarithmetic sets, HYP, are equal to tH P P pωq : Dα ă ω1 pX ďT H qu, where α ranges over the recursive ordinals.

Theorem 5.2.10 (Spector)

rx P O & y P O & vOpxq “ vOpyqs ñ Hpxq ďT Hpyq,

uniformly in x and y. (That is to say, if vOpxq “ vOpxq, then an index for cHpxq as a Hpyq-recursive function can be found uniformly in x and y.)

“Uniformly” here is essentially meant constructively, in the sense of having an effective pro- cedure.

Corollary 5.2.10.1 rx P O & y P O & vOpxq “ vOpxqs ñ Hpxq ”T Hpyq, uniformly in x and y.

1 Theorem 5.2.11 (Kleene) y P O ñ Hpyq P ∆1 uniformly in y.

1 Theorem 5.2.12 (Kleene) B P ∆1 ñ pDyqry P O & B ď1 Hpyqs.

1 Theorems 5.2.11 and 5.2.12 show that HYP “ ∆1.

23 As concerns the ramified hierarchy from page 130, Kleene [1959] proved that HYP “ R CK . ω1 It was Kreisel who then tentatively proposed the hyperarithmetical sets as explicating the notion of predicative definability.

Kreisel’s Proposal

The ramified analytic hierarchy is defined by means of the platonistic conception of the ordinals and set theory, and hence is impredicative. So the goal is to identify the initial segment of the ramified analytic hierarchy consisting of the predicatively definable sets of numbers. That is, we must restrict ourselves to appealing only to the predicative ordinals. The problem is that 23It should be noted, however, that results of Gandy [1960] and Spector [1960] showed that HYP does not exhaust the ramified hierarchy. See Feferman [2005], p. 605.

137 it would seem that to define the notion of a predicative ordinal we would need to already have an understanding of the predicative sets of natural numbers, and to define the notion of a predicative set of natural numbers it would seem we have to already have an understanding of the predicative ordinals. So we appear to be caught in a circle. The solution is to define the two in tandem. More precisely, a set of natural numbers should be predicative it it belongs to some Rα for a predicative ordinal α; and an ordinal α should be predicative if it is the ordertype of a well-ordering of natural numbers that appears in some predicative Rα. The predicatively definable sets of natural numbers, P , should be the first level of the ramified analytic hierarchy that is obtained by closing under these two conditions.24 If the solution sounds familiar, it should. It is Kreisel’s definable autonomy condition. As Feferman puts it, Call an ordinal α predicative(ly definable) if it is the order type of a predicatively definable well-ordering W of the natural numbers; then a set should be considered to be predicatively 25 definable if it belongs to Rα for some predicative α. And recall that α is a recursive ordinal if there exists a relation R such that: (i) R is a well-ordering (of some set of integers); (ii) R is recursive; and (iii) the well-ordering given by R is order-isomorphic to α.

So all the recursive ordinals are predicative. The result of Kleene then shows that HYP is predicative. And based on a result of Spector [1955] that every hyperarithmetic well-

CK 26 ordering has order type less than ω1 , the predicative sets do not exceed HYP. Hence, the identification of the predicatively definable sets with HYP. In our definition of O, N is taken to be a completed totality. This of course is in line with the predicativist conception. But O also tacitly assumes the concept of ordinal number and well-ordering. And these are impredicative.27 Though the considerations leading to the identification of the predicative ordinals (resp. sets of natural numbers) with the recursive ordinals (resp. hyperarithmetical sets) have a certain plausibility, they ignored one crucial point if predicativity is only to take the natural num- bers for granted as a completed totality: namely, that they involve in an essential way. . . the impredicative notion of being a well-ordering relation.28 The notion of well-foundedness is prima-facie impredicative, if it is considered to be defined (as usually done) in terms of prior concepts, since for this we need quantification over all functions. . . Thus to qualify such notions or means of definition as predicative, we would seem to need some more direct understanding of their meaning implicit in our understanding of N.29 24Koellner [20XX], p. 16. 25Feferman [2005], p. 605. 26Feferman [2005], p. 605. 27Indeed, such assumptions seem to involve in some way a Platonistic outlook. But Predicativism, as we’ve seen above, is motivated in many ways by the rejection of such a set-theoretical platonistic ontology. 28Feferman [2005], p. 606. 29Feferman [1979a], p. 90.

138 This led to a shift of focus from investigating predicative definability to investigating pred- icative provability, and importantly to a determination of the bounds of predicativity. The crucial new point. . . is that the predicative ordinals not only are those that can be defined by (what happen to be) well-ordering relations in the given systems, but also must previously be proved to be such relations.30

The idea, also initiated by Kreisel, was that rather than considering the various collections of Rα, one deals instead with a transfinite progression of formal systems of ramified analysis

RAα. That is, rather than look at the domains Rα of ramified analysis, one looks instead at the formal systems RAa that axiomatize those domains. Given W , a binary recursive relation in the natural numbers, let WOβpW q express that W is a linear ordering such that every nonempty subset Xβ of the field of W has a W -least element. Then if one proves WO0pW q, it follows that one can “lift” the proof to establish WOβpW q for each β that comes to be accepted. In this sense, the proof of the predicatively meaningful statement WO0pW q can ensure all predicatively meaningful consequences WOβpW q of the impredicative statement of well-ordering WOpW q; indeed, from the outside it ensures that W is a well-ordering. Now Kreisel’s proposal (1958, 1960) can be formulated as follows. The predicative(ly provable) ordinals are generated from 0 by the bootstrap or autonomy condition: 0 if α is predicative and RAα proves WO pW q for a given recursive W , and β is the order type of W , then β is predicative. The least nonpredicatively provable ordinal is then proposed to be the least ordinal which cannot be obtained in that way.31

Through independent work by Feferman and Schütte, it was determined that the least non- predicatively provable ordinal is Γ0. In Feferman’s case, he appeals to the Veblen hierarchy xχαyα of critical functions of ordinals.

5.2.3 Predicative Provability

Γ0

To get a flavor of Γ0, we follow Smorynski [1982]. We discussed in the last chapter how to extend ordinals into the transfinite up to the ordinal 0. We extend that idea even further here.

α For all ordinals α, define χ0pαq “ ω . The sequence

χ0p0q, χ0p1q, χ0p2q,...

30Feferman [2005], p. 606. 31Feferman [2005], p. 607.

139 thus corresponds to the transfinite sequence

1, ω, ω2,... ; ωω, ωω`1,... ; ωω2 ; ... ; ωωω ; ...

We saw in the last chapter that 0 is the limit of the sequence

ω .. ω. n, where n P ω. *

α That is, 0 is the least ordinal α such that α “ ω . Put otherwise, χ0p0q “ 0.

Now, given χ0, we can define the function χ1 that enumerates all the fixed points of χ0:

χ1p0q “ 0, χ1p1q “ 1, . . . , χ1p0q “ 0 ...

Now χ1 has itself uncountably many fixed points. Let χ2 enumerate those. χ2p0q is called the first critical epsilon number and is larger than all of

 ,  , . . . ,  , . . . ,  , . . . ,  , . . . ,  ,.... 0 1 ω 0 1 0

So, generally, give a function χα, we can obtain a new function χα`1 that enumerates all the

fixed points of χα. If α is a limit ordinal then χα enumerates all the common fixed point of

χβ for all β ă α. The functions χα are all total, i.e. χαpβq is defined for all countable α, β.

Each χα is also strictly increasing. Now define

γ0 “ χ0p0q

γn`1 “ χγn p0q

32 and let Γ0 “ suptγn : n P ωu. Then

Γ0 “ χΓ0 p0q.

32 More generally, let Γα be the αth fixed point ξ of the equation χξp0q “ ξ.

140 What Feferman (and Schütte) showed is that Γ0 is the least non-predicatively provable ordinal. To see how this works we need to shift our focus to transfinite progressions of

formal systems of ramified analysis RAα.

RAα

The condition of provable autonomy outlined in the quote at the end of §2 shows that we are interested in countable ordinals which are the order types of recursive well-orderings of

natural numbers. Feferman starts with a restricted version of O and ăO:

Definition 5.2.9 a ă b is a recursively enumerable relation such that: @a, b,

(i) a ć 0, (ii) a ă 2b Ø a ă b _ a “ b, b (iii) a ă 3 ¨ 5 ØDnpa ă rbsnq,

th where rbs is the b primitive recursive function in a standard enumeration and rbsn means rbspnq.

Definition 5.2.10 O is the smallest set containing 0 and closed under (i)-(iii). O is par- tially ordered by ă.

2 So, for example, the notations 0, 1, 2, 22, 222 , 222 ,... represent the ordinals 0, 1, 2, 3, 4, 5,...

b At limit ordinals the notations are the numbers 3¨5 for ϕb a total computable function such

that ϕbp0q ăO ϕbp1q ăO ϕbp2q ăO ... , and α the least upper bound of the sequence of

ordinals αn with notation ϕbpnq. The theories that Feferman [1964] considers are formalized in second-order number theory. The basic axioms are as follows:

(Ax1) Pure classical logic for types 0, 1. (Ax2) The axioms for identity. (Ax3) The standard recursion equations for each primitive recursive function. (Ax4) Complete induction

Let S be any consistent system satisfying at least Ax1´Ax4. To facilitate the discussion of proofs of well-orderings, Feferman introduces the following

abbrevaitions: For any formula F pxq,

141 (i) ProgxF pxq “ @xrx ĺ x ^ @y ă xF pyq Ñ F pxqs; (ii) IxpF pxq; zq “ Linăpzq ^ rProgxF pxq Ñ @x ă zF pxqs; (iii) Ipzq “ @XIxpx P X; zq, where Linăpyq is the conjunction of sentences expressing that ă is a well-ordering of those x such that x ĺ y. Given N as a model for objects of type 0, Feferman takes it that we ought to accept the ω-rule:

from F p0q,F p0q,...,F p0q,... infer @xF pxq.

Let S be as above and let Sa be a particular r.e. set of axioms. Let P rpa, uq mean that the

formula U with Gödel number u is provable from Sa by means of S. We can then formalize

the ω-rule for Sa consisting of all instances of

@xP rpa, _F pxq^q Ñ @xF pxq.

1 Let Sa‘1 denote the new recursively enumerable set of axioms Sa that results from ad- joining all instances of the formalized ω-rule to Sa. One can iterate this procedure and define a recursive progression of theories Sa based on the formalized ω-rule:

(i) S0 consists simply of the axioms of pSq; (ii) for any a, Sa‘1 consists of Sa together with all sentences of the form

@xP rpa, _F pxq^q Ñ @xF pxq;

a (iii) for any a, S3¨5 “YSrasn rn ă ωs.

The question then is to determine which Sa are predicatively admissible. The idea, in line with Kreisel’s condition, is that if “pSq is correct under a certain interpretation in which

33 the variables of type 0 range over the natural numbers, then so is Sa for each a P O.” So our concern now is to determine which a P O. The appeal to oracles, as involved in the Turing jump, is insufficient given our (epistemic) predicative concerns. The solution is to

33Feferman [1964], p. 20.

142 apply Kreisel’s autonomy condition. Roughly, the proviso says that we accept only the Sa

for which Dbpb ăO aq such that Sb is accepted and Sb $ a P O.

We now consider the ramified progressions RAa. We introduce, in addition to first order variables ranging over natural numbers, variables Sa,T a,...,Xa,Y a,Za of degree a for each a. The idea is that Xa ranges over the sets corresponding to the ath level of the ramified hierarchy.

Definition 5.2.11 A graded formula is a formula that has a definite degree associated with every set variable it contains.

Definition 5.2.12 Given a graded formula F , let dpF q be the maximum of all a ‘ 1 such that a variable Xa occurs bound in F and of all b such that a variable Y b occurs free in F . dpF q is called the real degree of F . Let d˚pF q be the maximum of all a such that a variable Xa occurs (free or bound) in F . d˚pF q is called the apparent degree of F .

Definition 5.2.13 (RCa) (Ramified comprehension axioms). For each a ĺ c and each graded formula F with dpF q ĺ a we take the axiom

DXa@xrx P Xa Ø F pxqs

(Xa not free in F ).

a Definition 5.2.14 (LGa) (Limit generalization rule) For any graded formula F pX q with just Xa free, a any limit notation ĺ c, and d any notation with d ‘ 1 ĺ c,

@z ă aPrpd; _ @XzF pXzq^q Ñ @XaF pXaq.

˚ The system RAc is defined as follows: The formulas of RAc are all F with d pF q ĺ c.

The axioms of RAc are:

(i) the basic logical axioms Ax1´Ax4; (ii) all instances of induction for formulas F with d˚pF q ĺ c; (iii) ramified comprehension axioms (RCa) for all a ĺ c (iv) all instances of the formalized ω-rule for each a ‘ 1 ĺ c, i.e. all instances of @xP rpa, _F pxq^q Ñ @xF pxq, where d˚pF q ĺ a; (v) the limit generalization rule (LGa).

Definition 5.2.15 We call c autonomous with respect to (the progession based on) (RA) if it belongs to the smallest set A such that:

(i) 0 P A;

143 b (ii) if d P A and RAd $ I paq (so b ĺ d) then a P A, b b b where I pzq “ @X Ixpx P X ; zq.

Proposal A graded formula (i.e., formula of ramified analysis) F is predicatively provable if RAc $ F for some c autonomous with respect to the progression of theories RAa.

Definition 5.2.16 We denote by AutpRAq the least α such that for all c autonomous with respect to (RA) we have |c| ă α.

Feferman was able to show that AutpRAq ď Γ0 and Γ0 ď AutpRAq, and hence AutpRAq “

Γ0. Γ0 is thus said to be the limit of predicativity.

5.3 Philosophical ‘Ramifications’

5.3.1 Independent characterizations

One might wonder what justifies Feferman’s use of ramification when it was considered the bane of Russell’s system. After all, the actual development of mathematics becomes very difficult with ramification and leads, in Russell’s case, to the adoption of the seemingly ad hoc axiom of reducibility. On the other hand, the idea of ramification does seem to nicely capture the general philosophical picture of avoiding impredicative definitions by means of proceeding in stages. The strategy is to come up with independent unramified systems and justify them by means of proof-theoretic reductions. The redevelopment of mathematics under predicative strictures in practice is better represented logically in terms of unramified formal systems which are shown to be predicatively justified by their proof-theoretical reduction to the autonomous progression of ramified systems.34

The idea is that the unramified systems are better suited for mathematical development, and they are justified by being proof-theoretically reducible to the ramified system. The unramified systems are said to be predicatively reducible, and hence, in a sense, predica- tively justified—though, as we’ll see, just what this justification amounts to requires some unpacking. 34Feferman [2004], 315.

144 If T has the same proof-theoretic strength as that progression, then its proof-theoretic ordinal is Γ0. In that case, though the system T as a whole may not be justifiable predicatively, each theorem ϕ of T rests on predicative grounds, at least indirectly. In practice, more can be said: T is conservative over the autonomous ramified progression for arithmetic sentences (i.e, if ϕ is arithmetical and provable in T, then it is provable in that progression). For second-order T 1 this can often be strengthened to conservativity for Π1 sentences (i.e., for ϕ of the form p@XqA, where A is arithemtical), which on the ramified side is taken to be p@X0qA. In particular, in that case, any provable well-ordering of T is also predicatively provable.35

Moreover, providing independent characterizations would also seemingly bolster the claim

36 that RAΓ0 captures predicativity.

HCăΓ0

Already in the 1964 paper, Feferman considered transfinite progressions of unramified sys-

1 tems, HCα, based on the Hyperarithmetic (∆1) Comprehension Rule (HCR):

1 From p@xqrP pxq Ø Qpxqs, infer pDXqp@xqrx P X Ø P pxqs (∆1-CR)

1 1 where P pxq is any Π1 formula and Qpxq is any Σ1 formula (parameters allowed). Feferman showed that Γ0 is the least autonomous ordinal for these systems, and also showed that

1 HCăΓ0 ” RAΓ0 , and that HCăΓ0 is conservative over RAΓ0 for Π1 statements.

IR and ATR0

Feferman [1964] also considered the single second-order system IR, axiomatized by HCR plus the full schema of transfinite induction on a recursive ordering W, provided one has established WO(W), and a corresponding inference for a schema of transfinite recursion. The “I” in “IR” is for Induction, while the “R” is for Recursion.37 Following Simpson [2002], IR consists of the following inference rules:

1 1. The ∆1 Comprehension Rule:

35Feferman [2005], p. 607-608. 36Though see Weaver [2009] for the charge that Feferman has cooked up these independent constructions in order to establish the reductions. 37Feferman [2005], p. 609.

145 @nppDXαpn, Xqq Ø p@Y βpn, Y qqq , DZ@npn P Z ØDXαpn, Xqq

where α and β are arithmetical formulas.

2. The Hierarchy Rule:

WOpăeq , @XDY Hpăe,X,Y q

1 where WO(Z) is a Π1 formula expressing that Z is a well-ordering of the integers, ăe is a primitive recursive linear ordering of the integers, and H(Z,X,Y ) is an arithmetical formula expressing that Y is a Turing jump hierarchy along Z starting at X.

3. The Transfinite Induction Rule:

WO(ăe) , TI(ăe, γ)

where TIpZ, γq expresses transfinite induction along Z with respect to γ, and WO(Z) and ăe are as above.

Feferman was able to show that IR proves the same theorems as HC . This established αăΓ0 α that IR is predicatively reducible to HC , and that oneŤ has conservativity for Π1 αăΓ0 α 1 formulas.38 Ť The above rules may also be written as axioms:

1 1. The ∆1 Comprehension Axiom:

p@nppDXαpn, Xqq Ø p@Y βpn, Y qqq Ñ DZ@npn P Z ØDXαpn, Xqq

2. The Hierarchy Axiom:

@ZpWOpZq Ñ @XDY HpZ,X,Y q

3. The Transfinite Induction Rule:

@ZpWOpZq Ñ TIpZ, γqq 38Feferman [2005], p. 609.

146 The system ATR0 is obtained by adjoining the Hierarchy axiom to the system ACA0. (Recall

1 that ATR0 stands for ATR0 also includes the ∆1 Comprehension Axiom, but since it is a system with restricted induction (as indicated by the subscript 0), does not include the Transfinite Induction Axiom39

It was shown by Friedman, et al. [1982] that IR and ATR0 are proof theoretically similar. In particular, they both have proof-theoretic ordinal Γ0; they both prove the same

1 Π1 sentences; and they are both of the same proof-theoretic strength. They are, however, importantly different. According to Simpson [2002], IR explicates predicative provability, while ATR0 explicates predicative reducibility. Moreover, ATR0 is much stronger than IR, both model-theoretically and mathematically.

Unfolding

Despite the mathematical interest that the above results have, one might still worry about their philosophical significance. The problem is that the autonomy constraint implicit in these axioms and inference rules still appeals in some sense to the notion of a well-ordering when discussing “provable well-orderings”, even if not as explicitly as in the case of predicative definability discussed above. After all, “If the predicativist cannot say what it is to be a well- ordering, then how can she prove that some relation is a well-ordering? How can she say what it is that she has proved?”40

The principal autonomous condition considered is that one already has a proof of a formal statement which expresses, under an impredicative interpretation, that the ordering is a well- ordering. The orderings satisfying this condition will be called ‘predicative well-orderings,’ though strictly speaking we have given no independent meaning to ‘well-ordering’ in predicative terms.41

The idea is that the notion of well-foundedness, though perhaps epistemically accessible, is not, strictly speaking, contained in the concept of natural number, and so not predicative.

the notion of well-foundedness is prima-facie impredicative, if it is considered to be defined (as usually done) in terms of prior concepts, since for this we need quantification over all functions.

39Simpson [2002], p. 131. 40Linnebo and Shapiro [20XX], p. 23. 41Feferman [1979a], p. 85, as quoted in Koellner [20XX], p. 21

147 . . . Thus to qualify such notions or means of definition as predicative, we would seem to need some more direct understanding of their meaning implicit in our understanding of N.42

Considerations of this sort led Feferman to give a characterization of Predicativity that does not depend at all on the concept of a well-ordering. This led to a shift to consider, following Kreisel [1970], what results from “reflecting” on a theory T. Early ideas centered around the schematic nature of theories like PA and ZFC, schematic because of the induc- tion schema, comprehension schema, and collection schema. One modifies the language of such schematic theories by adding an untyped truth predicate and the corresponding Kripe- Feferman axioms for type-free truth. The additions represent the “reflective closure” of the schematic theories. More generally, Feferman considered frameworks that capture what it is to “unfold the content of a theory”. The more general program of unfolding asks, in addi- tion to the truth predicate and corresponding axioms, which operations and predicates one should accept given one has already accepted some theory.

The question which the notion of unfolding is supposed to address is: given a schematic system S, which operations and predicates—and which principles concerning them—ought to be accepted if one has accepted S? The answer for operations is straightforward: any operation from and to individuals is accepted which is determined explicitly or implicitly from the basic operations of S. Moreover, the principles which are added concerning these operations are just those which are derived from the way that they are introduced. . . . The question concerning predicates in the unfolding of S is treated in operational terms as well; that is, which operations on and to predicates—and which principles concerning them—ought to be accepted if one has accepted S? For this, it is necessary to tell at the outset which logical operations on predicates are taken for granted in S.43

It turns out that the unfolding of non-finitist arithmetic, U(NFA)—essentially PA in this

setting—is equivalent to RAΓ0 , i.e.

U(NFA) ” RAăΓ0 , and that one also has conservativity for arithmetical statements.

42Feferman [1979a], p. 91, as quoted in Koellner [20XX], p. 21 43Feferman [2005], p. 616-617.

148 Limitations

Among the questions that foundational and philosophical positions within the philosophy of mathematics concern themselves with are: How much of mathematics is entirely clear and definite? Can we determine a limit of such reasoning? Can we calibrate these limits with a philosophical stance? What is the foundational significance, be it ontologically or epistemologically, of these limitative results? That is, can we bring together the philosophical with the mathematical? A further question is whether this limit really does exhaust what is clear and definite, or whether once we know the limits of these things we can reflect in some way and understand further things.

In the case at hand, even if it is correct that Γ0 exhausts what is predicatively knowable, and even if the Predicativist is correct that predicativist reasoning is epistemically privileged, the predicativist will never be able to know the extent of his own reasoning.

For, whenever he can recognize that all theorems of a certain sort are correct, he can also recognize that the statement of consistency of this set is correct. However, the consistency of the set of all predicatively provable statements is not itself predicatively provable.44

Similarly, were one to suppose that ‘predicatively definable ordinal’ is itself a predicative notion, such a condition should give rise to a form of predicative comprehension and hence provide a predicative construction of Γ0, which, according to the logical work exposited above, is the least impredicative ordinal. And such argumentation extends to other various onto- logical and epistemological theses that the predicativist might put forward. Alternatively, beginning with the idea that there should be only countably many predicative real numbers, one can easily diagonalize out in the standard—and predicatively acceptable—way.45 So one can never be in a position to demarcate the limits of the predicatively meaningful mathematics in a way acceptable to the predicativist. As Hellman puts it,

[T]he very effort to articulate such [limitative] theses, gven [sic] the precision available to us from all the logical work. . . reveals them to be self-defeating: the predicativist implicitly tran- scends predicativity him/herself in the very formulation of the limitative theses! 46

44Feferman [1964], p. 4. While this quote addresses the particular case of characterizing predicativity in terms of transfinite progressions, the central point applies to any of the characterizations. 45Hellman [2004], pp. 299-300. 46Hellman [2004], p. 299.

149 We’ve already seen how this works in the case of the finitist in our discussion of the inter- nal/external distinction at the end of last chapter. The problem, pithily captured in the title of Burgess [2010], is that of being on the outside looking in. Thus, to quote Gandy [1967],

The role played by Γ0 for predicative systems is closely analogous to that played by 0 for finitist systems. Γ0 is not a predicatively definable ordinal, but he who understands Γ0 understands the consistency, the potentialities and the limitations of predicative proof.47

And the point generalizes. Even if there is a limit to one’s understanding, the determination of that limit will essentially involve methods that go beyond what is epistemically acceptable to the one philosophically aligned with that limit.

This is a general difficulty that tends to arise whenever one advances a precisely demarcated limitative view concerning the extent of meaningful and acceptable statements of mathematics. For example, it also arises for the strict finitist and the finitist. The trouble is that if one gives a precise characterization of a domain of mathematics and says “This includes all and only the meaningful and acceptable mathematical statements” then, provided the characterization is itself meaningful and acceptable, one will generally, by reflection, be able to arrive at a meaningful and acceptable statement that lies outside of the limitative view. The only recourse in such a case seems to be to accept that one has, by one’s lights, run up against the limits of articulation and, instead of attempting to say the unsayable, rather gesture at it.48

0 and Γ0 are similar in the sense that they both determine the “bounds” of philosophically informed (or based) reasoning. And like the finitist, the best the (ideal) predicativist can do is to recognize any particular proposition as being predicatively meaningful and correct. But perhaps this is not a problem. In the case of Predicativism, why, one might ask, should the ideal predicativist care about being able to succinctly demarcate the limits of her view? Perhaps she can be content that at each stage, given any particular proposition that is predicative, she is safe. Indeed, perhaps the lesson to be drawn from all of this is that the philosophical import should not be framed in terms of any limitative theses regarding the extent of Predicativity (or any foundational thesis for that matter) as a philosophical position, since the Predicativist is not in a position to appreciate those. Instead, perhaps the extent of any foundational or philosophical significance should focus on the question of “What rests on what?”, or on the force of various indispensability arguments. Both of these avenues have been pursued by Feferman and others. In the next chapter we will

47As quoted in Crosilla [2017]. 48Koellner [20XX], p. 27.

150 investigate this question of “What rests of what?” in order to get clear on the foundational and philosophical significance of proof-theoretic reductions. And though I shall not address indispensability arguments directly, what I have to say has obvious imports to that line of reasoning.

151 Chapter 6

On the Philosophical Significance of Proof-Theoretic Reductions

The proof-theoretical reduction of the system WKL0 to PRA is taken as perhaps the clearest example of a partial realization of Hilbert’s Program. The concept of reductive proof theory more generally is taken as being foundationally informative in the sense that certain reductions and conservativity results are taken as revealing just how much mathematics is justified on the basis of other, more elementary frameworks. Implicit in this is the notion of epistemic security. We consider as case studies the reductions of the systems IΣ1 and WKL0 to PRA in order to get clear on exactly what such justification and security amounts to. We argue that to properly understand what is meant by these terms requires a closer look at the radicalization of the axiomatic method and the shift to formalism that underlies Hilbert’s Program. After looking at Hilbert’s famous disagreement with Frege we suggest that central to the notions of justification and security is the meaning of what is expressed in formal languages that is eschewed by the shift to formalism. To appreciate this meaning and hence achieve the level of justification and security that is claimed by proof-theoretical reductions requires one to be on the outside of the formal system looking in.

6.1 Introduction

As we have seen, Hilbert’s aim, broadly speaking, was to address the methodological issue of using so-called abstract methods in proofs of concrete number-theoretic statements. The goal was to justify the epistemologically and ontologically suspect “actual” infinite, and to do so by unproblematic finitary means. One standard understanding of this vague characterization is instrumentalist in spirit. In essence, Hilbert sought to convince the finitist that the methods

152 of infinitistic, “meaningless” classical mathematics could safely be used as an instrument for proving facts about the meaningful, finitary domain. A finitist proof of the consistency of classical mathematics would provide such a justification of the instrumental use of classical mathematics. The rough idea is captured in Smorynski [1977], p. 824:

Let R, I denote formal systems encoding real statements with their finitistic proofs and ideal systems with their abstract reasoning, respectively. Let ϕ be a real statement @xpfx “ gxq. Now, if I $ ϕ, then there is a derivation, d, of ϕ from I. But, derivations are concrete objects and, for some real formula P px, yq encoding derivations in I,

R $ P pd, xϕyq,

where xϕy is some code for ϕ. Now, if ϕ were false, one would have fa ‰ ga for some a and hence, R $ P pc, x ϕyq for some c. In fact, one would have the stronger assertion

R $ fx ‰ gx Ñ P pcx, x ϕyq,

for some cx depending on x. But, if R proves [the] consistency of I, we see

R $ pP pd, xϕyq ^ P pc, x ϕyqq,

whence R $ fx “ gx, with free variable x, i.e. R $ @xpfx “ gxq.

0 So, assuming R $ ConI, I $ ϕ ñ R $ ϕ, for ϕ P Π1. Burgess puts the situation nicely: 0 If [a Π1] statement is classically provable, so are all its numerical instances, and if it is untrue, so is at least one of its numerical instances; but an untrue numerical instance would also be disprovable, simply by exhibiting the relevant computations; hence if an untrue such statement were classically provable, there would be an inconsistency in classical mathematics; hence a finitist proof of the consistency of classical mathematics amounts to a finitist proof that every classically provable such statement is true.1

This argument highlights an important concept in proof-theory: reflection principles. Re- flection principles are schematic assertions of soundness. They essentially say that anything provable (in some system) is true. Letting P be a formal theory for classical mathematics, and F the formal theory representing finitist mathematics, the consistency of P is equivalent in F to the reflection principle for P

p@xqpPrpx, ‘ϕ’q ñ ϕq,

where PrP is the proof predicate for P, ϕ a finitist statement and ‘ϕ’ the corresponding 1Burgess [2010], p. 129.

153 formula in the language of P. Because of this equivalence, a finitist consistency proof would allow one to transform any proof of ‘ϕ’ in P into a finitist proof of ϕ. In this way one can eliminate ideal elements from proofs of real statements, thus showing that the formal theory P can be considered an instrument for proving finitist statements.2 The well-known difficulties that Gödel’s theorems posed for Hilbert’s Program, as origi- nally intended, led some, notably Bernays, to the conclusion that the methods countenanced as epistemically (and ontologically) privileged must be expanded beyond the mere finitary to the constructive, a notion that itself comes in degrees. This is the first step toward a generalization of Hilbert’s Program and it is epitomized by Gentzen’s consistency proof of

PA by appeal to transfinite induction up to the ordinal 0 for primitive recursive predi- cates. Work in this tradition notably continued in the Takeuti and Schütte schools. The general idea of such programs is to provide a finitistic description of the ordering relation of a particular ordinal notation for ordinals up to some ordinal α, a finitistic proof that the principle of transfinite induction up to α implies the consistency of the formal system under consideration, and a constructive proof of said principle of transfinite induction.3 While mathematically interesting, articulating the philosophical value of the generalized program is less clear. We broached some of these issues, specifically with respect to Takeuiti’s work, in chapter 4. Feferman captures the general spirit:

The problem with such extended forms of H.P. is that there are many different styles of con- structivity, and the concept of constructivity in general is much less clear than even finitism. Moreover, as the systems of ordinal notation used for consistency proofs of stronger and stronger theories become more and more complicated, the significance to noncognoscenti of what is thereby accomplished decreases in inverse proportion. Thus, on the one hand, to say that one has obtained a constructive consistency proof of a theory T —without saying anything more—is too general to be informative; and, on the other hand, to say that the proof has been carried out by transfinite induction on a certain complicated recursive ordering for some very large ordinal tells us nothing about what constructive principles are involved in the proof of its well-ordering.4

Partly owing to considerations of these sorts, Kreisel further expanded Bernays’s thought,

2This informative characterization of Hilbert’s Program in terms of the generalized reflection principle is due to Sieg [1990], p. 309. See also Sieg [1991]. 3Feferman [2000a], p. 70. 4Feferman [1988a], p. 367, italics mine. See also Feferman [1993a], p. 191, and Feferman [2000a], pp. 70-71.

154 promoting a “hierarchy of Hilbert programs”. The mature Hilbert’s Program connected the consistency problem with understanding the concept of infinity. According to Kreisel, the elimination of transfinite symbols from proofs of formulae not containing those symbols amounts to an understanding of the use of such transfinite machinery, from the finitist point of view. The idea behind Kreisel’s proposal is that

instead of having a single kind of elementary reasoning whereby we understand the use of transfinite symbols, there will now be methods of reasoning involving a hierarchy of conceptions such as, e.g. more and more abstract conceptions of a «construction», and when we have a hierarchy of Hilbert programmes of discovering the appropriate complex of such methods which is needed for understanding the use of transfinite symbols in given systems (modified Hilbert programme).5

The goal of Kreisel’s modification of Hilbert’s program is thus to

To determine the constructive (recursive) content or the constructive equivalent of the non- constructive concepts and theorems used in mathematics, particularly arithmetic and analysis. . . . The (or a) constructive content of non-recursive formulae will be expressed by means of constructive ones; the purpose of the so-called finitist or constructive consistency proofs of a system consists, for us, not in the alleged greater “evidence” or “reliability” of constructive proofs compared with non-constructive ones, but in this: they help keep track of the constructive (recursive) content of the steps in the (non-constructive) proofs of the system considered.6

What is important for our purposes is not so much the extent of Kreisel’s unwinding program and its various specific applications.7 Rather, our concern is with a central ingredient of Kreisel’s Program, namely his notion of an interpretation.8 We define the notion here and revisit it later; it will emerge as one of the crucial aspects to understanding the philosophical upshot of the proof-theoretical results discussed in the sequel.

Definition 6.1.1 (Recursive Interpretation) A recursive interpretation of a system F in some subsystem F consists of two recursive functions, αpA, nq and πpBq such that

(i) α maps each formula in the language of A of F to some formula An of F and π maps each proof P in F to a proof P in F of some formula An; (ii) A can be proved from each An in F; 0 (iii) if A is Π2, i.e. of the form @xDyApx, yq, then An is Apx, φnpxqq, where φn are the provably recursive functions of F.

5Kreisel [1958a], p. 349. 6Kreisel [1958b], pp. 155, 156. 7See Kreisel [1951/52], [1958b] and Feferman [1996] for more on Kreisel’s Unwinding Program. 8Kreisel [1958b], p 160.

155 Herbrand’s theorem and Kreisel’s no counterexample interpretation are notable examples of recursive interpretations.9 The rest of the chapter proceeds as follows. In §6.2 we highlight the connection between proof-theoretic reductions (as defined in chapter 2) and Kreisel’s hierarchy. In doing so we

sketch several important weak subsystems, notably IΣ1 and WKL0, in order to establish in

§6.3 a proof, due to Wilfied Sieg, that WKL0 is proof-theoretically reducible to PRA. Such a result is important because it is taken as perhaps the clearest example of a reduction in the original spirit of Hilbert, and hence as the paradigm example of a partial realization of Hilbert Program. This prima-facie foundational significance together with the elementary nature of these systems make them ideal case-studies to evaluate the philosophical significance of proof-theoretical reductions generally. Such significance is discussed in §6.4.

6.2 Proof-theoretic reduction

Kreisel’s hierarchy of Hilbert’s programs can be fruitfully understood within the general framework of a proof-theoretic reduction. Following Sieg,

The crucial tasks of this general reductive program are: (i) find an appropriate formal theory P˚ for a significant part of classical mathematical practice, (ii) formulate a “corresponding” constructive theory F˚, and (iii) prove in F˚ the partial reflection principle for P˚, i.e.

Pr˚pd, ‘s’q ñ s10

for each P˚-derivation d. Pr˚ is here the proof-predicate of P˚ and s an element of some class F of formulas. The provability of the partial-reflection principle implies the consistency of P˚ relative to F˚.11

The notions of reducibility and relative consistency give rise to hierarchies of reducibility and consistency strength, which, due to unpublished results of Friedman, coincide in most

12 cases. We say that a theory T1 is of no greater consistency strength than T2 if one can

9See Kreisel [1958b]. 10Sieg writes Pr˚pd, ‘s’ ñ sq, but presumably this is a typo. 11Sieg [1990], p. 310. 12For a fantastic overview of this hierarchy and its relations to Frege-inspired reconstructions of mathe- matics, see Burgess [2005]. As Burgess notes, the “authoritative compendium[s]” of the lower, middle, and upper part of this scale are Hájek and Pudlak [1998], Simpson [2009], and Kanamori [2009], respectively.

156 prove the consistency of T1 relative to T2. If in addition one can prove the consistency of T2

relative to T1, then T1 and T2 are said to be of equal consistency strength. It is a remarkable fact that virtually all theories of natural foundational interest can be shown to be of equal consistency strength of a handful of fundamental theories. These theories form the “spine”, so speak, of a hierarchy of increasing strength. What is quite remarkable is that in many cases two theories of prima facie different logical and expressive strength are shown to be of equal consistency strength by virtue of both being equivalent to one of the theories located on the spine. And this has prima facie foundational and philosophical import:

The fundamental series provides a scale by which the degree of success of a foundational program may be measured. If the program produces a theory T to which theories fairly high up on the fundamental series can be reduced, then it is correspondingly fairly successful, at least by one important measure of success.”13

We shall consider the status of this measure of success below. At the very bottom of the scale we have the system Q, known as Robinson arithmetic,

consisting of the usual axioms for successor (1), plus (`), and times (¨). It is well known that Q is very weak mathematically and that its intended purpose was to determine the scope of Gödel’s first incompleteness theorem. Now there are two primary ways that one can extend a theory. One way is to enlarge the class of formulas over which one can perform induction. If we take Q and add the full scheme of mathematical induction, we get Peano Arithmetic. If instead we add only bounded quantification for induction, then we end up with the system

14 0 sometimes called I∆0. Extending quantification to Σ1 formulas for induction results in the system IΣ1, known as Parson’s arithmetic. And one can continue in this fashion to arrive at the systems IΣ2, IΣ3, IΣ4,.... The union of all of these theories results in the extension to induction for all formulas, and is a form of Peano Arithmetic. Another way to extend Q is to add particular operations. Following Sieg [1991], we introduce a characterization of the Grzegorczyk hierarchy due to Ritchie [1965]. Consider the following sequence of number theoretic functions:

13Burgess [2005], p. 54. 14 A formula is bounded, or ∆0, if its quantifiers are all bounded (where bounded quantification is defined by letting @x ă tpϕq mean @xpx ă t Ñ ϕq and Dx ă tpϕq means @xpx ă t ^ ϕq).

157 1 A0px, yq “ y

x if n “ 0 $ An`1px, 0q “ ’0 if n “ 1 ’ &’ 1 if 2 ď n ’ ’ An`1px, y ` 1q “ %’Anpx, An`1px, yqq.

The reader can confirm that A1 defines addition, A2 multiplication, A3 exponentiation, A4 superexponentiation, and so on. The Grzegorczyk hierarchy is obtained by letting En be

1 the smallest class of number-theoretic functions that contain λx.0, λx.x , Amďn, and that is closed under explicit definition and bounded recursion. E3 famously corresponds with the elementary Kalmar functions.. The union of all the levels, Eω, coincides with the primitive recursive functions.

The relationship between Eω and IΣ1 is taken as providing one of the earliest and most basic forms of a modified Hilbert program. In particular it can be shown IΣ1 proves the exis- tence and uniqueness assumptions for all of the primitive recursive functions, and hence that they are among the provably total functions of IΣ1. This establishes that Eω is interpretable

0 in IΣ1. Conversely, IΣ1 is reducible to Eω for Π2 formulas, and hence to PRA. Moreover, this is the strongest result one can get in the sense that IΣ2 is not proof-theoretically reducible to

PRA since IΣ2 $ ConIΣ1 . The interpretability and reduction results show that the provably total functions of IΣ1 are exactly the primitive recursive functions. In the next section we’ll look at this in more detail by considering a proof due to Wilfied Sieg [1991].15

Proceeding to full second-order arithmetic (Z2) results in analysis. Here one adds axioms 15Sieg also established the result in Sieg [1985], but the [1991] proof is a bit more perspicuous and so our focus will be on that. The result was first established by Parsons [1970], using a modification of Gödel’s functional interpretation. Alternative (and independent) proofs were given by Mints [1973] and Takeuti [1987]. The former uses the no-counterexample interpretation, while the latter uses a Gentzen-style assignment of ordinals to proofs. Buss [1998] also presents a proof, appealing to witnessing functions. We plan to consider these proofs in future work as case studies in order to compare the philosophical value of these standard proof-theoretic techniques. For model-theoretic proofs see, for example, Avigad [2002] and Simpson [2009]. See Ferreira [2005] for more on these references.

158 for full impredicative comprehension pDX@npn P X Ø ϕpnqqq and full induction (0 P X ^ @npn P X Ñ n ` 1 P Xqq Ñ @npn P Xq) to the basic axioms for 1, `, ¨, and ă. As we saw in chapter 2, important subsystems of Z2

1 RCA0, WKL0, ACA0, ATR0, Π1 ´ CA0 are obtained by restricting comprehension and induction.16 Notable for our purposes is the subsystem WKL0. WKL0 consists of RCA0 plus the non-constructive set existence principle known as Weak König’s Lemma (WKL) asserting that any infinite tree of finite binary se- quences has an infinite path. It was first shown by Friedman that WKL0 is conservative over

0 PRA for Π2 sentences. Friedman’s proof is model-theoretic. It was Sieg [1985], [1991] that established the result proof-theoretically by exhibiting a primitive recursive proof transfor- mation. This establishes that the reducibility of WKL0 to PRA is itself provable in PRA, i.e., PRA $ WKL0 ď PRA, and is taken as showing that whatever nonconstructive methods

0 used in the WKL0 proof of any Π2 sentence can be “eliminated” in the sense that there exists a corresponding “elementary” proof in PRA. It is to these results that we now turn.

6.3 Two Case Studies

In this section we present an argument due to Wilfried Sieg [1991] that establishes IΣ1 ď PRA and WKL0 ď PRA. The key to establishing these results centers on the fact that the

0 Π2 theorems exhibit a form of normalizability that in turn provides bounds on the logical complexity of formulas in a derivation. The language being used is that of elementary arithmetic together with possible function symbols for classes of primitive recursive functions. Sieg uses the because of its convenience, but takes it in the form due to Tait, whereby sequents are understood as

16We omit details concerning some of the reductions that have been achieved for these theories. We also omit mention of reductive results in full analysis, higher types, or set theory. For details see, for example, Feferman [1988a], [1993a], [2000a], or Simpson [2009], and the references and results therein.

159 disjunctions.

LA: Γ, ϕ, ϕ

Γ, ψ Γ, ϕ ^: Γ, ψ ^ ϕ Γ, ϕ Γ, ψ _: _: Γ, ψ _ ϕ Γ, ϕ _ ψ

Γ, ϕ Γ, ϕ Cut: Γ Γ, ϕa @: a R Γ Γ, @xϕx

Γ, ϕt D: Γ, Dxϕx

It is well known by results originally due to Gentzen that restricting oneself to the sequent calculus guarantees full normalizability, i.e. any derivation can be reduced to a normal, or cut-free derivation with cut rank 0:

Theorem 6.3.1 (Normalization Theorem, 1.1.2) If D is a derivation of Γ with %pDq “ n, then there is a derivation E of Γ with %pDq “ 0.17

Indeed, E can be determined effectively from D. One can also add axioms for equality:

Γ, t “ t Γ, s ‰ t, ϕs, ϕt (for atomic ϕ)

and axioms for arithmetic:

Γ, 0 ‰ s1

Γ, s1 ‰ t1, s “ t

Γ, s ` 0 “ s Γ, s ` t1 “ ps ` tq1

17For ease of reference, numbers inside parentheses indicate Sieg’s numberings.

160 Γ, s ¨ 0 “ 0 Γ, s ¨ t1 “ ps ¨ tq ` s

Γ, s ă 0 Γ, s ă t1, s ă t, s “ t Γ, s ă t, s ă t1 Γ, s ‰ t, s ă t1

Let TpFq be the theory that results from the Tait calculus together with the axioms for equality and arithmetic plus additional axioms for all the defining equations of the primitive recursive functions. TpFq has the property of quasi-normalizability, i.e. each derivation can be converted into a derivation with cut rank equal to 1:

Theorem 6.3.2 (T-Normalization, 1.1.3) Let D be a TpFq-derivation of Γ; then there is a quasi-normal TpFq-derivation E of Γ.

One can actually achieve full normalizability if one allows closure under cut, i.e. taking as axioms all sequents that are obtainable from the axioms of TpFq by means of cuts with one of the principal formulas in the axioms as cut formula. Since what is important for our results is the bounding of formula, rather than conservation of the subformula property, we, following Sieg, do not require closure under cut and content ourselves with quasi-normalizability. Finally, if one adds the induction rule Φ-IR

Γ, ϕ0 Γ, ϕa, ϕa1 Γ, ϕt

for the class of formulae Φ, one gets:

Theorem 6.3.3 (I-Normalization, 1.1.5) If D is a derivation of Γ in pΦpFq-IA), then there is an I-normal derivation E of Γ in pΦpFq-IA), where a derivation is called I-normal iff all of its cuts are either I-cuts or have atomic cut-formulas, and a cut with cut-formula ϕ is called an I-cut iff one of its premises is the conclusion of the induction rule with principal formula ϕ or ϕ. The crucial tools for establishing the reductions are what are called @-inversion and D- inversion. The former follows straightforwardly from the form of the @ rule.

Lemma 6.3.1 (@-inversion, 1.1.1) If D is a derivation of Γ, p@xqϕx, then there is a deriva- tion E of Γ, ϕc with |E| ď |D| and %pEq ď %pDq.

161 D-inversion is a form of Herbrand’s theorem. Understanding it requires understanding what Sieg calls Herbrand Theories:

Definition 6.3.1 (Herbrand Theory) A theory TpFq of the form QFpFq-IA is an Her- brand Theory whenever (i) F is provably closed under explicit definition and definition by cases; and (ii) F is closed under bounded search, i.e. for any φ in QFpFq there is an h such that T pFq $ pDy ď xqφ Ø φhpxq.18

Lemma 6.3.2 (D-Inversion, 1.2.3) Let TpFq be an Herbrand theory, let Γ contain only purely existential formulas, and let ψ be quantifier-free; if D is a TpFq-derivation of Γ, pDxqψx, then there is a term t˚ and a(n I-normal) TpFq-derivation D˚ of Γ, ψt˚.

Proof. See Appendix.

The following theorem follows immediately from @-inversion and D-inversion.

Theorem 6.3.4 (Term extraction, 1.2.4) Let TpFq be an Herbrand theory and let φ be quantifier-free; if TpFq proves p@xqpDyqφxy, then there is a term tras in LpFq such that TpFq proves p@xqφxtrxs. λx.trxs denotes a function in F.

We are now in a position to understand how to establish IΣ0 ď PRA.

6.3.1 IΣ1 ď PRA

0 To show IΣ1 ď PRA, Sieg actually shows that the system Σ1pPRq-IA is conservative over

0 0 QFpPRq-IA for Π2 sentences in the sense that if Σ1pPRq-IA proves @xDyRxy with R quan- tifier free, then there exists a primitive recursive function f such that QFpPRq-IA proves

19 0 Rafpaq. Σ1pPRq-IA is a first-order extension of primitive recursive arithmetic (PR is the

class of primitive recursive functions) and is a definitional extension of IΣ1. QFpPRq-IA is

PRA expanded by first order logic and is equivalent to tp∆0pAnq-IA: n P Nu, where the

languages of the formal theories ∆0pAnq-IAp3 ď nq extendŤ that of elementary arithmetic by

adding function symbols for the elements of An :“ tAm : 3 ď m ď nu from above.

0 To establish the conservativity of Σ1pPRq-IA over QFpPRq-IA it suffices to prove the following lemma: 18Sieg [1991], p. 416. 19Note the relation to Kreisel’s notion of an interpretation mentioned above.

162 0 Lemma 6.3.3 (2.1.2) Let Γ contain only Σ1-formulas; if D is an I-normal derivation of Γ 0 in (Σ1pPR-IA), then there is an I-normal derivation of Γ in QF(PR)-IA.

Proof. See Appendix. The central moves involve applications of @-inversion and D-inversion. Even though our focus is on the reductions, it is worth noting that with this result one can also establish that

0 Theorem 6.3.5 (2.1.1) The provably total functions of (Σ1-IA) are exactly the primitive recursive functions.

Proof. See Appendix.

6.3.2 WKL0 ď PRA

The basic idea to establishing WKL0 ď PRA is to establish the reduction WKL0 ď IΣ1, and

couple this with the result from the last subsection, namely IΣ1 ď PRA.

To see how this works in Sieg’s hands, we first introduce theories ETnpn ą 2q. These

are second order versions of QFpEnq-IA with defining axioms for function(al)s, all instances

of an axiom schema for explicit definition of function(al)s in the form λy.ptarysqpbq “ tarbs

(call this ED), and an induction schema containing possible function parameters. ETn is called BT. ETn is conservative over QFpEnq-IA, and BT is conservative over QFpPRŤ q-IA. Following Sieg, we strengthen BT by adding a function existence principle in the form of the

@xDyφxy ÑDf@xφxfpxq

0 0 for Σ1 formulas (Σ1-AC0), and a version of König’s Lemma for binary trees, abbreviated WKL for Weak König’s Lemma. Following Troelstra [1974] (as cited in Sieg [1985]), Sieg formalizes the abstract version of Weak König’s Lemma as

@frT pfq ^ @xDyplhpyq “ x ^ fpyq “ 1 ÑDg@xfpg¯pxqq “ 1qs, where lhpxq is a length function, g¯pxq a course-of-values function, and T pfq abbreviates that

163 the set of sequence numbers given by its characteristic function f forms a binary tree, i.e.

@x, ypfpx ˚ yq “ 1 Ñ fpxq “ 1q ^ @x, ypfpx˚ ă y ąq “ 1 Ñ y ď 1q.

0 0 Let F :“ tBT ` Σ1-AC0 ` Σ1-IA ` WKLu. F is equivalent to the system WKL0. Given this equivalence, Sieg establishes:

0 Theorem 6.3.6 (Friedman, 2.2.4) The theory (F) is conservative with respect to Π2- formulas over PRA.

And this is proved from a slight extension of Lemma 6.3.3 above, together with Lemma 6.3.4. We repeat Lemma 6.3.3 for perspicuity:

0 Lemma 5.3.3 (2.1.2) Let Γ contain only Σ1-formulas; if D is an I-normal derivation of Γ

0 in (Σ1pPR-IA), then there is an I-normal derivation of Γ in QF(PR)-IA.

0 Lemma 6.3.4 (2.2.5) Let ∆ contain only Σ1-formulas and let φ be quantifier-free; if D is 0 an I-normal derivation of ∆r QF-AC0, WKLs, pDyqφya in pBT ` Σ1-IAq, then there is an 0 20 I-normal derivation E of ∆, pDxqφya in pBT ` Σ1-IAq.

0 Lemma 6.3.4 is itself a slight extension of the case of (ETn ` Σ1-AC0 ` WKL), keeping in

0 mind that Σ1-AC0 is equivalent to QF-AC0:

Lemma 6.3.5 (Elimination Lemma, 2.2.2) Let D be an I-normal ETn-derivation of ∆r SCHEMAs, where SCHEMA stands for QF-AC0 or WKL; if ∆ contains only existential formulas, then there is an I-normal ETn-derivation E of ∆.

Proof. See Appendix. Importantly, the proof illustrates the appeal to the central conceptual tool of Σ-inversion. The important take-away from Sieg’s proof is that the strategy is uniform:

1. Embed the formal theory into a suitable sequent calculus with additional principles 0 (e.g. Σ1-AC0 and WKL); 2. Show the normalizability of derivations; 3. Eliminate the additional principles from normal derivations for a class of sequents.

20 ∆r QF-AC0, WKLs stands for ∆ with a finite number of negated instances of QF-AC0 and WKL.

164 The idea behind the eliminations is that proving ∆ from the assumption, say, WKL, is to

prove WKL Ñ ∆. And this is the same as proving WKL _ ∆, which in the Tait calculus is WKL, ∆. To prove the conservation theorem, one shows that one can transform a proof of WKL, ∆ into a proof of ∆; hence the “elimination” of WKL. In each case, the elimination involves a series of inversions until one arrives at subproofs that can be cut away.

6.4 Philosophical Upshot

6.4.1 Internal limitations

One of the reasons we chose to focus on the reduction of WKL0 to PRA is that it is commonly taken as one of the purest forms of a partial realization of Hilbert’s Program. This position

is perhaps most clearly espoused by Simpson [1988]. For Simpson, because WKL0 is notably stronger than PRA with respect to infinite mathematics21, the reduction is foundationally significant since, assuming Tait’s analysis of finitism as PRA, it shows that a large and

0 significant part of mathematics is finitistically reducible in the sense that any Π2 consequence of WKL0 is finitistically true, and hence true in the real world. This truth in the real world is what justifies WKL0 and all of its (infinitistic) consequences. This for Simpson counts as a significant partial realization of Hilbert’s Program. Burgess nicely summarizes the idea:

0 Given a finitist proof of a metatheorem to the effect that any Π1 statement having a WKL0 0 proof has a finitist proof, obtaining a WKL0 proof of a Π1 statement P would give a finitist proof of the existence of a finitist proof of P , and (tacitly assuming that a finitist proof of the existence of a finitist proof is as good as a finitist proof) would allow a finitist to infer P . So though the instrumental use of the whole of classical mathematics would not be finitistically justified, still the instrumental use of as much as can be formalized in WKL0 would be, and to this extent something like Hilbert’s ambitions would be partially realized. How substantial a partial realization this amounts to depends on how much of classical mathematics is formalizable in WKL0 or a system of a similar status, and this is one of the main questions addressed in subsequent work in reverse mathematics.22

0 Some caution is required here. The claim “Every WKL0-provable Π1 statement is PRA-

21 For a list of mathematical theorems provable in WKL0 see Simpson [1988] and [2009]. They include, notably, the Boolean Prime Ideal Theorem, the Completeness Theorem for first-order logic, the Heine/Borel Theorem, Brouwer’s fixed point theorem, and the Hahn/Banach theorem for separable Banach spaces. 22Burgess [2010], p. 131.

165 0 provable” is Π2, and as such is, as we’ve seen, finitistically meaningful only as a partial communication of:

x is a proof in WKL0 of y Ñ fpxq is a proof in PRA of y,

for f primitive recursive. It is a result of this form which is finitistically provable. Now, as Burgess [2010] points out, Tait [1981] is explicit that his analysis of finitist provability as PRA-provability is external. As such, assuming Tait’s analysis of finitism is correct,

while finitist provability and PRA provability do in actual fact coincide, and while we who are not subject to the intellectual limitations of the finitist can see this ‘from the outside’, still the finitists themselves cannot see it ‘from the inside’.23

Of course the finitist can see that each individual axiom of PRA is finitistically acceptable, but she cannot establish the universal generalization with regard to the finitistic acceptability of all of the axioms of PRA. And so because of this, even if it is true that finitistic provability coincides with PRA- provability, it does not follow that a finitist proof of the PRA-provability of a result is the same as a finitist proof of the finitist provability of said result. Similarly, a finitist proof of the relative consistency of PRA and WKL0 does not amount to a finitist proof of the consistency of WKL0, since there is not a finitist proof of the consistency of PRA for Gödelian reasons (assuming Tait’s analysis). Burgess’s point is thus that appeals to the metatheorem above as establishing a finitist guarantee “once-and-for-all” to the finitist that any WKL0 proof can be “backed up by” a finitist proof are mistaken. The finitist presented with a finitist proof

0 of the metatheorem is only able to see that a specific proof p of a Π1 result P also amounts to a PRA proof of P , and not, contrary to the claims of some authors, a finitist proof of P . Similar remarks apply to various other proof-theoretical reductions cited in the literature.

To take a familiar example, the conservativeness of ATR0 over IR is of this sort. For as we saw

0 in the last chapter, while ATR0 is conservative over IR for Π2 statements, the predicativst is not in a position to appreciate this result generally, as the limit of predicativity (Γ0) was

23Burgess [2010], p. 133.

166 similarly established externally. In particular, the predicativist is not in a position to entirely appreciate the autonomy constraint. At best she can understand it piecemeal as a rule. This is not to say that there are no genuine partial realizations of the neo-Hilbertian instrumentalist program, and Burgess mentions a few. Various WKL-like extensions of bounded fragments of PRA do count as partial realizations because unlike with full PRA, a finitist can know that everything provable in a bounded fragment of PRA is finitistically provable. This is because “bounded fragments do not exhaust finitistic provability as (ac- cording to the Tait analysis) PRA provability does.”24 And there are other results of the same character. ACA0 is a conservative extension (for a large class of formulas) of Heyting arithmetic HA, and this does allow the constructivist to conclude that certain ACA0 proofs are constructivistically provable, for analogous reasons: the constructivist can know that everything HA-provable is constructivistically provable because HA does not exhaust con- structivistic provability. The important criterion is that the reducing theory be below the externally defined “foundational” limit. Cashing out the philosophical significance of proof-theoretic reductions in terms of foun- dational reductions is also found in Feferman’s work, and embodies the idea of a hierarchy of Hilbert Programs á la Kresiel. In Feferman’s hands the philosophical significance of such reductions follows a general pattern:

A part of mathematics M is represented in a formal system T1 which is justified by a founda- tional or conceptual framework F1. T1 is reduced proof-theoretically to a system T2 which is 25 justified by another, more elementary such framework F2.

Supposing T1 is justified directly by F1 and T2 by F2, a proof-theoretical reduction of T1 to T2 provides a partial foundational reduction of F1 to F2. Among the foundational concepts that have been reduced are the countable infinitary to the finitary (IΣ1 ď PRA); the uncountably infinitary to the finitary (RCA0 ď PRA, WKL0 ď PRA); the uncountably infinitary to

1 the countably infinitary (ACA0 ď PA, ∆1-CA0 ď PA); the impredicative to the predicative

1 1 (∆1-CA ď ACAă0 ); and the nonconstructive to the constructive (PA ď HA, ∆2-CA ď 24Burgess [2010], p. 139. 25Feferman, [1988a], p. 364. See also Feferman [1993a], p. 189 for an almost verbatim charactization.

167 1 piq 26 pΠ1-CAqă0 ). A catalogue of such reductions illustrates the hierarchy had in mind by Kreisel and, according to Feferman, allows one to survey the proof-theoretical landscape in order to have a better idea of the connections between formal mathematical machinery and philosophically motivated foundational perspectives. But note that the general point Burgess was alluding to seemingly also applies here. To see this, first note that central to the scheme of reduction above is the passage of formalization from a body of mathematics M to a formal theory T. Following Feferman [1993a], define T as being an adequate formalization of M if every concept, argument, and result of M may be represented by a (basic or defined) concept, proof, and theorem, respectively, of T.

Similarly, T is in accordance with (or faithful to) M if every basic concept of T corresponds to a basic concept of M and every axiom and rule of T corresponds to, or is implicit in,

the assumptions and reasoning followed in M (in other words, if T does not go beyond M conceptually or in principle). Such definitions make clear the idea of T being directly adequate to and directly in accordance with M . T is said to be indirectly adequate to M if a theory directly adequate to M is (proof-theoretically) reducible to T in an elementary way, or if one can reformulate the concepts, proofs, and theorems of M informally in such a way that the resulting M 1 can be directly formalized in T. Similarly T is indirectly in accordance with M if T is reducible to a theory directly in accordance with M .27 Assuming that T is an adequate formalization of, and faithful to M , Feferman claims that the philosophical value offered by reductive proof theory is that it

provides technical notions and results which–when successful–serve to give a more global kind of answer to [the question “What rests on what?”], in terms of a reduction of one such system to another; moreover, these results provide a technical bridge from mathematics to philosophy.28

As he puts the point in a later work:

The alternative offered by reductive proof theory is to formalize various parts of mathematics in subsystems of set theory and see which of these can be reduced to systems justified by one or another of these frameworks F. This way of proceeding is non-committal to which such F is to be preferred, and leads more quickly to a survey of what parts of mathematics can be

26See Feferman [1988a], [1993a], or [2000a] for details. 27Feferman [1993a], pp. 203-204. 28Feferman [1993a], p. 187.

168 reconstructed, at least in principle, on the grounds of F. . . . Namely, as ever in the , one wants to see, dispassionately: what rests on what?29 Proof-theoretical investigations, on this picture, allow one to see just what systems are reducible to other systems. And this survey of the logical landscape is supposed to be informative for understanding foundationally what rest on what. Conceiving of things this way illustrates the shift highlighted in the quote from Crosilla from last chapter: modern proof theory (at least on this construal) is not foundational in the same sense that it was at the turn of the 20th century. Because of this, though, one must be careful about the language used when formulating what is achieved by partial realizations of Hilbert’s program. More on this shortly. The relation to the Burgess point should be clear: In order to appreciate the foundational reduction for what it is, one has to know that, or at least grant that, the formalization is adequate. And in order to do that, one must be on the outside of the system, looking in. And so, one might wonder just how much epistemological insight one actually attains by standing on the sideline. That is, it’s hard to see how one could understand what the reductions are supposed to be showing philosophically, not simply as mathematical results, if one is unable to grasp the limits of the theories. The philosophical value of reductive results as cataloguing what rests on what can only be adequately understood from a bird’s-eye view; one has to at least accept and understand both the theories involved in order to appreciate the reduction in this foundationally schematic way. I raise this issue not to accuse Feferman of overlooking it. After all, he seems to concede as much when he continues, [T]he reductive proof-theorist may face. . . criticism from the committed advocate of F, who will say that it is only what can be explicitly worked out under the principles of F that is of interest to him or her. Well, you cannot satisfy everybody.30 Rather, the point is to make explicit that such a viewpoint requires that one be in some sense non-committal with respect to epistemic privilege when considering the overall picture of what rests on what. And put this way, it is slightly curious, at first glance anyway, that

29Feferman [2000a], pp. 81-82. 30Feferman [2000a], pp. 81-82.

169 Feferman also maintains that such a bird’s-eye view can be used to inform an epistemically informed philosophy of mathematics. After all, various reductive results

serve to sharpen what is to be said in favor of, or in opposition to, the various philosophies of mathematics such as finitism, predicativism, constructivism, and set-theoretical realism. Whether or not one takes one or another of these philosophies seriously for ontological and/or epistemological reasons, it is important to know which parts of mathematics are in the end justifiable on the basis of the respective philosophies and which are not.

In particular, Feferman thinks that the reductive results “undermine the case for set-theoretic realism”. Continuing the quote above,

The uninformed common view—that adopting one of the nonplatonistic positions means pretty much giving up mathematics as we know it—needs to be drastically corrected, and that should also no longer serve as the last-ditch stand of set-theoretical realism. On the other hand, would-be nonplatonists must recognize the now clearly marked sacrifices required by such a commitment and should have well thought-out reasons for making them. Though I personally believe that the kind of results described here on the whole strengthen the case for a nonpla- tonistic philosophy of mathematics and further undermine the case for set-theoretical realism, they do not speak for themselves to that extent, and it is at that point that well-informed critical philosophical discussion must take over.31

But if one has to in some sense be on the outside in order to recognize the epistemic privilege of one of the systems/frameworks under consideration, that would seem in some sense to undermine one’s confidence in the concluded privilege. And if this is correct, it’s not clear just what is the philosophical value of the catalogue. Let me be clear. I am not claiming that the results are not of mathematical value. Clearly they are. The point that I’m making concerns the philosophical upshot of such results when cashed out in terms as above. On the face of it, the investigations can only undermine set-theoretical realism if one has already in some sense adopted (or at least entertained) the platonist picture to begin with; pithily put: I have to climb the ladder in order to kick it away.

6.4.2 Justification and epistemic security

In light of the discussion of last section, let’s imagine a situation where one accepts, founda- tionally, some weak base theory as resting on some epistemically privileged framework. In our example, one accepts that PRA is (roughly) equated with finitism and one accepts the

31Feferman [1993a] pp. 207-208.

170 epistemic privilege of finitistic reasoning. In fact we need not even invoke any identification with a particular foundational position. For simplicity, then, assume only that one accepts the base theory as epistemically privileged. Suppose one is then presented with the proofs above from §6.3. What exactly is achieved by such reductions? What is the philosophical value of the results? Such reductions are supposed to, as we’ve seen, give a constructive foundation to analysis. But then what is philosophically important about constructive relative consistency results? What does “a constructive foundation for mathematical analysis” even mean? A natural characterization is that it means that mathematical analysis is somehow justified constructively. Implicit here is a notion of epistemic security. Hence you read that a proof-

theoretical reduction ensures that “the body of mathematics M that can be formalized in

32 T1 is justified or secured on the grounds of the framework F2.” The general idea is that proof-theoretic reductions provide some sort of security for the theories being reduced on the basis of the more evident ones to which they are being reduced. After all, the results were

obtained in the pursuit of a reductive program that provides a coherent scheme for metamathe- matical work and is best interpreted as a far-reaching generalization of Hilbert’s program. For philosophers these definite mathematical results (should) present a profound challenge. To take it on means to explicate the reductionist point of constructive relative consistency proofs; the latter are to secure, after all, classical theories on the basis of more elementary, more evident ones.33

We take on the challenge. In what follows we try to explicate the reductionist point of constructive relative consistency proofs by getting clear about just what the above justifica- tion/security rests in. I argue that a proper understanding of what is meant by justification, and hence security, requires a closer look at the radicalization of the axiomatic method and the shift to formalization that underlies Hilbert’s Program. Doing so highlights the vague- ness present in characterizations like the above, and can help us to get clear on just how to understand them. 32Feferman [2004], p. 318. 33Sieg [1990], p. 300.

171 Hilbert refined34

Hilbert’s program can be understood as a re-conception of mathematics as a formula game capturing the entire content of mathematics. The formulas of this game are stratified into statements that are meaningful and those that are not. The entire system must be proved consistent, and proved so by means of the philosophically unproblematic, meaningful part. Moreover, because of their equivalence, a consistency proof would amount to a proof of the corresponding reflection principle, and hence to the instrumental justification of the strong theory.35 The idea was to delegate philosophical and epistemological issues to the domain of the mathematical. To repeat an oft-cited quote from Bernays,

The great advantage of Hilbert’s method is precisely this: the problems and difficulties that present themselves in the foundations of mathematics can be transferred from the epistemological- philosophical to the properly mathematical domain.36

Examination of the reductions discussed in §6.3 and exposited in the appendix illustrates this transfer clearly, and highlights what Sieg [1990], p. 301 remarks are the two central features of proof-theoretic reductions: they focus on the deductive apparatus of a theory; and they are carried out within theories that are somehow epistemically privileged. Central to this re-conception are three motivating ideas: (i) the radicalization of the axiomatic method; (ii) the instrumentalist view of (strong mathematical) theories; and (iii) the strict formalization of logic.37 Our characerization above of Hilbert’s Program stressed (ii). A few remarks about (i) are in now in order. The central idea behind the radicalization is the separation of the linguistic representation of a theory from its content/intended interpretation. To quote Bernays,

We only have to realize that the [syntactic] formalism of statements and proofs we use to rep- resent our conceptions does not coincide with the [mathematical] formalism of the structure we intend in our thinking. The [syntactic] formalism suffices to formulate our ideas of infinite man- ifolds and to draw the logical consequences from them, but in general it cannot combinatorially generate the as it were out of itself.38

34This section is heavily indebted to Sieg [1990]. 35Sieg [1990], p. 308-309. 36Bernays [1922], p. 19, as quoted in Sieg [1990], p. 309. 37Sieg [1990], p. 308. 38Bernays [1930], p. 59, as quoted in Sieg [1990], pp. 315-316.

172 While not identical to the contentual theory, the formalization is considered close enough to the intended structure to be exploited in the reduction. In particular one projects the formalization of the intended structure into finitistic mathematics, where consistency is to be established. In taking the deductive structure of a formalized theory. . . as an object of investigation the [contentual] theory is projected as it were into the number theoretic domain. The number theoretic structure thus obtained is in general essentially different from the structure intended by the [contentual] theory. But it [the number theoretic structure] can serve to recognize the consistency of the theory from a standpoint that is more elementary than the assumption of the intended structure.39 In this way, one can avoid the philosophical difficulties of relating to a particular contentual domain, and instead relocate the issue to the mathematical formalism. Formal axiomatics, too, requires for the checking of deductions and the proof of consistency in any case certain evidences, but with the crucial difference [when compared to contentual axiomatics] that this evidence does not rest on a special epistemological relation to the partic- ular domain, but rather is one and the same for any axiomatics; this evidence is the primitive manner of recognizing truths that is a prerequisite of any theoretical investigation whatsoever.40 The program can be understood as seeking uniform structural reductions.41 The focus on formal axiomatics is not, however, to discount the importance of the con- tentual component. Contentual axiomatics is to supplement formal axiomatics; after all, it guides the formalism in the first place and aids in determining appropriate application. More- over, the guiding idea behind the relocation of the philosophical into the mathematical was “the basic conviction. . . that contentual axiomatic theories are fully formalizable.”42 Unlike the Kroneckerians of the world, Hilbert trusted in the correctness of classical mathematics. This conviction, together with the belief that its formalization was complete, explains the shift to formal axiomatics. The assumed completeness and the ensuing harmony of provability and truth help understand how Hilbert could take his radical formalist position, in order to simply bypass the epistemo- logical problems associated with the classical infinite structures.43 And this explains the importance placed on consistency proofs: they were to be the “the last desideratum in justifying the existential supposition of infinite structures made by modern 39Bernays [1970], p. 186, as quoted in Sieg [1990], p. 316. 40Hilbert and Bernays [1934], p. 2, as quoted in Sieg [1990], p. 317. 41Sieg [1990], p. 316. 42Sieg [1990], pp. 315. 43Sieg [1990], pp. 312.

173 axiomatic theories.”44 Last. Not only. Compare Kreisel: “I was repelled by Hilbert’s exag- gerated claim for consistency as a sufficient condition for mathematical validity or some kind of existence.”45 Hilbert’s program can be seen as a way of trying to mediate between two approaches to justifying certain “transcendental assumptions” involved in providing a structural foundation for Arithmetik. On the one hand are the logicists who attempt to prove consistency logically; on the other are the constructivists who opt instead to simply do without the assumptions altogether. Since Hilbert believed strongly in the correctness of these transcendental assump- tions, consistency proofs were supposed to justify the principles of classical mathematics by establishing the existence of the the corresponding structures by means acceptable to the the skeptics. Of course the incompleteness results show this cannot be done. In particular they illus- trate that one cannot eschew contentual axiomatics, and that formal theories can be used “at most as vehicles for partial structural reductions to strengthenings of the finitist basis.”46 So the epistemological basis must be changed/extended. Nonetheless, the basic philosophical position towards the transcendental assumptions remains, namely “to try, whether it is not possible to give a foundation to these transcendental assumptions in such a way that only primitive intuitive knowledge (primitive anschauliche Erkenntnisse) is used.”47 With the characterization of Hilbert’s thought at hand we are now in a position to return to questions concerning how to understand this “foundation” in light of varying reductions. Reformulated, “(i) what is the nature and the role of the reduced structures? and (ii) what is the special character of the theories to which they are reduced?”48 Sieg [1990] spends some time answering the latter.49 In what follows I will say a few things about the former.

44Sieg [1990], p. 314. 45Kreisel [1987], p. 395. 46Sieg [1990], p. 318. 47Bernays as quoted in Sieg [1990], p. 318. 48Sieg [1990], p. 318. 49For Sieg, proof-theoretical reductions should be viewed as structural reductions, where “[t]he philosoph- ical significance of relative consistency proofs is viewed in terms of the objective underpinnings of theories to which the reductions are (to be) achieved” (Sieg [1990], p. 300). These underpinnings are provided by

174 6.4.3 External value I

Recall the reductions from §6.3. Sieg’s strategy is uniform:

1. Embed the formal theory into a suitable sequent calculus with additional principles 0 (e.g. Σ1-AC0 and WKL); 2. Show the normalizability of derivations; 3. Eliminate the additional principles from normal derivations for a class of sequents.

Recall that we said that the idea behind the eliminations is that proving ∆ from the

assumption, say, WKL, is to prove WKL Ñ ∆. And this is the same as proving WKL _ ∆, which in the Tait calculus is WKL, ∆. To prove the conservation theorem, one shows that one can transform a proof of WKL, ∆ into a proof of ∆; hence the “elimination” of WKL. In each case, the elimination involves a series of inversions until one arrives at subproofs that can be cut away. This highlights the syntactic aspect of the proof, and points to the discussion above concerning radicalization of the axiomatic method and the corresponding distinction between the language used to represent the theory and the content itself. This distinction, I claim, can help us to get clear on just what is achieved by the above reductions by informing how we are to understand what is meant by “justification” and “security.”

Central to the reductions are applications of @-inversion, and especially Σ-inversion. The reductions are thus not of the syntactically same formulae, but of interpreted formulae in the sense of Kreisel. Sieg [1991], for example, opens by saying,

Statements ψ of the form pDyqφy express a functional dependence of the quantified variable y on the parameters occurring in ψ.”50

the elements of accessible domains, and which are built up uniquely via certain basic operations from distin- guished objects. Such a generation of objects reflects the epistemic significance of accessible domains; given an understanding of the build-up of the domain, the theories that are generated formulate principles that are taken as evident (Sieg [1990], p. 301). The central feature of this epistemic privilege lies in accessibility. Focusing on accessibility rather than traditional “constructivity” broadens the range of theories that are considered privileged in some sense. The task of constructive consistency proofs is thus to “relate two aspects of mathematical experience; namely the impression that mathematics has to do with abstract objects arranged in structures that are independent of us, and the conviction that the principles used for some structures are evident, because we can grasp the build-up of their elements” (Sieg [1990], p. 314). As such one can “understand the role of abstract structures in mathematical practice and the function of (restricted) accessibility notions in ‘foundational’ theories’. . . ” (Sieg [1990], p. 301). 50Sieg [1991], p. 409.

175 And though he never mentions Kreisel, this appears to line up with the thought that the conditions on an interpretation are intended to express what one would reasonably expect of an understanding of transfinite sym- bols by, or of a reduction to, finitist (constructive) means.51 Whether they do, and whether a constructivist could actually understand this, are different questions. There are two related issues here. The first is whether the constructivist can understand,

0 generally speaking, that a Π2 formula is taken as expressing a functional dependence and that this is how to understand, constructively, the presence of transfinite symbols. The second is understanding this in application, i.e. that the constructivist can understand what is meant by particular transfinite symbols – say the formalization of WKL above – under the appropriate conversion. Our discussion above supports the idea that such an interpretation itself must be understood from the outside. After all, how could the constructivist herself ever understand the sufficiency of the interpretation if she is not in a position to understand the transfinite symbols being interpreted? And this seems to be what Kreisel is after when he says, “the «reduction» of primitive notions is not a matter of principle. A reduction does not eliminate them since merely to see that a proposed reduction is correct one has to start with the primitive notion considered anyway.”52 The idea is that from the constructivist point of view the transfinite assumptions are “meaningless” and the reduction is being presented by the classicist in order to convince one of the instrumental safety of appealing to the stronger theory in proofs of constructive statements. The classicist’s position is that the reduction should be unproblematic since one can understand each step involved in the (primitive recursive) proof transformation. Except that if one thinks that “ideal” statements are meaningless, then presumably one is not in a

0 position to really understand the motivation behind the intended meaning of Π1 statements as expressing a functional dependence generally, nor that the particular formula Rafpaq is what is constructively meant by @xDyRxy. And in that case it is not clear how one can 51Kreisel [1958a], p. 363. 52Kreisel [1958a], p. 364.

176 understand, from the constructive point of view, how a reduction is supposed to justify, or secure, or “give a foundation to these transcendental assumptions in such a way that only primitive intuitive knowledge is used.”53 It seems that one must stand outside, adopting a privileged external standpoint, in order to grasp what the philosophical motivation of such reductions is supposed to achieve. And without any such grasp of this, it’s not clear how one can be said to be understanding the transfinite symbols by constructive means. This raises an interesting question regarding what is required for understanding. Is simply being able to formulate the theories enough? Or is something more involved? What about the other point of view? Suppose one is a classicist and one’s goal is not so much to convince the constructivist of the legitimacy of one’s theory but rather to justify its use to oneself? From here it depends on what one thinks the role of infinitary mathematics is. If one thinks infinitary methods are simply instruments to be used in proofs of finitary formulae, then perhaps one feels validated. After all, we have seen that the reductions do

0 establish directly that the Π2 formulae provable in WKL0 are finitistically justified in the sense that their interpretations are PRA provable, and hence true. And this establishes

0 that not only does WKL0 not prove any false Π1 (finitary) statements, but it also does not

0 prove any extra Π1 (finitary) statements. The flipside of this is to say that PRA proves (the

0 interpretations of) all Π2 statements provable in WKL0. To borrow a phrase from Feferman, the reduction establishes that a little bit goes a long way. But what about the rest? It seems to me that answering this question is really what’s at issue. After all, there seems be more behind Hilbert’s conception of infinitary theories than simply potential instrumental use. Mathematicians do not tend to think of their work in infinitary theories as meaningless. In this case one does appreciate the philosophical motivation behind the reductions and one can appreciate (at least to some degree) the contentual content behind the axioms; one has a grasp of what the formal theory means.

Some logicians call themselves formalists and think that there is nothing more to mathematics than what is pictured as the production of derivations within formal systems. But those

53Bernays, as quoted in Sieg [2002], p. 362, italics mine.

177 who take seriously the platonistic, constructivist, or other such views (for example, finitist, predicativist, etc.) also concern themselves with the meaning of what is expressed in formal languages. There is then the question as to which choices of axioms and rules are legitimate and—in case the systems are incomplete—how they might be legitimately extended. This leads one into controversial areas of the foundations of mathematics.54

And this, ironically, seems to be what Frege was trying to point out in his famous dis- agreement with Hilbert. First, though, we take a brief detour through the work of . As we will see, early Hilbert endorsed Dedekind’s structuralist/logicist positions, an endorsement that is at the heart of the maxim that Con ñ Ex and central to the dis- agreement with Frege.

6.4.4 Dedekind

Wilfried Sieg [2013] and Ferreirós [2009] have argued convincingly that early Hilbert was heavily influenced by the work of Richard Dedekind. Though they differ over some of the subtleties, they are in general agreement. Importantly, Hilbert was endorsing the logicism of Dedekind before publishing his presentation of the axiomatization of geometry. Though modern thinking usually equates logicism with Frege, Frege’s views went largely unknown until later, largely due to Russell. Early logicism was largely associated with Dedekind. It was through Dedekind that Hilbert came to logicism. The central tenets of logicism for Dedekind were that the basic concepts of mathematics can be defined by logical concepts alone and that basic mathematical principles can be derived from logical principles. All of pure mathematics are proven as an outgrowth of the pure laws of thought.55 Important for our purposes is to note that the concepts of set and

54Feferman [1979b], p. 177. This distinction between the content of a theory and its formal representation is stressed by Brouwer and Zermelo in arguments against the Hilbertian radicalization of axiomatization (Sieg [1990], pp. 312, 313). As Brouwer put the point: “These mental mathematical proofs that in general contain infinitely many terms must not be confused with their linguistic accompaniments, which are finite and necessarily inadequate, hence do not belong to mathematics” (Brouwer [1927], p. 460, fn. 8, as quoted in Sieg [1990], p. 312). And to quote Zermelo: “Complexes of signs are not, as some assume, the true subject matter of mathematics, but rather conceptually ideal relations between elements of a conceptually posited infinite manifold. And our systems of signs are only imperfect and auxiliary means of our finite mind, changing from case to case, in order to master at least in stepwise approximation the infinite, that we cannot survey directly and intuitively” (Zermelo [1931], p. 85, as quoted in Sieg [1990], p. 313). 55For how exactly this works see, e.g., Sieg and Schlimm [2005].

178 mapping are taken as purely logical. Dedekind’s logicism was based on the theory of classes, and the theory of relations. Logic deals with the relations between concepts and sets are extensions of concepts. Moreover he accepted the (unrestricted) principle of comprehension as well as the axiom of extensionality. Given the centrality of the concept of a mapping for doing mathematics (e.g. functions, isomorphisms, etc.), Dedekind wanted to show it was essential to all thought. For “[t]he ability of the mind to relate things to things, to make a thing correspond to another, or to represent a thing by another” is taken as fundamental.56 Hilbert followed Dedekind in thinking that Arithmetik is based on logic and pure thought, including the notion of set and mapping. The science of number is thus built up from purely logical processes. Unlike Arithmetik, though, geometry is not a purely mathematical science but is rather a natural science. Taking Arithmetik and logic as fundamental, the question for Hilbert then is: what is needed in addition to this in order to arrive at the truths of geometry? The result is that the foundation of geometry is built on axioms that reflect the “truths” of the natural science of geometry, taking as given the laws of logic and Arithmetik. The nonconstructive characteristic of much of mathematics endorsed by many mathe- maticians of the day comes with it certain difficult questions with regard to the meaning of existence claims. In particular, what does it mean for a mathematical object to exist? More specifically, given the indispensibility of the infinite for the smooth practice of mathematics, how do we establish its mathematical existence, even if it does not exist in reality? The goal initially was to justify it by means of logic (as opposed to his later proof-theoretic program). Thought can make a non-continuous entity continuous by adding new entities. And this, for Dedekind, can be done logically. The real numbers, for example, exist in thought even if not in physical reality. We can build up the reals from pure thought, i.e. by means of logic alone. Existence, then, is a logical notion for Dedekind, not an ontological one as it was for Frege; mathematical creation is, for Dedekind, legitimated and constrained by pure

56Dedekind in Was sind und was sollen die Zahlen? as quoted from Ferreirós [2009], p. 38.

179 logic alone. Since pure mathematics is based on pure logic, logical impossibility is the only constraint on mathematical existence. Put otherwise, the only constraint on mathematical existence is conceptual coherence. But we cannot simply assume the existence of a mathematical object. Rather, its ex- istence must be proven. So to show the existence of, say, a simply infinite set, we need to show the consistency of the axioms that reflect, or capture, or are satisfied by, the system, where a system, for Dedekind, is a well-defined collection of mathematical objects. The real numbers, for example, form a system. For Dedekind this consisted in showing that infinite sets exist as thought objects, and that every such infinite set contains a simply infinite one. A system is infinite just in case it can be put in one-to-one correspondence with a proper subset of itself. A simply infinite system is any system satisfying the Peano-Dedekind ax- ioms. Proposition 72 of Was sind und was sollen die Zahlen? states that every infinite system contains a simply infinite subsystem. Assuming the existence of an infinite system, then, there exists a simply infinite system satisfying the Peano-Dedekind axioms. Given this existence we then abstract away from the particular nature of the elements of the system, characterizing it generally. As he says, we

entirely neglect the special character of [its] elements, simply retaining their distinguishability and taking into account only the relations to one another in which they are placed.... [T]hese elements [are] called natural numbers or ordinal numbers or simply numbers.... With reference to this freeing [of] the elements from every other content (abstraction) we are justified in calling numbers a free creation of the human mind.57

Obviously, though, for this to work we must secure the existence of an infinite system. Faithful to the conception of existence at the time, Dedekind strove to show the existence of an infinite system by providing a specific example. His categoricity theorem would then allow one to abstract away from the particular system and secure the generality of the construction. Existence, then, is completely absent of metaphysical considerations. The existence of the infinite system (Proposition 66) was to be generated from considering the totality of one’s thoughts, i.e. the totality of all things that can be objects of one’s thinking. This logical

57Dedekind in Was sind und was sollen die Zahlen? as quoted in Demopoulos and Clark [2005], p. 153. Note the striking similarity to Hilbert.

180 existence entails the consistency of the axioms because it provides an object that satisfied the axioms. And this is how establishing consistency was typically thought of. Unfortunately for Dededkind, the set theoretic paradoxes undermine his conception of the collection of all of one’s thoughts being a coherent set. If every set can be an object of thought, then the totality of all thoughts is essentially the same as the set of all sets. This was shown by Cantor to be an inconsistent notion. Rather than abandon Dedekind’s project completely, Hilbert attempted a rehabilitation

that would lead him to his Con ñ Ex doctrine. Hilbert’s views on truth and existence re- mained rooted in the logicistic understanding of set theory. Ferreirós [2009] claims that it was the contradictory axiom of comprehension (the ultimate source of the trouble for Dedekind’s view) and Hilbert’s attempts to overcome the inconsistency that led him to reverse the direction of the relation between consistency and existence. By adopting unrestricted com- prehension, Dedekind thought that all well-defined concepts came with them coherent sets. Cantor’s paradox shows this is not the case; even though it is a well-defined concept, it does not follow that Dedekinds collection of all thoughts is a set. Hilbert’s solution was to eschew the full axiom of comprehension. Well-definedness of a concept is not enough to guarantee that they come with them consistent systems. As he says,

As I see it, the most important gap in the traditional structure of logic is the assumption made by all logicians and mathematicians up to now that a concept is already there if one can state of any object whether or not it falls under it. This does not seem adequate to me. What is decisive is the recognition that the axioms that define the concept are free from contradiction.58

Being able to tell whether any object falls under a concept is the characteristic mark of a concept being well defined. And this, Cantor’s paradox shows, is simply not enough to guarantee the existence of the corresponding set. Instead, we must first make sure that the concept itself is consistent. Only then can we appealing to comprehension to give us the

existence of the corresponding set. Hence the maxim that Con ñ Ex.

58Frege [1980], pp. 51-52.

181 6.4.5 Frege-Hilbert

Some Agreement

It is worth noting that Frege and Hilbert had some general views in common. Hilbert and Frege seem to agree with respect to formalization of mathematical ideas in terms of symbols rather than words. Assuming that the formalization expressed the same thought as the words, symbols are briefer and more perspicuous. Indeed, according to Frege, “the advantages of perspicuity and precision are so great that many investigations could not ever have been made without a mathematical sign language.”59 However, Frege warns that we must be careful with proceeding by means of symbols. In particular,

the use of symbols must not be equated with a thoughtless, mechanical procedure.... A mere mechanical operation with formulas is dangerous (1) for the truth of the results and (2) for the fruitfulness of the science.60

Later, he writes that

The natural way in which one arrives at a symbolism seems to me to be this: in conducting an investigation in words, one feels the broad, imperspicuous and imprecise character of word language to be an obstacle, and to remedy this, one creates a sign language in which the investigation can be conducted in a more perspicuous way and with more precision. Thus the need comes first and then the satisfaction.61

Hilbert agrees with this, saying that

I believe your view of the nature and purpose of symbolism in mathematics is exactly right. I agree especially that the symbolism must come later and in response to a need, from which it follows, of course, that whoever wants to create or develop a symbolism must first study those needs.62

More importantly though, they were both in general agreement that the basic concepts of pure mathematics can be defined by logical concepts alone and that basic mathematical principles can be derived from logical principles. That is, they were both logicists with respect to Arithmetik. They also both thought that geometry is fundamentally different from Arithmetik in not being a purely mathematical/logical science. Yet they fundamentally disagreed about the epistemic basis and security for geometry. For Frege this is intuition in

59Frege [1980], p. 33. 60Frege [1980], p. 33. 61Frege [1980], p. 33, emphasis mine. 62Frege [1980], p. 34.

182 the Kantian sense. Hilbert, on the other hand, seems to think that though we might get insight into the axioms of geometry through intuition63, the notion of set and particularly mapping are logical, and because one can effectively map points to pairs of real numbers, etc., and numbers are secured by logic, the logical relations that hold between the points, lines, planes will themselves be effectively secured and rooted in logic. For it is through the logical principles that “a foundation is provided for the reliability and completeness of proofs.”64 This is why the relative consistency proof works for him.

The Disagreement

In his second letter to Hilbert, Frege asks for certain clarifications regarding his usage of certain terms. For one thing, while he takes Hilbert to have been using ‘explanation’ and ‘definition’ to mean different things, he has trouble seeing what that difference is.65 For in some cases Hilbert seems to be using them in the same way. He also is unclear as to exactly what Hilbert means by points, lines, and planes. In one instance Hilbert seems to be thinking of them in the traditional Euclidean sense. But then later he conceives of them as (in the example of a point) a pair of real numbers. Equally confusing for Frege is Hilbert’s failure to distinguish between definitions and axioms. In particular, Frege cannot see how axioms can provide a “precise and complete description of relations” nor how the concept of ‘between’ can be defined by the set of axioms. For Frege, definitions ought to be distinguished from all other mathematical proposi- tions. Definitions indicate the meaning of hitherto meaningless terms. They neither extend knowledge nor require proof or intuition for truth. They are merely “a means for collecting a manifold content into a brief word or sign, thereby making it easier for us to handle.”66

63Presumably Hilbert does not means the precise Kantian sense of intuition that Frege does. 64Dedekind says this in a letter to Keferstein in 1890. The quote was taken from Ferreirós [2009], p. 38. Hilbert was probably unaware of the letter but the quote is apt because I take it that Hilbert would have endorsed such a view given his deference to Dedekind. 65As far as I can tell, ‘explanation’ is actually translated as ‘definition’ in the English version of Grundlagen der Geometrie. It is also worth noting that the sections that Frege refers to do not match up with those of the English translation. Both are confusing, to say the least, when first trying to understand their exchange. 66Frege, et al. [1971], p. 24.

183 Axioms, on the other hand, are true statements not in need of any proof. Their epistemic status is different from logic. Whereas the truth of the axioms of arithmetic were to be derived from pure thought, those of geometry were established by means of spatial intuition. As he says to Hilbert, It seems to me that you want to detach geometry entirely from spatial intuition and turn it into a purely logical science like arithmetic. The axioms which are usually taken to be guaranteed by spatial intuition and placed at the base of the whole structure are, if I understand you correctly, to be carried along in every theorem as its conditions–not of course in a fully articulated form, but included in the words ‘point’, ‘line’, etc.67

And, because they are true by nature, of course they are consistent; no further proof of their consistency is thus required, pace Hilbert.68 In reply to Frege’s point that the axioms are themselves guaranteed to be consistent, Hilbert responds by reiterating his ‘consistency entails existence’ maxim: [F]or as long as I have been thinking, writing, lecturing on these things, I have been saying the exact reverse: If the arbitrarily given axioms do not contradict one another, then they are true, and the things defined by the axioms exist. This for me is the criterion of truth and existence...69

Frege does not see how such a position is tenable. To illustrate his concern he presents the following simple example:70 1. EXPLANATION: We conceive of things which we call gods. 2. Axiom 1: Every god is omnipotent. 3. Axiom 2: There is at least one god.

Assume that 1-3 are consistent in that their consequences do not contradict one another. Does it then follow that there exists such an intelligent, omnipresent, and omnipotent being? Frege does not see how it could. He takes this as extending into the realm of mathematics. It’s not clear to me that such a complaint is really fair. It does not seem unreasonable to me to think that coherence of thought is enough to secure it as an object or theory of mathematics. Nevertheless, the example is illustrative because it gets to the heart of Frege’s rejection of Hilbert’s Con ñ Ex. We return to this later. The distinction between definitions and other mathematical propositions is, for Frege, 67Frege [1980], p. 43. 68Frege [1980], p. 37. 69Frege [1980], p. 42. 70Frege, et al.[1971], p. 33.

184 very essential for the strictness of mathematical investigations. . . . The other propositions (axioms, fundamental propositions, theorems) must not contain a word or sign whose sense and meanings or whose contribution to the expression of a thought, was not already completely laid down, so that there is no doubt about the sense of the proposition and the thought it expresses. The only question can be whether this thought is true and what its truth rests on. Thus axioms and theorems can never try to lay down the meaning of a sign or word that occurs in them, but it must already be laid down.71

For Frege, then, all notions found in axioms are to be completely defined. So, for example, whenever ‘between’ is used in an axiom, the notion of between must already be understood. And because of this, the axioms themselves cannot give a more precise definition of ‘between’. Thus, according to Frege, the “alternative” definition that Hilbert does give cannot make sense. To make sense of it, the notion of between must not have a definite meaning yet in the axiom. But then, according to Frege, it does not express a thought and so cannot express either a truth nor a fact of intuition. Moreover, axioms cannot be definitions, “if only because there is more than one of them, and furthermore, because they contain expressions whose meaning seem not to yet have been laid down.”72 So axioms express truths and definitions give meaning to particular terms. As such, axioms ought not themselves contain terms that have no prior indicated meaning. In response to Frege, Hilbert first notes that his explanations could be considered a definition, if properly rephrased. One could simply say, for example, that “‘Between’ is a relation that holds for the points on a line and which has the following characteristic marks: II/1 ... II/5."73 And later, rather than call Hilbert’s axioms ‘axioms’, Frege should feel comfortable to call them ‘characteristic marks’ instead. Terminological disputes aside, the fundamental point of disagreement concerns Hilbert’s use of implicit definition of point, line, plane. In his first letter Frege said that “The expla- nations in sect. 1 are apparently of a very different kind [from a definition], but are assumed to be known in advance.” Hilbert responds by pointing out that he “does not want to assume anything as known in advance; I regard my explanation in sect. 1 as the definition of the

71Frege [1980], p. 36. 72Frege [1980], p. 37. 73Frege [1980], p. 39.

185 concept point, line, plane – if one adds again all the axioms of groups I to V as characteristic marks.”74 But this, as we saw above, is lost on Frege. How can these be the definitions of the concepts if they are incomplete? Hilbert replies by saying that [T]he definition of the concept point is not complete till the structure of the system of axioms is complete. For every axiom contributes something to the definition, and hence every new axiom changes the concept. A point in Euclidean, non-Euclidean, Archimedean and non-Archimedean geometry is something different in each case...75 For Hilbert, the axioms are the component parts of the definition of a concept. So, for example, in the case of “betweenness”, axioms II.1 through II.5 serve to define the concepts. Hilbert is willing to concede to Frege that axioms might also be called characteristics and is happy to appease him with such a renaming. Nonetheless, Frege is still confused as to how axioms, if definitions, can express basic facts of intuition. For one thing, if they do, they are assertive. But then the expressions contained in them would have to be understood. But this cannot be so if axioms are components of definitions. The axioms thus cannot be thought of independently, given their definition. This is because, for Frege, the sense of a proposition gets the meaning it does in virtue of the thought it contains. As such, if the proposition defining betweenness is taken to be the result of the various component parts, then the definition of the concept must be taken holistically. As a result it does not make sense to think of them independently of one another, since the resulting concept of betweenness will change in the absence of one of the axioms. Frege thus seems to be against the notion of independence for the reason that different thoughts will be expressed. For the axioms and the things they are defining get their meaning when taken together. Removing one changes the notions involved. Setting this aside, though, and taking the axioms as expressing different characteristics of the concepts being defined, Frege takes issue with Hilbert’s conception of the explanations composed out of the hitherto undefined notions of point, line, and plane, comparing them to a system of equations with unknowns. Having definitions that include unknowns does not let one know what it is that is being talking about, according to Frege. As he says, 74Frege [1980], p. 39. 75Frege [1980], p. 42.

186 “Your system of definitions is like a system of equations with several unknowns, where there remains a doubt whether the equations are soluble, and, especially, whether the unknown quantities are uniquely determined...Given your definitions, I do not know how to decide the question whether my pocket watch is a point.”76 According to Hilbert, it is only when the totality of the axioms is taken together that a determination of the unknowns (the points, lines, and planes) is possible. Frege’s complaint is that even so we cannot be sure that they will be solvable at all, or that if they are, they will be unique. It would seem, according to Frege, that we could thereby already have to know what a point is. But why exactly does Frege think that we need to be able to specify just what the objects of our theory actually are, over and above simply proclaiming that they are whatever satisfies the axioms? Why does Frege think that we need to reach for the “objective” nature of the intended objects of the theory and its structure? Why do we have to pinpoint the unique, specific subject matter? Hilbert clearly thinks we do not need to. As he says, almost in seeming frustration,

But it is surely obvious that every theory is only a scaffolding (schema) of concepts together with their necessary connections, and that the basic elements can be thought of in any way one likes. E.g., instead of points, think of a system of love, law, chimney-sweep...which satisfied all axioms; then Pythagoras’ theorem also applies to these things. Any theory can always be applied to infinitely many systems of basic elements. For one only needs to apply a reversible one-one transformation and then lay it down that the axioms shall be correspondingly the same for the transformed things...77

The answer, presumably, has to do with Frege’s general conception of a theory. Geometry, in particular, is about space, or the relations that hold between points, lines, and planes, where by ‘points’, ‘lines’, and ‘planes’ we really means the things of . Reinterpreting the geometric propositions as about real numbers changes the subject matter. It is the geometric thoughts about actual points, lines, and planes that we are actually interested in, according to Frege, not simply their structure. Since for Frege theories are collections of thoughts, and since the things that the sentences are about will be different in different branches of mathematics, they must express different thoughts. They are thus

76Frege [1980], p. 45. 77Frege [1980], p. 45.

187 different theories and so using one to demonstrate the consistency or independence properties of the other is a mistake. Patricia Blanchette78 has argued this point forcefully. For Frege, theories are ultimately about Thoughts, axioms being special propositions expressing true thoughts not in need of proof. It is the thoughts, not the sentences expressing the thoughts, that stand in logical relations to one another. And because of this, Frege warns, a particular presentation of some truths does not necessarily expose its ultimate structure. For while sentences express thoughts, the expressions themselves may not reveal enough of the logical structure; two sentences can express the same thought in different ways - one can display the relationships more clearly than another. As Blanchette puts it, “superficial grammatical structure is no reason to suppose that the thoughts they express share logical properties”79 As we will see, this is central to understanding why Frege does not think that the relative consistency proofs of Hilbert will work. This leads to an important difference between Frege and Hilbert: there is an important gap between logical implication and deducibility for Frege. While deducibility (in a good system of course) entails implication, for Frege, the converse is not true. For it is quite conceivable that logical structure has not been completely and adequately uncovered. It is this disparity, she claims, that (for Frege) lies at the heart of the disagreement with Hilbert.80 Frege’s main objection to Hilbert’s proofs is that they are not about the axioms of geometry. The sentences that express the axioms are mere tools for expressing the things that we really care about, namely the thoughts that make up Euclidean geometry. So if the sentences/axioms do not express genuine thoughts (which they will not since they contain uninterpreted terms and hence lack sense) then they are not the sorts of things that stand in logical relationships. Recall that Hilbert’s method was to start with intuitive axioms for geometry presented

78See, e.g., Blanchette [1996] and [2012]. 79Blanchette [2012], p. 108. 80See Blanchette [2012].

188 in a way that abstracts from the meanings of the individual terms in the axioms. From here he reinterprets the terms as about real numbers. Providing interpretations for these otherwise uninterpreted axioms ends up giving us, according to Frege, two different sets of thoughts. On the one hand, when the terms are interpreted as being points, lines, and planes, we get thoughts about the domain of geometry. On the other hand, when the terms are interpreted as being relations of real numbers, we get thoughts about the domain of Arithmetik. Hilbert’s strategy then was to show that consistency of the axioms expressing truths about the real numbers can be used to infer the consistency of the axioms expressing truths about geometry. It is this inference that Frege finds untenable. To be sure, assuming the consistency of Arithmetik, interpreting the axioms as expressing truths about the real numbers will entail the consistency of those interpreted axioms. And this, in turn will entail the consistency of the set of uninterpreted axioms, treated as implicitly existential generalizations of the arithmetically interpreted axioms. But it is a mistake to conclude from this that the axioms expressing thoughts in the domain of geometry will also be consistent. Because for Frege it is the thoughts themselves that enter into logical relationships, and because thoughts are crucially dependent on the meanings of words, the consistency of thoughts involves the meanings of the non-logical terms in the sentences expressing the thoughts. To see logical relations among thoughts, appropriate conceptual analysis is re- quired. To show that a thought follows logically from a set of other thoughts, one first analyzes the thoughts, formalizing them appropriately, and then proves (in, say, a Begriffss- chrift), the sentence ϕ from the set Γ. This process is central to Frege’s arithmetical proofs in the Grundlagen. He first gives extensive conceptual analysis of, e.g., zero and successor, and then proves certain logical relations between the concepts. So while a derivation of a sentence ϕ from a set of sentences Γ will ensure that the thought expressed by ϕ follows logically from the thoughts expressed by Γ, the converse is not true. Failure to provide a derivation does not necessarily mean that the thoughts of Γ do not entail the thought of

189 ϕ. The point holds equally for consistency. Failure to prove that a set of sentences entails a contradiction does not mean that the thoughts are actually consistent. To use a tired example, let Bj express the thought that John is a bachelor and Mj express the thought that John is married. As it stands, these two expressions are consistent. Using Hilbert’s meth- ods, one can easily give a reinterpretation that shows them obviously consistent. For Frege, however, the two thoughts are inconsistent because part of what it means to be a bachelor is to be unmarried. The point is to be taken generally, and, because it is possible that an inconsistency could lie hidden beneath the surface of conceptual analysis, reinterpretation strategies in general do not guarantee consistency of thoughts. This is why Frege took issue with Hilbert’s strategy. Analogous to the case above, while a reinterpretation shows the structural properties of the sentences expressing the thoughts to not be in contradiction, this does not show that the thoughts in the intended interpretation are not inconsistent. In our case at hand, reinterpreting the axioms expressing thoughts of geometry as those con- cerning arithmetic, and showing these consistent, does not mean that the original geometric thoughts are consistent. Recall that what is important for Hilbert is that unlike Frege where the point was to be sure we are actually talking about these intuitive points lines, planes, etc., the point for Hilbert was to focus on the logical relations among these concepts. Independence and consistency questions are about the logical structure of the axioms rather than about any particular thoughts expressed by them. As we saw Hilbert say above, surely a theory is only a scaffolding or schema. This focus on the logical structure of the axioms independently of some outside meaning or intuition we attribute to it shifts the focus toward syntax. We need to make sure that our rules do not allow us to derive a contradiction. Because of this notion of consistency we can treat the terms as whatever we like. Treating the terms as meaningless, without loss of generality, whatever the terms name–the objects of the theory– just are whatever satisfy the sentences. Frege’s complaints about the illegitimate shift from talk of the arithmetically interpreted axioms to the geometrically interpreted axioms does

190 not make sense to Hilbert, given that for him the theory is about the abstract structure, the scaffolding. Despite Frege’s concerns with various parts of Hilbert’s terminology, as seen above, as well as his concerns with Hilbert’s reinterpretation strategy as discussed in the previous paragraph, Frege does concede that Hilbert has succeeded in defining a structure. Frege acknowledges that Hilbert’s reinterpretation guarantees the consistency of the existentially generalized axioms. The consistency of the generalized axioms shows that Hilbert has pro- vided a definition of an abstract structure, a second level concept, under which the two interpreted representations (as first-level concepts) fall. Frege diagnoses Hilbert’s mistake as a failure to distinguish between first- and second- level concepts.81 Frege notoriously distinguished concepts from objects, and further stratified concepts into different types. This stratification of concepts is supposed to hold also in the case of the characteristics. First-level concepts can only have first-level characteristics and second-level concepts can only have second-level characteristics. So a characteristic of a first-order concept is a property an object must have in order to fall under the concept. And similarly a characteristic of a second-level concept is a property a concept must have in order to fall under it. And so on. Frege’s complaint is that Hilbert violates the type restrictions on characteristics. The problem that this brings out in the example above on page 184 is that Axiom 1 has characteristics that are first order (‘is omnipotent’), whereas Axiom 2 has characteristics that are second order (something like ‘the concept god is realized’). How does this relate to Hilbert? Well, assuming that points are objects, “point” is a first-order concept (similarly for line and plane). So, since the axioms are to define what a point is, all of its characteristics must be first order. But upon examination of Hilbert’s axioms, the characteristics presented are second-order. So the things that they are defining are second-order concepts, not the first order concept that they are supposed to be defining. Understood this way, Euclidean

81See, e.g., Frege, et al. [1971], p. 77, and Frege [1980], p. 46.

191 geometry presents itself as merely one of many geometries. The relations that Hilbert char- acterizes are second-order relations between the first-order concepts of “point”, etc. As Susan Sterrett has pointed out82, Hilbert’s assistant , in a letter to Liebmann, seems to have accepted the point, conceding that rather than defining the concepts of, e.g. “point” and “incidence”, Hilbert’s axioms of geometry define the second-order concept of a three- dimensional Euclidean space. Such is indeed fitting for this is exactly what Dedekind did when he provided a definition of a simply infinite system in Was sind und was sollen die Zahlen? 83 Nonetheless, such a failure to define a point fails to provide the fixity of refer- ence84 that Frege demanded a characterization of Euclidean geometry should do. Moreover, because Hilbert’s axioms contain undefined words they fail to express thoughts, and, as such, are not the sorts of things that Frege takes to be subject to consistency (and independence) considerations.

6.4.6 External value II

We thus see the emergence of two different types of justification: focusing on the deductive apparatus can be said to provide a foundation in the sense of being provable in a particular framework. But that’s it. It does not give one any access to a different type of justification of the coherence of the concepts involved, i.e. what the formalization is taken to mean. Nor can it, given the paradox of rigor. The goal, it seems, is to provide some further epistemic support for the actual methods and thereby correctness of the infinitary systems. Here one is trying to justify (in some sense) the truth of the transfinite assumptions. The idea is that the constructive consistency proof justifies the stronger theory in the sense of giving credence to its truth. Given our discussion above this seems in line with Hilbert’s overall idea, given his confidence in the transcendental assumptions. Understood this way, claims of the form “WKL0 is justified

82Sterrett [1994], fn. 25. 83See Sieg [2014], pp. 143-144. 84For more on the importance of fixity of reference for Frege see Hallett [2010].

192 or secured on the grounds of PRA” make it sound as though the proof-theoretic reduction

is providing justification for the entire system WKL0 in the sense of establishing that the

axioms and theorems of WKL0 are true. But this is not what is established, strictly speaking. Only those statements for which conservativity holds should be considered epistemically

privileged. That is, it should not be concluded from this that whatever is provable in T2

simpliciter is justified by the foundational framework of T1, at least not directly. Whether such results provide indirect support for this is another matter. More on this in a moment.

0 0 Of course, given conservativity for Π2 statements, one also has conservativity for Π1 statements, and hence one has established the consistency of the rest. But again, consistency of a syntactic formalism does not give one access to the correctness of the content of that theory. The lesson learned from the discussion above is that there is something fundamentally different about formal and contentual axiomatics. And this is important for getting clear on what exactly the reduction achieves. It only shows that the principles being “eliminated” are consistent. It says nothing more about their actual truth. Insight into the (relative) consistency of some formalization does not give one insight into the correctness of the (totality of) infinite proofs. Simpson himself seems to grant this much: I grant that the reduction of infinitistic proofs to finitistic ones does not increase confidence in the formal correctness of infinitistic proofs. What such a reduction does accomplish is to show that finitistically meaningful end-formulas of infinitistic proofs are true in the real world. Hence formulas which occur in infinitistic proofs become more reliable in that they are seen to correspond with reality.85 Presumably the formulas that Simpson has in mind in the last sentence of this quote are those which are finitistically meaningful. It’s hard for me to see how the sort of justification that is being appealed to provides us any direct confidence in the rest of the system; unless by ‘correspond’ here he means simply ‘is consistent with’. So while it is true that having a proof-theoretical reduction will guarantee relative con- sistency, one should not conclude from this the truth of everything provable in WKL0, only

0 the truth of the (interpretations of) Π2 formulae. And this is only because PRA is taken as epistemically privileged: everything it proves is true. In the general case we do not even get 85Simpson [1988], p. 359.

193 this. One can only establish from a reduction that we have relative consistency. This much will be true, of course, given the result of Kreisel above. But on this more general picture we are not even guaranteed that the sentences proven to be conservative are true, only that they are provable in the weaker system without appeal to concepts from the stronger system. However, proof-theoretic reductions can be understood as increasing one’s epistemic con- fidence that the principle is true. The idea is that since it is consistent (assuming the con- structive theory is consistent), and it is conservative (properly understood) for a wide class of formulae, this does increase one’s confidence in not only its formal, but also its contentual correctness. Understood this way, establishing the (relative) consistency of some strong mathematical theory could increase one’s confidence that the principles involved are true. This is something akin to extrinsic justification of new axioms in the sense of Gödel, and serves as an appropriate segue to the next chapter. Before we segue, though, I would like to make a brief remark about the role of consistency proofs, generally. Why think that simply showing a set of axioms is consistent is enough to give one epistemic security? After all, Kreisel famously railed against Hilbert on precisely this point when he said, “I was repelled by Hilbert’s exaggerated claim for consistency as a sufficient condition for mathematical validity or some kind of existence.”86 We noted above that (at least early in his career) Hilbert took the consistency of the axioms for real analysis to show that the structure they are taken to describe (that of the reals) actually exists as a completed totality. We called this maxim Con ñ Ex. But we might wonder why a consistency result should give us any such credence in the ontological status of the purported subject-matter of the theory.

One reason for thinking it might is the following: WKL0 is conservative over PRA with

0 0 respect to Π2 sentences. Moreover, this conservativity result, being Π2, is provable within

PRA. Not only does this show that WKL0 is of no stronger consistency strength than PRA

(a result that would be established by showing that WKL0 was conservative over PRA with

86Kreisel [1987], p. 395.

194 0 respect to just the Π1 sentences) but also that WKL0 is consistent relative to PRA. Now, to be sure, PRA is subject to Gödel’s theorems so it cannot prove its own consistency. But, given its epistemic security and the fact that it does not lead to any false statements, we can be epistemically sure that it is consistent. So, given that WKL0 is conservative relative to PRA, this means that Hilbert would have considered WKL0 epistemically secure. And,

0 importantly, not merely secure with respect to the Π1 sentences. For if only those then no reason to appeal to WKL0 in proving results in PRA. Rather, secure with respect to all statements of WKL0. So far so good. Now, WKL0 proves Gödel’s completeness theorem, namely that every countable consistent set X of sentences has a model. This seems to support the consistency-entails-existence maxim of Hilbert, Con ñ Ex.87 What the above discussion hopefully makes clear is that inference to the effect that all statements of WKL0 are PRA-justified is fallacious. Hence, we do not have any direct confidence from the point of view of the PRA theorist that the completeness theorem is true, and so, on that basis, the argument for the truth of Con ñ Ex falls short. Of course the remark from the last paragraph shows that one might have increased confidence in the truth of the maxim. And of course if one is already outside the system them such an argument does support ones view. This lines up with the discussion of Frege above. Hilbert succeeded in defining a structure, but only understood as a second-level concept. And one can only do so if one is on the outside looking in, so to speak. In closing, then, I echo Burgess [2010], All I am asking for, ultimately, is just that some fraction of the subtlety and sophistication that has been devoted to technical results should be devoted to the working out of their philosophical implications, in imitation of the model of Gödel, who was equally careful in both domains.88

6.5 Appendix

Lemma 6.5.1 (D-Inversion, 1.2.3) Let TpFq be an Herbrand theory, let Γ contain only purely existential formulas, and let ψ be quantifier-free; if D is a TpFq-derivation of Γ, pDxqψx, then there is a term t˚ and a(n I-normal) TpFq-derivation D˚ of Γ, ψt˚.

87Such an argument was first suggested to me by Neil Tennant. I since vaguely recall seeing something similar put forth by Stephen Simpson in the FOM archive, but have been unable to locate said post. 88Burgess [2010], p. 140.

195 Proof. The proof proceeds by induction on I-normal TpFq derivations. Focusing on the central step concerned with the induction rule, D must end with an inference of the form

. . . . Γ, φ0, Dxψx Γ, φa, φa1, Dxψx Γ, φt, Dxψx The induction hypothesis applied to the left premise yields a term r and a derivation of

Γ, φ0, ψr. The induction hypothesis applied to the right premise yields a term sras and a derivation of Γ, φa, φa1, ψsras, where sras means that a is a parameter occurring in s. More- over, since TpFq proves φ0, φt, pDx ď tqpφx ^ φx1q,TpFq also proves both φ0, φt, φhptq and φ0, φt, φhptq1 (by condition (ii) in the definition of a Herbrand theory). Replacing the parameter a with hptq gives us Γ, φhptq, φhptq1, ψsrhptqs. So, graphically, TpFq proves

...... Γ, φ0, ψr Γ, φhptq, φhptq1, ψsrhptqs φ0, φt, φhptq φ0, φt, φhptq1

A series of cuts then yields a derivation of Γ, φt, ψr, ψsrhptqs. And since TpFq is a Herbrand theory one can define (by condition (i)) an f in F such that TpFq proves Γ, φt, ψfptq. l

0 Lemma 6.5.2 (2.1.2) Let Γ contain only Σ1-formulas; if D is an I-normal derivation of Γ 0 in (Σ1pPR-IA), then there is an I-normal derivation of Γ in QF(PR)-IA.

0 Proof. The proof proceeds by induction on the number of applications of the Σ1-induction rule in D. Since the claim is trivial in the case where D contains no applications of the

0 Σ1-induction rule, assume that D contains at least one application and consider one such application for which there are no other applications above it. The subderivation in question will contain the following inference (with ψ quantifier free).

. . . . ∆, Dxψx0 ∆, Dxψxa, Dxψxa1 ∆, Dxψxt D-inversion allows the extraction of a term σr0s and a QFpPRq-IA derivation of ∆, ψσr0s0 from the left premise. From the right premise, @-inversion followed by Σ-inversion yields

196 a new paramter c, a term τra, cs, and a derivation of ∆, ψca, ψτra, csa1. Now define a function f by primitive recursion such that fp0q “ σr0s and fpa1q “ τra, fpaqs. One can then prove by means of quantifier free induction together with the proofs of ∆, ψσr0s0 and ∆, ψca, ψτra, csa1 that there is a QFpPRq-IA derivation of ∆, ψfpaqa, and hence of ∆, Dxψxt. By replacing the original subderivation above with this new proof, one can invoke the induction hypothesis and infer the lemma. l

0 Theorem 6.5.1 (2.1.1) The provably total functions of (Σ1-IA) are exactly the primitive recursive functions.

Proof. This result follows from the following facts:

0 0 (i)( Σ1pPRq-IA) is a definitional (and hence conservative) extension of (Σ1-IA); 0 0 (ii)( Σ1pPRq-IA) is conservative over QF(PR)-IA for Π2 formulas; and (iii) PR is the class of provably total functions of QF(PR)-IA).

(iii) follows from the following theorem:

Theorem 6.5.2 (1.2.6) The provably total functions of (∆0pAnq-IA), 3 ď n are exactly the elements of En.

Proof. First note that all the functions of En can be introduced in ∆0pAnq-IA, 3 ď n, and so

∆0pEnq-IA is a definitional extension of ∆0pAnq-IA. Moreover, QFpEnq-IA is equivalent to

∆0pEnq-IA. The basic idea is that since QFpEnq-IA and ∆0pAnq-IA are equivalent, it follows

that they have the same class of provably total functions. Moreover, since QFpEnq-IA is an Herbrand theory, by the term extraction lemma we have that that its class of provably total

functions is En. So the provably total functions of ∆0pAnq-IA, 3 ď n are exactly the elements

of En. From this it follows that because tEn : n P Nu are the primitive recursive functions,

and pQFpPRq-IAq is equivalent to tpŤ∆0pAnq-IA: n P Nu, the provably total functions of pQFpPRq-IAq (and PRA) are the primitiveŤ recursive functions. l As Sieg notes, this highlights a general schema for determining the class of provably

total functions of some theory TpFq. Namely, instead of considering TpFq directly, one first extends TpFq definitionally and then reduces the extension to a Herbrand theory. In the case

at hand, TpFq is (∆0pAnq-IA), the definitional extension is (∆0pEnq-IA), and the Herbrand

theory is QFpEnq-IA.

197 Lemma 6.5.3 (Elimination Lemma, 2.2.2) Let D be an I-normal ETn-derivation of ∆r SCHEMAs, where SCHEMA stands for QF-AC0 or WKL; if ∆ contains only existential formulas, then there is an I-normal ETn-derivation E of ∆.

Proof. The proof proceeds in two parts: first for QF-AC0, then for WKL. In both cases one argues by induction on the length of the derivation D, distinguishing cases of the last

inference of D. Focusing first on the central case of an introduction of QF-AC0, D must end with an inference of the form . . . .

∆r QF-AC0s, @xDyφxy ∆r QF-AC0s, Df@xφxfpxq

∆r QF-AC0s, @xDyφxy ^ Df@xφxfpxq Considering the derivations of the left premise, one can, after applying @-inversion, apply the induction hypothesis to arrive at a derivation of ∆, Dyφcy. D-inversion together with

ED then yields a function g such that ∆, @xφxgpxq is ETn-provable. Applying @-inversion on the right premise yields a derivation of ∆, @xφxfpxq. Replacing the parameter f in the derivation by g yields an ETn derivation of ∆, @xφxgpxq. An application of cut, after I-normalizing, completes the case. Graphically, . . . . ∆, @xφxgpxq ∆, @xφxgpxq ∆ As for the case of WKL, we focus on the introduction of an instance of WKL:

T pfq ^ @xDyplhpyq “ x ^ fpyq “ 1q ^ Dg@xfpg¯pxqq “ 1.

There are thus I-normal ETn-derivations of length shorter than D of the following sequents:

...... ∆r WKLs,T pfq ∆r WKLs, @xDyplhpyq “ x ^ fpyq “ 1q ∆r WKLs, Dg@xfpg¯pxqq “ 1

@-inversion and the induction hypothesis yield I-normal derivations ETn-derivations of:

...... ∆,T pfq ∆, Dyplhpyq “ c ^ fpyq “ 1q ∆, Dxfpu¯pxqq ‰ 1

198 with new parameters c and u. D-inversion yields new numerical terms t and s and I-normal

ETn-derivations of

...... ∆,T pfq ∆, lhptrcsq “ c ^ fptrcsq “ 1 ∆, fpu¯psrusqq ‰ 1

Note that s and t may contain further parameters but u does not occur in t. Now t describes sequences of arbitrary length, all of which are in f and not necessarily forming a branch, and fpu¯psrusqq ‰ 1 expresses the well-foundedness of f. So Tpfq is a binary tree that is also well-founded, containing sequences of arbitrary length. Appealing to a recursion-theoretic result due to Howard, s can be majorized by a numerical term s˚ in

89 the language of En not containing u.

˚ ˚ Letting trs s be the binary sequence t0, . . . , ts˚´1 and defining u as

˚ tn if n ă s ˚ u pnq “ $ &’ 0 otherwise,

we have that u˚ps˚q “ trs˚s. Since f is provably%’ a tree and s˚ is a bound for s, it follows that there is a derivation of

. . ∆, fpu¯ps˚qq ‰ 1. Since u˚ps˚q “ trs˚s, substituting u˚ for u in the aforementioned derivation yields an I-normal

ETn-derivation of

. . ∆, fptrs˚sq ‰ 1. Substituting s˚ for c in the derivation of ∆, lhptrcsq “ c ^ fptrcsq “ 1 above yields a proof of

. . ∆, fptrs˚sq “ 1. 89See Sieg [1991], p. 429.

199 An application of cut yields the desired I-normal derivation of ∆:

. . . . ∆, fptrs˚sq “ 1 ∆, fptrs˚sq ‰ 1 ∆ l

200 Chapter 7

On Penelope Maddy’s Defending the Axioms: A critical study

The notion of a foundation for mathematics is vague. One can distinguish between what might be called Hilbertian Foundationalism and Naturalistic Holism. In light of Gödel’s results, Foundationalism has largely gone out of style. The lesson from Gödel is commonly taken to be that the goal of the Hilbert Program is simply an outdated and unattainable ideal. In its place has emerged the holistic picture whereby a foundation for mathematics is to be understood as a way to bring the seemingly disparate branches of mathematics together in a unifying way. The sharp contrast between Naturalistic Holism and Hilbertian Foundationalism connects this chapter to the rest of the dissertation by illustrating how, to many, the philosophical steam has left the Hilbertian machine. We look at one such brand of Naturalistic Holism due to Penelope Maddy centered on set theory as the foundational arena. Maddy’s aim is to understand the proper grounds for the introduction of sets and set-theoretic axioms, as well as a justification of set-theoretic practice. After considering Maddy’s account of how such an understanding proceeds, we provide critical commentary. Central to the discussion is the notion of mathematical depth. We end with some remarks on mathematical methodology followed by a brief discussion of the a priori and its role in bridging the gap between mathematics and philosophy.

7.1 Introduction

In thinking about mathematics several questions are bound to arise. What are mathemati- cians doing? How are they managing to do it? What is the appropriate methodology? What does it mean to introduce a mathematical concept? Under what conditions is it appropriate to introduce a mathematical concept? Do mathematical objects exist? If so, how? Are

201 mathematical statements true? What is objectivity in mathematics? Of course, answers to these questions depend largely on who is asking them. The practicing mathematician will presumably answer this differently than the mathematical logician, who will presumably an- swer it slightly differently than the philosopher of mathematics. The question then becomes: Who, if any, are right? Or is there even a right answer? A first place one might look is to the historical development of mathematics. Penelope Maddy, for example, contends that in order to answer these questions one must begin with an historical investigation of how pure mathematics rose out of . Maddy takes the historical development of mathematics to show a ‘reversal of philosophical fortune’, whereby initially mathematics was taken to be a paradigm of knowledge and science mere opinion, to the view that science itself is the best knowledge that we have and mathematics is in need of justification. The view that scientific method is the best arbiter of matters metaphysical and epistemological is loosely called Naturalism. Naturalism is perhaps most notably identified with Quine. Roughly, for Quine, since the goal of science is to describe reality, science is the best arbiter for matters metaphysical and epistemological. As he puts its, naturalism is “the recognition that it is within science itself, and not in some prior philosophy, that reality is to be identified and described.”1 Science, though not infallible, is justified by means of observation and the hypothetico-deductive method. As concerns philosophical inquiry, the idea, roughly, is that philosophy is to answer to science, not the other way around.2 This is not to say that philosophy is totally divorced from science; indeed it is largely part of the scientific enterprise. But the point is that there is “no extrascientific method of justification [that] could be more convincing than the methods of science."3 The scientific method is itself the best method that we have for understanding the world. As such, why think that our epistemic endeavors ought to fall outside the scope of science? There is no better way for us to inquire into knowledge and belief than science

1Quine [1981], p. 21. 2Quine [1975], p. 72. 3Maddy [2005], p. 438.

202 itself. Moreover, it applies to the justification of its own methodology – science itself justifies science. [W]hat better justification could we have to believe in the most well-confirmed posits of our best scientific theory than the fact that they are the most well-confirmed posits of our best scientific theory?4 As for how belief revision ought to proceed, Quine famously uses the image of Neurath’s ship. Just as the rebuilding of Neurath’s boat is to take place while he is still afloat in it, the philosopher and scientist must also rebuild the framework while still immersed in it. Such a picture leads to a form of holism, Quine’s famous “web of belief.” One of the maxims of Quinean belief revision is to modify the web at the edges, disturbing the whole as little as possible. The idea is similar to common scientific practice. Given a set of mathematical axioms one is committed to various mathematical theorems, e.g. different solutions to differential equations. This coupled with different initial and boundary conditions, along with a set of scientific hypotheses will lead the scientist to make various predictions about the observable (measurable) magnititudes. These are then compared with the evidence that results from the scientists observations and other various auxiliary assumptions. If the predictions and the evidence lead to contradiction, then something has to give. Rather than give up the entire scientific enterprise, the scientist ends up rejecting one or more of the hypotheses.5 For Quinean Naturalism, questions of metaphysics and epistemology are to be answered within science itself. Metaphysics naturalized tells us that science is our guide to what exists and what it is like. Scientifically, for Quine, this commits us to the existence of atoms, as well as medium-sized objects, etc. Mathematically, it tells us that numbers and sets and functions exist. This is largely because they are taken as indispensable for the scientific enterprise; “We are committed to the existence of mathematical objects because they are indispensable to our best theory of the world and we accept that theory."6 Notice, though, that given that science is the ultimate arbiter of mathematics, and given that only part of mathematics is 4Maddy [1997], p. 11. 5For an interesting account of the logic and science of belief revision, see Tennant [2012]. 6Maddy [1997], p. 30

203 indispensable for science, only part of mathematics is justified by science. This immediately raises three questions for the Quinean naturalist: What entities are indispensable? What are they grounded on? What about the rest?7 The rise of pure mathematics signaled a shift in focus, one away from trying to describe nature, to one where mathematics is free to pursue whatever concepts and structures are considered to be of mathematical interest. As Cantor put it, “The essence of mathematics lies in its freedom.”8 Given its freedom, then, and given the shift away from a grounding in applications, it is not clear that science is the proper guide to mathematical practice. And if science is not the proper guide to mathematical practice then what is? That is,

[h]ow can we properly determine if a new sort of entity is acceptable or a new method of proof reliable? What constrains our methodological choices?9

Inspired by the Quinean picture but sensitive to these concerns, Maddy has developed a form of naturalism focused on mathematics, instead of science.10 On this picture, math- ematics is to be measured on its own terms, and is justified by means of actual mathe- matical practice; there is no need for an extra-mathematical foundation. For Maddy the proper methodology with which to approach the questions mentioned at the beginning of this chapter is what she calls Second Philosophy. Second Philosophy is the idea that our investigations about the world and the typically “philosophical” questions that arise about mathematics ought to proceed by roughly mathematical methods. That is, we ought to approach such questions as an “active participant” from within the vantage point of mathe- matics. Mathematics and philosophy ought to proceed hand-in-hand. This naturalistic view aligns most closely with the practicing mathematician. This is in contrast to what she calls ‘First Philosophy’, whereby philosophical inquiries are undertaken in complete isolation from mathematical practice.

7For an interesting answer to the first two of these questions see Feferman [1993c]. 8Cantor, as quoted in Maddy [2011]. 9Maddy [2011], p. 31. 10See, e.g., Maddy [1990], [1997], [2007], [2011] and Feferman, et al. [2000]. Since her views on certain metaphysical and epistemic issues have changed over time, the view presented here will be the one consistent with her most recent case in [2011].

204 The rest of this chapter is broken down as follows. In §7.2, I lay out the particular form of Naturalism espoused by Maddy in her most recent work on the subject in Maddy [2011]. In §7.3, I revisit her proposal, offering critical remarks. It is my contention that while there are various novel features of Maddy’s proposal, as a whole it fails to be convincing. Many of my complaints center around her appeal to an unclear and unhelpful account of mathematical objectivity, as well as a disagreement at the level of methodology and on what is required of a foundation for mathematics. I end by sketching a modified proposal that attempts to get clear on the epistemology of mathematics in light of my arguments at the end of the last chapter together with my criticisms of naturalism in mathematics and the emergence of set theory as an overarching foundation.

7.2 The fall of Hilbertian Foundationalism and the rise

of Maddyan Naturalism

Part of the development of pure mathematics included an emphasis on rigor through axioma- tization, notably in the work of Dedekind and Hilbert. The new picture of mathematics was, to use a phrase from Wittgenstein, a motley of different structures built upon an axiomatic foundation. The shift toward axiomatization was a hallmark of rigor because the axioms, along with precise logical rules, would allow (in theory) for a clear and precise characteri- zation of the subject, as well as the ability to compare different structures and study their relations. When thinking about and justifying a branch of mathematics one eventually comes to basic statements that are beyond further justification and that are taken to characterize the theory in question. These axioms are the bedrock upon which the theory is founded. The resulting theory, then, is the result of the consequences of these fundamental statements.

Two structures that are of particular interest for our purposes are N and V , the natural numbers and the cumulative hierarchy of sets, respectively. Their canonical axiom systems

205 are Peano Arithmetic (PA) and Zermelo-Frankel Set Theory with the Axiom of Choice (ZFC), respectively. The traditional view of the axioms is that they are self-evident and constitutive of the concepts of the theory. Initially there was great optimism that in virtue of these defining characteristics of the axioms all of their consequences would be derivable within the theory. As we will see, this optimism was misplaced, as there are natural statements of the theory that are independent of the axioms systems set up. Following Shapiro [1991], call foundationalism the view that “it is possible and desirable to reconstruct mathematics on a secure basis, one maximally immune to rational doubt.”11 On this view, one seeks to secure mathematics on unshakeable principles. Mathematics, the idea goes, is a paradigm of certainty. As we’ve seen, this sort of foundationalism was central to one of the most ambitious philosophical projects of the twentieth century: Hilbert’s Program. Our concern is what will provide the methodological guidelines and justifications for mathematics. Given the foundational role of set theory, one might think to focus our inves- tigation there. Maddy does just this. Of course it needs to be asked whether the investigation carries over into the more synthetic areas of mathematics. By ‘synthetic’ I do not mean its use in the analytic/synthetic distinction. I mean it to describe taking the various branches as they are when standardly investigated by mathematicians working in those branches, taking the axioms for the theories as primitive and taken at face value, and taking the objects of study as sui generis, not in terms of their set-theoretic counterparts. Unless one has reason to think that the investigation carries over into these more synthetic areas, answering the questions for set theory will not suffice. Maddy claims that what she says can be extended to all of mathematics. We address whether it really can below. For now, we follow Maddy for ease of understanding her position. To reiterate, the Second Philosopher is attempting to understand the proper grounds for the introduction of sets and set theoretic axioms, as well as a justification of set-theoretic

11Shapiro [1991], pg. 25. It should be noted that Shapiro rejects foundationalism.

206 practice. Given the mantra of the Second Philosopher, the natural place to look, for Maddy, is to set theory itself. As for the introduction of sets, the recurring theme is that in the historical development of set theory, sets were introduced as effective means toward various explicit and concrete mathematical goals.12 Justifying the addition of new axioms typically proceeds by appealing to either intrinsic or extrinsic reasons. Intrinsic reasons appeal to what

is intrinsic in the concept set. Extrinsic reasons arise from examining the consequences of adopting a given axioms. Given the independence results mentioned above, the current strategy for justifying the choice of new axioms has been largely extrinsic. Much of Maddy’s project is to justify this appeal to extrinsic reasons. Given that for her the goal of set theory is to provide a foundation in the sense mentioned above, as well as to develop the most powerful theory it can, the adoption of many of these axioms and the extrinsic reasons for doing so count as suitably effective means for achieving the desired mathematical goal. So her answer to what the proper set-theoretic methodology is, is simply to look to what set-theorists actually do. On this picture, sets are introduced to satisfy certain set-theoretic goals, and axioms are adopted for a series of intrinsic and extrinsic reasons in line with these goals. Maddy gives several historical examples to back this claim. She notes Cantor’s introduc- tion of the notion of set as an effective means of extending understanding of trigonometric representations, how Dedekind utilized sets to achieve representation-free definitions in non- constructive , how Zermelo justified the Axiom of Choice because of its utility in analysis, , algebra, etc., and how modern set-theorists appeal to the Axiom of Projective Determinacy in attempts to answer problems in analysis and set theory.13 Nonetheless, one can still ask: what exactly are set theorists doing? What is the status of the subject matter of set theory? Do sets actually exist? Are set theorists delivering a body of truths? Answering these questions becomes particularly pressing when we consider the various independence phenomena found throughout mathematics and logic. Given that

12Maddy [2011], pp. 42-45. 13See Maddy [2011], §2.2.

207 the Continuum Hypothesis, for example, cannot be decided by the current axioms of ZFC extended by any large-cardinal axioms, it is natural to wonder whether it it has a determinate truth value. And while assuming the existence of various large large cardinals does settle some of the other questions, is that enough to warrant their postulation?

7.2.1 Thin Realism vs. Arealism

The traditional line on such questions has been a retreat to a Platonistic metaphysics. In fact, Solomon Feferman thinks set theory requires a Platonic realism:

Philosophically, set theory – even in its “moderate” form given by Zermelo’s axioms – requires for its justification a strong form of Platonic realism. This is not without its defenders, most notably Gödel (1944 and 1947/1964) (cf. also (Maddy 1990)). For its critics, however, the following are highly problematic features of this philosophy:

(i) abstract entities are assumed to exist independently of any means of human definition or construction; (ii) classical reasoning (leading to non-constructive existence results) is admitted, since the statements of set theory are supposed to be about such an independently existing reality and thus have a determinate truth value (true or false); (iii) completed infinite totalities and, in particular, the totality of all subsets of any infinite set are assumed to exist; (iv) in consequence of (iii) and the Axiom of Separation, impredicative definitions of sets are routinely admitted; (v) the Axiom of Choice is assumed in order to carry through the Cantorian theory of trans- finite cardinals.14

It’s clear that (ii)-(v) (as far as Feferman is concerned) rest on (i), so let’s focus on that. The idea is that some objective reality is the intended subject matter of set-theory and it is the job of set-theorists to describe this Platonic realm of sets. Moreover, because the structure exists independently of the set-theorist, there is a fact of the matter as to whether certain statements about the structure, e.g. the Continuum Hypothesis, are true or false.15 The first part of this view adopts a realism in ontology, whereas the second adopts a realism in truth value. To borrow Maddy’s terminology, call this view Robust Realism. I take it that

14Feferman [1993c], pp. 287-288. 15Some, though, such as Neil Tennant, might think that this is a fallacious move.

208 the idea is that the realism in ontology is what secures the realism in truth-value. Indeed, typically these two realisms are taken as going hand in hand.16 The standard complaint about realism in ontology was most famously raised by Benac- erraf. Roughly the idea is that if (i) is true, then a story is owed as to how it is that we have epistemic access to the abstracta, and it’s hard to see how any such naturalistic story could be satisfactory. There is another problem, though, specifically for the naturalist. Recall that the naturalist contended that to answer set-existence questions one ought to assent to set-theoretic methodology. Given this reliance on set-theoretic methodology, postulating the existence of Platonic abstracta is problematic because it could be the case that despite what set-theorists think, reality is simply uncooperative. Given these issues, Maddy is motivated to question whether such an account is even required. As she says,

the Second Philosopher’s inclination is to think that. . . if Robust Realism questions the co- gency of apparently sound mathematical reasoning, her guess is that the fault lies with Robust Realism, not the tried-and-true ways of set theory.17

Of course now the burden is on Maddy to explain (1) what exactly sets are like and (2) why it is that set-theoretic practice is itself reliable. In answer to the question (1), Maddy’s Second Philosopher is

naturally inclined to entertain the simplest hypothesis that accounts for the data: sets just are the sort of thing set theory describes; this is all there is to them; for questions about sets, set theory is the only relevant authority.18

This seemingly insubstantial view of sets Maddy calls Thin Realism. Maddy says that much of her thoughts about Thin Realism were inspired by various remarks of John Burgess and John Steel. For example, she quotes Steel as saying,

Realism in set theory is simply the doctrine that there are sets. . . Virtually everything mathe- maticians say professionally implies there are sets.. . . As a philosophical framework, Realism is right but not all that interesting.

And shortly after,

Both proponents and opponents [of realism] sometimes try to present it as something more intriguing than it is, say by speaking of an ‘objective world of sets’.19

16Though see Tennant [1997] for a nice discussion of the four possible combinations. 17Maddy [2011], p, 59. 18Maddy [2011], p. 61. 19Maddy [2011], p. 60.

209 The latter quote is supposed to distinguish her view from the Robust Realism character- ized above. The idea presumably is that instead of beginning with some robustly existing ‘objective world of sets’, and then proceeding from there, we need instead to start with

[a] more sophisticated realism, one accompanied by some self-conscious, metamathematical considerations related to meaning and evidence in mathematics20

Thin Realism still considers set-theory to be describing properties of an objectively existing reality. The primary difference between Thin Realism and Robust Realism is that whereas Robust Realism

requires a non-trivial account of the reliability of set-theoretic methods, an account that goes beyond what set theory tell us; for the Thin Realist, set theory itself gives the whole story; the reliability of its methods is a plain fact about what sets are.21

That is, the Thin Realist

holds [that] the set-theoretic methods are the reliable avenue to the facts about sets, that no external guarantee is necessary or possible.22

It might appear then that sets are constituted by set-theoretic methods. Of course this cannot be the case for the Thin Realist since this would sin against the idea that sets are objective, independent entities. Given that sets are introduced for explicitly mathematical purposes, and given their success in aiding mathematical inquiry, the Thin Realist simply balks at the idea that her methods could be mistaken. As she puts it, “there is no room for a radical epistemological gap between sets and set-theoretic methods.”23 This is not to say that anything goes. There must be something that guides the set-theoretic methods in their tracking of sets. So the naturalist still owes us a story of what it is that grounds our mathematics and makes our set theory reasonable. We want to know what justifies our set- theoretic practice and what it is that gives us good reason to think that we’re tracking truth. That is, we want to know what “underlies the justificatory methods of set theory... [and gives] us reasons to believe what we believe.”24 Part of the answer will rely on intrinsic justification, i.e. those concepts that are contained in the concept set; a large part of the justification

20Steel, as quoted in Maddy [2011], p. 61. 21Maddy [2011], p. 63. 22Maddy [2011], p. 63. 23Maddy [2011], p. 75. 24Maddy [2011], p. 79.

210 for Zermelo’s axioms rests upon what is taken to be intrinsic in the concept of a set. But given the freedom that characterizes mathematics, and given the independence phenomena, many think this cannot be the whole story. Indeed many of the large axioms have been proposed because of their consequences, i.e. for explicitly extrinsic reasons. But what is it that guides this? For Maddy the answer is mathematical depth: “[W]hat guides our concept-formation, beyond the logical requirement of consistency, is the way some logically possible concepts track deep mathematical strains that others miss."25 It is depth that “constrains our set-theoretic methods.”26 Now Maddy is not entirely clear about what exactly depth is. Her use of the term is rather vague, being left up to the reader (or more accurately the mathematician) as something that becomes apparent with mathematical experience. Indeed she even goes so far as to claim that providing a general account of depth is unlikely to even be productive.27 Instead she uses similar phrases to get at the idea: fruitfulness, effectiveness, importance, and productivity. She is clear, however, that depth is objective. So judgments of mathematical depth are not subjective. It also bears repeating that judgments of mathematical depth are not subjective: I might be fond of a certain sort of mathematical theorem, but my idiosyncratic preference does not make some conceptual or axiomatic means toward that goal into deep or fruitful or effective mathe- matics; for that matter, the entire mathematical community could be blind to the virtues of a certain method or enamored of a merely fashionable pursuit without changing the underlying facts of which is and which is not mathematically important.28

So what constrains mathematical theorizing, and what ensures that mathematics is objective, is depth; sets just are the sorts of things that set theorists are describing and because of the reliability of set-theoretic methods set theorists can be confident that the result of their labor is truth. Maddy nicely sums up the view of her Thin Realist: Any particular extrinsic justification may fail to meet its mark, for reasons ranging from a straightforward error in what follows from what to a deep misconception about the true math- ematical values in play. We can be uncertain whether or not a given set-theoretic posit will pay off, and therefore uncertain about whether or not it exists, but if it does pay off, there’s no longer any room for doubt; we can be uncertain that we’re getting at the deepest and most

25Maddy [2011], p. 79. 26Maddy [2011], p. 81. 27Maddy [2011], p. 81. 28Maddy [2011], p. 81.

211 fruitful theory of sets, and therefore uncertain about whether or not our axiom candidate is true, but if we are succeeding, there’s no further room to doubt that we’re learning about sets. . . [T]here is a well-documented objective reality underlying Thin Realism, what I’ve been loosely calling the facts of mathematical depth. The fundamental nature of sets (and perhaps all mathematical objects) is to serve as means for tapping into that well; this is simply what they are. And since set-theoretic methods are themselves tuned to detecting these same contours, they’re perfectly suited to telling us about sets; they lie beyond the reach of even the most radical skepticism. This, I suggest, is the core insight of Thin Realism.29

The Thin Realist proclaims that the mathematician is after truths. But is this really so? The Second Philosopher’s journey, remember, started with the scientific enterprise and developed from there. She began with an investigation of the world and what it is like, developed various scientific methods and came to appreciate the indispensable role of math- ematics for doing science. And even as mathematics became more pure and separated itself from science in method and application, it was this role in various scientific disciplines such as chemistry, , etc., that led the Thin Realist to buy into the existence of sets and the truth of mathematics. After all, these branches of science are after truth and describe things that exist. How then could one explain the applicability of mathematics to these disciplines if it was not itself also after truth and existence? Nonetheless, one might take an alternative view, stressing as uniquely important the difference in method that mathematics has from science. And given this stark difference in method, and given the failure of application in many cases, one might conclude, quite differ- ently than the Thin Realist, that whatever the merits of the methods of pure mathematics, it would be a mistake to claim that pure mathematics is after the truth about sets. That is, given that mathematics is different enough from these branches of science, one might just as well question whether mathematical objects (e.g. complex numbers, vectors, etc.) do in fact exist and whether mathematicians are in the business of tracking truth. This is the route the Arealist takes. The difference, then, between the Thin Realist and the Arealist concerns how seriously one takes the role of mathematics in empirical science. For the Arealist, like the Thin Realist, the methods of set theory are constrained by

29Maddy [2011], p. 83.

212 set-theoretic methods, and, as history has shown, the development of set theory was for dis- tinctively mathematical purposes. Set theory is successful and worthwhile, yet no questions of existence of truth need enter the picture. Nonetheless, two questions then arise for the Arealist: What exactly are mathematicians doing if they are not after truth/existence? And if sets do not actually exist then is up to her to explain the application of mathematics while simultaneously holding that it is not after truth/existence. This second question concerns Frege’s applicability requirement. Namely, can Arealism account for the application of mathematics without regarding it as true? The first part of the Arealist’s answer is to note that the mathematics that is used in science is merely a model of reality. Mathematical abstracta can be used to model physical phenomena without taking the model itself as literally true of reality; it matches it in some respects but not others. It’s not clear that this is enough, though. Michael Liston [2007], for example, thinks that there has to be more that mathematics is doing than simply describing models. Advanced mathematics must also be used in determining which models work and why. And if mathematics is being used to determine which properties of the models are required, then surely the mathematician must also believe the mathematics that is being used.30 The Arealist differs from the Thin Realist on the question of whether sets exist and whether set-theory is after a body of truths. But the two views do agree “at the level of method.” That is, they both agree on set-theoretic methodology and on the criteria upon which sets are to be introduced. Maddy takes this methodological agreement to to “reflect a deeper metaphysical bond: the objective facts that underlie these two positions are exactly the same, namely, the topography of mathematical depth. . . ”31 It is the phenomena of mathematical depth that guide the set-theorists in what they are doing; and this remains the same whether sets exist or mathematicians are after truth. In this way set-theorists are responding to the objectivity of depth, and it is this that guides the development of set theory.

30This is a loose recounting of Maddy’s reconstruction of Liston’s claims. See Maddy [2011], pp. 90-92. 31Maddy [2011]. p. 100.

213 The disagreement between the two depends on the philosophical reflection on the emer- gence of set-theoretic practice. As above, the Thin Realist takes sets to exist and set-theory to be after truth because of the connection that mathematics has with empirical science. Mathematics is to the overall scientific endeavor of the Second philosopher. Her scientific endeavors take mathematics to be in the same pursuit of truth and existence as they are.

Thus the divergence between the second-philosophical Arealist and the second-philosophical Thin Realist comes down to this: as the Second Philosopher conducts her inquiry into the way the world is, beginning with her ordinary methods of perception and observation, theory- formation and testing, she’s eventually faced with the effectiveness of pure mathematics and elects to add it to her ever-growing list of investigations; she also recognizes that the appropriate methods are different and that the objects studied are different; the point at issue hinges on what she concludes from this. If the new objects seem a bit odd – non-spatiotemporal, acausal, etc. – but still enough like the old – singular bearers of properties, etc. –, if the new methods seem a bit odd, but still of-a-piece with the old, then she concludes that she’s made a surprising discovery, that the world includes abstracta as well as concreta. If, on the other hand, she regards the new methods and would-be objects as sharply discontinuous with what came before, she has no grounds for thinking pure mathematics is true, so she concludes that this new practice–valuable as it is–is not in the business of developing a body of truths.32

The upshot of all of this is that methodologically, looking at what set-theorists are doing, whether you think sets exist and whether set theorists are really after truth is in some sense irrelevant, since it does not change what the mathematician is licensed to do.

The Arealist does not disagree with what mathematicians say qua mathematicians, but when they branch out into questions of truth and existence external to mathematics proper–what is the nature of human mathematical activity? what is its subject matter and how do we come to know about it? and so on–then she reserves her right to differ.33

Maddy admits that “[i]t’s hard not to think that one must be right and the other wrong, that either sets exist or they do not, that set theory is a body of truth or it is not. . . ”34 In an attempt to claim that this is mistaken she appeals to work of Mark Wilson [2006] and his notion of tropospheric complaceny. The idea is that we tend to think that our concepts

mark fully determinate features or attributes, that there is a determinate fact of the matter as to whether they apply and where they do not, that this is so even for questions we have not

32Maddy [2011], pp. 101-102. Neil Tennant has pointed out to me that this picture really only began with Newton, and was made possible only because the needed (synthetic) mathematics (that of kinematics and differential equations, mainly) was already (and at long last) in place. Historically, Maddy has the cart before the horse. Newton furnished the math; then he did the physics. 33Maddy [2011], p. 103. 34Maddy [2011], p. 105. Note that these could be orthogonal questions (e.g. for Hartry Field).

214 yet been able to settle one way or the other.35

Wilson’s claim is that this is mistaken. He uses the example of ‘ice’ to illustrate what he means. Ice is frozen water. Under certain circumstances water freezes with a crystaline structure. This is what (most) chemists consider ice. But there are many ways in which water can freeze, and freeze with different structure. For example, when water is cooled quickly enough, the frozen water lacks the crystaline structure that it normally does. Is this frozen block still ice? For some chemists, the answer is, technically, no. And yet given that it is frozen water, for others it would still be considered ice. So who is right? The moral of all of this is there is nothing in our language, nothing in the underlying chemical facts, that settles the answer one way or the other. Maddy wants to extend this broadly Wilsonian picture to the case of “truth”, “existence”, etc. When asking whether mathematics ought to count as of-a-piece with physics, chemistry, and so on, whether it is a body of truths, whether mathematical objects exist, these questions have no more determinate answers than the question of whether the non-crystaline solidified water counts as ice.36 All that matters in the ice case is that we know the various ways that water can solidify and the similarities and differences of the various structures that result. The words that we use, our ways of speaking, are irrelevant to the matter. In the mathematics case, so the story goes for Maddy, when understanding the ways in which pure mathematics arose out of science and how it now differs, once we “understand the many ways in which it remains intertwined with those sciences, how its methods work and what they are designed to track–once we know all these things, what else do we need to know? Or better, what else is there to know?”37 The upshot of all of this is this is that

. . . there is no substantive fact of the matter to which our decision between Thin Realism and Arealism must answer. The application of ‘true’ and ‘exists’ to the case of pure mathematics is not forced upon us–as it would be if Thin Realism were right and Arealism wrong–nor is it forbidden–as it would be if Arealism were right and Thin Realism wrong. Rather, the two idioms are equally well-supported by precisely the same objective reality: those facts of

35Maddy [2011], pp. 105-106. This quote is from Maddy, describing Wilson’s picture, and not a quote from Wilson himself. 36Maddy [2011], p. 111-112. 37Maddy [2011], p. 112.

215 mathematical depth. These facts are what matter, what make pure mathematics the distinctive discipline that it is, and that discipline is equally well described as the Thin Realist does or as the Arealist does. Once we see this, we can feel free to employ either mode of expression, as we choose–even to move back and forth between them at will. The proposal, then, comes to this: Thin Realism and Arealism are equally accurate, second- philosophical descriptions of the nature of pure mathematics. They are alternative ways of expressing the very same account of the objective facts that underlie mathematical practice.38

7.2.2 New Axioms

Suppose that Maddy is right that Second Philosophy does not need to decide between Thin Realism and Arealism. What does this tell us about continued mathematical prac- tice, in particular efforts to “overcome” various independence results by adding new axioms? Throughout we have been alluding to the difference between intrinsic justification and ex- trinsic justification. It was noted that part of Maddy’s goal is to justify the adoption of new axioms based on extrinsic appeal. It is time to make this all precise and look closely at how it affects mathematical methodology, looking at specific examples. What is meant by an intrinsic reason for adopting an axiom? Several characterizations are typically given. One often speaks of the axioms as intuitive, as being self-evident and obviously true, as “forcing themselves upon us.” This is the view of Gödel, and the result of mathematical intuition that we have. One sometimes also talks about the axioms being part of the meaning of the concept under consideration (in the case of set theory as being part of

the concept of set).39 Intrinsic justification will be that it is implicit in the concept under consideration. Indeed, many of the well known axioms of the theories important for our purposes, namely PA and ZFC, do satisfy these requirements. Given any familiarity with the notion of a natural number, the axioms of PA do seem to simply force themselves upon us and they do so because they are implicit in the concept of natural number. Similarly, the axioms of ZF do seem intuitively obvious for the very reason that they are implicit in the

38Maddy [2011], p. 112. 39Though see someone like Neil Tennant who contends that the concept set is fully captured by the I- and E-rules for the abstraction operator tx| . . . x . . . u. These rules yield Extensionality, but make no ontological commitment.

216 concept of set. In contrast to adopting axioms based solely on reasons intrinsic to the concepts under investigation, one can also appeal to external reasons. Gödel characterized these reasons as supported by “hitherto unknown principles” that would become evident after gaining deeper insight into “the concepts underlying logic and mathematics” or by the “abundan[ce] of their verifiable consequences.”40 This picture is thus holistic. It leads to new ‘axioms’ because of their consequences, not because of an intrinsically conceptual case that renders them clearly true (i.e. as ‘conclusion’ of some kind of argument). As he writes,

[E]ven disregarding the intrinsic necessity of some new axiom, and even in case it had no intrinsic necessity at all, a decision about its truth is possible also in another way, namely, inductively by studying its ‘success’, that is, its fruitfulness in consequences and in particular ‘verifiable consequences, i.e. consequences demonstrable without the new axiom, whose proofs by means of the new axiom, however, are considerably simpler and easier to discover, and make it possible to condense into one proof many different proofs. The axioms for the system of real numbers, rejected by the intuitionists, have in this sense been verified to some extent owing to the fact that analytical number theory frequently allows us to prove number-theoretical theorems which can subsequently be verified by elementary means. A much higher degree of verification than that, however, is conceivable. There might exist axioms so abundant in their verifiable consequences, shedding so much light upon a whole discipline, and furnishing such powerful methods for solving given problems (and even solving them, as far as that is possible, in a constructive way) that quite irrespective of their intrinsic necessity they would have to be assumed at least in the same sense as any well-established physical theory.41

Note the similarity of this to the discussion at the end of the last chapter with regard to the instrumental role of certain abstract principles and how proof-theoretic reductions can be seen as increasing one’s epistemic confidence that such principles are true. Given that for Maddy’s Second Philosopher sets and axioms are introduced for purely mathematical means, it is no surprise that she think that one need not limit oneself to intrinsic reasons that are obviously “natural” continuations of the system as set up so far. For her, “absolute clarity" is not required. We ought not limit ourselves to what is inherent in the concepts under examination. In general, attempts to limit mathematics are a bad idea. To make a bad pun, mathematicians should, in a sense, be able to go wherever their noses lead. Because remember, for Maddy, “the essence of pure mathematics is its freedom.”

40Gödel [1947/64], p. 182-183. 41Gödel [1947/64], p. 182-183.

217 In modern mathematics new axioms are adopted all the time given their usefulness and “interest”. So adopting new axioms in order to get all the new and different mathematically interesting structures should be allowed, regardless of “clarity”, if the adoption helps the mathematician further her goals, assuming, of course (at least) consistency. For Maddy’s Second Philosopher, getting clear on what exactly is and is not part of our existing concept is not the primary goal. Why should the set theorist care if large cardinals are contained in the concept of a set? It should be clear at this point just how sharply Maddy’s view differs from that of the Hilbertian Foundationalist described above. There is no need for absolute clarity and epis- temic security. All that guides mathematical practice is utility. Maddy’s understanding of the foundational role of set theory is markedly different from the Hilbertian. Rather than provide an unshakeable foundation, set-theory’s job is one of unification and clarification of the interrelationships between the various mathematical sub-disciplines. And given this, whereas the ultimate justification of a theory has been typically via intrinsic appeal, Maddy thinks that the true justificatory force lies in appeal to extrinsic reasons; instrinsic justi- fication is merely instrumental. Pithily put, the philosophical steam has finally left the Hilbertian machine.

7.3 Reply

For Maddy, mathematics is about the logical connections among concepts. But Maddy’s Second Philosopher, paying close attention to the practices of actual mathematicians, notices that not just anything goes. There is something beyond mere logical consistency that guides what mathematicians do. This is the notion of mathematical depth discussed above. The main novelty of Maddy’s position lies in the morals that the Second Philosopher draws from her understanding that Thin Realism and Arealism are different ways of expressing the same underlying objectivity of mathematics, namely depth. That is, if the Second Philosopher

218 need not decide between Thin Realism and Arealism because they both describe set-theoretic methodology equally well, then she is led to conclude that unlike most standard accounts of what guides and constrains mathematical practices, objectivity of mathematics does not rely on the existence of mathematical objects or on the truth of mathematical statements.

If Thin Realism and Arealism are equally accurate, second-philosophical descriptions of the nature of pure mathematics, just alternative ways of expressing the very same account of the objective facts that underlie mathematical practice, then we have here a form of objectivity in mathematics that does not depend on the existence of mathematical objects or the truth of mathematical statements, or even on the non-existence of mathematical objects or the rejection of mathematical claims.42

Indeed she thinks that this insight tracks better what mathematicians are actually doing than any appeal to ontology, epistemology or semantics ever could. There is much in Maddy’s account that I agree with. In particular I do believe that math- ematics is the study of the logical connections among various concepts. I also firmly agree that consistency is essential. And I also agree that what is actually guiding mathematicians in what they do is something like depth. Whether or not this is objective in the sense of being mind-independent I’m not sure. I’ll touch more on this at the end of the chapter. So her switching the focus from ontology and truth to depth is both interesting and important. I do, however, disagree with many of the conclusions that she draws from these facts. Part of my disagreement stems from what I take to be a fallacious jump from what counts as acceptable mathematical practice to the conclusion that said practice is the only legitimate avenue to pursue. She largely gets into trouble, in my eyes, because of her Second Philosophical methodology, her appeal to actual mathematics in answering philosophical questions, and in her failure to distinguish set theory as a foundation from set theory as a particular branch of mathematics. These issues are all interrelated and they will accordingly be mentioned throughout the discussion. In what follows I will lay out my complaints.

42Maddy [2011], p. 116.

219 7.3.1 The uniqueness of V

One of Maddy’s primary contentions is that the universe of sets, V , whatever it is, is unique. Consider, for a moment, the status of the Continuum Hypothesis. For the Second Philoso- pher, like the Robust Realist, the Continuum Hypothesis has a determinate truth value, regardless of whether we will ever know what it is. The reason for this is that “‘CH or not-CH’ is a theorem, established by her best methods as a fact about V ; therefore CH is either true or false there.”43 There is a footnote in that quote which reads as follows: Again, by ‘V ’ I mean the universe of sets that set theory is investigating, which we learn, in the course of that investigation, takes the form of stages Vα, one for each ordinal α. Here I’m presupposing that the second-philosophical set theorist is moved by various mathematical considerations, conspicuously the desire for set theory to serve as a foundation. . . , to seek a unified theory of this single universe V .

But why think that V must be unique? Surely this begs the question against the Continuum Hypothesis lacking a determinate truth value. One could plausibly adopt a pluralist account of the universes of set theory. Rather than think there is one unique structure, V , there could be a multitude, each extending ZFC by adopting different axioms.44 Maddy’s reason for thinking that V is unique appeals to her desire for V to count as a foundation for, and a unification of, all the different branches of mathematics. But nearly all of the mathematics that is being unified lives so low in the set-theoretic hierarchy that it can

45 be taken care of by Vω`ω. This seems to undermine the uniqueness requirement. But even this is too strong. Even if it did not live so low, it still lives in the part of V described by ZFC. So any possible extension will still count as a foundation in her sense. This shows that we need to distinguish set theory as a foundation from set theory as a specific mathematical

43Maddy [2011], p. 63. By ‘best methods’ she presumably means assuming LEM. And so one could also raise issue with her taking Classical Logic as the “right” logic. I take it her doing so is simply because the majority of the set-theoretic community uses classical logic and so her Second Philosopher is committed to it. And, as Tennant [2000a] fn.17 claims, the Second Philosopher in some sense cannot really have it any other way; logical revision, at least of the Dummettian anti-realist sort, is decidedly first-philosophical, and so off the table. More recently, however, Tennant has revised this position. In personal correspondence he has claimed that he thinks that even the Second Philosopher, with her stress on scientific method, could be persuaded to use only Core Logic. See §7.3.4, pp. 197-203 of Tennant [2017]. 44For one view along these lines see Hamkins [2012]. 45Friedman [1971].

220 subdiscipline. Maddy uses the one to try to justify the other, but it does not work because simply serving as a foundation does not require that V be unique.

7.3.2 Depth

Perhaps my biggest issue with Maddy’s proposal rests at the heart of it, namely depth. What is depth? Maddy nowhere attempts to make the notion clear, appealing only to cases that seem intuitively deep. To support her claim that depth is what is guiding the mathematician, she appeals to the fact that the mathematician will be “brimming with conviction” when presented with something deep. Such an appeal leaves one feeling left in the dark unless equipped with the appropriate mathematical expertise. Not to mention that while there may be agreement among mathematicians as to whether a concept, proof, or theorem is deep, it will be much more controversial as to why it is deep. For example, is deeper than Euclidean geometry (especially because of duality)? What do the details of depth look like? Are equivalent definitions equally deep? What about equivalent theorems? Do different proofs of the same theorem count as equally deep? Or can there be asymmetries? Is, for example, Dedekind’s definition of the reals as disjoint pairs of sets of rationals just as deep as Cauchy’s definition in terms of equivalence classes of convergent sequences? And what about and the sequent calculus? Are those of equal depth? Nowhere does Maddy hint as to how to answer these questions, and yet, given how central the notion of depth is to her proposal, it seems to me that she needs to provide more guidance than she does, else depth starts to look extremely subjective, something that would completely undercut her proposal. Rather than try to explicate depth by means of concrete, “quantifiable” characterizations – such as a reduction to some type of syntactic analysis, or something of the sort – that would be a step toward shirking this subjectivity worry, she suggests that providing a precise characterization of depth would be not only difficult, but unhelpful, and appeals instead to colorful language to make her point – depth is for her roughly equated with fruitfulness,

221 effectiveness, importance, and productivity. Not only do these notions themselves mean different things, thereby giving depth a multiplicitous character, but appealing to them only shifts the burden elsewhere; the elusive nature of depth is simply relocated to other unclear concepts. But making such unclear concepts do so much work is bound to lead to difficulties, and difficulties that I think crucially undermine her enterprise. Take, for example, fruitfulness. Not only does equating depth with fruitfulness beg the question in favor of the set-theoretician’s appeal to extrinsic reasons – assuming fruitfulness is something like wealth of consequences – but it also leads one to question what makes one axiom more fruitful than another. To use a particular example, on Maddy’s story, the determinacy theorist develops the theory of projective sets because it’s “mathematically richer” and of greater depth than what would follow were she to accept that V “ L. But why think this? Why not think that adopting V “ L is “rich” or “fruitful”? After all, it settles many of the outstanding problems of set theory. And it’s not clear that depth is an all-or-nothing affair. Adopting V ‰ L might count as deep in some ways, whereas adopting V “ L might count as deep in other ways. To think that there must be a fact of the matter as to which counts as deeper seems, if not controversial, then at least beyond our epistemic grasp given our failure of logical omniscience. Unless one could see all of the consequences of a set of axioms, how can one be sure that the route they are pursuing is indeed the “objectively deeper” one? All of this is not even to question whether our theory even needs to be as rich as possible. What reason is there for thinking that a theory needs to be “generically complete”? Why think that independent questions even need answering? After all, the Second Philosopher is supposed to only listen to what set theory tells us, and, as it stands, ZFC is silent on these issues. The answer, I take it, rests on the idea that freedom is the essence of mathematics, and it is simply part of that freedom that encourages one to maximize. The maxim to maximize “ensure[s] a plenitude of objects and structures:”46 [S]et theory should not impose any limitations of its own: the set theoretic arena in which

46Tennant [2000a], pg. 325.

222 mathematics is to be modelled should be as generous as possible; the set theoretic axioms from which mathematical theorems are to be proved should be should be as powerful and fruitful as possible. Thus, the goal of founding mathematics without encumbering it generates the methodological admonition to MAXIMIZE.47

Such a stance seems to be that of John Steele, when he says that we require not only that axioms be true, but also “as strong as possible".48 It’s not clear to me why we should think that one must always maximize. Nor is it clear to me why freedom ought to encourage maximization. Indeed, given the appeal to freedom, it seems that mathematicians ought to be free to pursue all lines of inquiry, whether maximizing or not. That is, one ought to be free to pursue the consequences of entertaining V “ L as well as V ‰ L. Restricting oneself to the investigation of what follows from the latter seems to go against the liberal credo. But even assuming that maximization is, and should be, the ultimate goal, how can one possibly judge whether one axiom is more maximizing than another? Just as in the case with depth, it seems highly likely that two competing axiom candidates might be maximizing in some respects, but not others. And unless one can see all the consequences of one’s choice of axiom, and unless one has some more precise way of explicating maximization that makes it clear when adopting one axiom is more maximizing (or deep) than another, neither of which it seems to me that we have, then the slogan to always maximize is unhelpful. One might try to make maximization more precise by establishing some connection between it and logical strength. Were one able to do this, then we might be one step closer to overcoming the failure of logical omniscience. But such an avenue does not seem open to Maddy, given that, for example, V “ L is of tremendous logical strength but not, according to her, maximizing. Given these complications, what exactly is the relationship between depth and various mathematical goals, for example maximization? Maddy says,

[M]athematical fruitfulness [(depth)] is not defined as ‘that which allows us to meet our goals’, irrespective of what these might be; rather, our mathematical goals are only proper insofar as satisfying them furthers our grasp of the underlying strains of mathematical fruitfulness. In other words, the goals are answerable to the facts of mathematical depth, not the other way ’round.49 47Maddy [1997], pp. 210-211. 48Feferman, et. al. [2000], p. 422. 49Maddy [2011], p. 82.

223 So according to this story, the reason we want to maximize is because it is answerable to facts of mathematical depth. But if we can be mistaken about depth, which she admits that we can, then it’s quite possible that in practice this story is backward, i.e. that the reason we think that something is deep is because it aligns with certain mathematical goals that are

present. Take the above example. On Maddy’s story, V “ L is (supposedly) rejected because it’s in violation of our goal to maximize, which is itself founded on the idea that V “ L is not deep (or at least is not as deep as V ‰ L). But if one is mistaken about depth, then it seems highly plausible that the story gets reversed, namely that the set theorist thinks that

V ‰ L is deep because it is more maximizing than than the alternative. That is, V “ L is rejected because it’s in violation of depth, which is itself founded on the idea that V “ L is not maximizing (or at least is not as maximizing as V ‰ L). Recall that for Maddy set theory is the arena in which mathematics plays. As such she wants it to be as encompassing as possible. Remember also that Maddy claims that everything she said with regard to set theory applies equally to each branch of mathematics. It is time to evaluate whether this claim holds good. It is my contention that it does not, and it seems to me that depth undermines it. Part of what is meant by having set theory serve as the arena in which mathematics plays is that for each of our synthetic areas of mathematics, there will be a set-theoretic surrogate for it, and each branch will “live” in some part of the hierarchy of V . So, for example, a function is considered to be a set of ordered pairs, where ‘ordered pair’ itself takes on the set-theoretic definition of Kuratowski. Of course one must first question the idea that ultimate foundational authority ought to be granted to set theory given how different it is from all the other branches of mathematics, not to mention competing foundations such as . Mathematicians, when investigating the structures that comprise their discipline, do not reason in terms of set- theoretic surrogacy. Mathematicians do not really thinks that a function, for example, is simply a set of ordered pairs. Or that a point is an ordered pair, or a line a set of ordered pairs, and so on. They reason about the objects taken as sui generis. That is, the geometer

224 reasons about points, lines, and planes qua points, lines, and planes. And the intuitions that guide that mathematical practice are wholly unconnected with set-theoretical thinking. The results that are arrived at in the different branches are so without any set-theoretic thinking in the background. Of course they might use some set-theoretic notions for ease of presentation, but they need not. There would still be a massive amount of core mathematics done had Cantor never come around. So to use set theory as the foundational authority over all of these disparate branches seems too strong. The philosophy of mathematics should really be addressing those areas, not this highly cooked up set-theoretic arena. The point is that the one-size-fits-all attitude to set theory seems to me to fall seriously short when one considers just how different the various branches of mathematics are. Different areas of mathematics have different vocabularies, concepts, and methods. It is highly artificial to say that mathematics is really about V. As noted above in §7.3.1, all of core mathematics

50 lives in Vω`ω. So to claim that higher set theory and its motivating factors carry over to standard, core-mathematical practice seems far-fetched at best. But what does this have to do with depth? Given the drastic differences between different branches of mathematics, what guarantees that depth will survive set-theoretic translation? It does not seem far-fetched to think that what counts as deep in a set-theoretic guise might be taken as shallow in some core branch. And if set theory is taken as the ultimate authority, then given how few mathematicians actually engage in high powered set theory, very few people would actually have access to true depth. But surely this cannot be right. If these complaints are not damning enough, there is a more general problem for the set-theoretically obsessed naturalist’s appeal to depth. The whole point of appealing to mathematicians was that given their expertise, they are the ones who know what they are talking about, more so than the philosopher. For the Thin Realist, the success of set-theoretic methodology convinces her that sets exist and that set-theorists are tracking truth. Radical

50This is actually probably too strong. For while it’s true that the mathematical objects of ordinary mathematics all have surrogates living in Vω`ω, it’s not clear that all facts about them would be settled by any set theory enjoying a model that contains Vω`ω. Thanks to Neil Tennant for pointing this out.

225 skepticism about these aspects was, as mentioned above, unthinkable for the Thin Realist. Again, though, it’s not as if anything goes–something must be constraining mathematical practice. And so to ensure that set theory remains objective, the Thin Realist appealed to mathematical depth. But given that the mathematician could be mistaken about what counts as deep, and given that set-theoretic methodology in general (and its goals in particular) are driven by mathematical depth, surely set theoretic methodology in general (and its goals in particular) could be mistaken. So her reversal of tactic, namely relying on practice and then filling in the metaphysics, seems just as problematic as the Robust Realist’s route. Whereas the Robust Realist gets in trouble because of the postulation of independent abstracta and the worries about epistemic and ontological fit, the naturalist finds herself in an analogous situation with respect to depth.

7.3.3 Thin Realism vs. Robust Realism

This makes me question how significantly different Maddy’s version of Thin Realism is from Robust Realism. To be sure there is at least some difference, given their reversals of expla- nation. However, in an important sense the appeal to depth strikes me as a falling back into the unwanted territory that is Robust Realism. This point is illustrated by looking back at the original issues that Feferman raises for the Gödelian (Robust) Realist. The primary one of concern to us was that abstract entities are assumed to exist independently of any means of human definition or construction. Above we rehearsed the problems that this posed for Robust Realism and how, motivated by these problems, Maddy suggested her alternative of Thin Realism. But even if her metaphysical story of sets can avoid the issues present in (i) – i.e. that abstract entities are assumed to exist independently of any means of human definition or construction – her account of depth cannot. Given that depth is supposed to be an objective feature existing independently of human definition or construction, and given that depth is itself in some sense metaphysical as well as epistemological, it seems to me that the problems for (i) rear their ugly head again, this time against depth. So, despite

226 her alternative story, it seems to me that Thin Realism sins against Feferman’s complaints regardless, and that the problems with (i) are simply pushed back to the issue of depth. Maddy does seem somewhat sensitive to concerns of these sorts, both as concerns ontology as well as depth. On the ontological side, one could still ask a metaphysical question about the nature of sets themselves (given their objective, independent existence) and whether set theory could simply be off the mark. Of course, the Second Philosopher would never be in a position to answer such questions, given her confinement to operating within her thin epistemology. But the question could nevertheless be asked. The situation is reminiscent of the Hilbertian finitist who, given his confinement to finitist reasoning, could never establish that the single universal generalization of all the axioms of PRA could count as finitist (assuming, of course, that Tait [1981] is right that finitism ought to be equated with PRA). Just as an evil demon could deceive one about the reliability of one’s perceptual beliefs, could such a demon similarly deceive the set-theorist as to what sets are actually like? This was, recall, essentially one of the charges that Maddy made against Robust Realism. Maddy thinks not, arguing that to think that set theory could enjoy all the virtues that it does and yet have sets be either non-existent or radically different than they seem to be is simply to “misunderstand the nature of set theory and its subject matter.”51 Setting aside the fact that this seems to beg the question, it does not seem unreasonable to think that such a demon capable of deceiving us about the nature and existence of sets could also be sophisticated enough to deceive in a coherent manner. That is, the whole enterprise would be deception, and a good one. There is no reason to think that such a deception would not also appear to be of mathematical virtue. And, if the argument I’ve given in the last paragraph is right, then the same sorts of worries that are raised for the Robust Realist’s appeal to an objective world of sets are problematic for the Thin Realist’s appeal to depth. Maddy’s response is to say that even though the complaint may be coherent, it is not enough to worry the Second Philsopher. That is, she grants that an Evil Demon could deceive us wholesale about what follows from what or about where the 51Maddy [2011], p. 75.

227 deepest, most fruitful strains lie. Still, the coherence of the radical skeptical challenge is not enough to revive Benacerraf-style worries: though it’s hard to see why our set-theoretic methods should track the truth about the Robust Realist’s ontology, they’re clearly well-designed (the Demon aside) to track set-theoretic depth.52

I simply fail to see how this answers the problem. Indeed, such drastic issues as demons are not even needed, given the methodological choices and given the fact that mathematicians can actually be wrong with respect to depth; the whole enterprise could simply be mistaken. This I take to be a deep (no pun intended) issue for Maddy, and one that we will revisit again below in the section on Methodology. In the meantime, though, I note that Maddy says that much of her thoughts about Thin Realism were inspired by various remarks of John Burgess and John Steel. For example, she quotes Steel as saying, Realism in set theory is simply the doctrine that there are sets. . . Virtually everything mathe- maticians say professionally implies there are sets. . . . As a philosophical framework, Realism is right but not all that interesting.53

And shortly after, Both proponents and opponents [of realism] sometimes try to present it as something more intriguing than it is, say by speaking of an ‘objective world of sets’.54

The latter quote is supposed to distinguish her view from the Robust Realism character- ized above and often ascribed to Gödel. Presumably the idea is that instead of beginning with some robustly existing ‘objective world of sets’, and then proceeding from there, we need instead to start with [a] more sophisticated realism, one accompanied by some self-conscious, metamathematical considerations related to meaning and evidence in mathematics.55

The appeal to the quote is a bit confusing, to this author, given her joint commitment to objectivity (granted at the level of depth instead of ontology) and the uniqueness of V . Again, I think it points to the fact that Thin Realism is not really that much different from Robust Realism. She still seems to be stuck with a robust ‘objective world of sets.’ The only difference is the way that she thinks one arrives at it. To avoid the Benacerrafian

52Maddy [2011], p. 116. 53Steel, FOM posting 15 January 1998, as quoted in Maddy [2011], p. 60. 54Steel, FOM posting 15 January 1998, as quoted in Maddy [2011], p. 60. 55Steel [2004], p. 2, as quoted in Maddy [2011], p. 61.

228 epistemological concerns, as well as the failure of ontological fit that plagues the Robust Realist, she appeals to the authority of the set theorist; we have access to the world of sets because of the reliability of set-theoretic practice. But the world of sets still exists uniquely, and robustly, given how robust the nature of depth is. And depth has to be robust, else it does not seem clear how one could appeal to it to secure set-theoretic methodology. But, as I have been reiterating, both the Benacerrafian problem and the issue of ontological fit are reintroduced at the level of depth. What I think is really going on is that the characterization of Thin Realism that Maddy provides is simply an attempt to avoid the difficult philosophical questions that arise when doing the philosophy of mathematics. By appealing to mathematical practice, she tries to answer philosophical questions mathematically, instead of philosophically. But given the nature of the questions it does not seem clear to me that this is appropriate, nor particularly effective. For Maddy’s Thin Realist takes set-theory to be describing properties of an objec- tively existing reality. But set theory itself never actually says this, and I take it that most mathematicians are at least agnostic as to whether this is what they are actually doing. So most fundamentally, I think that many of her problems stem from a failure to adequately distinguish between set theory taken as a foundation, and set theory as taken as a branch of mathematics. This is in some sense ironic given the similarity in spirit to Hilbert’s attempt (on some interpretations) to have mathematics be its own arbiter on philosophical questions, together with Maddy’s rejection of Hilbertian foundationalism. Set theory as a mathemat- ical discipline is not concerned with these philosophical questions, whereas set-theory as a foundation is. But then to appeal to actual mathematical practice to answer questions that are not of their concern seems misplaced.

229 7.3.4 The choice (or lack thereof) between Thin Realism and Are-

alism; and a moral for methodology

Despite their differences, recall that what bonds Thin Realism and Arealism is their deference to “the topography of mathematical depth.” If the argument presented in the last section that the Thin Realist faces the same issues with respect to depth that plagued the Robust Realist with respect to epistemic access and ontology is correct, then, given this common deference, the same methodological agreement that binds the Thin Realist and the Arealist is the same agreement that undermines Arealism in the same way that it undermined Thin Realism. Moreover, this agreement at the level of methodology coupled with the appeal to Wilson’s notion of tropospheric complacency was what supported the idea that the Second Philosopher need not decide between Thin Realism and Arealism. But it is interesting that the concepts truth and existence should suffer from tropospheric complacency but the concepts set and depth should not. Why not think that these are similarly indeterminate in use? And if they are, then why think that the structure that set-theorists are after ought to be unique, or even determinate? Moreover, does not her discussion of these issues rest on the philosophy of language and fall outside the purview of the mathematician? As such, should not they count as First-Philosophical? Can the Second-Philosopher ever be in the position to realize that she need not make a decision? Not to mention that as a Second Philosopher taking classical logic as being the correct logic, she should take it that concepts do indeed either apply or do not in a determinate manner. It’s not clear to me that the division between First and Second Philosophy is very well demarcated. For example, at one point in her book Maddy says,

Notice that if such a philosophical undertaking intends to correct science, or even to justify it in some way, then it is not effectively separated from our inquirer’s sphere of interest: working without any litmus test for ‘science’ or ‘non-science’, she will view it as a potential part of her own project, out to revise or buttress her methods; faced with such a proposal, she will want to know the grounds on which the criticism or confirmation is based and to evaluate these grounds on her own terms. To be truly autonomous, a philosophical enterprise would grant that science is perfectly in order for scientific purposes, but insist that there are other,

230 extra-scientific purposes for which different methods are appropriate.56

But then later she says,

[T]he Second Philosopher begins with her ordinary perceptual beliefs and refines her methods from there, expanding her reach into all areas of what we might call ‘natural science’ and even ‘social science’: physics, chemistry, astronomy; biology botany, mineralogy; psychology, linguistics, and the study of human inquiry itself.57

So if the study of human inquiry itself is part of our picture, then why are not these external investigations warranted? The complaints that we raised before the quoted passages make it seem that even Maddy is not clear on the demarcation. In some cases she seems to think that certain extra-mathematical investigations count as Second Philosophical, for example in the case of tropospheric complacency, whereas in others they are not, for example questioning the reliability of mathematical practice and its verdicts for epistemology and metaphysics.

7.3.5 More on methodology, and some consequences

If ‘truth’ on this picture really is just ‘what set theory tells us’, then we have some interesting consequences. First, what happens if the set-theoretic community is divided on what they take the appropriate axioms to be? Whose authority is appealed to? And even if there is no division amongst the actual community, it seems possible that in some other possible world, rather than thinking that V‰L, the community thinks that V=L. If that’s the case, then the objectivity of mathematical truth, or at least the necessity of mathematical truth seems suddenly contingent. And that is certainly not the sort of thing the naturalist would want. Presumably Maddy would respond to this by appealing to depth, and claim that because of the objectivity of mathematical depth, such a situation could not occur. But then we’re back with the same issues raised above. Much of the appeal to “large” large cardinal axioms owes to their consequences. Harvey Friedman, however, has argued for the adoption of various “large” large cardinals along different lines. Friedman has managed to generate many natural looking finite combinatorial

56Maddy [2011], p. 40. 57Maddy [2011], p. 105, emphasis mine.

231 statements ϕ whose proofs require very strong axioms of infinity. More recently he has generated statements from his Boolean Relation Theory that he claims will eventually force the mathematical community to accept the various new large cardinal axioms. As far as I know the eventuality has yet to come to the community. Setting aside the complaint due to Feferman, et al. [2000]58, suppose they really do, and yet the community of set-theorists remains unconvinced. The naturalist appeal to set-theoretic authority gets it wrong, and sends mathematics in the “wrong” direction. How to settle this with their supposed authority is unclear. Or consider a different historical progression. The Löwenheim-Skolem theorem says that if a countable first- has an infinite model, then it has a model of size κ, for every infinite κ. Suppose that upon reflection this theorem was taken to be so deep that set-theoretic investigation remained at ℵ0. That is, investigation into the higher infinite never occurred. Is the Skolemite simply mistaken? He certainly thinks that what he is doing is deep. So who is right? These examples point to a general problem concerning views that appeal to historical authority. Such an appeal leaves the naturalist hostage to historical fortune. After all, we could simply have a failure of method even if no one realized it; our method could be wrong. This view undercuts the Quinean picture at the level of the scientific method, and it undercuts the Maddyan line at the level of mathematical practice. For Quine, science, though not infallible, is justified by means of observation and the hypothetico-deductive method. Things are, of course, a bit more subtle than this, but the idea, roughly, is that science is justified by virtue of being the best method that we have of investigating the world. It could immediately be asked: “Well, what exactly is science?” While answering that succinctly is certainly difficult, it is at least an attempt to understand the world and gain knowledge by using experimentation and evidence to provide explanations and predictions. But we should be careful not to make the jump from taking science to be

58On p. 407, Feferman claims that “it is begging the question to claim this shows we need axioms of large cardinals in order to demonstrate the truth of ϕ, since this only shown that we ‘need’ their 1-consistency.”

232 our best means of learning about the world to simply defer unconditionally, thinking “Well, science tells me so. So that must be the way things are.” After all, it could be that scientific method is currently mistaken in certain ways. At this point it might be objected that scientific method is such that any mistaken could be discovered and corrected. But the point sticks even if scientific method is reliably self-correcting because at any stage it might not be fully correct. The same applies to Maddy’s deference to the mathematician. Given her reliance on science, the concerns of the last paragraph apply to her view. She is stuck trying to walk a tightrope between mathematics being somehow grounded in science, and yet being its own discipline. But setting this aside, taking pure mathematics as its own enterprise, the Sec- ond Philosopher is still stuck with the reliance on history. As the example above about the Skolemite illustrates, things could have turned out very differently. This is particularly prob- lematic if, like the Thin Realist, one thinks that mathematics is after some objective truth and existence. If, however, one is like the Arealist and not after truth, then such historical reliance is not as problematic. In this case, rather than trying to describe some objective reality, one can simply characterize the enterprise as describing what mathematicians are doing. But then one is left with the task of explaining the application of mathematics while simultaneously holding that it is not after truth/existence.

7.3.6 New Axioms?

We have seen that much of Maddy’s project is to defend the appeal to extrinsic justifica- tions in overcoming independence problems. In evaluating this appeal, it will be useful, following Feferman, to distinguish between foundational axioms and structural axioms.59 Structural axioms, like those of , are definitions of the sorts of structures that recur throughout mathematics. They neither concern self-evident propositions, nor are they arbitrary starting points. Foundational axioms, on the other hand, concern the various

59Feferman, et al. [2000], p. 403.

233 fundamental concepts of, e.g., number, set, and function, that underlie all mathematical concepts. This view of foundational axioms is reminiscent of Zermelo when presenting his axioms for set theory. Set theory is that branch of mathematics whose task is to investigate mathematically the fundamental notions ‘number’, ‘order’ and ‘function’, taking them in their pristine, simple form, and to develop thereby the logical foundations of all arithmetic and analysis; it thus constitutes an indispensable component of the science of mathematics.60 Roughly, Feferman’s point in providing the distinction is to claim that extrinsic justi- fication, while fine for structural axioms, is inappropriate for foundational axioms. And given the foundational role of set theory, its axioms are foundational, and hence beyond the purview of extrinsic justification. Maddy agrees that set theoretic axioms are aimed at least in some sense at some foundational goal. But her understanding of that goal is very different. The difference goes back to the difference between Hilbertian foundationalism, and the post-Gödelian holistic naturalism, espoused by Maddy, among others. Recall Maddy’s position: Set theory seeks to provide a unified arena in which set theoretic surrogates for all classical mathematical objects can be found and the classical theorems about these objects can be proved. This sort of foundation brings the various structures of mathematics onto one stage, where they can be contrasted and compared; it provides a uniform answer to questions of mathematical existence and proof. What it does not do, however, is reveal what mathematical entities really are, nor do they provide an epistemic foundation in the sense that they do not allow you to derive truths of mathematics by transparent steps from absolutely certain truths, contrary to what earlier thinkers had hoped for.61 However, Maddy’s failure to distinguish between set theory as foundation and set theory as branch of mathematics makes it uncertain how broad the role of set theory is, qua foundation. Is it only supposed to provide a foundation for so-called “core” mathematics? If so, then

the arena need not be very big, relatively speaking, only Vω`ω. If this is the case, then higher-order set-theoretic investigation for the sake of “foundations” are really only structural investigations, namely of a relatively small initial segment of the “unique” V . Presumably this is not what she means, though, given that she sees the effectiveness of an axiom candidate as helping set-theoretic practice reach its foundational goal as a sound extrinsic reason to adopt it as a new axiom!62 60Zermelo [1908], as quoted in Maddy [2011], p. 33. 61Feferman, et al. [2000], p. 418. 62Feferman, et al. [2000], p. 418.

234 Given how low in the cummulative hierarchy core mathematics actually lives, new axiom candidates would not seem to help set-theoretic practice reach its foundational goal; were “core” mathematics the only mathematics to found, then ZFC, as it stands, would already suffice! So she must mean, then, to include set theory itself as within the foundational purview of set theory. But this ends up meaning that higher-order set theory is attempting to provide a foundation for itself. Such a move is problematic for the Hilbertian foundationalist, let alone the Gödelian! So it depends on what one thinks the overall goal is. If one conceives of the role of foun- dations to be some form of epistemic security, then extrinsic reasons will not work, without in some sense changing the subject. If, however, one conceives of set theory as providing the sort of arena that Maddy envisions, then one will be more comfortable with extrinsic justifications. It seems to me that much of Maddy’s acceptance of extrinsic justification derives from the above failure to disambiguate set-theory qua foundation from set theory qua mathematical discipline. Taken as a foundation, the axioms are considered foundational; taken as a branch of mathematics, they can be seen as structural. And since Maddy wants to appeal to set theory as a branch of mathematics to answer questions concerning set theory as a foundational enterprise, the distinction between the characterizations of the axioms as structural or as foundational becomes blurred. Personally, I am partial to the neo-Hilbertian picture espoused by the likes of Kreisel, Feferman, and Sieg discussed in detail in chapter 6. Maddy is right to say that, as prac- ticed, it is not a matter of ‘everything goes’ in mathematics. But I take this to be not a result of some external objectivity of mathematics, but rather a result of the psychology of mathematicians. Taking the neo-Hilbertian view seriously removes the methodological limitation of only pursuing those most “maximizing’ routes. It also does not threaten to curtail mathematical investigation in any way. The point is simply to keep in mind what it is that one is doing. And what one is doing is investigating the logical connections among various mathematical concepts.

235 7.3.7 Concluding remarks

One of the particularly striking things about Maddy’s book is the total lack of engagement with the possibility that mathematics is knowable a priori. This is particularly striking given the historical importance of the a priori. That said, that she should reject the possibility of a priori knowledge is perhaps not surprising, given her naturalistic leanings. Though she never directly addresses the issue in Defending the Axioms, she does have some engagement in Maddy [2007]. Recall that given the Second Philosopher’s strategy of starting with set-theoretic methods and then proceeding to provide the appropriate ontology, her starting question is something like: What must sets be like in order that we can know about them by means of the methods of set theory?

Maddy notes that this is similar to Kant’s approach: What must the world be like in order that we can know about it partly a priori? Kant’s answer was to say that the world is partly constituted by our modes of cognition. But we saw above that the Second Philosopher can not maintain this attitude because, were sets constituted by our modes of cognition, it would undermine the objectivity of mathematics. But there is more to the Second Philosopher’s rejection of the quasi-Kantian approach: by appealing to an extra-scientific mode of inquiry, the Kantian approach violates Second Philosophy. Kant distinguished between sensibility and understanding. The sensibility ‘passively’ receives sensory impressions and then, by means of the understanding, actively orders the experience by application of concepts. Space and time, the forms of intuition, are the two a priori elements of sensibility. The pure concepts of understanding, which are, for Kant, distinctive of any discursive intellect, are also a priori. So the sensibility is not wholly passive. It forms the manifold of sensory impressions (applying the spatial and temporal forms of intuition) before the understanding steps in with application of concepts of individuals and causation. According to some Kant scholars, Kant distinguishes between two different levels of inquiry: empirical and transcendental. It is this division that the Second Philosopher finds

236 puzzling. Operating at the empirical level is essentially what the Second Philosopher takes herself to have been doing all along. But she becomes confused when presented with the additional level of transcendental inquiry. Given her reliance on scientific investigation, she finds no reason to believe in Kant’s claim that one’s empirical experience is shaped by cognitive architecture. As Maddy puts the point,

If the spatiotemporality [the Second Philosopher] attributes to the world is actually a projection of her cognitive processing, she certainly wants to know this, but as far as her empirical analysis of human sensory and neurological processing goes, she sees no reason to think it’s true.63

The Second Philosopher objects to Kant’s appeal to an additional level of transcendental inquiry by wondering how Kant can know that the human intellect is discursive and that space and time are constitutive of our sensibility in a way that lies beyond scientific inquiry?64 Given her methodological framework, the Kantian cannot answer the Second Philosopher on her own terms. And so in some sense the two have reached an impasse. The a priori then, is, at least for Kant, grounded in his appeal to transcendental ideal- ism. Given her naturalistic leanings, the Second Philosopher rejects this additional line of inquiry. In short: if Second Philosophy, then no a priori knowledge. In addition to the com- plaints made above with regard to Second Philosophical methodology, if one takes seriously the prospect of a priori knowledge, this would provide another argument against Second Philosophy. It is not one the Second Philosopher would be moved by, but one that I suggest we take seriously, pausing to reconsider the merits of Kantian epistemology. Whatever the shortcoming of the Kantian view of mathematics, I think there is something fundamentally true about Kantian transcendental idealism. I shall grant Maddy that our mathematical investigations began at least in application. Presumably certain mathematical concepts were gleaned from our experience of the world. From there we started investigating these concepts and their interrelations with other con- cepts. Moreover we did this with such precision that eventually new strands developed and pure mathematics arose. Nonetheless, it seems to me that our initial experiences of the

63Maddy [2007], p. 62. 64Maddy [2007], pp. 62, 63.

237 world, as well as all the subsequent concepts that have been developed over time, are all fundamentally subject to the cognitive structure of our minds. And if this is so it at least points in the direction of an explanation of the applicability of mathematics. Moreover, this Kantian picture leaves open the possibility of mathematical knowledge being a priori. Supposing the Kantian framework is on the right track, it does not seem far- fetched to me to think that part of our evolutionary history resulted in a cognitive framework that takes as fundamental certain basic concepts of mathematics and logic. And if this is right, then, within a modified version of Kantian epsitemology, we too can come up with an adequate account that explains the apparent a priority of mathematical knowledge. I shall also grant Maddy that there does seem to be something in practice that constrains the logical investigation of mathematical concepts. To put it briefly, not just anything goes. It is thus not unnatural to think that there must be some sort of objectivity involved. But perhaps even that is a result of the way our minds are structured. If this is so, then why think that there is one unique structure V existing independently of our minds? Why not investigate all the different extensions that are reasonable? The point of all this is not to say that set theorists ought to modify their practice. I agree with Maddy that the spirit of mathematics is its freedom. The point is to keep in mind what one is doing: develop west-coast Cabal; but also develop V=L. See what interesting connections there are. Just keep the axiomatic framework in mind.

238 Bibliography

[1] J. Avigad [2002]. Saturated models of universal theories, Annals of Pure and Applied Logic, 118: 219-34.

[2] J. Avigad and S. Feferman [1998]. Gödel’s Functional Interpretation. In Samuel R. Buss, editor, The Handbook of Proof Theory, pages 337-406. North Holland.

[3] J. Avigad and E.H. Reck [2001]. “Clarifying the nature of the infinite”: the development of metamathematics and proof theory. Carnegie Melon Technical Report CMU-PHIL-120. Online at http://www.andrew.cmu.edu/use/avigad/Papers/infinite.pdf.

[4] C.J. Ash and J.F. Knight [2000]. Computable Structures and the Hyperarithmetical Hi- erarchy. Elsevier.

[5] G. Boolos, J. Burgess, and R. Jeffrey [2007]. Computability and Logic. Cambridge Uni- versity Press.

[6] P. Bernays [1922] Über Hilbert’s Gedanken zur Grundlegung der Mathematik, Jahres- bericht der DMV, 31, pages 10-19. Translated in Mancosu [1998].

[7] P. Bernays [1930] Die Philosophie der Mathematik und die Hilbertsche Beweisthe- orie. Reprinted in P. Bernays Abhandlungen zur Philosophie der Mathematik, Wis- senschaftliche Buchgesellschaft, 1976.

[8] P. Bernays [1935] On Platonism in Mathematics. English translation in Paul Benacerraf and Hilary Putnam, editors, Philosophy of Mathematics: Selected Readings, pages 447-69. Cambridge University Press, 1983.

[9] P. Bernays [1970] Die schematische Korrespondenz und die idealisierten Struk- turen. Reprinted in P. Bernays Abhandlungen zur Philosophie der Mathematik, Wis- senschaftliche Buchgesellschaft, 1976.

[10] P. Blanchette [1996]. Frege and Hilbert on Consistency. The Journal of Philosophy 93(7):317-336.

[11] P. Blanchette [2012]. Frege’s Conception of Logic. Oxford University Press.

239 [12] L.E.J. Brouwer [1927]. Über Definitionsbereiche von Funktionen. English translation in Jean van Heijenoort, editor, From Frege to Gödel. toExcel Press, 1999.

[13] J. Burgess [2005]. Fixing Frege. Princeton University Press.

[14] J. Burgess [2010]. On the outside looking in: a caution about conservativeness. In Solomon Feferman, et al.., editors, Kurt Gödel: Essays for his Centennial, pages 128- 141. Cambridge University Press.

[15] J. Burgess [2015]. Rigor and Structure. Oxford University Press.

[16] S.R. Buss [1998]. First-order proof theory of arithmetic. In Samuel R. Buss, editor, The Handbook of Proof Theory, pages 79-147. North Holland.

[17] I. Copi [1971] The Theory of Logical Types. Routledge, 1971.

[18] T. Coquand [2018]. Type Theory. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy, URL = https://plato.stanford.edu/archives/fall2018/entries/type-theory

[19] L. Crosilla [2017] Predicativity and Feferman. In Gerhard Jäger and Wilfried Sieg, editors, Feferman on Foundations: Logic, Mathematics, Philosophy, pages 423-447. Springer.

[20] W. Demopoulos and P. Clark [2005]. The Logicism of Frege, Dedekind, and Russell. In Stewart Shapiro (ed.) The Oxford Handbook of Philosophy of Mathematics and Logic. Oxford University Press, 129-165.

[21] M. Detlefsen [1986]. Hilbert’s Program. Reidel, Dordrecht.

[22] M. Detlefsen [1990]. On an Alleged Refutation of Hilbert’s Program Using Gödel’s First Incompleteness Theorem. Journal of Philosophical Logic, 19(4):343-377

[23] M. Detlefsen [2003] Constructive Existence Claims. In Mattias Schirn, editor, The Phi- losophy of Mathematics Today, pages 307-335. Oxford University Press.

[24] H. Enderton [2001]. A Mathematical Introduction to Logic. Harcourt/Academic Press, San Diego.

[25] S. Feferman [1964]. Systems of Predicative Analysis. J. Symbolic Logic, vol. 29, pp. 1-30.

[26] S. Feferman [1977]. Theories of finite type related to mathematical practice. In Jon Barwise, editor, Handbook of Mathematical Logic, pages 913-971. North Holland, Ams- terdam.

[27] S. Feferman [1979a]. A more perspicuous formal system for predicativity, in Konstruk- tionen versus Positionen, I, pp. 68-93, Walter de Gruyter, Berlin.

[28] S. Feferman [1979b]. What Does Logic Have to Tell Us about Mathematical Proofs?. In S. Feferman, In the Light of Logic, OUP, 1998, ch. 9.

240 [29] S. Feferman [1988a]. Hilbert’s Program Relativized: Proof-Theoretical and Foundational Reductions. The Journal of Symbolic Logic, 53(2):364-384.

[30] S. Feferman [1988b]. Wely vinidicated: Das Kontinuum seventy years later. In Solomon Feferman, In the Light of Logic. OUP, 1998, ch. 13.

[31] S. Feferman [1993a]. What Rests on What? The Proof-Theoretic Analysis of Mathe- matics (1993). In Solomon Feferman, In The Light of Logic, OUP, 1998, ch. 10.

[32] S. Feferman [1993b] Gödel’s Dialectica Interpretation and Its Two-Way Stretch. (1993). In Solomon Feferman, In The Light of Logic, OUP, 1998, ch. 11.

[33] S. Feferman [1993c]. Why a little bit goes a long way: Logical foundations of scientifically applicable mathematics. In S. Feferman, In the Light of Logic, OUP, 1998, ch. 14.

[34] S. Feferman [1996]. Kreisel’s “unwinding” program. In P. Odifreddi, ed., Kreiseliana, pages 247-273. A. K. Peters Ltd., Wellesley.

[35] S. Feferman [2000a]. Does reductive proof theory have a viable rationale? Erkenntnis, 53(1/2):63-96, 2000.

[36] S. Feferman [2000b]. The significance of ’s Das Kontinuum. Online at https://math.stanford.edu/„feferman/papers/DasKontinuum.pdf [37] S. Feferman [2004]. Comments on “Predicativity as a philosophical position” by G. Hellman. Revue Internationale de Philosophie, Vol. 58, No. 229 (3), Russell en héritage / Le centenaire des Principles (juillet 2004), pp. 313-323.

[38] S. Feferman [2005] Predicativity. In Stewart Shapiro, editor, The Oxford Handbook of Philosophy of Mathematics and Logic, pages 590-624. Oxford University Press, 2005.

[39] S. Feferman, H.M. Friedman, P. Maddy and J.R. Steel [2000]. Does mathematics need new axioms? Bulletin of Symbolic Logic, 6(4):401-46.

[40] F. Ferreira [2005]. A Simple Proof of Parsons’ Theorem. Notre Dame Journal of Formal Logic, Volume 46, Number 1.

[41] J. Ferreirós [2009]. “Hilbert, logicism, and mathematical existence.” Synthese 170:33-70.

[42] G. Frege, D. Hilbert, and A. Korselt. [1971]. : On the Foundations of Ge- ometry, and Formal Theories of Arithmetic, Eike-Henner Kluge, translator. New Haven: Yale University Press.

[43] G. Frege. [1980]. Philosophical and Mathematical Correspondence. Chicago: The Uni- versity of Chicago Press.

[44] H. Friedman [1971]. Higher set theory and mathematical practice. Annals of Mathemat- ical Logic, Vol. 2, No. 3, 325-35.

241 [45] H. Friedman [1975]. Some systems of second order arithmetic and their use. Proceedings of the 1974 International Congress of Mathematicians, 1:235-242.

[46] H. Friedman [1976]. Systems of second order arithmetic with restricted induction. I. Journal of Symbolic Logic, 41(2):557-8.

[47] H. Friedman, K. McAloon, S. Simpson [1982]. A finite combinatorial principle which is equivalent to the 1-consistency of predicative analysis. In G. Metakides, ed., Patras Logic Symposium, Studies in Logic and the Foundations of Mathematics, North-Holland, pp. 197-230.

[48] R.O. Gandy [1960]. Proof of Mostowski’s conjecture, Bulletin de l’Academie Polonaise des Sciences, se’rie des sciences mathe’matiques, astronomiques et physiques 8, 571-575, 1960.

[49] R.O. Gandy [1967]. Review of Feferman [1964]. Math. Rev., 1967.

[50] G. Gentzen [1936] Die Widerspruchsfreiheit der reinen Zahlentheorie. In M.E. Szabo, editor, The Collected Papers of Gerhard Gentzen, pages 132-213. North Holland, 1969.

[51] G. Gentzen [1936-37] Der Unendlichkeitsbegriff in der Mathematik. In M.E. Szabo, editor, The Collected Papers of Gerhard Gentzen, pages 223-233. North Holland, 1969.

[52] G. Gentzen [1938] Neue Fassung des Widerspruchsfreiheitsbeweises für die reine Zahlen- theorie. In M.E. Szabo, editor, The Collected Papers of Gerhard Gentzen, pages 252-286. North Holland, 1969.

[53] K. Gödel [1931]. On formally undecidable propositions of principia mathematica and related systems i (1931). In Jean van Heijenoort, editor, From Frege to Gödel. toExcel Press, 1999.

[54] K. Gödel [1933] The present situation in the foundation of mathematics. In Solomon Feferman et al., editor, The Collected Works of Kurt Gödel, volume III. Oxford University Press, 1995.

[55] K. Gödel [1938] Lecture at Zilsel’s. In Solomon Feferman et al., editor, The Collected Works of Kurt Gödel, volume III. Oxford University Press, 1995.

[56] K. Gödel [1941] In what sense is intuitionistic logic constructive? In Solomon Feferman et al., editor, The Collected Works of Kurt Gödel, volume III. Oxford University Press, 1995.

[57] K. Gödel [1944]. Russell’s mathematical logic. In Paul Benacerraf and Hilary Putnam, editors, Philosophy of Mathematics: Selected Readings, pages 447-69. Cambridge Univer- sity Press, 1983.

[58] K. Gödel [1947/64]. What is Cantor’s continuum problem (1947/64). In Solomon Fefer- man, et al., editor, The Collected Works of Kurt Gödel, volume III. Oxford University Press, 1995.

242 [59] K. Gödel [1958] On a hitherto unutilized extension of the finitary standpoint. In Solomon Feferman, et al., editor, The Collected Works of Kurt Gödel, volume II. Oxford University Press, 1990.

[60] K. Gödel [1972] On an extension of finitary mathematics which has not yet been used. In Solomon Feferman, et al..., editor, The Collected Works of Kurt Gödel, volume II. Oxford University Press, 1990.

[61] P. Hájek and P. Pudlák [1998]. Metamathematics of First-Order Arithmetic. Springer.

[62] M. Hallett [2010]. Frege and Hilbert. In Michael Potter and Thomas Rickets (eds.), The Cambridge Companion to Frege. Cambridge University Press, 413-464.

[63] J.D. Hamkins [2012]. The set-theoretic multiverse. Review of Symbolic Logic 5:416-449.

[64] G. Hellman [2004]. Predicativism as a Philosophical Position. Review Internationale de Philosophie, 229 (3), 295-312.

[65] D. Hilbert. [1900a]. Über den Zahlbegriff. Jahresbericht der DMV 8: 180-194. Translated in William Ewald (ed.) From Kant to Hilbert. A Source Book in the Foundations of Mathematics, volume 2. Oxford University Press, 1089-1095, 1996.

[66] D. Hilbert. [1900b]. Mathematische Probleme. Nachrichten der Königlichen Gesellschaft der Wissenschaften zu Göttingen: 253-97. Translated in William Ewald (ed.) From Kant to Hilbert. A Source Book in the Foundations of Mathematics, volume 2. Oxford Univer- sity Press, 1096-1105, 1996.

[67] D. Hilbert [1905]. On the foundations of logic and arithmetic. In Jean van Heijenoort, editor, From Frege to Gödel. toExcel Press, 1999.

[68] D. Hilbert [1918]. Axiomatisches Denken. Mathematische Annalen, 78:405-415. Lecture given at the Swiss Society of Mathematicians, 11 September 1917. English translation in William Ewald, editor, From Kant to Hilbert. A Source Book in the Foundations of Mathematics, volume 2. Oxford University Press, Oxford, 1996.

[69] D. Hilbert [1925]. On the infinite. In Jean van Heijenoort, editor, From Frege to Gödel. toExcel Press, 1999.

[70] D. Hilbert [1930]. Logic and the Knowledge of Nature (1930). Translated in William Ewald (ed.) From Kant to Hilbert. A Source Book in the Foundations of Mathematics, volume 2. Oxford University Press, 1157-1165, 1996.

[71] D. Hilbert [1971]. The . Chicago: Open Court.

[72] D. Hilbert and P. Bernays [1934] Grundlagen der Mathematik, Vol. I, Springer. Second edition, 1968.

[73] D. Hilbert and P. Bernays [1939] Grundlagen der Mathematik, Vol. II, Springer. Second edition, 1970.

243 [74] A. Kanamori [2009]. The Higher Infinite: Large Cardinals in Set Theory from Their Beginnings. Springer.

[75] S.C. Klenne [1955]. Hierarchies of number-theoretic predicates, Bulletin of the American Mathematical Society 61, 193-213.

[76] S.C. Kleene [1959]. Quantification of number-theoretic functions, Compositio Mathe- matica 14, 23-40.

[77] P. Koellner [20XX] Feferman on Set Theory: Infinity up on Trial. Unpublished Manuscript.

[78] G. Kreisel [1951] On the interpretation of non-finitist proofs – Part I. Journal of Symbolic Logic 16, 241-267.

[79] G. Kreisel [1952] On the interpretation of non-finitist proofs – Part II. Interpretation of number theory. Applications. Journal of Symbolic Logic 17, 43-58.

[80] G. Kreisel [1958a]. Hilbert’s program. Dialectica, 12:346-372.

[81] G. Kreisel [1958b]. Mathematical significance of consistency proofs. The Journal of Symbolic Logic, 23:155-182.

[82] G. Kreisel [1976]. What Have We Learnt from Hilbert’s Second Problem? AMS Pro- ceedings of Symposia in Pure Mathematics, 28, 1976.

[83] G. Kreisel [1987]. Proof theory: Some Personal Recollections. In Takeuti [1987], pp. 395-405.

[84] G. Kreisel and G. Takeuti [1974]. Formally Self-Referntial Propositions for Cut-Free Classical Analysis and Related Systems. Dissertationes Mathematicae, 118: 4-50.

[85] O. Linnebo and S. Shapiro [20XX]. Predicativism and potential infinity. Unpublished Manuscript.

[86] M. Liston [2007]. Review of Maddy, Notre Dame Philosophy Reviews, 12/9/07, online at http://ndpr.nd.edu/review.cfm?id=11903.

[87] P. Maddy [1990]. Realism in Mathematics, Oxford University Press, Oxford.

[88] P. Maddy [1997]. Naturalism in Mathematics, Oxford University Press, Oxford.

[89] P. Maddy [2005]. Three forms of Naturalism. In Stewart Shapiro (ed.) The Oxford Handbook of Philosophy of Mathematics and Logic, Oxford University Press.

[90] P. Maddy [2007]. Second Philosophy, Oxford University Press.

[91] P. Maddy [2011]. Defending the Axioms, Oxford University Press.

[92] P. Mancosu [1998]. From Brouwer to Hilbert: The debate on the foundations of mathe- matics in the 1920s. Oxford University Press.

244 [93] P. Milne [2007]. On Gödel Sentences and What They Say. Philosophia Mathematica, 15(2):193-226.

[94] G.Mints [1973]. Quantifier-free and one-quantifier systems. Journal of Soviet Mathemat- ics, 1:71-84.

[95] C. Parsons [1970]. On a number theoretic choice schema and its relation to induction. In Intuitionism and Proof Theory, pp. 459-473. North-Holland, Amsterdam.

[96] W.V.O. Quine [1975]. Five Milestones of Empricism, reprinted in Theories and Things, Harvard University Press, Cambridge, 1981.

[97] W.V.O. Quine [1981], Things and Their Place in Theories, in Theories and Things, Harvard University Press, Cambridge, 1981.

[98] P. Raatikainen [2003]. Hilbert’s Program Revisited. Synthese 137: 157-177.

[99] M. Rathjen [2006]. The art of ordinal analysis. Online at http://www1.maths.leeds.ac.uk/ rathjen/ICMend.pdf.

[100] R.W. Ritchie [1965]. Classes of recursive functions based on Ackermann’s function. Pacific Journal of Mathematics, 15(3): 1027-1044.

[101] H. Rogers [1987]. Theory of Recursive Functions and Effective Computability. MIT Press.

[102] J. B. Rosser [1936]. Extensions of Some Theorems of Gödel and Church. Journal of Symbolic Logic, 1: 97-91.

[103] B. Russell [1903]. The Principles of Mathematics. W.W. Norton & Company, 1996.

[104] B. Russell [1908]. Mathematical logic as based on a theory of types. In Jean van Heijenoort, editor, From Frege to Gödel. toExcel Press, 1999.

[105] B. Russell [1973]. Essays in Analysis. Douglas Lackey (ed.), Allen & Unwin.

[106] S. Shapiro [1991]. Foundations without Foundationalism: A case for second-order logic. Oxford University Press, Oxford.

[107] A. Siders [2012]. Gentzen’s Consistency Proofs for Arithmetic. Online at http://www.jaist.ac.jp/„mizuhito/jss12/Siders.pdf [108] A. Siders [2015]. A Direct Gentzen-Style Consistency Proof for Heyting Arithmetic. In Reinhard Kahle and Michael Rathjen, editors, Gentzen’s Centenary: The Quest for Consistency, pages 177-212. Springer.

[109] W. Sieg [1985]. Fragments of Arithmetic. Annals of Pure and Applied Logic, 28:33-71.

[110] W. Sieg [1990]. Relative consistency and accessible domains. In Sieg [2013].

[111] W. Sieg [1991]. Herbrand analyses. Archive for Mathematical Logic, 30:409-441.

245 [112] W. Sieg [2002]. Beyond Hilbert’s reach? In Sieg [2013].

[113] W. Sieg. Hilbert’s Programs and Beyond. Oxford University Press, 2013..

[114] W. Sieg [2014]. The Way of Hilbert’s Axiomatics: Structural and Formal. Perspectives on Science 22(1):133-157.

[115] W. Sieg and C. Parsons [1995] Introductory note to 1938. In Solomon Feferman, et al.., editor, The Collected Works of Kurt Gödel, volume III. Oxford University Press.

[116] W. Sieg and D. Schlimm [2005]. Dedekind’s analysis of number: systems and axioms. Synthese 147:121-170. Reprinted in Sieg [2013].

[117] S. Simpson [1988]. Partial Realizations of Hilbert’s Program. The Journal of Symbolic Logic, 53(2): 349-363, 1988.

[118] S. Simpson [2002]. Predicativity: The Outer Limits. In Sieg, Sommer, Talcott, eds. Reflections on the Foundations of Mathemeatics: essays in honor of Solomon Feferman. ASL Lecture Notes in Logic: 15.

[119] S. Simpson [2009]. Subsystems of Second-Order Arithmetic. Springer.

[120] P. Smith [XXXX]. Back to Basics: Revisiting the Incompleteness Theorems. Online at http://www.logicmatters.net/resources/pdfs/Godelbasics.pdf

[121] C. Smorynski [1977]. The Incompleteness Theorems. In Jon Barwise (ed.), Handbook of Mathematical Logic. North Holland.

[122] C. Smorynski [1982]. The Varieties of Arboreal Experience. The Mathematical Intelli- gencer, 4(4):182-189.

[123] C. Smorynski [1985]. Self-Reference and . Springer-Verlag, New York.

[124] C. Spector [1955]. Recursive well-orderings. Journal of Symbolic Logic 20, 151-163.

[125] C. Spector [1960]. Hyperarithmetical quantifiers, Fundamenta mathematicae 48, 313- 320.

[126] J. Steel [2004]. Generic absoluteness and the continuum problem. Online at https://www.lps.uci.edu/files/conferences/Laguna-Workshops/LagunaBeach2004/laguna1.pdf

[127] S. Sterrett [1994]. Frege and Hilbert on the Foundations of Geometry. Online at http://philsci-archive.pitt.edu/723/1/SterrettFregeHilbert1994.pdf

[128] W.W. Tait [1981] Finitism. Journal of Philosophy, Vol. 78., No. 9, pages 524-546.

[129] G. Takeuti [1974] Consistency proofs and ordinals. Proof Theory Symposium 1974, Springer Lecture Notes in Mathematics 500, pages 365-369. Springer, 1974.

[130] G. Takeuti [1987] Proof Theory. North Holland, 1987.

246 [131] N. Tennant. The Taming of the True. Oxford University Press, Oxford, 1997.

[132] N. Tennant [2000a] What is Naturalism in Mathematics, Really? Philosophia Mathe- matica, (3) Vol. 8, 316-338, 2000.

[133] N. Tennant [2000b] Deductive Versus Expressive Power: a pre-Gödelian predicament. The Journal of Philosophy, 97(5), pages 257-277.

[134] N. Tennant [2012]. Changes of Mind. Oxford University Press.

[135] N. Tennant [2017]. Core Logic. Oxford University Press.

[136] A.S. Troelstra [1974]. Note on the fan theorem, J. Symbolic Logic 39 (1974) 584-596.

[137] J. von Neumann [1930] The formalist foundations of mathematics. In Paul Benacerraf and Hilary Putnam, editors, Philosophy of Mathematics: Selected Readings, pages 61-65. Cambridge University Press, 1983.

[138] M. Wilson [2006]. Wandering Significance. Oxford University Press.

[139] N. Weaver [2009]. Predicativity Beyond Γ0. https://arxiv.org/pdf/math/0509244.pdf [140] H. Weyl [1918]. Das Kontinuum. English translation by Pollard and Bole, Lanham, 1987.

[141] R. Zach [2005]. Hilbert’s program then and now. Online at http://arxiv.org/pdf/math/0508572v1.pdf.

[142] E. Zermelo [1908]. Investigations into the foundations of set theory I (1908). In van Heijenoort (ed.) From Frege to Gödel: a Source Book in Mathematical Logic 1879-1931, Harvard University Press, 1967, pp. 199-215.

[143] E. Zermelo [1931]. Über Stufen der Quantifikation und die Logik des Unendlichen, Jahresbericht der DMV 31, 85-88.

247