About Logic and Logicians

A palimpsest of essays by Georg Kreisel Selected and arranged by Piergiorgio Odifreddi

Volume 1: Philosophy

Lógica no Avião ABOUT LOGIC AND LOGICIANS

A palimpsest of essays by Georg KREISEL

Selected and arranged by Piergiorgio ODIFREDDI Volume I. Philosophy. Editorial Board

Fernando Ferreira Departamento de Matem´atica Universidade de Lisboa Francisco Miraglia Departamento de Matem´atica Universidade de S˜aoPaulo Graham Priest Department of Philosophy The City University of New York Johan van Benthem Department of Philosophy Stanford University Tsinghua University University of Amsterdam Matteo Viale Dipartimento di Matematica “Giuseppe Peano” Universit`adi Torino

Piergiorgio Odifreddi, About Logic and Logicians, Volume 1. Bras´ılia: L´ogicano Avi~ao,2019.

S´erieA, Volume 1

I.S.B.N. 978-65-900390-2-6

Prefixo Editorial 900390

Obra publicada com o apoio do PPGFIL/UnB. Editor’s Preface

These books are a first version of Odifreddi’s collection of Kreisel’s expository papers, which together constitute an extensive, scholarly account of the philo- sophical and mathematical development of many of the most important figures of modern logic; some of those papers are published here for the first time. Odifreddi and Kreisel worked together on these books for several years, and they are the product of long discussions. They finally decided that they would collect those essays of a more expository nature, such as the biographical memoirs of the fellows of the Royal Society (of which Kreisel himself was a member) and other related works. Also included are lecture notes that Kreisel distributed in his classes, such as the first essay printed here, which is on the philosophy of and geometry. Kreisel himself wrote all the texts, but Odifreddi has made some substantial editorial interventions, rearranging some of the material, breaking the text into sections and paragraphs, inserting titles, moving or removing some notes, and eliminating some digressions. These interventions were made in order to give the essays some of their original freshness and linearity, qualities that were lost in later versions. Some other minor modifications were made here and there, consisting basi- cally in the correction of a small number of erroneous references in the original manuscripts. Kreisel’s expository works are invaluable to logicians, and we hope that the reader may find the present edition to his advantage, even if there is still some editorial work to do.

Rodrigo Freire. Bras´ılia,April 2019. Contents

I INTRODUCTION 1

1 INTRODUCTION TO THE PHILOSOPHY OF MATHEMA- TICS 2 Introduction ...... 3 Warnings ...... 4 1.1 Definitions ...... 4 Philosophical perplexities (putting one’s questions into question) .5 Heuristic value of perplexities: a source of issues and distinctions .6 Examples of distinctions (to be developed later) ...... 8 Warnings: what not to expect of our elementary (i.e. general) dis- tinctions ...... 9 A new philosophical problem ...... 10 1.2 The Circle: Two Aspects of Explicit Definitions ...... 11 A first definition of circle ...... 11 A second definition of circle ...... 13 A philosophical scheme and the choice between the two definitions of the circle ...... 14 Popular philosophical literature ...... 18 Definitions as an auxiliary means ...... 19 Eliminating definitions (from proofs) ...... 21 On explicit definitions ...... 23 1.3 Area: Two Aspects of Implicit Definitions ...... 24 The area of a rectangle ...... 24 Equality of area ...... 26 Surprises, perplexities and (promising) problems ...... 27 Measure of area ...... 31 Grand philosophical questions ...... 34

II ON ARISTOTLE 36

2 THE LITERARY VALUE OF ARISTOTLE’S THOUGHT 37 The value of great philosophers ...... 37 CONTENTS

Analogies ...... 39 Abstractions (also called structures) ...... 39 Choice of abstractions ...... 40 What we know and how we know ...... 42 Disclaimer ...... 43

3 ARISTOTLE’S LOGIC TODAY 44 3.1 Obsolete Ideas ...... 44 The infinite ...... 44 Logic ...... 45 3.2 Less Obsolete Ideas ...... 46 Particular proofs and general proofs ...... 46 Propositional Logic ...... 48 3.3 Eternal Verities ...... 50 Exact philosophy ...... 51

III ON FREGE 53

4 FREGE’S FOUNDATIONS 54 4.1 Background: Some Discoveries ...... 55 Functions ...... 55 Isomorphism ...... 56 The structure of the natural numbers ...... 56 Defining functions by equations ...... 57 Order ...... 58 The structure of the rational numbers ...... 58 The structure of the real numbers ...... 59 4.2 ...... 59 Set-theoretic purity ...... 59 Supply of sets ...... 60 Defining sets by predicates ...... 60 Proofs in Set Theory ...... 61 4.3 Formal Proofs ...... 62 Formal Procedures ...... 62 Formal procedures in logic ...... 63 Formal procedures in mathematics ...... 64 4.4 After Frege’s Foundations ...... 64 Newton’s foundations ...... 65 Bourbaki’s foundations ...... 66 Philosophical perspective ...... 68 Some neglected concerns ...... 70 CONTENTS

5 VARIANTS OF FREGE FOUNDATIONS 72 5.1 Zermelo’s foundations ...... 73 5.2 Hilbert’s foundations ...... 74 5.3 Brouwer’s foundations ...... 76 Propositions about incompletely defined objects ...... 76 Parallels between the theories of sets and choice sequences . . . . 78 Proofs and logical operations applied to propositions about incom- pletely defined objects ...... 80 5.4 Philosophical Hygiene ...... 82 5.5 Bourbaki’s alternative ...... 84 Warnings to literal-minded readers ...... 85

IV ON RUSSELL 86

6 RUSSELL’S LOGIC 87 Logic ...... 88 The theory of descriptions ...... 89 Russell’s paradox ...... 89 The doctrine of types ...... 91 Principia Mathematica ...... 91

7 PRINCIPIA MATHEMATICA: CRITICAL OR SPECULATIVE PHILOSOPHY? 92 Mechanization of information ...... 93 7.1 Critical Philosophy ...... 95 Antinomies ...... 97 7.2 Speculative Philosophy ...... 99 What we know versus how we know ...... 101 7.3 Looking Back. Russell’s Hopes and Disappointments ... 103

8 107 8.1 Russell’s Life ...... 108 Childhood ...... 108 Adolescence ...... 109 Studies and research ...... 109 Wittgenstein ...... 110 First world war ...... 111 Between the wars ...... 112 Return to England ...... 114 8.2 and Logical Foundations of Mathe- matics ...... 116 Foundations: loaded terminology ...... 116 CONTENTS

Background: logical language ...... 117 Background: sets and predicates ...... 119 Background: definitions of natural and real numbers ...... 121 New theories: Russell’s paradox ...... 122 True, false, meaningless ...... 124 Doctrine of types and use of types ...... 126 Principia Mathematica and the axiomatization of mathematical practice ...... 129 Principia Mathematica: a parenthesis in the refutation of Kant . . 132 8.3 Philosophy, Pedagogy, Literature ...... 133 Scope of philosophy ...... 133 Uncertainty and generality ...... 134 Critical philosophy and Occam’s razor ...... 136 From what we know to how we know ...... 138 Pedagogy in the large ...... 140 Pedagogy in the small ...... 141 Philosophic contemplation ...... 143 Russell’s picture of America ...... 144

V ON WITTGENSTEIN 147

9 ON SOME CONVERSATIONS WITH WITTGENSTEIN 148 Constructive content of proofs ...... 149 Family resemblances of concepts ...... 151 ‘Deep’ processes ...... 153 Warning ...... 155

10 157 Meaning of ‘philosophy’: adaptation to background knowledge . . 157 Style of the chapter ...... 160 10.1 Wittgenstein: First and Second Thoughts ...... 160 A universal language for definitions ...... 161 Other functions of language ...... 163 Manifesto ...... 163 10.2 Biographical Material ...... 164 Ancestry ...... 165 The father ...... 166 (Homo)sexuality ...... 167 The youngest sister ...... 168 The rest of the family ...... 169 The house in the Kundtmanngasse ...... 170 Music ...... 170 CONTENTS

10.3 Tractatus ...... 171 Propositional logic ...... 171 Finite Problems ...... 172 Infinite problems ...... 174 Language: expressive and reasoning capacities ...... 175 Anti-psychologism (another clumsy word for timely advice) . . . . 176 Specific Reminders ...... 176 Formal rules for propositional logic ...... 177 Discussion ...... 180 10.4 Transition to Wittgenstein’s Later Concerns ...... 181 Indignation ...... 182 Artistic skill ...... 183 10.5 Philosophical Investigations ...... 184 Partial rules: Frege’s bˆetenoire ...... 185 Generalities (going back to Cantor) ...... 186 What computers can and cannot do ...... 187 Discussion ...... 188 Formal rules (for arithmetic) ...... 189 10.6 Logical Aspects of Language ...... 190 Positive aspects of ‘negative’ results ...... 190 Natural language ...... 191

11 A PARALLEL BETWEEN WITTGENSTEIN’S WAYS AND WORKS 193 Warnings or reminders (depending on the reader’s general back- ground) ...... 194 Wittgenstein’s ways ...... 195 Tractatus ...... 196 Einstein’s way ...... 198 Philosophical Investigations ...... 199 Rules ...... 200 Bewitchment: deep thoughts and deep feelings about them . . . . 203 Alternatives to philosophical brooding about familiar phenomena, including puzzles ...... 204

12 WITTGENSTEIN ON THE PHILOSOPHY OF MATHEMA- TICS 206 12.1 General Philosophy ...... 208 Empiricism ...... 208 Theorems are rules of language ...... 209 Proofs determine the meaning of theorems ...... 210 Calculations are (psychological) experiments ...... 210 Wittgenstein’s general conclusions ...... 211 CONTENTS

12.2 Foundations of Mathematics ...... 212 Set Theory ...... 213 Constructivity ...... 215 Intuitionism and finitism ...... 215 Strict finitism ...... 216 Wittgenstein’s views ...... 217 12.3 Metamathematics ...... 219 Incompleteness ...... 219 Consistency ...... 221 Paradoxes ...... 223 12.4 Wittgenstein’s Conversations About Foundations ..... 223 Applications ...... 224 To find one’s way about ...... 225 Le style, c’est l’homme mˆeme ...... 226

13 WITTGENSTEIN ON CONSISTENCY AND INCOMPLETE- NESS 228 Historical background ...... 228 Summary ...... 229 13.1 Proofs Easy to Take in and Remember ...... 229 Autobiographical remarks ...... 232 13.2 Hilbert’s Program and Consistency ...... 233 Complementary remarks on formalization ...... 234 Autobiographical remark ...... 234 13.3 From Provability to Proofs ...... 235 An anecdote from the forties ...... 235 Analyses of theorems and of proofs ...... 236 13.4 Wittgenstein’s Expectations ...... 237

14 WITTGENSTEIN AND BOURBAKI ON TRADITIONAL FOUN- DATIONS 239 Strategy and tactics ...... 240 General complaint: deceptive abstractions ...... 241 Principal complaint: explicit definitions ...... 242 Specific complaints about some glamor issues ...... 243 Wittgenstein’s advice ...... 246 The positive side of traditional foundations ...... 247 Appendix: Mathematical logic ...... 250

15 THE DISASTROUS INVASION OF LOGIC INTO MATHE- MATICS 253 Introduction ...... 254 15.1 Proofs and Rules ...... 255 CONTENTS

Autobiographical remarks ...... 259 Explicit definitions ...... 260 Decision Procedures and Decisions ...... 262 Computer Science ...... 265 15.2 Philosophical Understanding: Questions of ‘Principle’ .. 267

16 ALL PROGRESS LOOKS BIGGER THAN IT IS 271 Summary ...... 271 16.1 General Features of Traditional Philosophy and Some of Their Implications ...... 272 Extending experience ...... 273 Doubts about the nature of traditional questions ...... 274 Conclusions ...... 275 16.2 Successes and Limitations of Definitions ...... 275 Definitions in geometry ...... 275 Family resemblances of concepts ...... 276 16.3 Intimate pedagogy ...... 279 The style of Wittgenstein and the Jesuits ...... 279 Precise formulations as an (occasional) alternative ...... 280 Conclusions ...... 281 Acknowledgement ...... 282 Appendix. The Place of Logic in the Light of Experience ... 282 General Introduction ...... 282 Traditional aims for logic ...... 283 Logical consequence ...... 283 First order logic ...... 284 Formal rules ...... 284 General conclusion ...... 285

17 KRIPKE’S BOOK ON WITTGENSTEIN 287 Every course of action can be made to accord with the rule [con- sidered] ...... 288 A proof must be easy to take in and to remember ...... 288 Private language ...... 289

Bibliography 291 Part I

INTRODUCTION

1 Chapter 1

Introduction to the Philosophy of Mathematics

Readers of this chapter are assumed to be attracted, perplexed or at least in- trigued by the kind of thing to be found in popular expositions of philosophy; for example, by grand aims (of what the world is really like, or what we really know) or by specific tools (such as the peculiar stress on generality and internal coher- ence involved in various paradoxes and puzzles). But no detailed knowledge of any philosophical text is required. Evidently, readers should have some interest in mathematics, since otherwise mathematical problems would not serve them well as an introduction to anything. The use of mathematical experience has been standard in certain branches (traditions) of philosophy since antiquity:

• Plato was so impressed by ancient Greek geometry that he took this style of reasoning as a model.

• Kant regarded philosophy and (his preferred part of) mathematics as so close to each other that he used them for his characterization: philosophy analyzes and mathematics builds up concepts.

• Even quite pragmatic sounding philosophical schemes, like the later Wittgen- stein’s (on the use or role of concepts), have their simplest (so to speak, chemically pure) applications within mathematics.1

Experience in the kind of mathematics emphasized in this chapter is useful for our purposes because both mathematics and philosophy occur within very

0This chapter is based on lecture notes for the class ‘Introduction to the Philosophy of Mathematics’, Spring 1979, Stanford University, and it is published here for the first time. 1Specifically, in the case of the concept of natural number, these schemes lead one to study the ‘role’ of numbers within a broader domain; that is, to ‘embed’ them among cardinals and ordinals or in the complex plane (in contrast to the scheme of ‘looking inwards’, at the way the natural numbers are built up). 2 Introduction to the Philosophy of Mathematics 3 diverse branches of knowledge: a principal object of the chapter is to give readers some feeling not only for the way this generality allows one branch of knowledge to benefit from experience in another, but also for the obvious danger that what is true in general may be trivial in each particular case (what is true for both cabbages and kings is usually futile in the kitchen and the palace). For several years I have emphasized that, by the nature of the case, traditional problems and notions date back to a time when one knew very little. And it is a fact of experience that, to this day, these same problems and notions have great appeal to us in our adolescence (when we begin to reflect), and especially before we acquire scientific experience. They should thus serve well also as a warm up for a collection of essays as the present one.

Introduction In academic language, this chapter would be described as an introduction to that broad area of philosophy which is concerned with meaning (analysis) and justification. A little more simply, we ask questions:

What is (the object) X? What are grounds for (the assertion) Y ?

Evidently, it will be crucial to get some idea for choosing the terms in which the answers are given. ‘Evidently’ because one can of course always ask for an analysis of any analysis, or justification of any justification. One has to learn where to stop, just as a child has to learn when to stop asking ‘Why’. Learning this is certainly part of philosophy in its literal sense: the love of wisdom. The questions above occur constantly also outside philosophy, in scientific dis- ciplines as much as in everyday life. Or, a little more to the point, within scientific and within everyday reasoning we find occasionally a good use of a philosophical kind of answer. A good rule of thumb for recognizing this philosophical character is that you feel very foolish not to have thought of the answer in the first place. More pedantically, the answer involves very general grounds. Superficially, it appears paradoxical that one should find great generality in a subject which, as a matter of historical fact, arose prior to the sciences, and has great attraction to us when we know little. A moment’s (philosophical) reflection removes the paradox. If we look at a thing only very casually, what we can assert as the result of this inspection will have great generality: it will apply to all objects which have the particular properties of the thing we looked at superficially. Of course, as already stressed above, there is a danger that this general knowledge will be trivial in each particular case. But, as experience shows, it is by no means always trivial to recognize that some given question can be answered trivially (that is, by using only general features of the situation). The chapter will provide a number of examples. Introduction to the Philosophy of Mathematics 4

As ‘philosophy’ is understood here (as characterized by a style of reasoning, rather than by subject matter 2), not all very general observations will be called ‘philosophical’; for example, not those which are arrived at after long experience and are convincing only in the light of that experience. In other words, if those observations cannot be recognized even a posteriori as particular instances of a ‘trivial’ generality. The complementary, so to speak negative part of philosophy is to learn limita- tions of the generalities mentioned above. Here it is often necessary to use highly specialized knowledge (for example, if the limitation occurs only in some special- ized domain). The best one can expect from pure philosophy is some inkling of such limitations. One final point by way of introduction (concerning the questions: What is X? or: What are grounds for Y ?). By and large, the philosophical questions are asked in science or ordinary life only when the unanalyzed (also called ‘unexamined’) use of X or Y has led to patent errors; this replaces the need for judgment on whether their analysis is rewarding at a particular time. In contrast, in philosophy one anticipates the possibility of such errors, or the possibility of a quite different style of answer to the questions above, before a practical need arises.

Warnings Sometimes the word ‘philosophy’ is applied to expositions of a scientific disci- pline, which present important steps in development as instances of philosophical generalities; and this is then called the philosophy of that particular science. In the terminology used here, such an exposition is regarded as a part of the particular discipline (such expositions will have appeal to a philosopher, just as expositions of physics which stress mathematical aspects have appeal to mathe- maticians without being included in mathematics). The choice of terminology is not intended as an arbitrary convention: we regard it as a matter of discovery that the generalities in question have been found to be worth treating separately. Particularly simple examples of useful formal treatments of this kind are given in books on logic (elementary model theory, set theory or recursion theory); but for most purposes in this chapter, it would be inappropriate to fix formally the languages or processes studied, as is done in formal logic.

1.1 Definitions

If asked for the ideal kind of answer to the question:

What is X? 2But it should be noted that different styles may be appropriate to different kinds of subject matter; for example, 300 years ago only the philosophical style was appropriate to questions concerning the atomic structure of matter. Introduction to the Philosophy of Mathematics 5 most of us would probably think of a definition of X. For example, in geometry the answer to:

What is a circle? is the following

Definition 1.1.1 A circle is the locus (set) of all points equidistant from a given point (the center of the circle).

This is an evidently useful, and quite typical definition. A fairly detailed analysis of its ‘evident’ use will be given a little later in the chapter. Readers who are spontaneously interested in knowing more about this particular definition should turn to Section 2, and (possibly) return to the present section later.

Philosophical perplexities (putting one’s questions into ques- tion) A typical example of a perplexity which stops one from going into details (here: concerning 1.1.1 above) is this:

If we define ‘circle’, why not also ‘equidistant’ or ‘point’?

And, given that we are in the business of definitions,

why not define ‘definition’?

Or even, a little more dramatically:

How can you be sure of anything, unless you have defined all your terms?

A familiar variant of this, with a bit of social concern, is:

How can you avoid being misunderstood, unless you have defined all your terms?

Given this state of perplexity, Aristotle’s answer (that every chain of definitions must stop somewhere) raises - as a matter of empirical fact - the further question:

How do we know when to stop?

The perplexities expressed by the questions above are very general, certainly not confined to the particular definition 1.1.1 (of circle). But there are also so to speak complementary perplexities, when we have what is (formally) a perfectly correct definition, and feel cheated: Introduction to the Philosophy of Mathematics 6

• An extreme example is the answer: X is X (to: What is X?), which does not tell us much about X. In fact, we use this answer to suggest that the question: What is X? isn’t worth asking (tacitly: at our particular stage of experience, ontogenetically or phylogenetically speaking3).

• Less extreme examples are those definitions which are simply intended as abbreviations, as when we say: let a stand for x · x · x (and we are familiar with - definitions of - multiplication). There is little drama here, unless one lets a stand for something else too.

The great philosophers describe the perplexities illustrated above much more vividly.4 The (few) novelists who describe similar intellectual perplexities are generally not as well-known as writers on other (so-called personal) perplexi- ties. Both philosophers and novelists can be very good at generating (or, more precisely, at making conscious to us) what might be called feelings about our (in- complete) knowledge, thereby extending, as one says, our sensibility. In contrast, as already mentioned, this chapter assumes a certain measure of sensibility, and looks at the matter (i.e. definitions) in quite a different way:

What can we assert (in other words, which questions can we answer) despite our incomplete knowledge?

We do not assume that our answers will have a permanent interest for anyone of us (for example, when we know more). If they don’t, their value is pedagogic for us at our present stage of knowledge. Of course the value is permanent inasmuch as there always will be somebody with similarly incomplete knowledge. Once we have thought through our elementary questions, we can go back to examine our perplexities more closely. But already at this stage the latter have a, so to speak, indirect use.

Heuristic value of perplexities: a source of issues and dis- tinctions The general idea can be very well illustrated from the subject of definitions. Re- alistically speaking, we know quite well what (grammatically or, as one says, syntactically correct) definitions are, including useful and useless ones, intended and unintended ones, and so on (cf. 1.1.3.a). Perplexities arise if we assert (cor- rectly) different properties of particular (memorable) definitions, which are used

3One of the pleasures of the philosophical literature is its grand vocabulary: ‘ontogenetic’ means the development of the individual, ‘phylogenetic’ of the human race as a whole. (In contrast to philo, as in ‘philosophy’, phylo has nothing to do with love.) 4Especially since the Middle Ages, philosophers have added further stirring questions; for example: whether we create objects by means of definitions, or whether definitions serve ‘only’ to specify objects that exist already. Questions which can evidently be put into question too. Introduction to the Philosophy of Mathematics 7 heavily in reasoning with each of those definitions, and then expect (wrongly) that all definitions have all those properties. And paradoxes or formal contradic- tions arise when nothing satisfies simultaneously all the properties of the various particular objects (in our case: particular definitions) involved. Such paradoxes are tantalizing if one tries to locate an error since, after all, each property is sat- isfied by some of the objects considered. So if there is an error, it consists in our having (wrongly) given equal weight to each of those objects. In short, it is not a formal error.5 The stragegy suggested by the general idea above is this: to use those per- plexities (or contradictions) to discover kinds or classes of definitions which are more significant than the general notion of definition. And that significance is, as always, expressed by listing the striking properties of such special kinds of defi- nitions. To say that we have used our perplexities to find a special class implies of course that all definitions in that class must have one of the properties that had struck us in the first place. (Of course, a bit of work will be needed to verify that the special class does have this property.) In a sense, paradoxes are even better than a mere perplexity. For, if all but one of the assertions involved in a paradox hold for a particular kind of definition, then we know that the remaining assertion is not true. In this way, a paradox has a continued permanent, albeit potential use, whenever we study a new category (of definitions). Somewhat loosely, one speaks of solving a paradox if one has found a memorable category for which, in fact, all but one of the assertions are true. The heuristic use described above is in accordance with an old saying:

Science gives (only) answers, philosophy poses the questions.

The saying is not quite true. For example, in higher mathematics one usually proceeds without perplexities: one has a striking conclusion about a particular object (say, a particular definition), and then looks for properties (of that object) sufficient to carry through the proof. Those properties determine then a class of objects which we know to be significant inasmuch as the latter satisfy the conclusion with which we started.

Exercises 1.1.2 Precision and significance of distinctions: borderline cases. a) Consider precisely defined ideas; for example, of polynomials with integral co- efficients of degree 1, 2, 3,... Was there a stage at which some distinctions were more significant for you than they are now? (Hint: Was there a time when you could solve linear equations but no others? Linear and quadratic equations, but no others? Can you now solve cubic equations? Can you do anything with cubic equations? If you

5In the recent philosophical literature, which goes back to Wittgenstein, one says that def- initions involve a family of concepts. From the point of view described above, the stress is different: we have a family of (significant) properties, not shared by all definitions. Loosely speaking, the general idea of definition is not imprecise, but too general, too superficial. Introduction to the Philosophy of Mathematics 8 cannot, is the class of quadratic polynomials not more significant than the class of all polynomials? In other words, is the distinction between quadratic polynomials and other things not more important than the separation of all polynomials from other things?) b) Do you think that the point made in (a) applies to some (precise) distinction made in the philosophical literature which you know? For example, between analytic and synthetic reasoning? And what about the distinction between those propositions which can be established by analytic or synthetic (but not analytic) reasoning? (Hint: Wait for 1.2.10.a-c.) c) If precise distinctions are sometimes of dubious value, what do you gain by making such distinctions? (Hint: Consider circles and straight lines, where arcs of circles of large radius are practically indistinguishable from segments of straight lines, and ask yourself both what you know of circles and of lines separately, and what you can say that is true of the mixture.)

Examples of distinctions (to be developed later) Abbreviations are not sources of the dramatic kind of error familiar from a thoughtless use of the definite article. The definition 1.1.1 of circle in geome- try is not intended as an abbreviation. But because it has the same form, being a so-called explicit definition, results about abbreviations apply to it too (provided one does not assume anything one knows about circles, but always goes back to 1.1.1, as in Euclid’s ritual). Clearly, the class of definitions for which abbrevia- tions are typical is particularly suitable for avoiding errors (mechanically, just by looking at the form). As is to be expected, this property is in conflict with another important el- ement of reasoning, namely: information content (in the theoretical sense of deriving new consequences, or in the practical sense of deriving old consequences more quickly). For example, if ‘the even prime number’ is known to be a defi- nition at all (that is, that there is exactly one even prime number) this tells us immediately that 10 is not prime since 2 is prime. For this very reason, one does not expect mechanical general rules for using the definite article which improve its information content.

Exercises 1.1.3 The definite article. a) Are the expressions ‘the inverse of x’ and ‘the square root of x’ definitions? What background knowledge is assumed here? How do you improve your answers by making them insensitive to background knowledge? (Hint: Remember the meaning of inverse x−1, i.e. x · x−1 = 1, and the role of zero; and that if y2 = x then also (−y)2 = x, provided negative numbers are used). b) Can you work up a paradox using the expression ‘the greatest integer’? (Hint: Use the fact that the successor of any integer x is greater than x.) c) Can you work up a paradox using ‘the greatest set x such that all members y of x satisfy y 6∈ y’? (Hint: Note that if all members y of a set x satisfy y 6∈ y, the same is Introduction to the Philosophy of Mathematics 9 true of x ∪ {x}; and x ∪ {x} is strictly greater than x because x ∈ x ∪ {x} but x 6∈ x.) d) If in mathematics or in old-fashioned grammar lessons you are taught to use the definite article only if you know that exactly one object is involved, what do you gain by this rule? (Hint: Remember the sometimes conflicting aims of avoiding error easily, and getting results quickly with relatively few errors.) e) If logic is the science of reasoning, which of the two aims in (d) above should it emphasize? May be both? f) Besides avoiding errors and getting (interesting) results, do you also expect an- swers to: What is error? and: What is interesting? g) It is a common idea that there must be simple general rules for the actual use of the definite article (in contrast to the restricted rules of elementary texts on grammar or mathematics, mentioned in (d) above) because small children learn that use quite easily. Find ten or more weaknesses of that idea and, if possible, one plausible reason for it. (Hints: Remember empirical facts about learning languages; for example, that small children learn the actual use of the definite article more easily than grown-ups whose native language does not have an equivalent, like Russian or Latin. What mechanisms of biological, in particular, linguistic development do make that idea plausible? What do you think of those mechanisms? Is the geometric shape of leaves on plants of some given species any less well-determined or less easily recognized than the actual use of the definite article? What simple general rules do you expect in botany? Do you expect general rules to refer to, say, the molecular structure of the nucleus, or to features which have been observed for 2000 years or more?)

Warnings: what not to expect of our elementary (i.e. gen- eral) distinctions Individuals can be identified either by fingerprints or by the DNA sequence in their genetic material. The latter is much more informative (but of course also more difficult to obtain). As one says, the differences between the two descriptions of the same object are much more important than the formal fact that they are both definite. The example illustrates the difference between the kind of generality treated here (the result of looking at different objects superficially), and the generality of molecular biology (which is not on the surface of all). In the philosophical literature, you find the opposite view as it were of this difference: Before you can even raise the question of identifying a person (by fingerprints or by genetic material) you must start with criteria of identity, expressing which person you mean. Since these criteria are assumed to be prior to discoveries, it is then suggested that the (philosophical) analysis of those criteria should be prior to science.6

6Obviously, the same applies mutatis mutandis to molecular definitions of metal used in chemistry and the like. Introduction to the Philosophy of Mathematics 10

One oversight in the view just stated is the assumption that we cannot do science (properly) unless we can put those criteria into words, over and above recognizing identity; so to speak:

say what we mean (and not only: mean what we say).

This oversight is similar to the one corrected by Aristotle (cf. p. 5) about chains of definitions. But this is not all. Far from being more responsible and more sophisticated (as it would seem) than ordinary scientific or every-day reasoning, the view in- volves some very dubious assumptions. For one thing, it ignores of course the fact that we grow (intellectually), absorb discoveries, and then refine or even change identity criteria: in such cases a detailed formulation is premature. But even when no such changes are to be expected, the view is suspect because of the peculiarly simple-minded conception of our intellectual mechanism which is obvi- ously implicit in it. This becomes particularly vivid if we remember the popular question:

What do we actually mean? and where the answer is to be given by means of a definition; specifically, the definition which reflects the mechanism involved in specifying the object meant. The idea is that if we are not dumb we should be able to make that mechanism conscious to ourselves (or even, that if we are conscientious we should be able to control that mechanism so as to follow the definition). Perhaps so. But at least in the case of familiar objects the idea is more dubious (to put it mildly) than the unanalyzed meaning of the definiendum, the notion that is to be defined. For example, the idea ignores the possibility that there is a whole battery of built-in responses which are used for recognition (having evolved, as one says nowadays, to cope with the world in which we live). A property recognized in this way could be called sui generis (if one likes to revive old terminology).

A new philosophical problem The warnings above, though sound as far as they go, say nothing about the fact that definitions, like 1.1.1, do have an obvious interest, and not only in mathematics. There is certainly no more reason to doubt the validity of that interest merely because it is unanalyzed, than to doubt the validity of unanalyzed notions as in the previous subsection. But here we can do better, as follows:

If a definition is given for the sake of a simple-minded conception (here: of our intellectual mechanism), it stands a good chance of being useful in a (different) situation to which the conception does apply. Introduction to the Philosophy of Mathematics 11

Specifically, whatever our means of recognizing circles may be, the definition 1.1.1 corresponds very well to the process of drawing circles by use of a pair of compasses; in short, the definition suggests a means of simulating the results obtained by means of our intellectual mechanism even when the details of the process are quite different. This fact is sometimes used in Artificial Intelligence. The new philosophical problem is to discover such situations.

1.2 The Circle: Two Aspects of Explicit Defini- tions

The two aspects considered in this section are:

1. the use of the definition to find out properties of circles

2. the use of the definition as a help (auxiliary) in finding properties of straight lines.

Evidently, in case 1 the word ‘circle’ is not intended as a mere abbreviation for the longer expression 1.1.1: on the contrary, the identification of circle with 1.1.1 expresses a property of circles. Only the form of that property happens to be similar to the form of an abbreviation.

A first definition of circle 1.1.1 can be rephrased, more fully, as follows.

Definition 1.2.1 The circle with diameter AB and center C is the set of points P such that (the distance) PC is equal to (the distance) AC or, equiva- lently, BC. So both A and B are such points P .

Exercises 1.2.2 Figures of constant width. The width of a (convex) figure in a given direction is the distance between two parallel support lines (i.e. lines touching the figure in only one point) perpendicular to the given direction. a) Are circles the only figures which have the same width in all directions? (Hint: Consider an equilateral triangle ABC, and draw the circular arcs AB, BC and AC with center the opposite vertex. For a juicier example, pick any d and extend the edges BA and CA beyond A to BA1 and CA2 by d, i.e. AA1 = AA2 = d, draw the circular arc A1A2 with center A, and do the same for the other verteces.) b) If not, mention at least one property possessed by all circles, but not by all figures of constant width. (Warning: This is a ‘trick’ question which merely tests that you understand the words.) c) What is the bearing of the exercises above on the matter of choosing definitions?

Using 1.2.1 we can prove the following Introduction to the Philosophy of Mathematics 12

Theorem 1.2.3 Let AB be any segment of a circle with center C, and P be any point of the arc of the circle which is cut off by A and B.

1. If P and C are on the same side of AB, the angle δ subtended by AB at P is half of the angle subtended by AB at C (and therefore constant).

2. If P and C are on opposite sides of AB, the angle subtended by AB at P is 180◦ − δ (and therefore again constant).

In particular, if AB is a diameter then the angles in 1 and 2 are both 90◦.

Proof. For part 1, suppose first that C is inside the triangle ABP . Since the angles at the base of an isosceles triangle are equal, the notation of the figure is justified. Since the sum of the angles of each of the triangles AP B and ACB is 180◦, 2(α + β) = ACBˆ .7 Since α + β is the angle subtended by AB at P , the theorem holds for the position of C considered.

Next, suppose that C is outside the triangle ABP . The angle subtended by AB at P is β − α (in other words, we have the previous case if α is regarded as negative). The angle subtended at C is 180◦ − 2γ, but also

(180◦ − 2α) − (180◦ − 2β) = 2(β − α) since the sums of the angles of both ACB and BCP are 180◦. So again the theorem holds. For part 2, prolong PC to P 0. Since, by the proof of case 1, the angles P 0APˆ 0 ˆ ◦ ˆ ◦ ˆ0 and P BP are 90 , APB = 180 − AP B. 

Remark. The theorem as stated is not geometrically obvious: if P1 and P2 are on the circle but on different sides of AB, the difference between the angles ◦ subtended by AB at them is 180 −2δ however close P1 and P2 may be to (say, B

7ACBˆ indicates the angle subtended by AB at C. Introduction to the Philosophy of Mathematics 13 and) each other. Small differences in the position of points (near B or A) make big differences in the angles subtended by AB.8

Exercises 1.2.4 a) Does the situation pointed out in the Remark above cast doubt on the definition 1.2.1 of circle? If not, what is in question? (Hint: Consider generally the notion of angle APBˆ when the side PB is much smaller than AP and AB. Then small changes in the position of B, that is, small compared to AP , make a big difference to the angle APBˆ . In other words, for P very near to B, the measure of the angle is not well-defined geometrically; for example, the angle would not be constant if B remains fixed but need not only lie in some narrow circular strip (annulus) bounded by the circle, but must lie on the circle itself. b) Is there anything about the Theorem to alert one to some kind of peculiar behavior when P is near A or B? (Hint: The theorem says nothing about the case when P = A or P = B.) c) What do you think of a reformulation of the Theorem when one redefines the ‘angle subtended by AB at P ’ to mean the exterior angle (of ABP ) at P when P is at the other side of AB from C (so to speak, when P turns its back on AB)? (Hint: For small changes in position, P suddenly turns its back.) d) Leaving aside the case of points very near to A and B, can you think of a good use of the Remark above? (Hint: The difference in angle provides a convenient way of determining whether or not P is on the same side of AB as C.)

A second definition of circle Theorem 1.2.3 suggests the following

Definition 1.2.5 The circle with diameter AB is the set of points where the angle subtended by AB is a right angle, together with the points A and B.

By Theorem 1.2.3, every point of the circle with diameter AB (as defined in 1.2.1) does in fact satisfy 1.2.5. Conversely, suppose P belongs to the set defined in 1.2.5, and C is the mid- point of AB. Then PC = AC = BC. Let CX be parallel to BP and CY parallel to AP . Then AXC and CYB are congruent (equal base, same angles). So

AX = CY. 8Of course, there are many well-known examples, besides angles and distances, where large effects are produced by small changes. For example, for 0 < x < π the two curves

y = 1 and y = 1 + ε sin Nx

(where ε is small and N is large) are very close (and thus determine approximately the same area) but differ greatly in their lengths. Introduction to the Philosophy of Mathematics 14

Also, CXPY is a rectangle. So

XP = CY, and hence AX = XP.

Then AXC and PXC are congruent. So

AC = CP, and P satisfies 1.2.1.

A philosophical scheme and the choice between the two definitions of the circle Since we talk of two definitions (of the circle with diameter AB), we obviously do not think of a definition as an abbreviation, or of each object as having a privileged definition: in the first case the word ‘circle’ would be an abbreviation for two different expressions, and in the second case only one of these expressions would be privileged. Instead, we regard the definitions as properties of circles expressed in a particular form.9 Since the same geometric figures satisfy the two definitions, the differences between the definitions cannot be analyzed in terms of, say, Euclid’s Elements. In other words, a geometric proposition about circles which is true for one definition is also true for the other. Furthermore proofs of such a proposition, or at least their lengths, will not differ much when they start with one definition instead of the other. After all, the passage between the definitions (in fancy language: the proof of their equivalence) is really quite short. So, patently, in terms of truth or length of proof , the answer to the question:

Which of the definitions comes first? is: Neither (since they are equivalent for propositions expressed in the terms mentioned). On the other hand, some differences strike us by the light of nature; for example, differences concerning the construction and measurement of circles (as opposed to differences of pure geometry):

• The first definition corresponds to the construction by means of compasses (as mentioned on p. 11), and also to a familiar way of checking whether a point P lies on the circle (namely, by measuring the distance of P from the center C of AB).

• The second definition corresponds to a different method which is more useful if, say, C is not visible from P but A and B are.

9In the words of the catechism, terminology (here: ‘definition’) is the outward and visible sign of an inward and invisible grace (here: of having an adequate concept of ‘definition’). Introduction to the Philosophy of Mathematics 15

These distinctions seem very special, quite specific to the particular definitions considered. They do not give a recipe for finding differences between other equiv- alent definitions, let alone for finding a general, ‘fundamental’ order between such things. Can’t we do better? Exercises 1.2.6 Axiomatic analysis. a) Do you find the idea of a ‘geometric proposition’ vague? (Hint: Logic provides a very general scheme for specifying all such propositions, and - this is of course crucial - the latter turn out to satisfy the conclusions of the last paragraph. For reference, remember the word ‘extensional’.) b) Granted the choice of (geometric) propositions, do you find the idea of ‘equiva- lence’ arbitrary because the choice of methods (axioms) used in the equivalence proof is arbitrary? (Hint: Experiment by generalization. Consider also circles on a sphere, and instead of straight lines PA, PB, PC consider the shortest distance on the sphere, i.e. great circles. In fancy language: use other axioms, and see if they imply the passage from one definition to the other, but not vice versa, thereby making one prior to the other.) c) Do you regard the suggestion, in (b), of an ‘axiomatic analysis’ as a satisfactory (or even as the fundamental) recipe for establishing an order among equivalent defini- tions?

More than 2300 years ago Aristotle10 discussed his ideas on: how to do better. Those ideas make up the ‘philosophical scheme’ referred to in the heading of the present subsection. First of all, he talks generally about an order, or rather two orderings, of things (here: definitions): Things are ‘prior’ and ‘better known’ in two ways: according to na- ture, and in relation to us. I call prior and better known in relation to us what is closer to perception, and prior and better known ab- solutely what is farther from perception. What is most universal is the farthest from the senses, and what is particular is the closest, and these notions are opposite to each other.11 Secondly, he gives his view of the significance of his (preferred) ordering; specifically, what is prior can be treated more exactly: The most exact of the sciences are those which touch on first princi- ples; the sciences based on few assumptions are the most exact: thus arithmetic is more exact than geometry.12 10Many people find the ancient philosophers intellectually more congenial than later ones (who are also concerned with first principles, with ideas which occur to us - or, at least, which we can understand - when we have little experience). The early philosophers could write more freely. They did not have to suppress a lot of knowledge as ‘philosophically irrelevant’, simply because there was less knowledge (to suppress). 11Posterior Analytics I, 2, 71b 33, 72a 5. 12Metaphysics A, 2, 982a 25–28. Introduction to the Philosophy of Mathematics 16

Elsewhere he says what he regards as most exact: he has in mind something close to logic, except that the latter (following Leibniz) goes the whole hog, and throws in possible being (all possible worlds) for good measure:

There is a science which studies Being-as-Being and the attributes which pertain to it in virtue of its proper nature. This science is not identical with any of the so-called special sciences; for none of the latter concerns itself generally with Being-as-Being, but each of them separates off some part of Being and studies the properties pertaining to this part; this is the case, for example, with the mathematical sciences.13

Exercises 1.2.7 Making distinctions which are easy 2300 years later. a) Give examples of a total and a partial ordering of the natural numbers. (Hint: The ordering by size is total; the ordering by divisibility is partial because for some pairs of numbers n and m neither n divides m nor m divides n.) b) Give an order in nature which would have thrilled Aristotle. (Hint: The order of substances according to chemical composition, extended to molecular, atomic and nuclear structure.) c) Is your order in (b) total or partial? (Hint: Consider substances made up of altogether different atoms, or consider different elementary particles.) d) What are the most universal things in (b)? Are they farthest from sense? (Hint: Elementary particles are not visible to the naked eye.) e) Using hindsight, how would you elaborate Aristotle’s conviction that the ‘really’ universal things in such an order as (b) should be far from sense? (Hint: Things are included in that order which are very diverse to the naked eye.) f) Are there other equally universal things which are close to sense? (Hint: Any- thing that is seen by looking at the things superficially; for example, shape). g) Is mathematics universal in the sense of (d) or of (f)? What about logic? (Hint: Are all parts of mathematics equal in that respect?) h) Do you think that when speaking of exactness (reliability) Aristotle means merely the number of principles, or does he mean that the principles of one science are lit- erally included in the other? For example, geometry is used to state the principles of mechanics, and is literally included in mechanics. (Hint: Did Aristotle not know that one can be as wrong about one unfamiliar thing as about 20 familiar ones?) i) Using hindsight how would you14 make Aristotle’s strategy of research plausible, that is the strategy of concentrating on the exactness of principles of theory and of ig- noring the matter of errors in their application? (Hint: Time and again, questions of principle turned out to be easy when applications were simply unmanageable; for example, in mechanics, where celestial phenomena like the motion of the planets could

13Metaphysics Γ, 1, 1003a 21–26. 14Aristotle himself appealed to social approval of the pursuit of knowledge for its own sake, and to social conditions of leisure (cf. Metaphysics A, 1, 980a 21–981a 28): the (wo)man who pursued pure knowlege was regarded as wise and superior to others. Aristotle does not discuss if it is wise to rely on social approval. Introduction to the Philosophy of Mathematics 17 be managed long before terrestrial ones. In any case there is more to see of the latter, our standards of detail are automatically more demanding, and hence more is liable to go wrong with a science that deals with terrestrial phenomena.)

The exercises above stress the positive side of Aristotle’s ideas. Specifically, his ideal of some kind of general order of nature has been realized (beyond his wildest dreams), and some mildly disturbing ambiguities in his wording are removed by easy distinctions. But nothing that he says gives a hint for our particular case, for fitting the two definitions of the circle into (anything remotely like) that general order. So far nobody else has developed any such order either. Principal philosophical corollary (of the negative point). If no hint of such an order is in sight, and if nevertheless questions of priority concerning the two definitions are to be treated, one must so to speak lower one’s expectations; for example, one must accept answers which are quite special to circles, or treat such superficial aspects as in the axiomatic analysis of 1.2.6. As always, there are several coherent reactions to the corollary. The most ob- vious one is to ignore altogether (philosophical) questions such as the one above (about a general order of priority among definitions). Another is to go ahead and elaborate the kind of modest answers we can get. The actual development of philosophy into its current branches does the latter. Specifically, analytical and axiomatic (mathematical) philosophy correspond quite well to the two kinds of lowered expectations mentioned in the corollary. They are not wholly unsatis- factory: when nothing at all can be done to approximate Aristotle’s ideal, and something can be done to satisfy the reduced aims, the latter gain in relative interest. To see this, it is useful (at least, for some readers!) to speculate quite loosely on discoveries that could provide (more) satisfactory answers. Such speculations serve two purposes:

• First of all, if the chance of such discoveries is found to be unrealistic (at present), one will see the reduced aims more clearly, not clouded by vague, unfulfilled hopes.

• But also, the speculations are enough to correct a very widespread and wholly false impression that one cannot even begin to imagine an answer to the question involved.15 (This false impression produces its own kind of perplexity.)

For these strictly pedagogic reasons, we now digress on a part of philosophy which is not suitable for academic teaching, examinations and the like, but is familiar from the: 15Incidentally, this applied also to the question: What is matter? Until some 150 years ago, Galileo and Newton had very witty arguments for the atomic structure of matter, without the remotest chance of developing anything remotely like a conclusive answer. Introduction to the Philosophy of Mathematics 18 Popular philosophical literature Here one does not solve problems or answer questions, but simply keeps alive such ‘grand’ ideals as Aristotle’s of a general order of nature, usually by mean of speculations connected with the most active area of contemporary thought. Example. At the present time (cf. Section 1) biology and its ‘master plan’ for evolutionary development have fired the popular imagination. Here, an

ordering by frequency (of occurrence) suggests itself; or, more precisely (as already stressed by Aristotle), two frequen- cies:

in nature and in (our) knowledge.

But the two frequencies would not be expected to be opposed to each other (pro- vided suitably broad features of knowledge are considered). This kind of har- mony is expected, inasmuch as our own development (our intellectual apparatus included) is supposed to be a response to the world in which we live. Thus the general features of that apparatus would reflect the general order of nature; some particular features could be seen to be artifacts, connected with some such drill as in mathematical training. The stability of other features which are quite un- perturbed by drill, would be taken to be particular cases of biological specificity (most mutations being lethal). From the point of view just described, one keeps away from close introspection for much the same reasons which Aristotle had for keeping away from things close to sense.16

Exercises 1.2.8 For readers with a speculative turn of mind. a) Do the facts you remember from your first lessons in geometry fit in with the view just described? (Hint: If at the beginning you could not see why one assertion was regarded as an axiom, another as a theorem: Is the logical order spurious for the order of nature?). b) Is there evidence for a growth of the intellectual apparatus? (Hint: Some theo- rems become second nature after their proofs and/or frequent use.) c) Is there evidence for limits to such growth? (Hint: Remember the need for starting early, but also the theorems which never become second nature.) d) Does the view resolve the opposition between mentalist and behaviorist ‘accounts’? (Hint: On the view even the combination of mentalist and behaviorist data would be expected to be too close to sense, in accordance with experience in physics, say, of optical and material properties of matter such as hardness of diamonds and glass; one needs atomic structure to connect those properties which are both close to sense.) e) Would you expect general philosophical considerations or mathematical theorems to be sufficient for establishing or refuting the view sketched above? (Hint: Remember Galileo’s refutation of the hypothesis: v = cs for freely falling bodies, and his evidence

16Reminder. If the two frequencies do differ, good-bye to the grand scheme. Introduction to the Philosophy of Mathematics 19 for: v = ct.)

The example above presents speculations, and only speculations. For though it speaks of empirical data, of two sorts of frequencies and the possibility of harmony between the two frequencies, it is not known whether the possibility is realized. This defect is so to speak complementary to another wide-spread assumption that the ‘foundations’ of our knowledge ‘must’ be independent of such possibil- ities, but it does not spoil the specific pedagogic use (of the example) described above. Readers who still feel like speculating, should invent for themselves other candidates for some fundamental general order of things, and exercises on their candidate in the style of the example above. Other readers can return to earth, and consider:

Definitions as an auxiliary means We now show how Definition 1.2.1 and Theorem 1.2.3 can be used for learning more not about circles, but about parallel lines.

Theorem 1.2.9 Suppose E, A, B lie on one line, X, A0, B0 on another, and 1. AA0 k BB0

2. EB0 k XA (where k means: is parallel). Then EA0 k XB.

Proof. The proof depends on the fact that, for a suitable point C on the line BAE, the three quadrilaterals CA0AX, CB0BX and CB0A0E are all concyclic (i.e. their verteces lie on a circle).

• There is a (unique) C such that CA0AX is concyclic There is of course such a C, namely the intersection of the line BAE and the circle through A0, A, and X. There is at most one such point, since there is only one circle through three points A0, A, and X, and the line ABE meets that circle in two points, namely C and A.

• CB0BX is concyclic By Theorem 1.2.3 and the fact that CAA0X is concylic,

ACXˆ = AAˆ0X.

But ACXˆ = BCXˆ Introduction to the Philosophy of Mathematics 20

because C lies on the line AB, and

AAˆ0X = BBˆ0X

by hypothesis 1. Thus BCXˆ = BBˆ0X, and CB0BX is concylic by Theorem 1.2.3.

• CB0A0E is concyclic By Theorem 1.2.3 and the fact that CAA0X is concylic,

A0CAˆ = A0XA.ˆ

But A0CAˆ = A0CEˆ because C lies on the line AE, and

A0XAˆ = A0Bˆ0E

by hypothesis 2. Thus A0CEˆ = A0Bˆ0E, and CB0A0E is concylic by Theorem 1.2.3.

• EA0 k XB By Theorem 1.2.3 and the fact that CB0A0E is concylic,

EAˆ0X = B0CEˆ

because the left-hand-side is the exterior angle at A0 of CB0A0E, and the right-hand-side is the angle at the opposite vertex C. Since B lies on the line CE, B0CEˆ = B0CB.ˆ By Theorem 1.2.3 and the fact that CB0BX is concylic,

B0CBˆ = B0XB.ˆ

Since A0 lies on the line B0X,

B0XBˆ = A0XB.ˆ

Thus EAˆ0X = A0XB,ˆ

0 and hence EA k XB.  Introduction to the Philosophy of Mathematics 21

Exercises 1.2.10 a) Try to give a proof of Theorem 1.2.9 without mentioning the word circle, using only the vocabulary of straight lines and parallel lines. (Hint: Instead of the phrase ‘CUVW is concyclic’ use ‘for some point O, the segments OC, OU, OV , OW are equal’. Or else start afresh.) b) Prove 2 + 2 + 2 = 3 + 3 by using facts about product. (Hint: a · b = b · a, 2 + 2 + 2 = 2 · 3 and 3 + 3 = 3 · 2.) c) Do you see a parallel between the introduction of circle and product in the proofs of 1.2.9 and (b)? (Hint: Neither circle nor product is used in stating the results proved.) d) Do you see any difficulty at all in developing the parallel in (c) in general terms? (Hint: Suppose the theorem is about addition, and doubling is mentioned in the proof, but not in the theorem; yet if you know how to add any two numbers, you certainly know how to double one.) e) Leaving aside the formal question in (d) of formulating the parallel, can you think of a situation where the distinction, between proofs, considered in (d) is significant? (Hint: It is often easier to check proofs mechanically if one knows in advance which objects and, in particular, which properties of the latter will be considered in the proofs.) f) Does your answer in (e) depend on using just the objects mentioned in the theorem proved? (Hint: Not for the particular answer hinted at in (e), for which it is enough tht the objects belong to some reasonably short list). g) Granted then some such distinction between proofs as suggested in (d), do you want to go on and distinguish between theorems (according as to whether there is some proof which uses only notions mentioned in the theorem)? (Hint: Not without some care, and especially not when the theorem is easy, and could be used as an axiom itself.) h) How would you sharpen the distinction proposed in (g)? (Hint: Sharpen ‘use such and such notions’ to ‘use familiar properties - on a given list - of such notions’.) i) If you cannot think of a wholly satisfactory answer to (h), what other question would you ask? (Hint: The analog to (e) for the distinction between proofs, since an answer to this question helps one discover in which terms the sharpening should be formulated.) j) Can you continue in the spirit of the exercises above, and do you want to do so? (Hint: Though, of course, you could continue, you would be right to wait for full answers for particular classes of proofs and theorems, for example, in geometry itself or in logic.) k) What then was the reason for having the exercises at all? (Hint: Not to be taken by surprise by the need for some distinctions, and also to be aware of some specific distinctions which are obvious abstractly, but may be difficult to notice when one is wrapped up in studying particular details.)

Eliminating definitions (from proofs) At this point it is worthwhile summarizing in general terms some of the points brought up by looking at Theorem 1.2.9; above all, concerning the possibility of ‘doing without’ the definition of circle. For those notions and their combinations of which the case of, respectively: Introduction to the Philosophy of Mathematics 22

1. the notion ‘circle (with diameter AB)’

2. the combination ‘P lies on the circle with diameter AB’ is typical, 1.2.9 suggests the following formulation:17

• The definition gives conditions in some other terms in place of the combi- nations involving the new notion. For example, in our case, 2 gives conditions on P , A, B in terms of the geometry of straight lines.

• In the proofs considered the (new) notion occurs only in the combinations listed. Thus, in our case, the proofs must not use the expression

P is a point of intersection of the circles with diameters AB and A0B0,

but may use the expression

(P lies on the circle with diameter AB) and (P lies on the circle with diameter A0B0).

• The properties of the new notion which are used in the proof must also have proofs in terms of the definition.

Then the new notion can be completely eliminated from such proofs simply by replacing its occurrences by means of the definition. The only property of proofs which is used here is that

valid proofs go into valid proofs if equals are replaced by equals, where ‘equal’ applied to combinations such as 2 above means any condition (on P , A, B) which holds if and only if 2 holds.

Exercises 1.2.11 a) State the conclusion about eliminating definitions a little more carefully, and give a reason for your refinement. (Hint: Add ‘so as to yield a valid proof’ after ‘can be completely eliminated from such proofs’. This distinguishes the present elimination from others which, for example, keep the length of proofs constant too.) b) Is (a) relevant to Aristotle’s distinction between the order of nature and the order in our knowledge? (Hint: Think of validity as concerned with the nature of proofs, and of such things as length as more critical for our knowledge.) c) Are you satisfied with your answer to (b)? (Hint: Do you need proofs at all if you are only concerned with validity, not with our knowledge of validity?)

17Readers who find the formulation too wordy will like logic, which is well suited for expressing simple generalities in simple language. Introduction to the Philosophy of Mathematics 23

d) What need be done to show that many proofs (in particular, about circles and straight lines) satisfy the general conditions for eliminating definitions? (Hint: One has to verify, in our case, that many combinations in which the notion of circle occurs are in fact equal to conditions obtained by combining cases of 2 above for different triples P , A, B). e) Does (d) present a real problem today? (Hint: No, Euclid started the job of such a verification.) f) Why is one satisfied, in (d), with finding many proofs, instead of considering all proofs? (Hint: Theorems about circles do not only connect circles with straight lines, but with lots of other things, for example, numbers like π; at least some proofs of such theorems will not satisfy the general conditions considered.) g) If (f) disturbs you, can you think of a recipe for peace of mind? (Hint: Find an appropriate name for the class of proofs which do satisfy those conditions, and restate the result on eliminating definitions.) h) At the opposite extreme (that is, for readers with a speculative turn of mind, not primarily concerned with peace of mind): Does (f) suggest a new view of Aristotle’s orders? (Hint: If we limit the things considered to some small class, within that class there may be a particularly simple order - in nature - which makes that class manageable. The corresponding order in knowledge must then be enforced by drill, by the kind of formal ritual familiar in mathematics, not evolved from a response to the external world as it presents itself.)

On explicit definitions Finally, a word about the expression ‘explicit definition’ used in the heading of the present section. The intention is to single out certain analogs to the particular definition 1.1.1 (of the circle), to give them a name (as in 1.2.11.f concerning a name for certain proofs), and achieve the following compromise:18 • The analogs should permit the applications made in Theorem 1.2.9 (elimi- nation), and the wider the class of such analogs the better.

• The analogs should be sufficiently restricted to make it easy to recognize them when one sees them. Within geometry, such analogs are naturally divided into explicit definitions of: • points a, where a = t and t is an expression for a point in geometric language (for example, the midpoint of two given points, and t does not contain a)

• conditions a (which define figures like the circle or the region bounded by the circle), where a is equal (in the sense of p. 22) to a condition t in geometric language.

18This is fairly typical of the compromises needed for choosing categories which are useful in science. Introduction to the Philosophy of Mathematics 24

More generally, instead of geometric language, one may take any other terms, provided of course that one restricts oneself to terms that are properly understood (cf. 1.1.3.a-b).

Exercises 1.2.12 Non-explicit definitions. a) Show that explicit definitions, as introduced above, provide only a compromise. (Hint: In arithmetic, 3a = 2a+1 does not have the right form for an explicit definition, but proofs containing it remain valid if it is replaced by a = 1). b) Generalize (a). (Hint: Take any property P (x) of numbers, for which one can prove that x = 1 is the only number which satisfies P (x).) c) Why is the generalization in (b) a bad compromise? (Hint: Think of a property P for which you don’t know if 1 can be proved to be the only solution.) d) Show that the forms used in ordinary (mathematical) language must be restricted if the idea of explicit definition should serve the intended purpose. (Hint: The phrase ‘a is the square root of 4’, is often used. It is correct if - tacitly - positive integers are meant; but not if also negative integers are meant since then both 2 and -2 are square roots of 4.) e) Does the need for the restriction in (d) shock you? (Hint: It shouldn’t, since there is no science where look-alikes always functions in the same way.) f) Returning to (a), isn’t 3a = 2a + 1 circular because ‘a is defined in terms of a’? (Hint: Perhaps so, but the circularity is proved to be harmless.) g) Why are people rightly frightened of circularity? (Hint: There are vicious circu- larities, for example, if one says ‘let a satisfy a = a + 1’.) h) Would you like to know about circular definitions which tell you more than cor- responding explicit ones? (Hint: Read on.)

1.3 Area: Two Aspects of Implicit Definitions

In this section we consider the following complementary questions concerning implicit (or, equivalently, circular) definitions of the area of geometric figures in the plane by means of some given condition C (on areas):

1. Given a collection of geometric figures, does C determine the measure of the area of each figure of the collection?

2. For which collections of not necessarily geometric figures (but other sets of points in the plane) does C determine an area?

The area of a rectangle A good introduction to the subject is provided by the justification of the familiar

(explicit) definition of the area AR of a rectangle R with integral sides a and b, that is

AR = a · b · AU (where U is the unit square). Introduction to the Philosophy of Mathematics 25

Though, by p. 3, we must always be prepared to find that the question: Why? is inappropriate, there is an obviously satisfactory answer to the question:

Why is AR = a · b · AU ?

One splits R into a · b unit squares: each of them has the area AU , and so the sum of them has the area a · b · AU . Taken literally (as it stands), this answer has not used (or, at least, has not emphasized) any properties of squares and rectangles, except the following:

• invariance under congruence Two congruent figures (namely, unit squares in different locations) have the 0 same area: if F and F are congruent, AF = AF 0 .

• additivity The area of the union of two non-overlapping figures is the sum of their 0 areas: if F and F do not overlap, then AF ∪F 0 = AF + AF 0 .

The two properties above of area (or, more pedantically, of the measure which associates a value AF to the figure F ) determine a unique area AR for our rect- angles R in terms of AU . So, if we confine ourselves to those geometric figures which are integral sided rectangles, we can derive the relation AR = a · b · AU from the implicit definition of area. It is implicit, because area occurs on both sides of the equations.

Exercises 1.3.1 a) Why does one say about those properties that the answer ‘has not used any . . . except’ instead of saying simply that the answer ‘has used only’? (Hint: The answer certainly does not use the properties for arbitrary figures F , F 0: it is sufficient to consider rectangles F , F 0, namely those formed from our squares.) b) Is there a tacit assumption in the formulation of the properties above? (Hint: Only those figures F , F 0 are considered which are to be assigned an area. Specifically if any figure is assigned an area then any congruent figure is also assigned an area, and if each of two non-overlapping figures are assigned areas, the same applies to their union.) c) Why does one restrict oneself to non-overlapping figures? (Hint: If F and F 0 do overlap, e.g. completely, it just isn’t true that AF ∪F 0 = AF + AF 0 .) d) Do you have a snappier answer to (c)? (Hint: If F = F 0 and so F ∪ F 0 = F , we should have AF = 2 · AF , and hence AF = 0 for all F .) e) Don’t neighbouring squares overlap? (Hint: Make up your mind whether you say that two regions overlap only if there is some square which is included on both of them, or you mean by ‘unit square’ both the figures with and without - some or all - their boundary lines.)

Exercises 1.3.2 Implicit definitions in Number Theory. Introduction to the Philosophy of Mathematics 26

a) Do you know implicit definitions of functions in number theory? (Hint: Defini- tions by recursion or induction. For example, of exponentiation to the base 2, sometimes n 19 0 n+1 n written exp2 and sometimes 2 : 2 = 1, 2 = 2 · 2 for all n.) b) Is: 2n+1 = 2 · 2n for all n, sufficient for an implicit definition? (Hint: The condition is satisfied by any constant multiple of 2n). c) Is the condition: f(n) = 2 · f(n + 1) for all n, an implicit definition of the function f? (Hint: It is, if it is - tacitly - required that the values should be integral. f(n) For f(n + m) = 2m , and this is a proper fraction if m = f(n) and f(n) 6= 0. So f(n) must be 0 for all n.) d) Does any difference between the definitions in (a) and (c) strike you? (Hint: The value of 2m is determined by the finite number of conditions 20 = 1, 2n+1 = 2 · 2n for all n < m; but no value of f, say f(0), is determined by any finite number of conditions: f(n) = 2 · f(n + 1), say for n < N, since g satisfies those conditions on f if g(n) = 2N−n.) e) Try to formulate your answer to (d) in general, memorable terms. (Hint: Whether conditions on a function f determine its value at one place depends on where else the function is supposed to be defined.)

Equality of area Before considering whether the two laws on area are sufficient to assign a measure to, say, arbitrary polygona (that is, rectilinear figures), we shall show that they determine whether or not two polygona have the same area. Since polygona can be split up into triangles, it is sufficient to answer the question for the latter. And instead of triangles ABC one can consider parallelo- grams ABCD where ABC is congruent to CDA, since then the area of ABCD is twice that of ABC.

Theorem 1.3.3 If two parallelograms ABCC0 and ABDD0 on the base AB have the same height, they have the same area.

First Proof. We apply the two laws (on area) to figures besides the two paral- lelograms: Then

AABCC0 + ABC0D = AABDD0 + AACD0 . By congruence,

ABC0D = AACD0 . Hence AABCC0 = AABDD0 .  One sometimes says that the two parallelograms ABCC0 and ABDD0 are equal by complementation, because they can be complemented by adding congru- ent triangles to form one and the same figure, the trapezoid ABCD.

19More pedantically, 2n is not a symbol for the function, but for its value at the argument n. Introduction to the Philosophy of Mathematics 27

Another way of using the two laws on area is to split up each of the parallel- ograms into figures which are pairwise congruent. Second Proof. Look at the figure obtained as follows. KL k AB, and KL goes through the point of intersection of AD0 and BC0. The parallelograms ABHL and ABKH are equal by subdivision since ABH is common to them, and the triangles with number 1 are congruent. Now one fills the parallelogram ABDD0 with copies of ABHL as long as possible: in the figure just described, only one copy has room inside the parallel- ogram since the distance of D0D from XX0 (the topside of the copy) is less than the height of ABHL. For each such copy one has a copy of ABKH to fill the parallelogram ABCC0. The remaining space consists of two congruent triangles and two congruent trapezoids.  Exercises 1.3.4 a) What difference between the two proofs strikes you most? (Hint: The second proof starts with ‘Look at the figure’.) b) Is the figure used in the second proof typical of the general solution? (Hint: No, at least in the following respect: in general, not only two copies of the smaller parallelogram will be used, but more or less.) c) What factor determines the number of copies used? (Hint: The ratio of height to base of the given parallelograms.) d) Does any tacit understanding strike you in our use of ‘equal by subdivision’? 1 (Hint: Subdivision into a finite number of parts, not like subdividing (0, 1) into (0, 2 ), 1 3 3 4 ( 2 , 4 ), ( 4 , 5 )...) e) In terms of your particular answer in (d), which property of the factor in (c) is used in the proof above? (Hint: Whatever the ratio of height to base, only a finite number of copies of the smaller parallelograms have room in the given ones.) f) What is the name of the property used in (e)? (Hint: Archimedes’ axiom.)

Surprises, perplexities and (promising) problems The two laws on area (on p. 25) have certainly given direction (or, as one says, a structure) to the results on area presented here: one looks for proofs which make use of just those laws (not only for verifying simple elementary results, but also for recondite constructions). One thing that can be so to speak positively surprising here is the discovery that so little goes so far; that is, the discovery that the area AF is determined for so many figures F , provided the laws are intelligently applied (in terms of 1.3.2.e: to the appropriate collection of figures on which the function F 7→ AF is defined). Another thing that may be negatively surprising is that some distinctions have to be made; for example, in 1.3.1.c (concerning overlapping squares) and especially in 1.3.2.e (applied to the proper choice of F ). Realistically speaking, these surprises do not last very long: scientific expe- rience contains many similar examples, both of the positive and the negative Introduction to the Philosophy of Mathematics 28 aspects and limitations. In particular, the law of diminishing returns may apply (when a good deal more does not go much further). And, generally, the sequence of necessary distinctions is not endless. But these facts of (scientific) experience can be made perplexing if instead of exploiting them, one asks:

How are they possible?

Though starting with ‘How’, this question needs as much discretion as the ‘Why’ questions mentioned on p. 3. So far, this question has not been heuristically useful along the lines considered in the subsections on p. 10 or 18. In particular, we have no answer in terms of the growth of our intellectual mechanism, and we do not know the particular features of the external world which affect that growth.20 We do not even know to what extent the details of that growth might be affected by our attempts to learn about it (how our thoughts are affected by our thinking about them). What is more, there is no immediate prospect for promising research on these (perplexing) questions. So philosophers, in their search for wisdom, have devel- oped a machinery for replacing those perplexities by giving them a new, specific twist. This is illustrated by the next two examples. a) Refinement results We consider the perplexity:

To which figures F are the two laws intended to apply?

(at the moment when we accept those laws as evident). This is replaced by the question:

For which figures F need the two laws be assumed?

(in order to ensure a particular conclusion; for example, that triangles with the same base and height have the same area). In the two proofs of 1.3.3 one had parallelograms and trapezoids. The follow- ing (easy) modification shows that one need work only with triangles F and F 0 such that F ∪ F 0 is a triangle too (and not, for example, a quadrilateral as when F = ABD and F 0 = ADC). Third Proof. Consider that the following pairs of triangles are congruent:

ABC and BCC0 BC0D and ACD0 ABD and ADD0.

The following pairs of triangles add up to triangles (as required).

20Replacing geometric questions (intended as questions about physical space) by purely math- ematical questions sometimes constitutes a wise lowering of expectations too. Introduction to the Philosophy of Mathematics 29

1. HAB + ABC = HBC

2. HAB + ABD = HAD

3. BCC0 + BC0D = BCD

4. ADD0 + ACD0 = ADC

5. HBC + BCD = HCD

6. HAD + ADC = HCD. From 5 and 6,

AHBC + ABCD = AHAD + AADC . From 1 and 3, and 2 and 4,

AHAB + AABC + ABCC0 + ABC0D = AHAB + AABD + AADD0 + AACD0 . From the congruences,

2 · AABC = 2 · AABD, that is AABCC0 = AABDD0 .  Of course, the construction given is a piece of pure mathematics. But it is highly relevant to the (philosophical) perplexity about the kind of figures F and F 0 to which our laws for area are intended to be applied: not to answer the question originally asked, but to show that the answer is not needed for (justifying) our conclusion. In fact, one can go even further in this direction (of making our conclusion insensitive to the class F of figures to which the laws are applied) by means of a restatement: Let F be any collection of figures which includes all triangles, and let 0 the assignment of area: F 7→ AF satisfy the two laws for all F , F in F. Then, for triangles F and F 0 with the same base and height,

AF = AF 0 . In other words, this last conclusion is not affected by any perplexity about the choice of F as long as F includes all triangles. This is elementary but typical of a widely used device for drawing conclusions in the face of uncertainty. It is an example of making a chain stronger than its weakest link by tying off that weakest link, as follows. Our chain (of reasoning) includes the laws on p. 25, with the weakness that we have a very foggy idea of the totality of figures F and F 0 to which those laws (are intended to) apply. It turns out that in practice many striking conclusions can be shown not to be affected by weaknesses of this kind; provided, of course, one tries to show this! And the heuristic value of philosophical perplexity is that it alerts us to a possibility which is to be excluded (by a scientific investigation). Introduction to the Philosophy of Mathematics 30 b) Independence results Next, consider the question:

Why does one choose the two laws on p. 25?

Evidently, again the question could be treated along the lines considered in the subsections on p. 10 or 18. But a more feasible twist is this:

Why does one choose the two laws (instead of just one of them)?

This has a good answer. For example, the second law alone would not be sufficient to determine whether or not the areas of (say) two triangles are equal. Indeed, the law is also satisfied ∗ by the pseudo area determined by the assignment F 7→ AF defined (for figures F ) as follows: for some point P in the unit square (in a fixed position),

 A if P lies in F A∗ = U F 0 otherwise.

Consider all 4 possibilities:

P in F P in F 0 P in F ∪ F 0 Yes Yes Yes Yes No Yes No Yes Yes No No No

If F and F 0 do not overlap, the first possibility does not arise (since then P would lie in both). In all other cases, whether F and F 0 overlap or not):

∗ ∗ ∗ AF ∪F 0 = AF + AF 0 , and thus the second law is satisfied. By giving all figures the same ‘area’, one can similarly show that also the first law alone would not be sufficient.

Exercises 1.3.5 a) Is there any uncertainty about the ideas involved in the definition of A∗? (Hint: As in the previous example, the collection F of figures to which the definition applies; and the ambiguity about ‘overlapping’ mentioned in 1.3.1.e.) b) What has to be achieved by the choice of F and the resolution of the ambiguity? (Hint: The laws formulated in the table above should be satisfied.) c) Verify that the laws hold for any F and for both meanings of ‘overlap’. (Hint: The laws hold for any F if ‘F and F 0 overlap’ means that F and F 0 have some point in common, hence ‘F and F 0 do not overlap’ means that they have no point in common. This is certainly ensured if F and F 0 do not overlap in the other sense, of not having a square in common.) Introduction to the Philosophy of Mathematics 31 Measure of area A much more interesting question is this:

For which figures F do the two laws of p. 25 determine a measure of

area (in terms of AU for the unit square)? and not only equality and inequality of area. a) Triangles with integral base and height The two laws determine the area of a triangle with integral base and height as follows. Given ABC of base AB, consider the parallelogram ABCC0 such that CC0 k AB and AC k BC. Then 1 A = · A 0 . ABC 2 ABCC By Theorem 1.3.3, ABCC0 has the same area as the rectangle on the same base and with the same height. By the formula for the area of a rectangle with integral sides, 1 A = · base · height. ABC 2 Exercises 1.3.6 a) Is it obvious that the product ‘base · height’ is the same whichever side of the triangle is regarded as the base? b) How does the drift of (a) compare with the discussion of Theorem 1.2.9? (Hint: They go in the opposite direction, since the implicit definitions of area together with the laws of congruence tells you something new about products.) c) Would a simpler example than (a) make the same point as (b)? (Hint: Consider a rectangle, and turn it through 90◦.) b) Polygona with rational area

For polygona F , the two laws determine the area AF if and only if the latter is a rational multiple of AU . p Suppose AF = q AU , where p and q are integers. By what was proved above, F is equal by complementation to a rectangle R on the unit segment. Split R into p rectangles and AU into q rectangles, all on the unit segment: these smaller rectangles are all congruent, since A A A R = F = U p p q by hypothesis. So, granted that we recognize congruences and unions when we see them, we can verify q · AF = p · AU by applying the laws. Conversely, suppose AF is determined by the two laws. Then there must be a set of figures F1,...,FN , where: Introduction to the Philosophy of Mathematics 32

• F1 = F

• FN = U

• Fi is congruent to Fj for certain pairs (i, j)

• Fk = Fl ∪ Fm and Fl ∩ Fm is empty for certain triples (k, l, m),

such that we determine the area AF = AF1 by solving the (linear) equations:

• AFN = 1

• AFi = AFj for the pairs (i, j) above

• AFk = AFl + AFm for the triples (k, l, m) above. But sets of linear equations have rational solutions, if a solution is determined at all.

Exercises 1.3.7 a) Do only rectilinear figures get an area by the two laws? (Hint: No. For example, consider the unit square with a semicircular bulge B and an equal 0 0 indentation B . This has area equal to AU , since B and B are congruent.) b) Does (a) suggest a better formulation of the proof above? (Hint: Some of the figures F1,...,FN may be curvilinear.) c) Are your conditions in (a) also necessary, as are the two laws in the formulation above? (Hint: This means that if the area AF of a figure F satisfying your conditions p is equal to q AU , then there is a sequence of figures F1,...,FN which relates F to U as in the proof above.) 2 d) Do your conditions include the figure F (with area 3 AU ) defined as follows? 1 1 1 (x, y) ∈ F ⇔ (∃n > 0)[(1 − ≤ x ≤ 1 − ) ∧ (0 ≤ y ≤ )] ∨ (x = 1 ∧ y = 0). 2n−1 2n 2n−1 (Hint: Fit a mirror image of F onto F so as to form a figure consisting of finitely many rectangles.) e) Is the restriction to finite configurations in the proof above parallel to an earlier example? (Hint: Recall 1.3.2.c and 1.3.2.d.) f) Mention one respect in which the present case differs from 1.3.2.c. (Hint: The values of f(n) defined by f(n) = 2·f(n+1) are restricted to be integers, but the values of AF are not restricted.) g) Would you like to see what happens if one does prescribe the kind of values AF takes? (Hint: Read on, where the case of non-negative real values of AF is treated.) c) Figures with real area

For any figure F for which the area AF is uniquely determined, the latter is (ob- viously) a (non-negative) real multiple of AU . This condition is liable to extend the collection of figures F for which AF is determined by the two laws, because of Introduction to the Philosophy of Mathematics 33 the following property of real numbers, concerning approximations by sequences (for example, of rational numbers). Given the (increasing and, respectively, de- creasing) sequences

− − − + + + r1 < r2 < r3 < ··· and ··· < r3 < r2 < r1 of rational numbers, if:

+ − 1 • they approach each other arbitrarily closely (for example, rn − rn < 10n )

− + + − • all r are less than all r (in particular, 0 < rn − rn ) then there is a unique real number ρ which lies between the sequences; that is,

− + rn < ρ < rn for all n. Suppose now that a figure F is approximated from below and from above by − + sequences (of figures) Fn and Fn such that

− − − + + + F1 ⊆ F2 ⊆ F3 ⊆ · · · ⊆ F ⊆ · · · ⊆ F3 ⊆ F2 ⊆ F1 ,

− + − + and Fn and Fn have areas rn and rn satisfying the properties above. Then F has the area ρ, if the following additional third law holds: • monotonicity 0 0 If F ⊆ F and both F and F are assigned an area, then AF ≤ AF 0 .

Exercises 1.3.8 a) Using the three laws, what is the area of a figure F consisting of a single point? (Hint: F is contained in arbitrarily small rectangles.) b) Do the three laws require that an area be assigned at all to a single point? If not, how does one formulate the tacit assumption used in (a)? (Hint: If such figures are included in the collection F of figures considered then ...) c) Using the same tacit assumption, what is the area of the following figure F , contained in the unit square? 1 1 1 (x, y) ∈ F ⇔ (∃m > 0)(x = ∧ 0 ≤ y ≤ ) ∨ (∀m > 0)(x 6= ∧ 0 ≤ y ≤ 1). m 2 m + − (Hint: AF = 1. To see this, let Fn be the unit square, for all n. And let Fn be a figure 1 consisting of the rectangle 2n+1 ≤ x ≤ 1 ∧ 0 ≤ y ≤ 1 with thin rectangular incisions 1 1 1 at each x = m > 2n+1 , such that the total width of these incisions is less than 2n+1 .) d) Can the following figure F contained in the unit square be approximated from above and below by finite unions of rectangles?

(x, y) ∈ F ⇔ x is rational and 0 ≤ y ≤ 1/2) or (x is irrational and 0 ≤ y ≤ 1).

(Hint: No, because any rectangle F − approximating F from below lies wholly in 1 1 + the rectangle 0 ≤ x ≤ 1 ∧ 0 ≤ y ≤ 2 , and so AF − ≤ 2 , while any rectangle F approximating F from above includes the whole unit square, and so AF + ≥ 1.) Introduction to the Philosophy of Mathematics 34

e) Does (d) show that if F is added to the collection F of ordinary geometric figures then no assignment F 7→ AF can satisfy all three laws? (Hint: Compare the analogous case of the first two laws where a figure cannot be split up into a finite number of rectangles; for example, in 1.2.10.d.) f) What is the general drift of the Exercises above? (Hint: The choice of the collection F of figures for which area is defined deserves some more attention.)

Grand philosophical questions Enough has been said to draw some general conclusions about two questions in the philosophical literature. For some given notion (for example, a property like area) one asks:

To what objects does the notion apply? or, more grandly,

What is the essence of the notion?

As before,21 these old questions can either be interpreted in terms of our actual intellectual mechanism, or in terms of the reduced aims of current academic philosophy. In our specific case, of the notion of area (of geometric figures), little can be said about the first interpretation. For one thing, here it would be essential to de- scribe the class of figures for which the notion (of area) is defined. And if that class depends on our experience, on the class of figures that present themselves to us, one would wish to know about the ‘master-plan’ which controls this dependence, the plan which so to speak controls the growth of our intellectual mechanism; in particular, one wonders how stable that mechanism is to perturbations of the environent in which we live. The reduced aims are not only vaguely suggested by (terminology derived from) the first interpretation, but sometimes they give genuine local knowledge of the latter (‘local’ because one learns not about the ‘master plan’ but about some specific property). For example, in the subsection on p. 28 the specific conclusions considered were seen to be independent of the uncertainty in our knowledge (cf. 1.3.9.a for a little more detail how this independence is achieved for different kinds of conclusions). Though the reduced aims, being restricted to specific conclusions, do not say much about the essence of a notion, they analyze which properties of the notion are essential to a particular conclusion.22 Sometimes one speaks in the con- temporary philosophical literature of uses of a notion (instead of its ‘essence’); in

21See the discussion starting on p. 17. 22For obvious reasons, one is particularly interested in properties which are essential to (or, for that matter, sufficient for) many conclusions (cf. 1.3.9.d). Introduction to the Philosophy of Mathematics 35 mathematics, one concentrates on mathematical uses (that is, theorems). Discov- eries of such conditions as the three laws of area provide one way of understanding our uses (cf. 1.3.9.b). Finally, there is an automatic by-product of work on the reduced aims, which has been called the creation of new notions.

In the particular case of area, the assignment AF to decidedly non-geometric sets of points F (which cannot be visualized at all, and so have nothing to do with area as ordinarily understood) is an example of such a new notion (cf. 1.3.8.c and 1.3.8.d). Quite generally, the scheme for creating new notions (mentioned already in the subsection on p. 6) uses the fact that only certain properties of a notion are used in any particular proof of some specific conclusion about that notion. So that conclusion is valid for the whole family of notions which possess those properties. The idea of such a family is a new notion: in this sense (the - conscious or unconscious - analysis of) a proof may create new notions (even when the primary purpose of a proof is to make sure that some assertion about the old notion is sound). As before, a separate study is needed to see how the new notion depends on the particular proof one has chosen to analyze (cf. 1.3.9.d). Exercises 1.3.9 a) Explain in general terms how you expect to make a conclusion about a notion insensitive to uncertainties about the range of objects (say, figures) to which the notion applies. (Hint: If we want to be sure that all (intended) figures have a given property, we consider a large class F (of figures) which certainly includes all intended ones. If we want to be sure that some intended figure has that property, we consider a small class F which is certainly included among the intended ones.) b) Explain in general terms what you expect to gain from confining yourself to a particular body of uses. (Hint: Generally, more laws are true of a more restricted body of things.) c) Are there limitations to your answer to (b)? (Hint: Some laws require that the body of things considered be sufficiently rounded out. For example, if F and F 0 which intersect in a single point have areas then their intersection F ∩ F 0 has no area unless points are given an area; cf. 1.3.8.a.) d) Is the choice of essential properties of a notion suspect just because it is relative to a limited class of conclusions? For example, those which we ‘happen’ to have studied? (Hint: It depends how familiar the notion is, and above all how long it has been studied: if the choice was not made consciously, for a limited aim, the class may reflect some feature of the world in which we live.) e) Can you see a virtue different from the one stressed in (d) in using a striking proof for creating a new notion, instead of doing this by simply giving free reign to your imagination? (Hint: At least you know in advance that the new notion has some striking property, namely the conclusion established by the proof.) f) Can you see a virtue in looking for a new notion which shares many properties of a familiar notion? (Hint: Inasmuch as you handle the familiar notion well, the new one will be manageable too.) Part II

ON ARISTOTLE

36 Chapter 2

The Literary Value of Aristotle’s Thought

I will speak here of Aristotle as of a great philosopher, without entering into the details of his thought. This choice conforms to the Aristotelian principle that general proofs are more important than particular ones, but it is in conflict with the scholarly tradition of meticulous precision. Not of course, with such innocent pastimes as establishing, say, literal correctness of a text, but with expectations that seem (to us philistines) associated with those scholarly pastimes. For one thing, we simply do not expect to find many (if any) gems in Aristotle to be unearthed, beyond what has become familiar in the last 2,300 years; except that, as always, familiar thoughts may be related to new experience in unexpected ways. Much more significantly, we do expect that perfectly sensible broad ideas will often, if perhaps not always, be spoilt by greater precision (in the sense of refinement, or of pursuing specifics that fascinated Aristotle1).

The value of great philosophers One usually thinks that the essential value of great philosophers has to do with the influence they may have had on the progress of knowledge: not because they have led to this or that particular discovery, but because they have contributed to the intellectual climate on which the dissemination of knowledge depends. In the case of Aristotle, one may mention the negative influence that he has had on

0This chapter is based on: ‘Valeur litt´erairede la pens´eed’Aristote’, in Aristote aujourd’hui, Sinaceur ed., Unesco, 1988, pp. 20–21; and ‘On analogies in contemporary mathematics’, un- published manuscript, 1989. 1From personal and other contemporary experience we know how often such specifics are premature and, incidentally, how their pursuit follows the path of least resistance. The alterna- tive is not vagueness, but simply not dwelling unduly on those sensible ideas, which must not be forgotten.

37 The Literary Value of Aristotle’s Thought 38 the progress of mechanics.2 At any rate, I fear that the influence of philosophers’ opinion on the progress of knowledge has been exaggerated. It is the same exaggeration that makes us forget another aspect of the oeuvre of great philosophers, namely their literary value. I mean ‘literary’ to the the extent that these writings express the feelings, and especially the attachment, that we feel about our primordial knowledge (i.e., the first ideas that come to the spirit whenever we begin to reflect on a new subject). A talented philosopher expresses such sentiments the way a novelist describes all the other emotions. And without doubt we need our spontaneous feelings in the face of knowledge, like any other feelings, to be presented in a vibrant manner. The permanence of this literary value seems evident to me when I compare the interest we have today in the works of Aristotle with that which we no longer feel for the formulations of Archimedes. Today, 2,300 years after his death, Aris- totle words and formulations still strike a chord within us, whereas the works of Archimedes are certainly more useful to us in their modern formulations. Of course, it may certainly happen that the very first ideas that spring to mind on a given subject prove to be scientifically useful: that, in Aristotle’s own terms, these first ideas happen to correspond to the order of nature.3 But (at the risk of repeating myself) the essential value of the great philosophers does not reside in the occasional possibility of these scientific applications, and stressing such applications can only lead to disillusionment. To illustrate the general points just made, I will use (implicitly or explicitly) some of Aristotle’s memorable formulations of broad ideas as points (or, more precisely, as patches) of reference; of course, not only for comparison, but also for contrast with contemporary knowledge; as norms (in the sense of standards of measurement, not generally of ideals to be followed). The common sense assumption here is that a useful frame of reference should be easy to take in and to remember: in particular, without requiring background knowledge that is not generally available. Well, at Aristotle’s time there just was less of it (available even to the most knowledgeable). Experience will show when this assumption reaches its points of diminishing returns.

2Galileo himself loved to place himself against the current of the aristotelian tradition. I know too little to go into detailed historical analysis, but I know that Galileo was fond of disputes (this is why he left Venice, where he was living a peaceful life, for Florence and Rome, where he must have known his life would have been more agitated). Now, all disputes have their dramatic aspects, drama attracts attention, and that contributes to the dissemination of knowledge. For that matter, it is not impossible that the atmosphere of tension produced by a dispute might be more important than its very object. 3I would not be surprised if, especially in philosophy, first ideas also had a role of a symptom, at least most of the time. Wouldn’t it be amusing to discover, for example, that most of those who are attracted to philosophical skepticism are, in fact, paranoids? The Literary Value of Aristotle’s Thought 39 Analogies As announced, we shall not go into any detail on Aristotle’s favourites. The literature by charlatans selling panaceas pursues some of these details relentlessly. But we do hold on to the broad idea, in Aristotle, that analogies4 are liable to be among the big guns in our arsenal for the acquisition and exposition of knowledge. When some particular kind of analogy proves to be strikingly successful, it may be time to subject that kind to some corresponding academic discipline. (Such disciplines represent organized labour in the commerce of ideas.) Remembering Aristotle’s many warnings against sterile generality,5 for exam- ple:

To examine all the views held . . . is superfluous. . . . Every subject has its special problems, . . . and it is well to examine these,6 we do not assume that what is true about all analogies will often (or even ever) be enlightening for any one of them that we encounter. The alternative envisaged here is, of course, not what is false or uncertain, but what is (both true, and) not sterile. The common sense target is to have relatively few kinds of analogies, which are adequate in relatively many situations. Not all kinds can be expected to lend themselves to (rewarding) theoretical elaboration.

Abstractions (also called structures) They are included in Aristotle’s (too?) general idea of category. They are of course candidates for a proper context in which particular objects are to be viewed; pedantically, in which a bunch of questions about such objects are to be considered. More poetically, as Plato’s translators put it, those abstractions are general forms of which the objects in question partake. Contemporary mathematical experience with abstractions is obviously rele- vant to the topic (by not recognizing this, one is left below the threshold of informed discussion). For example, consider Euclid’s geometry, which was not just a, but the focal point of (theoretical) knowledge at the time of Aristotle. For Euclid, the question

What is a point?

4In mathematics, at one extreme there may be simple drawings (for example, a point and an arrow). At another extreme, the analogies between collections (of quite diverse objects) with the same cardinal, which are so familiar that are rarely tought of as analogies (but they do preserve a big chunk of knowledge, namely numerical properties). 5Incidentally, such warnings are in conflict with his own advertisement for metaphysics, as the science of Being-as-Being. Not to speak of Kant’s preoccupation with mere possibility-in- principle, even without the modest constraint of being (actually around). 6Eudemian Ethics, 1214b, 15–30. The Literary Value of Aristotle’s Thought 40 seemed, as one says, of the essence.7 More than 2,000 years later this question was discovered to have a rewarding abstract aspect, presented forcefully in Hilbert’s Foundations of Geometry. Specifically, in two ways:

• for (mere validity of) abstract geometry, the question is not only not essen- tial, but demonstrably irrelevant

• for (merely contemplating the possibility of) objects partaking of the gen- eral form (here, abstract Euclidean geometry) it is a matter of course that Euclid’s question must be answered.8

For good or ill, whatever else is left open, the abstract version is not surrounded by the particular air of unreality in Euclid’s own discussion of the question. A corollary that strikes the (mind’s) eye is the impeccable precision of abstract reasoning with terms (such as points) that are not completely (in fact, not at all) defined. More fully, such reasoning is not only possible: formulations in terms of abstract structures paraphrase successfully earlier formulations using incompletely defined terms. This provides at least one answer to Aristotle’s question on how precise reasoning is possible with terms that are not completely defined:

The [need for a] definition arises out of the necessity of stating what we mean.9

Choice of abstractions Aristotle gives much attention to the matter of choice of abstractions. This indicates a shift of emphasis away from matters of principle (e.g. mere validity of analogies), to a focus (on, of course, valid cases) which provides safeguards against sterility (not merely against straight error). Evidently, such a focus does not require a preview of all valid cases, which in the jargon of the trade would be called ‘conceptual analysis of validity’. In fact, while engaged in such a preview one remains at a distance from any focus. Almost as a corollary, dwelling on the preview is liable to reach a point of diminishing returns (here, for contributing to effective knowledge by use of suitable analogies). What, by the nature of the case, is not evident from the Greek intellectual experience, and certainly could not be documented, is quantitative: both the degree of difficulty in, and the possibility of, making effective choices. Again,

7We are not concerned here with the dotting of i’s and crossing of t’s, such as noticing formal oversights by Euclid that needed correction: for example, properties of order used in the proofs, but not listed among the axioms. 8In the jargon of the trade (of logicians) the objects are (candidates for) models, and the abstractions are described by sets of formulas. For example, if complex numbers are to be candidates, one possibility is to take algebraic numbers as points. 9Metaphysics Γ, 7, 1012 a 22-23. The Literary Value of Aristotle’s Thought 41 contemporary mathematical experience is relevant. I cannot think of a more compelling example than the one provided by Bourbaki’s work:

relatively few so-called basic structures, which are suitable for rela- tively many situations encountered10 (an idea(l) already touched upon above).

This evolution of knowledge can be compared to ordinary evolution,11 which involves both adaptation (of the basic structures) and migration (looking for greener pastures among the situations ‘encountered’). It certainly fits Aristotle’s emphasis on politics (here, of knowledge) as the art of the possible. But, on the other hand, it also constitutes a shift of emphasis away from Aristotle’s refrain about plain truth in connection with knowledge. This shift is reflected in a new, but by now dominant, terminology: gener- alization. For example, the fact that n2 − m2 = (n + m)(n − m) holds for the ‘general form’ of commutative rings, and not only for (say) the ring Z of integers contributes to effective knowledge about Z. The new terminology carries with it both losses and gains; respectively:

• it not only can, but does obscure the fact that the general form contributes to effective knowledge of the original case, even when (at least for the mo- ment) one does not pay any attention to any other new case;

• it serves (in effet, if not in purpose) as a safeguard against the pursuit of aspects of the original case that may (as experience expands) become sterile, and distract from more rewarding aspects.

Without, of course, giving such a pedestrian account (in the literal sense of balancing gains and losses), Goethe gave a very memorable description of mathe- matical practice (even) at his time, naturally in terms suitable for his compatriots:

Mathematicians are like the French; if one says anything to them they translate it into their language, and it is immediately something else.12 10This fact, which (if I may speculate for a moment) would have thrilled Aristotle (even if he would not have wished to go through all the volumes of Bourbaki), is totally obscured by familiar, perfectly true and pious limitations. For example, one can quibble about the measures implicit in ‘few’ and ‘many’, or about the meaning of ‘encountered’: after all, it does not refer to the infinite variety of experiences that presents itself, but to aspects to which attention is paid. 11At least I, even with insight, can simply accept it as a fact of nature that this evolution has (globally) stable features. 12Item 1279 in Maxime und Reflexionen: Die Mathematiker sind eine Art Franzosen: redet man zu ihnen, so ¨ubersetzen sie es in ihre Sprache und dann ist es alsobald etwas anderes. For the record, I find it satisfaisant pour l’esprit that mathematicians are often so remarkably clumsy at saying simple things simply and memorably; of course, not only by comparison with The Literary Value of Aristotle’s Thought 42 What we know and how we know The distinction between what we know and how we know is one of Aristotle’s favourites. The point I wish to stress, again as a matter of accumulated experi- ence, is how much easier the former question is to answer (to our satisfaction) than the latter. Of the what, whose aspects strike the (mind’s) eye, we have easily accessible practical knowledge (which constitutes the threshold for additions by further theory). We also would like to know more about the how, but the price is high. The relevant contemporary experience here is artificial intelligence. In current jargon, it is best to associate the what with (artificial) achievements in (hardware or software) engineering of certain functions performed by us, and the how with the particular structures available in our own make up. In other words: the how concerns processes (e.g. in the central nervous system and in the sense organs), the what their (end) results. Artificial realizations are often regarded as some first steps toward the how. Of course, there are parallels between the natural and the artificial sides. But, as matters stand, points of diminishing returns are reached rather quickly, when those parallels cease to contribute and begin to distract. In particular, given how many valid alternative ways of reasoning there are, it would be rash to assume that those that are particularly effective for arriving at correct results artificially, are often also prominent in the nervous system. The case of artificial locomotion, often by use of wheels, underlines this warn- ing. For a parallel with artificial intelligence, the results are here the end points of the animal’s motion (corresponding to input and output). What we know of the biological processes involved in locomotion (chemical processes in muscles and in nerves of animals or, in the particular case of humans, conscious sensations of a high or of fatigue) simply hasn’t benefitted much from experience in mechanical engineering (for artificial locomotion) or from the geometry of circles (which are perfect wheels, just as Turing machines are perfect computers). Plato’s favorite motto

Know thyself (γνωϑιˆ σεαυτoν´ ) would seem to give priority to how we know. By what was said above, it should be tempered by Aristotle’s less high-minded

Nothing in excess (µηδεν´ αγαν´ ).

This fits Carlyle’s more resonating words:

Goethe (not to speak of Shakespeare). After all, their business is to say some quite complicated things (in particular, what they know of some recondite mathematical proof) relatively simply. Only those (of all ages) who are still in their ‘salad days, . . . green in judgment’ would assume that this business is an adequate preparation against that clumsiness. The Literary Value of Aristotle’s Thought 43

The folly of this impossible precept: ‘Know thyself’; till it be trans- lated into this partially possible one: ‘Know what thou canst work at’.

Disclaimer There has been no attempt here of conveying the particular excitement of ques- tions that are central (today). This is not a matter of (false) modesty on my part, but simple common sense. Not even Shakespeare’s King Lear conveys the feelings of a decrepit old man to healthy teenagers, even if his poetry makes them simper. The present (cl)aims are restricted to aspects that are both accessible and of use with more restricted experience. The market for the material provided here consists of those with such limited experience. What is left open is the packaging and advertising proper for that market, which is as demanding a job in the sphere of ideas as in the commerce of material goods and services. As to the small market of specialists, there is (as always) a chance that a felicitous formulation of a simple general fact can come in handy; by helping to remember such a fact when it happens to be a missing piece in a major puzzle. Chapter 3

Aristotle’s Logic Today

I would primarily like to emphasize (in Section 3) several ideas that figure in the Aristotelian corpus, and which seem to me to have (pedagogical or heuristic) utility for contemporary logic, mathematics and, especially, philosophy. Nonetheless, it is convenient to begin (in Sections 1 and 2) by mentioning other ideas which, for obvious reasons, would arise only at a stage when science was more primitive than now, as in Aristotle’s time. The discussion will not touch on the (numerous) passages in Aristotle on logic and mathematics which were inadequate even for his own epoch, for example in what concerns the proofs of arithmetical theorems in Euclid’s Elements.

3.1 Obsolete Ideas

The following examples concern well-known passages in the Aristotelian corpus, which depend on the primitive state of knowledge in his time.

The infinite Aristotle’s idea of the infinite, as he emphasized in (to me) somewhat ambiguous terms, was appropriate only to the mathematics of his time:

The mathematicians do not need the infinite and do not use it: they postulate only that a finite straight line may be extended as far as they wish.1

Aristotle’s idea (on so-called potential as opposed to actual infinite) does not apply to the developments of the last 100 years, and even less to the intentions which guided them, even if it is possible to reformulate the results in terms of the

0Originally published in Aristote aujourd’hui, Sinaceur ed., Unesco, 1988, pp. 251–262, as ‘De la logique’. Translated by H. Hodes. 1Physics III, 7, 207b 29-30.

44 Aristotle’s Logic Today 45 potential infinite. For example, the use of the actual infinitehas been eliminated from the parts of mathematics for which Hilbert’s program has succeeded.2 Of course here, as elsewhere, several of Aristotle’s marginal remarks have a certain interest even today, in particular:

The infinite is exactly the opposite of what it is said to be: not what has nothing outside, but what has always something outside. 3

Though hardly elegant, this quotation does not exclude the actual infinite in the Cantorian sense; in particular, it allows for different infinite cardinalities. In fact, for Cantor himself, only the totality of all infinite cardinals is ‘potential’, that is it cannot be conceived of as a unity. Thus, for example, it is a theorem of the axiomatic set theory formulated by Zermelo that there is no set of all cardinals. In less technical terms, we can say that Aristotle proposed certain concepts that are still applicable, especially to ordinary ideas of the infinite (or as one also says, to its ordinary usage), because most of the developments in mathematics just mentioned have not been sufficiently assimilated to influence everyday thought.

Logic Remarks analogous to those expressed above also apply here. It is true that Aristotle used quantifiers (for each, there is) to formulate the syllogism. But he didn’t iterate these quantifiers, for example, in the following combination: for each x there is a y such that for each z . . . Furthermore he didn’t hit upon the problems specific to the predicate calculus (he was at most interested in monadic predicate calculus, for which quantifiers can be separated.) Briefly, he looked at propositional logic, which only involves expressions without quantifiers, having at most free variables. Now, the principal (and delicate) problem can be posed in modern terms as follows:

Can one interpret the logical operators by truth-functions?

Everything depends on whether the terms appearing in the propositions are well- defined (that is when well-defined terms replace free variables), a question with which Aristotle was clearly preoccupied:

These [new] terms will be something defined . . . The [need for a] definition arises out of the necessity of stating what we mean.4

2Readers interested in this point will find in Hilbert [1930] comparisons between his ideas in the philosophy of mathematics (for example, the formal character of mathematical deduction) and those of Aristotle. 3Physics III, 6, 206b 33-35. 4Metaphysics Γ, 7, 1012a 14 and 1012a 22-23. Aristotle’s Logic Today 46

We will return to the question of the definitions of terms in Section 2, and will distinguish between: those that are well-defined, and those which are known to be well-defined. Now, surely, for Aristotle the former was essential, since for him the order of things according to nature had priority over the order of things according to our knowledge:

Things are ‘prior’ and ‘better known’ in two ways: according to na- ture, and in relation to us.5

Aristotle continues - in a manner which is much less convincing with respect to contemporary experience - as follows:

I call prior and better known in relation to us what is closer to per- ception, and prior and better known absolutely what is farther from perception. What is most universal is the farthest from the senses, and what is particular is the closest, and these notions are opposite to each other.

3.2 Less Obsolete Ideas

The examples which follow concern other passages in the Aristotelian corpus, also well-known and connected to the (limited) experience of his time, but in ways that seem to me less obvious than with the examples of Section 1.

Particular proofs and general proofs We will discuss below the intrinsic value of the following quotation:

What most clearly shows the superiority of universal demonstration is that grasping a proposition we in a sense know as well (potentially) all propositions following from it. For example, if someone knows that the sum of the angles of every triangle is two right angles, he knows it in a sense (potentially) also of the isosceles triangles . . . But one who knows the latter does not know the former in any sense, neither potentially nor actually.6

It seems to me that Aristotle’s preference for general proofs is perfectly rea- sonable, if one considers the majority of proofs known in his epoch. To repeat what cannot be repeated too often: this conclusion is convincing even without a detailed analysis of the concepts Aristotle is using, for example without analyzing what constitutes a greater knowledge. But when we consider later developments

5Posterior Analytics I, 2, 71b 33, 72a 5. 6Posterior Analytics I, 24, 86a 22-28. Aristotle’s Logic Today 47

Aristotle’s preference becomes less convincing, and it becomes necessary to ana- lyze his concepts more closely.7 The best known example of a particular demonstration is given by some proof by cases (Socrates’ bˆete-noire). I know of few examples from ancient texts in which such distinctions of case were pertinent, at least to the given formulation of the theorem being proved. But today there are many examples of this, partic- ularly when the intervening distinction is an undecided statement: for example, one first deduces a proposition P from Riemann’s Hypothesis RH (which is still undecided), then from its negation ¬RH, and finally one asserts P uncondition- ally. Intuitionistic logic rejects such deductions as invalid, and by so doing hides the following essential question behind its dubious doubts:

What do we lose by the unconditional assertion of P ?

Often P is an existential proposition and its witnesses, that is to say the things whose existence is asserted, have an altogether different form according to whether RH is true or false. Then ‘the elegance’ obtained by categorically asserting P is a little superficial, and it is more profitable to consider the cases (RH → P ) and (¬RH → P ) separately. Literally speaking, we have just been arguing for the advantage (in the long run) of particular propositions, not demonstrations, over general propositions; but Aristotle himself makes this leap in the passage quoted above. This leads directly to the other question which we have already raised:

In what does a greater knowledge consist?

I will make three distinctions that strike me as directly pertinent.

1. We sometimes go to greater generality, in other words to knowledge concern- ing more things (for example, when a proposition demonstrated of isosceles triangles can be demonstrated of all triangles). In modern mathematics this has a name: axiomatic analysis.

2. Sometimes the improvement in knowledge concerns the things mentioned by the proposition being proved (for example, the witnesses for existential propositions considered above in the discussion of proof by cases). Here is a good formula to remember this aspect of proofs:

What more do we know if, besides the truth of P , we know its proof?

7The analysis also becomes easier (and, therefore, more convincing) for a very general reason: in the time of Aristotle, the established distinctions were, so to speak, theoretical and not illustrated by concrete experience, and also were not so easy to see. Furthermore, even when one saw them, it was difficult to judge their importance, i.e. the frequency of their (further) relevance. Aristotle’s Logic Today 48

Intuitionistic logic hides, once again, a perfectly reasonable question be- hind dubious doubts, this time by rejecting the notion of truth (of P ) in mathematics, and trying to maintain that the sense of P itself must make reference to how it is proved: as if otherwise the interest for proofs were so fragile that any other interest might surpress it.

3. Sometimes the growth of knowledge concerns the relationship between the things mentioned by the proposition and other known things (for example, those introduced in the course of the demonstration). One very good illus- tration of this sort of supplemental knowledge is provided by the proofs in analytic number theory, where arithmetical properties of natural numbers are related to (geometrical) properties of the complex plane. Such analytic proofs might or might not augment our knowledge in the sense of 2.

Last but not least, the distinction between 2 and 3 resembles, at least su- perficially, the distinction that Aristotle draws between Being-as-Being and (the properties that pertain to) Being in virtue of other things:

There is a science which studies Being-as-Being and the attributes which pertain to it in virtue of its proper nature . . . Such attributes hold for anything at all, and not only for some particular things, to the exclusion of others.8

Quite evidently, the recognition of even the existence of such truths was what the psychologist K¨ohler called an Aha-Erlebnis (for Aristotle as well as for us, for phylogeny as well as for ontogeny). It is an entirely different problem to determine whether these truths are also useful, at least from time to time, when they are applied to particular things. Only extensive intellectual experience provides the elements needed for an evaluation of this much more delicate problem.

Propositional Logic I will here follow the common usage of the expression Aristotelian logic, under- stood to mean that the logical operations are interpreted as truth functions of two truth values.9 In particular, on this interpretation the principle of excluded middle is valid, i.e.

P or not P for all propositions (under consideration).

In fact, in what I have read of Aristotle I have not found many places which emphasize this principle. He actually emphasizes the entirely different principle

8Metaphysics Γ, 1, 1003a 21-26 and Γ, 3, 1005a 22-23. 9This usage is historically inaccurate. For example, implication was apparently first con- sidered truth functionally by Philo of Megerea, shortly after Aristotle’s death, and there is no evidence that Aristotle did the same. Aristotle’s Logic Today 49 of contradiction:

One cannot have, at the same time, P and not P .

This principle cannot be asserted ‘in general’ (for example, for propositions P containing terms which are not well-defined), but its field of application is cer- tainly wider than that of: P or not P . The real (and delicate) problem here is to know to what extent the state of knowledge determines the utility of the principle of excluded middle (or of Aristotelian logic). This can be expressed by the following question Q:

Are there classes of propositions wider than that of propositions that are true or false and which, nonetheless, lend themselves to a (prof- itable) theoretical analysis?

It seems to me that no other class (narrower or wider) had any hope of being profitably analyzed in Aristotle’s epoch (and even long after him).10 Today the discovery of formal systems (for predicate calculus) has suggested a candidate for which the response to Q is positive: any class of propositions P which (are not necessarily true or false, but) have the property that, for any proposed proof π, one can see whether or not π proves P . This is suggested by formal systems in the following sense: formulas, formal proofs and derivability correspond to, respectively, propositions, proofs and truth; and for any proposed formal proof one can mechanically see whether or not it is a proof (of its last formula) according to the rules of the formal system (but in general one cannot mechanically decide whether or not a given formula is derivable). Naturally, for a given paradigm (such as that concerning formal systems) a very great number of variants come to mind. The following leads naturally to a positive answer to question Q for which the laws of Aristotelian logic are not valid: the class of propositions that bear on choice sequences.11 These sequences are conceived of as incomplete, so that operations (including predicates) defined for them should depend only on partial data: the formal expression of this fact is, as one must expect, incompatible with the principle of excluded middle. This variant is a principal theme of intuitionistic mathematics. Nobody familiar with this subject could have the slightest doubt concerning the interest of the discovery that a consistent and coherent theory of proposi- tions about choice sequences is possible. In spite of the possibility of making

10Granted, it was always easy to think of other classes, for example the one of the amusing propositions. Whatever uncertainty might surround the extent of this class, it is clear that it differs from the class of P that satisfy: P or not P . The latter is certainly closed under negation ¬, which is to say that if P or not P is true, then ¬P or not ¬P is also true. On the other hand, P could well be amusing without ¬P being so. 11Or on random sequences, considered as objects or individuals (while the more ‘classical’ and familiar theory avoids random sequences by paraphrases in terms of the theory of sets or of measure). Aristotle’s Logic Today 50 this enlarged class of propositions credible,12 the attraction of this discovery is, as a matter of historical fact, absolutely not universal today, and certainly not comparable to the attraction of the decision, implicit in Aristotle, to restricting attention to propositions that are definitely true or definitely false.

3.3 Eternal Verities

Setting aside certain specific points considered in the examples above (which, sooner or later, will appear obsolete), Aristotle has also left us several general formulations. For example, the distinction between the order of things according to nature and that according to our knowledge (Section 1), or Being-as-Being (Section 2). Aristotle’s formulations seem to have pedagogical value, as aids for memory (e.g., to better remember the previous examples). Having survived 2,300 years they are obviously durable, if not unforgettable. Their value increases if we get used to citing them whenever they are pertinent to other situations, thereby indicating useful parallels. The price we must pay for this (a repetitive vocabulary and, thus, a slightly stereotyped style) is nothing in comparison with the global orientation thus gained; especially if we stick to Aristotle’s original formulations instead of inventing our own. In fact, the frequency with which these generalities noted by Aristotle are rediscovered and reformulated gives an indication of their interest. For example, it is clear that Bertrand Russell reformulated (in [1959]), without himself noticing the relation, Aristotle’s idea on the order of things when he asserts ‘the priority of what we know over how we know’. Similarly (in the introduction to Principia Mathematica) he sees the simplicity of set theory in the fact that it only contains one basic notion: the relation of membership. This corresponds, in an evident way, to the famous passage:

The most exact of the sciences are those which touch on first princi- ples; the sciences based on few assumptions are the most exact.13

The parallel is indisputable even if neither Russell nor Aristotle is convincing: after all, one can as easily be mistaken about one (difficult) notion as about twenty (familiar and tested ones). For curiosity, I will add that it seems to me quite easy to get a general idea of what one can expect from Aristotle. In particular, for years I cited the well- known phrase from a sermon by Bishop Butler: ‘Everything is what it is, and not another thing.’ While recently reading some pages of Aristotle, it became clear to me that he had to mention this thought somewhere. After all, it is not much

12By expressing the dependence on partial data by an appropriate version of continuity, in the usual topological sense. 13Metaphysics A, 2, 982a 25-27. Aristotle’s Logic Today 51 different from the recognition of the fact that all chains of reasons have an end, in:

It is a lack of education not to know what requires demonstration and what does not, since not everything can be demonstrated; otherwise there would be an infinite regress.14

Sure enough, here it is:

The fact that anything is itself is the one and only reason that can be given in answer to such questions as ‘Why is a man a man, or a musician a musician?’15

One should add that, contrary to Bishop Butler and his admirers, Aristotle con- tinues by saying that such responses, and consequently such questions, are sterile. Of course, the formulations in question have limited value. One would be deluded if, forgetting this, one expected them to be not only general, but pro- found. Or if one expected them to be often useful when applied to concrete and familiar situations. After all, it is already a lot if they are often useful in concrete but unfamiliar situations (i.e. at the start of research), and sometimes useful in familiar situations.

Exact philosophy It seems plausible to me to suppose that the kind of delusion of which we just spoke has contributed to the attraction, above all in Anglo-Saxon countries, for what is called exact philosophy. This philosophy flows from a doctrine about the kind of knowledge that is useful: a doctrine according to which Aristotelian formulations are not useful, at least not unless they are made precise. But, as happens so often to traditions that want to be empiricist, the exact philosophy has not really taken the trouble to make an empirical study of the results and the limits of its own doctrine or (perhaps more importantly) of the people who have tried to follow it. I have the impression, if one must choose, that this doctrine is much more deluding and (despite the apparent paradox) even more pretentiously ambitious than philosophy in the everyday sense, with the interest the latter takes in eternal truths. This is displayed particularly well in its relation to scientific research and especially (in our case) to mathematical research. Popular philosophy aims to a sort of sense of proportion: this is, chronologically, prior to scientific development and becomes, generally, completely marginal after the progress of sciences. Exact philosophy considers itself either separated from science, or even as a sort of

14Metaphysics Γ, 4, 1006a 6-9. 15Metaphysics Z, 17, 1041a 16-18. Aristotle’s Logic Today 52 vision more advanced and elevated, as shown by slogans like: From science to philosophy.16 Not only is this approach historically false, but it would even deprive us of the pleasure and (other) benefits that philosophy is able to give. For example, this nice piece of authentic wisdom:

It is clear that one cannot adopt the same attitude towards all oppo- nents. Some need persuasion, and others force. For those who hold their position by failure of thought, their ignorance is easy to cure . . . but those who argue for argument’s sake can only be cured by putting them and their words to their place.17

16Naturally there is an infinity of variations, which certainly appear superficially very differ- ent. For example, in the particular case of logic, logical philosophy is supposed to furnish a general theory of objects. The disillusion comes from the (simplistic) supposition that what is true of objects in general has a chance of being interesting for any particular objects. Evidently, this applies, mutatis mutandis, to attributes of Being-as-Being (discussed above). 17Metaphysics Γ, 5, 1009a 16-22. Part III

ON FREGE

53 Chapter 4

Frege’s Foundations

The word foundations is used for:

1. first steps (in current mathematics)

2. systematic exposition

3. ensuring validity, and other aims besides (e.g. reliability).

The purpose 1 is evidently pretty independent of the others. So are 2 and 3: any systematic method may introduce systematic errors, and reliability is often best achieved by exploiting specific knowledge of (horizontal) cross checks. The first three sections of this chapter introduce the dominant part of the philosophy of mathematics in this century, derived from the logical foundations of Frege (and Russell). The idea is to answer the question:

What is mathematics (about)? by:

Mathematics is (about) logic or, more precisely, about the logical notions mentioned. The stock of the (few) logical notions used consists of:

1. the language of sets and membership, familiar from the New Math.

2. the so-called elementary part of 1, where only elements of some one set are considered: for example, only (elements of the set of all) natural numbers, or (of all) points of the plane.

0This chapter is based on lecture notes for the class ‘20th century philosophy of mathematics’, Winters 1982–82 and 1983–84, Stanford University, and it is published here for the first time.

54 Frege’s Foundations 55

The main variants of this scheme (associated with Zermelo, Hilbert, and Brouwer and known as set-theoretic, finitist and intuitionist foundations) are regarded in the literature as rivals (of that scheme).1 We shall set them out (in Chapter 5) as simplifications, refinements, and extensions.2 Section 4 describes a view of mathematics and mathematical reasoning which differs significantly from all the variants mentioned above, a view related to Wittgenstein’s (later) ideas in the philosophical literature, and those of (the founders of) Bourbaki, a group of mathematicians, in their manifesto [1948].3 Without necessarily rejecting the logical view (for example, as imprecise), this second view regards the logical notions as too crude: different properties of math- ematical objects or arguments are lumped together (as logically equivalent) which differ significantly for an understanding of the architecture of mathematics and our intuitive resonances to it; and, conversely, distinctions are made (between properties that are not logically equivalent) which do not contribute to that un- derstanding.

4.1 Background: Some Discoveries

Before it was even plausible to answer the question: What is mathematics? by: The study of sets, it was of course necessary to ‘define’ familiar mathematical notions in terms of sets. The material presented below gives an idea of how this is done.

Functions We all think of functions (say, from natural numbers to natural numbers) as rules or methods for passing from arguments to values. For example,

n 7→ n − n and n 7→ 0 are certainly different rules. For a programmer the chief problem is to find an efficient rule. 1Males of the same species strike the biologist as variants, but behave as rivals; they are rivals for attention. The new terminology can be an effective method of conveying the new perspective: for example, changing the context from one species to several. (Most philosophical questions take the form: What is a relevant context?) 2Inasmuch as the so-called finitist and intuitionist schemes have been intended to formulate views of mathematics and mathematical reasoning which are genuinely contradictory to Frege’s, their developments (so far) have been inadequate for that purpose. 3See also Chapter 14. Frege’s Foundations 56

Dirichlet discovered the following fact: at least for the great bulk of theorems T (f) in pure mathematics about functions f, if the methods f and g assign the same value to the same arguments and T (f) is true, so is T (g). So, as far as these theorems T are concerned, only the set of pairs (n, f(n)), also called the graph4 of the function f, is relevant; not the other aspects of f which strike us.

Isomorphism Dirichlet’s discovery of the graph as being the most relevant property of a function (for the bulk of then-current mathematics) is a special case of: selecting relevant equivalence relations, or: selecting relevant features of mathematical objects. When one speaks of isomorphisms (matching up) between such objects, one always means tacitly: with respect to a particular list of such features.5 Consequence of isomorphism. For any theorem in set-theoretic language built up from (symbols for some of) the selected features, we have the Transfer principle: if the theorem is true for one object, it is true for any isomorphic copy (in the sense described). This is quite independent of the way in which the object (that is, the domain of elements and the relations considered) have been defined.

The structure of the natural numbers A domain of elements together with a (privileged) list of functions and relations (on the domain) is called a structure.

Dedekind discovered that every structure (D, a0, suc) satisfying the follow- ing conditions (called Peano’s axioms) is isomorphic to the familiar structure (N , 0,S) of the natural numbers with least element and successor operation:6

1.( ∀x ∈ D)(a0 6= suc(x)) 2.( ∀x, y ∈ D)(suc(x) = suc(y) ⇒ x = y)

3. For every subset X ⊆ D:

a0 ∈ X ∧ (∀x ∈ D)[x ∈ X ⇒ suc(x) ∈ X] ⇒ X = D. 4One speaks of ‘graph’ because, when we think geometrically, we think of the curve, not of any particular way of ‘constructing’ its points (for example, constructing the y-coordinate from the x-coordinate). 5For example, consider the open intervals of rational numbers: (a, b) = {r : a < r < b}. If only the order is relevant and a < b, then (a, b) is isomorphic to (0, 1). If also + and × are (chosen to be) relevant, then this holds only if a = 0 and b = 1. 6The familiar idea of natural numbers leaves open which aspect are to be principal objects of study (e.g. S, or +). Therefore, what comes to be called ‘the’ structure of natural numbers involves selection. Similar remarks will hold for rational and real numbers. Frege’s Foundations 57

Equivalently, 3 (called the induction principle) can be replaced by the following (called the least number principle):

4. For every subset X ⊆ D:

(∃x ∈ D)(x ∈ X) ⇒ a0 ∈ X ∨ (∃x ∈ D)[x 6∈ X ∧ suc(x) ∈ X].

Surprise. Instead of saying that the series D is generated from a0 by applying suc repeatedly (i.e. finitely often), one paraphrases finitely in terms of the idea: for all sets X ⊆ D.

Defining functions by equations Addition f and multiplication g can be defined over the structure (N , 0,S) by their recursion equations ∆(f, g):

f(x, 0) = x f(x, S(y)) = S(f(x, y)) g(x, 0) = 0 g(x, S(y)) = f(g(x, y), x).

In elementary mathematics there is no need to distinguish between the follow- ing conditions (say, in the case of functions with natural numbers as arguments and values, as in the example of f and g above):

1. There is a unique pair of functions f and g which satisfies ∆(f, g) (since ∆(f, g) contains variables over N , f and g must satisfy ∆(f, g) for all values in N of such variables).

2. For each pair (n, m) of numerical arguments of (say) g, a unique value of g(n, m) can be computed from (or, a little more formally, is determined by a suitable finite number of numerical instances of) the equations in ∆(f, g).7

Condition 2 has the following properties:

• it can be expressed in purely logical (set-theoretic) form, using the definition (up to isomorphism) of N by Peano’s axioms, and Dirichlet’s definition of the graph of a function

• most elementary definitions of functions by equations satisfy it

• it implies condition 1 (for the proof, remember that a unique value can be determined in sense 2 only if there are functions f and g which satisfy ∆(f, g) at all; and if two functions, say f and g, do not satisfy ∆(f, g) then there is a finite set of numerical instances of ∆(f, g) which is not satisfied).

7For example, in the case of addition, f(n, m) is computed from the recursion equations applied to f(x, y) for x = n and 0 ≤ y ≤ m. Frege’s Foundations 58

On the other hand, there are equations which satisfy 1 but not 2 . For example, ∆(d, z) defined as follows: d(0) = 0 d(S(x)) = S(S(d(x))) z(x) = d(z(S(x))). Actually, ∆(d, z) defines d and z in sense 1 (where, as the notation indicates, d is the doubling function and z is the constant function with value zero), (and d) but not z in sense 2 (since there is a unique function z compatible with all equations z(x) = 2z(x + 1) with x ∈ N , but many functions compatible with any given finite set of such equations).

Order Order relations come in different varieties: strict and weak (e.g. ‘less than’, and ‘less than or equal to’), total and partial (sometimes we order all elements of a domain, sometimes only ‘comparable’ ones). For example, a (binary) relation R defines a strict total ordering of a domain D if and only if it satisfies the following conditions: 1.( ∀x, y ∈ D)[R(x, y) ∨ R(y, x) ∨ x = y] 2.( ∀x, y ∈ D)[x = y ⇒ ¬R(x, y) ∧ ¬R(y, x)] 3.( ∀x, y, z ∈ D)[R(x, y) ∧ R(y, z) ⇒ R(x, z)]. Any finite ordering is isomorphic to the usual ordering by size of an initial segment of the natural numbers. We now look at interesting infinite orderings.

The structure of the rational numbers (Recall that) a set is countable if it is isomorphic to the natural numbers, pre- serving only = and 6=. Cantor discovered that every strict total ordering R of a countable set D which: 1. has no first or last element, i.e. (∀x ∈ D)(∃y, z ∈ D)[R(y, x) ∧ R(x, z)]

2. is dense, i.e. (∀x, y ∈ D)[R(x, y) ⇒ (∃z ∈ D)(R(x, z) ∧ R(z, y))] is isomorphic to the familiar structure (Q, <) of the rational numbers with the usual ordering. Frege’s Foundations 59 The structure of the real numbers (Recall that) a Dedekind cut of a totally ordered structure (D,R) is a set L ⊆ D (where ‘L’ stands for ‘left’) which is: • not trivial, i.e. (∃x, y ∈ D)(x ∈ L ∧ y 6∈ L)

• downward closed, i.e. (∀x ∈ D)(∀y ∈ L)[R(x, y) ⇒ x ∈ L]

• open to the right, i.e. (∀x ∈ L)(∃y ∈ L)R(x, y). Dedekind discovered that every strict total ordering R of a set C with a (countable) subset D ⊆ C such that:

1.( D,RD), where RD is the restriction of R to D (or, more pedantically, to D × D), is isomorphic to (Q, <)

2. D is dense in C, i.e. (∀x, y ∈ C)[R(x, y) ⇒ (∃z ∈ D)(R(x, z) ∧ R(z, y))]

3. for every Dedekind cut L of D there is xL ∈ C such that

(∀x ∈ D)[x ∈ L ⇔ R(x, xL)] is isomorphic to the familiar structure (R, <) of the real numbers with the usual ordering.

4.2 Set Theory

The material presented in Section 1 gives a good idea of the part of set-theoretic foundations which characterizes familiar mathematical objects up to isomor- phism. The present section provides set-theoretic definitions of (isomorphic) copies of such objects in the language of set theory. This ‘reduces’ familiar ma- thematics to set theory, in the same sense as geometry is ‘reduced’ to the theory of real numbers in analytic geometry (so-called axiomatic mathematics does not need any reduction, because the definitions are already given in set-theoretic terms).

Set-theoretic purity The language of set theory uses variables over (suitable) sets, the relation ∈, and logical operations for building up compound propositions. Set-theoretic purity requires not only conditions (like Peano’s axioms) formulated in set theoretic lan- guage (as done above), but also set-theoretically defined structures which satisfy those conditions. As an example, we consider one possible set-theoretic definition of (N , 0,S): Frege’s Foundations 60

1. S(x) = {x}8

2.0= ∅

3. N = T{X : ∅ ∈ X ∧ (∀y)(y ∈ X ⇒ {y} ∈ X)}.

Then Peano’s axioms are trivially satisfied; in particular, the induction principle holds because, by definition, N is the smallest set containing 0 and closed under S.

Supply of sets What sets are we talking about? Today we think of the so-called cumulative hier- archy, built up from the empty set by iterating the so-called power set operation P (P(x) being the set of all subsets of x). The first few iterations are defined as follows:9 V0 = ∅ V2n+2 = V2n ∪ P(V2n) for n ≥ 0 S V1 = n≥0 V2n V2n+1 = V2n−1 ∪ P(V2n−1) for n > 0.

The hierarchy is called cumulative since, for α preceding β, Vα ⊂ Vβ; here α precedes β if:

(α is even and β is odd) or (both are of the same parity and α < β).

When people speak of models for (tacitly, some given axiom system of) set theory they mean thinner hierarchies built up by means of some P− instead of P. For example, Pf : taking all finite subsets; or Pd: taking all subsets defined in a specified language (e.g., the language of set theory in the sense described earlier).

Defining sets by predicates When defining numbers by equations one must pay attention to the kind of num- bers one is talking about: for example, x2 = 2 does not define a rational number, x2 = −1 not any real number, 2x = 0 no number (you want to consider in algebra) at all. The same happens when defining sets by predicates, where now one must pay attention to the the kind of sets (i.e. the levels of the cumulative hierarchy) one is talking about: for example, (∀x ∈ Vm)(x ∈ y) does not define any set at all on Vm (since, for each m, if y ∈ Vm then y 6∈ y), but it does on Vm+1 (since Vm ∈ Vm+1).

8This is the simplest possible definition consistent with the intuition that y follows x if x ∈ y. If we wanted ∈ to be the order relation < on N , then S should be defined as: S(x) = x ∪ {x}, because n < x + 1 ⇔ (n < x) ∨ (n = x). 9If you know about ordinals, 2n takes the place of n, 1 of ω, and 2n + 1 of ω + n. Thus we are defining Vα for α < ω + ω. Frege’s Foundations 61

Russell and, before him, Frege (mis)understood Cantor to say that every predicate defined a set. Actually, Cantor said more or less the opposite: only those predicates satisfied by a collection of objects that can be grasped as a whole (unity) define a set; formally: for any P and z,

(∃y)(∀x)[x ∈ y ⇔ x ∈ z ∧ P (x)].

The above (mis)understanding is expressed instead by Frege’s axiom: for any P ,

(∃y)(∀x)[x ∈ y ⇔ P (x)].

Fortunately, one gets a contradiction (the so-called Russell’s paradox), e.g. when P (x) is x 6∈ x. Contrary to a simpleminded view, this is fortunate since not every error is so blatant that it actually leads to a contradiction.

Proofs in Set Theory Euclid’s aim was to present geometry by starting with a few properties of the plane, called axioms, from which all other properties expressed in (his) geometric language followed logically. As we know now, modulo a few minor corrections10 he succeeded. A neglected aspect of Euclid’s presentation is that little was said explicitly about the ‘kinds’ of points in the plane: in modern terms, whether only points with rational, or also algebraic, or even all real coordinates are meant.11 Frege’s (and Russell’s) aim was similar to Euclid’s, including an analog to the neglected aspect: little was said about whether the full operation P or a thinner variant P− is to be used (P ‘corresponding’ to the full plane in the case of Euclid); or, more subtly, how often the operation P (or P−) is iterated. The idea(l) of set-theoretic purity (formulated above) was taken very strictly: in particular, all the axioms were to be stated in set-theoretic language; and all predicates P , for example in

(∃y)(∀x)[x ∈ y ⇔ x ∈ z ∧ P (x)], were to be defined in elementary set-theoretic language. The reason for this (terminological) choice comes from an additional require- ment (beyond mere purity), not present in Euclid and considered in the next section. 10One correction: he forgot to list the relation of order. So, for example, points on a sphere with great circles regarded as lines would satisfy all of Euclid’s own axioms, except for the parallel postulate. 11Of course, from the context one sees that some irrational, but not all algebraic points were intended (the former by Pythagoras Theorem, the latter because it was not considered obvious that every angle could be trisected). Frege’s Foundations 62 4.3 Formal Proofs

Today this idea is very easy to explain because of familiarity with computer programs; in particular, for so-called non-numerical data processing. Specifically, the data which are being ‘processed’ (that is, manipulated) are formulas of the logical languages in question, and sequences (or even trees) of such formulas. The most modest aim is to find a program which has two properties: starting with a list of (necessarily, formal) axioms, it generates

1. only formulas which are logical consequences of the axioms (for example, from A and A → B it generates B)

2. all formulas which are logical consequences of the axioms (so, if C is such a formula, sooner or later C will be generated by the program).

One could think of less modest aims in this area. For example, to find a program which, for each logical formula C, decides (that is, prints 1 or 0 according as to) whether C is valid. So far no requirement has been made to the effect that proofs are faithfully represented by the formal derivations generated by our computer programs; no more than most computer programs for numerical calculation imitate our actual methods of computation. Evidently, the requirement on a computer program to respect set-theoretic purity will, for this very reason, distort our actual proofs of set-theoretic proposi- tions. For example, if we want to know something about the set-theoretic version of N (and ∈ restricted to N or, more pedantically, to N × N ), we shall make use of previous experience in arithmetic, and transfer our arithmetic knowledge to the isomorphic set-theoretic copy (by use of the transfer principle on p. 56). For set-theoretic purity this simple transfer would have to be replaced by a kind of padding; of deriving the set-theoretic translation of the arithmetic knowledge involved from the formal axioms for set theory. This requirement of set-theoretic purity is also formulated as a certain doctrine of formal rigor.

Formal Procedures We use the word ‘formal’ as synonymous of ‘mechanical’. So, a formal procedure can be programmed on a computer. A formal decision method is a formal method for deciding problems (of some given class). For example, the use of determinants for deciding if, for given (say, rational) aij and ci, the equations

aijxj = ci (1 ≤ i ≤ n, 1 ≤ j ≤ m) have a solution xj (1 ≤ j ≤ m). Frege’s Foundations 63

A semi-formal verification method is a formal method for verifying that a problem (of some given class) has a solution (note that the method need not show that a problem does not have a solution). For example, the use of trials and errors to find out if, for a given polynomial P with integral coefficients,

P (x1, . . . , xn) = 0 for natural (or integral or rational) x1, . . . , xn.

Formal procedures in logic An obvious formal decision method for validity of a propositional formula F built n up from letters p1,..., pn uses truth tables: try out all 2 truth distributions on 12 p1, . . . , pn, and evaluate F for them (this can easily be mechanized by letting 1 stand for true, 0 for false, 1 − p for ¬p, and p · q for p ∧ q). For a predicate formula F the notion of validity is equally easy to state (true in all possible worlds), but sounds forbidding. Suppose the formula F is built up from relation symbols Ri with variables (for objects) xj (1 ≤ j ≤ ni: in particular, ni = 1 means that Ri is a predicate, ni = 0 that Ri is a proposition). Then F can be interpreted in any ‘world’ which consists of a collection of objects

U (the universe over which the variables range), by interpreting Ri as a relation U ni Ri ⊆ U defined on U. F is valid if its interpretation is true for each (nonempty) U U and each interpretation Ri . Although this sounds mind-boggling, there is in fact a semi-formal verification method for validity.13 Its existence is made plausible by the fact that validity in countable worlds ensures full validity. This can be seen as follows: 1. F can be put into (an equivalent) so-called prenex form, in which all the quantifiers are at the beginning. For example, ∀xG ∧ ∀yH goes into ∀x∀y(G ∧ H), and ¬∀xG goes into ∃x¬G. 2. Suppose F is in prenex form, for example

∀x1∃y1 · · · ∀xn∃ynF0(x1, . . . , xn, y1, . . . , yn),

and is true in a universe U. Then there are functions fi(x1, . . . , xi) defined on U i such that

∀x1 · · · ∀xnF0(x1, . . . , xn, f1(x1), . . . , fn(x1, . . . , xn)) is true in U.

3. Let U0 be the part of U generated from some element u by applying the fi repeatedly: F is true in the countable universe U0 too. 12There are more efficient procedures, one of which is described in Chapter 10. 13One such method is modelled on the ‘more efficient’ decision procedure for propositional logic quoted in note 12. On the other hand, it can be proved that no formal decision method exists. Frege’s Foundations 64 Formal procedures in mathematics The discovery of a formal presentation of logic suggested a program: to find suitable formal presentations FB (in terms of elementary logical axioms expressed in a suitable language) for any given branch B of mathematics. This would go beyond mere purity (discussed in Section 2), requiring formal purity. Even for those branches B for which the program can be carried out, it cannot be expected to increase the reliability of one’s conclusion. For example, purely numerical computations can be carried out formally, but one still wishes to use abstract theorems to check such computations. Even less relevant is formal rigor for explaining mathematical certainty. For example, if a proof π of F is not a derivation in FB, the existence of a derivation 0 π of F in FB (which we do not have in mind when looking at π) cannot explain the conviction carried by π itself. The principal philosophical problem is thus to discover the significance of formally rigorous expositions for particular (not necessarily: all) branches B that can be so presented. The best use of a formally rigorous presentation of B will, in general, exploit specific properties of B. The most obvious general significance is, of course, that such a presentation makes B a candidate for mechanization; ‘candidate’ because genuine mechanization involves questions of efficiency.

4.4 After Frege’s Foundations

The principal ideals of Frege’s foundations have been objected to in the last half century both by philosophers (Wittgenstein, Popper, Lakatos) and mathemati- cians (Bourbaki), usually without detailed knowledge of the formal development of logical foundations. Their criticisms of the logical view (as being imprecise or circular, or as ignoring the cultural or historical ‘relativity’ of the notions regarded as basic, i.e. their dependence on environment or evolution14) express a malaise about those notions, as drawing attention away from aspects of mathematics and mathematical reasoning which are more relevant to the broad concerns of the philosophy of mathematics. The examples we consider here are (the later) Wittgenstein, and Bourbaki’s manifesto [1948] on the architecture of mathematics (which is the most sophis- ticated exposition of a view that has dominated a certain part of mathematical

14This goes well with the great success in biology in the last half century. Naturally, what excites the philosopher (and his readers) is the popular understanding of those successes; not the snags which worry the specialists. As a corollary, and contrary to the traditions of ana- lytic philosophy, it cannot be expected that a precise analysis of the recent (popular) views is rewarding, since the views themselves are weak in detail. But they draw attention to a general scheme which guides the choice of striking specific facts: to illustrate similarities and differences in the evolution of species (by lavish Nature) and the evolution of mathematical knowledge (by tame mathematicians). Frege’s Foundations 65 practice over the last half century). Both of them contrast the ideals of Frege’s foundations with aspects of mathematics which are:

1. so to speak, by the light of nature, obviously essential to mathematics, including the matter of reliability;

2. equally obviously, simply dismissed in the logical schemes.

An example of 2 is the choice of notation in elementary arithmetic; or of abstract (e.g., suitable group-theoretic) properties, defined by explicit definitions (from some specific structure such as the lattice of integral points in the plane). Without exaggeration: such choices, among all valid possibilities, are literally lumped together in the logical categories, which makes the latter demonstrably inadequate w.r.t. 1. The objections referred to above do not rely on the paradoxes. On the con- trary, though the latter are usually presented as an embarrassment to Frege’s foundations, they come as a Godsend for one of his principal aims, of securing the principles of mathematics: without those paradoxes there was not even the appearance of any urgent need for a logical analysis. Bourbaki hardly mentions the paradoxes, ignoring them as a minor aberration; a view which is supported by the fact that Cantor warned Frege back in [1885] about his thoughtless axiom. In contrast, Wittgenstein argued strenuously for some deep (that is, revealing) misconception behind the near-hysterical reaction of logicians to the paradoxes; perhaps, to be compared to Freud’s dramatization of slips of the tongue by ar- guing as if all of them were profoundly revealing. Wittgenstein’s and Bourbaki’s objections are complementary also in another respect. The former uses experience from quite elementary mathematics, ne- glected (sometimes consciously) in logical foundations. The latter uses experience of advanced mathematics, backed by full details of a new systematic exposition (cutting across the logical system). It is easy to get a general idea of the objections from the following parallel with

Newton’s foundations Here we mean the famous foundations in Newton’s three laws of motion about mass, distance, and time (and the ‘reduction’, or ‘definition’, of force in terms of them). In the parallel, these notions correspond to the basic stock of logical notions in logical foundations. Clearly, here very different things are lumped together, for example, by having the same mass: objects of very different shape, color or chemical composition (or, simply, of different commercial value). Newton discovered one class of phenom- ena, the motion of the planets under gravity, for which the mechanical parame- ters above were adequate. In the parallel, the particular property of gravitational Frege’s Foundations 66 attraction in natural science corresponds to the validity of mathematical propo- sitions after translation into the logical language mentioned.15 The parallel can be pushed one step further: recognizing the correctness of the translation corre- sponds to recognizing the purely gravitational (as opposed to electrical, magnetic or other) character of the attraction involved. The misgivings of critics of logical foundations can be related to the following well-known limitations of Newton’s scheme.16 First, Newton’s scheme is effec- tive in the (practically speaking) narrow area of celestial mechanics relevant to the motion of planets (and spaceships), and much less for terrestrial mechan- ics. Secondly, knowledge of the motion of planets is almost irrelevant to such problems as their origin or composition. And it is fair to say that other scien- tific work at the time of Newton, for example, early chemistry (distinguishing between different chemical components of bodies of the same mass or even of the same density), has done much more than Newton’s magnificent theory towards answering satisfactorily the question: What is natural science (about)? The critics of logical foundations attempt to provide, for mathematics, something like an analog of the clever, apparently isolated, observations and arguments in chemistry and other areas which extended the scope of natural science beyond mechanics. Oversight by (most of) the critics of logical foundations: Newton’s foundations remain to this day a powerful tool in all of natural science, for solving simple problems simply; so-called dimensional analysis.

Bourbaki’s foundations It does not tell us much about a physical phenomenon if we are told that it is the result of the action and reaction of forces.17 It is not only necessary to know what kind of forces are dominant (gravitational, electric, magnetic, etc.), but often also, for example, the chemical composition of the matter involved. To put it crudely: the least interesting thing to say about gravitational and magnetic forces is that, in both cases, forces are involved. In terms of the above parallel, it does not tell us much about a mathematical structure if we are told that there is an (isomorphic) set-theoretic copy of it. It

15Historical anecdote. E¨otv¨osspent many years of his life checking that the (gravitational) attraction between (electrically uncharged) masses was independent of their color and chemical composition. Modern critics of logical foundations would compare the Principia of Whitehead and Russell to the work of E¨otv¨os. 16The deeper criticisms of Einstein are not involved here. 17This is meant broadly. Very simple questions may be answered by knowing that some force is present: for example, if we merely want to know whether or not a particle will move with constant speed in a straight line; in particular, when we do not care how much the motion deviates. Frege’s Foundations 67 is much more rewarding to replace the one (basic) notion of set by a relatively small number of (so-called) basic structures, of which groups, fields, topological spaces, etc. are typical. Bourbaki discovered that use of these basic structures made possible to reinterpret an astonishing amount of old, and to build up much new mathematics (including solutions to many old problems).18 The standard text that documents the discovery (and assembles the technology based on that discovery) is Bourbaki’s Elements of mathematics. The following examples illustrate the kind of reward provided by the change of perspective just discussed. 1. About a century ago careful measurements of the spectral lines of chemically pure substances led to the famous composition law of Ritz-Raleigh. In isolation it was an observational curiosity. Much later it was reinterpreted as constituting evidence for the quantization of energy (in radiation from excited atoms). Here mere mechanistic foundations, in the sense that some forces obeying Newton’s general laws are involved, is (to use the hackneyed phrase above) the least interesting side of the matter.

2. Euclid’s algorithm for the greatest common divisor g of two numbers n and p implies that if p is a prime then the set {0, . . . , p − 1} with addition and multiplication modulo p19 is a field.20 The consequence just mentioned of Euclid’s algorithm appears as a neat curiosity in any elementary text of number theory. To use this curiosity for establishing the fact that {0, . . . , p− 1} is a field is tantamount to discovering new significance (in the curiosity). Digression. In terms of a different parallel, Bourbaki’s work can be seen as the 20th century mathematician’s solution of the job done in natural21 language by vague expressions. The job in question is: flexibility in many (often unexpected) situations. The mathematical solution is to have a relatively small number of precise variants (of the vague expression) which serve on relatively many occa- sions. For example, instead of the vague expression ‘number’ one has: (number) fields, real and real closed fields, etc. The skill needed to pick the relevant expression for a particular occasion replaces the skill needed to use vague expressions sensibly in everyday affairs.

18Warning. Also Bourbaki uses the language of sets to answer formally the question: What is a group? But it does not need set theory, that is, recondite properties of sets; and in the great bulk of group theory the formal answer to the question above is irrelevant (the words ‘collection of objects’, ‘operation’, etc. are used without formal explanation in set-theoretic vocabulary). 19This means that n ⊕ m and n m are, respectively, the remainders of n + m and n · m after division by p, where ⊕ and denote addition and multiplication in {0, . . . , p − 1}. 20The surprising fact here is the existence of an inverse, i.e. that for 0 < n < p there is an m < p such that n m = 1; in other words, for some a: nm = ap + 1. By Euclid’s algorithm, there are whole numbers a and m such that ap + nm = g, and (a and) m can be so chosen that (a < n) and m < p, since g ≤ min(n, p); if p is prime and 0 < n < p, then g = 1. 21Tacitly, natural to those who have neither reflected much themselves, nor tried to learn from those with specialized experience. Frege’s Foundations 68

The advantage of the mathematical solution is that elaborations of precise (families of) notions stand at least a chance of being rewarding, while this is never true of vague notions; even when the latter are adequate for simple situations (cf. the boring, hence unrewarding, measured prose of philosophical disquisitions).

Philosophical perspective We now take a brief critical look at Bourbaki’s alternative to logical foundations. a) Developments The following reminders concern the lopsided character of all general views. First, it is granted that the (modern) atomic view of matter constitutes, in an obvious sense, great progress compared to our untutored view. Far greater than, say, the relevant developments of Newton’s mechanistic foundations in what is known as rational mechanics of solids, liquids and gases (where matter is classified accord- ing to its phase state, rather than its chemical composition or atomic structure); or, perhaps, even greater progress than statistical mechanics, where neither clas- sification is relevant. But nevertheless there are plenty of macroscopic phenomena where the microstructure is irrelevant, and so it would be a practical error to drag in the atomic view. Second, our knowledge of the microstructure and its quantum-theoretical laws has created a certain bias in the phenomena around us.22 Specifically, many more phenomena in the civilized world have atomic character in the sense described above; that is, atomic forces and quantum-theoretical laws are dominant. This applies, obviously, to atomic bombs, but also computer hardware (which exploits the properties of semi-conductors and their quantum-theoretical character). In contrast, in the wilderness we do not see these products of our understanding of microphenomena.23 So it will not come as a surprise that current mathematical literature shows a bias to developments according to the current lopsided view or, for that matter, towards old questions that lend themselves to solutions in line with this view. b) Implications Even if one corrects for the bias just described, the progress over the logical view is very striking. This applies even to questions about validity and reliability, because the logical emphasis on principles of proof (and their validity) is usually

22They are perfectly real phenomena: this is not the issue at all. 23Warning. In a trivial sense, everything has microaspects, matter being atomic and energy being quantized: but these particular aspects are not always dominant. On the other hand, sometimes they are unexpectedly central: as the source of solar heat (which is due to nuclear fusion), or of volcanoes (which are due to radioactive decay in the interior of the earth). Frege’s Foundations 69 misplaced: the dominant source of error lies in incorrect applications of correct principles, and reinterpretations have provided efficient cross checks. A general lesson to be learnt from this is that restrictions on principles of defi- nition or proof are not automatically ‘justified’; no more than Cartesian doubts.24 For example, restrictions on principles may lead to more complex proofs, and thus a higher probability of error in applying the principles and a smaller reliability. Similarly, for efficiency of solutions. Since logical foundations were the intended application of contemporary logic25 (i.e. notions, conjectures and problems of the latter were chosen for their rele- vance to the former) and such an application has turned out to be defective, a shift of emphasis will be needed to salvage the mathematical methods developed along the way. c) Defects Broadly speaking, only the rhetoric accompanying logical foundations, not the quite substantial contribution of contemporary logic is taken into account. To start with, some elements of logical foundations have in fact changed the face of mathematics. For example, logical notation is an obvious improvement of ‘natural’ mathematical language: an astonishingly simple grammar and vocab- ulary which (often) replace with advantage the vagaries of ‘any’, and the like, with little loss of expressive power. Again, far from being logical hygiene (or, perhaps, even mere logical cosmetics) the language of sets permits a principle of reasoning by analogy, refined by first order language (in technical terms: by transfer principles).26 Second, no attempt has been made in the literature to assemble and perhaps even organize those aspects which logical foundations fail to consider or to ap- preciate.27 Moreover, the potential of contemporary logic for sharpening this has been almost completely neglected, contrary to broad scientific experience. For not only the virtues, but also the defects of a theory or even of a (scientific) strategy are best established by developing or even carrying it out, provided the (necessary) search for tests is not forgotten.28 Contemporary logic has of course

24Descartes forgot the rule of thought: dubious doubts are no holier than dubious assertions. 25In our century, in contrast to the past, most new mathematics was developed for application to already well established parts of pure mathematics: contemporary logic is an exception, to the extent to which it continued to be developed for logical foundations. 26This fact remains though, admittedly, it has been overstressed compared to less advertised but equally useful transfer principles within mathematics, such as Hasse’s (local vs. global). 27Here it cannot be assumed that the most effective presentation will take the form of metathe- orems, with an aesthetic appeal comparable to that of logical ones; ‘aesthetic’ because the effective use of logical metatheorems usually requires a great deal of ingenuity (cf. the number of such uses of, say, Hasse’s principle and logical preservation theorems). 28Sample. Recent work on so-called reverse mathematics has used (perfectly, according to the canons of the logical tradition) logical categories for classifying proofs and theorems, thereby Frege’s Foundations 70 been used to sharpen objections within the logical tradition, familiar from the many ‘negative’ metatheorems about incompleteness or undecidability. Here it is used to sharpen objections from without, that is, by reference to the broad philosophical concerns for which logical foundations were intended.

Some neglected concerns One natural first step to correct the (second) philosophical defect mentioned above is to list a few slogans that draw attention to broad philosophical concerns where logical foundations have failed, for systematic reasons as it were.

• Recording and improving a natural language, including natural trade jargons such as written or (the very different) spoken languages of mathematical tribes throughout the centuries. One improvement is provided, as discussed above, by logical notation.

• How to benefit from mathematical experience, without the fiction of would- be empiricists that mere validity is the only virtue of mathematics (and thus correction of our notion of rigor the only progress).29 One way is to dis- cover the relevant range of ‘possibilities’, e.g. by establishing generalizations (remainder: the possible is prior to the the actual).

• How to go beyond informal rigor, without the fiction that there is no such thing. Examples of informal rigor are the definitions of length or convexity in geometry; or of the perfect computer, and thus of mechanical rule.

• Combining formal rigor and flexibility, the latter without the use of fuzzy notions (and fuzzier theory), but by discovering relatively few (precise) notions which are adequate for relatively many (actual) situations.

• How to use both theses and antitheses in appropriate domains, without necessarily synthesizing them.

• When the coarse is better than the refined, and conversely. For example, in the case of a suitable coarse quotient (without fictions about intensions).

• The triple role of generalizations: in the literal sense, for analyzing rel- evance in the case of theorems, and making proofs easy to take in and permitting comparisons with classifications by competing systems, and thus judgment of the adequacy of those canons. 29Popper, Lakatos and others are obviously disturbed by the fact that logical foundations (seem to) leave no room for experience. It is a mere assumption that the principal (or even only) virtue of experience is to correct a lax notion of rigor; as if logical foundations were defective on logical grounds. 100 years ago the silent majority rejected abstract sets altogether, while today they are used freely (except for Frege’s particular axiom). In short, the arsenal of notions and proofs is expanded. Frege’s Foundations 71

remember (by paraphrasing untutored proofs in the language of suitable abstract structures).

• From foundations to technology. For example, if an assumption about men- tal processes (in Artificial Intelligence) is false for the intended domain of human data processing, it may indeed apply to a simulation of some result obtained by those processes.30

30Actually, simple-minded assumptions have, in general, a better chance of being realized by electronic or photonic hardware. Chapter 5

Variants of Frege’s Foundations

The term Frege’s foundations is here used to mean the study of logical features of mathematics, in the sense illustrated by Euclid’s style of presenting geometric proofs, and broadened to its current meaning (by work on Frege’s foundations). Here the general notions of predicate and proposition (together with operations on them), and problems of validity and definability are particularly prominent. This meaning corresponds quite well to what the interested (half-)educated pub- lic understands by the term, not necessarily to what Frege himself wrote, let alone intended, at various times; nor to the meaning(s) that Frege scholars have read into the passages which happen to impress them. If one is interested in Frege’s broad influence our meaning is relevant, and not historical scholarship; simply because the broad perception of Frege’s ideas cannot be computed from the historical facts (however fascinating they may be in themselves). To use a hackneyed phrase: it is best to make a fresh start, and take the so-called influence (that is, its end results) as a fact of our natural history. This chapter develops two principal points. First, the so-called rivals of (Frege’s and Russell’s) logical foundations (associated with Zermelo, Hilbert, and Brouwer) are here regarded as variants;1 respectively: to (correct and) simplify, refine, and extend Frege’s scheme. Each of the variations is seen as a special case of a familiar strategy in the pursuit of knowledge. In particular, the extension

0Originally published in The Monist, 67 (1984) 71–91, as ‘Frege’s Foundations and Intu- itionistic Logic’. 1In Kreisel [1980] I attempted to summarize (my) knowledge of foundations and stressed, quite correctly, the parallels (to 3 decimal places!) in the developments of different foundational schemes during the present century. But at that time I failed to see the obvious source of these parallels, namely one of the principal points of this chapter; those schemes are mere variants of a single scheme: Frege’s foundations. Correspondingly (in the tradition of Frege scholarship mentioned earlier) I overstressed differences from his scheme. So whatever weaknesses there may be in the present perspective, they are not due to lack of familiarity with the attractions of the opposite view.

72 Variants of Frege’s Foundations 73 provided by Brouwer’s intuitionistic logic concerns the class of propositions con- sidered: about incompletely defined objects such as choice sequences. In contrast, Frege or, for that matter, Aristotle thought that only propositions about precisely defined terms lent themselves to anything like logical theory. (The scientific value of the extension requires a separate analysis). This view of intuitionistic logic is fairly consistent with the mature Brouwer’s own research practice, but not at all with his rhetoric (about 7 sevens in the expansion of π or about the unreliability of classical logic). Secondly, the chapter explains the silent majority’s scepticism about all vari- ants of logical foundations, and (in Section 5) sketches a genuine alternative which is almost explicit in the manifesto [1948] (by the Founding Fathers) of Bourbaki with the aim of exploring the architecture of mathematics and our in- tuitive resonances to it. The alternative sees the weakness of logical foundations not in any logical defects, such as lack of precision, but in its sterility: stress on the logical elements of mathematics and mathematical reasoning draws attention away from discoveries of more significant features; more significant even for va- lidity and reliability, though these matters are less prominent here than in logical foundations. The weakness in question is relevant beyond the present topic of foundations, because this kind of weakness affects the whole analytic tradition in philosophy (which, as its name implies, relies on analysing familiar notions rather than on extending experience, usually by developing technology needed for such extensions).

5.1 Zermelo’s foundations

Set-theoretic foundations in the style of Zermelo [1930] are without doubt the best-known variant. Specialists emphasize the difference between:

1. the concept of set, Cantor’s variety that can be grasped as a unity (or, in G¨odel’sfelicitous terminology: the concept of subset-of )

2. the concept of predicate, of possibly unbounded extension, used by Frege and popularized by Russell.

But in the light of general scientific experience, the switch from 2 to 1 is quite standard, a matter of suppressing unmanageable detail. Specifically, generally sets are presented by means of predicates, for example, a curve (a set of points) by some particular equation or geometric condition (locus). However, both for theory and in everyday life, the totality of data, which happen to strike us about an object, tends to be unmanageable. In technical jargon: one considers an homomorphic image (tacitly: respecting appropriate - Variants of Frege’s Foundations 74 crude - relations between different data).2 The switch from 2 to 1 is of the kind just described, the crude relation being that of extensional equivalence between intensional predicates. Experience shows that we often know more about the objects involved than (what we imagine to be) the intensions. If and when matters change, it could be rewarding to look at intensions or, more precisely, some of their relevant aspects.3 A more specific difference is the restriction of the sets considered by Zermelo, built up from some unspecified supply of atoms (in German, Urelemente), to those built up from just one atom (the so-called empty set). Superficially, there is a big difference between Frege’s long-winded discussions whether Julius Caesar should be included among the objects considered, and Zermelo’s elegant solution of not excluding him from the supply of atoms. The irrelevance of the whole issue for (current) mathematics is then expressed elegantly by this: any structure in Zermelo’s hierarchy has an isomorphic copy in the restricted hierarchy, and (for structures in current mathematics) the copy is even definable in the familiar language of set theory. Closer attention to the available data, that is, to the predicates available for defining sets, becomes necessary when the limitations of currently formulated knowledge of sets are at issue (for example, in independence results). Each such ‘negative’ result has, of course, a positive side, familiar from all successful uses of Ockham’s razor: greater generality derived from using only partial knowledge about the specific objects originally considered.

5.2 Hilbert’s foundations

Finitist 4 foundations (following Hilbert’s best seller Grundlagen der Geometrie, where he emphasized the 2,000 year old ideal of purity of method) are by no means in conflict with the logical scheme pursued by Frege and Russell, but a most natural, albeit misguided, attempt at a refinement. The idea is this.

2Contrary to an almost universal oversight, this switch occurs in the simplest kind of percep- tion, for example, of a pattern of parallel segments; at least in so-called preattentive perception we may note and remember the direction, but neither the length nor, say, the color of the segments. In fact, some evidence on the corresponding loss or gain of information in the data processing in sense organs and the peripheral cortex is now available. In the light of this fact of experience, the analysis by Frege and Russell of the abstract concept of direction as a class of suitable lines or concrete segments appears as an artifact. Of course: not the precision of the distinction between concrete and abstract, but its significance appears dubious. 3When the role of perspective was discovered in the way 3-dimensional objects present them- selves, it was natural to enrich early geometry by taking perspective projections into account. 4The term formalist foundations, actually coined by Brouwer [1913] for Hilbert’s scheme (since, in mathematics, formalism is the lowest form of life), later fired Hilbert’s imagination. When the scheme stagnated in the twenties, his claims for it became more inflated; of having finally nailed down the laws of thought. They were published in Hilbert [1930]. Variants of Frege’s Foundations 75

Granted set-theoretic foundations of, say, arithmetic or analysis, with a defini- tion of the natural number series in set-theoretic terms, Hilbert asked: Can’t we do better? Specifically, when using such definitions for proving purely number- theoretic results, for example, the insolubility of a diophantine equation (about + and ×), which sets are really relevant? Wouldn’t it be satisfying if one needed only sets defined in elementary number theoretic language? The archetypical ex- ample is provided by the restriction of Peano’s axioms, with induction applied to arbitrary predicates to ensure categoricity, to first-order arithmetic where only predicates in the language of elementary number theory are used. For the special cases where only the successor or even only + or × (not both) is used, the corre- sponding system is indeed complete, that is, all true statements in the elementary language can be proved from the corresponding elementary axioms. The popularity of this ideal of Methodenreinheit (purity of methods) goes back to the Greeks. If anything, it was made more attractive by Hegel’s threat that everything is connected with everything else: at least ‘in principle’, Hilbert’s scheme would show that one need not look at anything other than the natural numbers, + and × to settle questions formulated in these terms (in the sense of elementary logic). Throughout the first half of this century the ideal caught the imagination of many number theorists, in the specific glamour issue of an elementary proof of the Prime Number Theorem. It is clear enough what can be gained by Methodenreinheit: extraneous matter is suppressed. What is lost? Well, knowledge of relations between the objects referred to in the theorem stated and other things; in short, the kind of knowledge obtained by looking at those objects from a broader point of view. In the case of the Prime Number Theorem, by looking at the natural numbers as embedded in the complex plane.5 Incidentally, Hilbert’s finitist foundations, although an almost inevitable con- tinuation of the peroration to his Grundlagen der Geometrie on the ideal of Methodenreinheit, came to be peddled as a means of ‘securing’ mathematical reasoning. This sales pitch was extraordinarily popular despite the fact that:

1. even if wholly successful the ideal would have ‘secured’ only a tiny part of 0 mathematics, that is, identities expressed in so-called Π1 sentences 2. the ‘justification’ of our conviction in ordinary mathematical reasoning would consist in the possibility of replacing it by different, finitist reasoning (for the same conclusion). 5Thoughtful number theorists are quite explicit about this loss; for example, Hasse [1964] (p. 282) stresses the greater Beziehungsreichtum (wealth of connections) of the analytic proof of the Prime Number Theorem. In fact, after the elementary proof(s) of the latter the princi- pal identity used was very elegantly reinterpreted in terms of the zeta function familiar from Analytic Number Theory; cf. pp. 102–103 of Ellison [1975]. Variants of Frege’s Foundations 76 5.3 Brouwer’s foundations

The popular view of the subject of intuitionistic foundations goes back to Brou- wer’s rhetoric, starting in the first decade of this century, about the unreliability of (and thus the need for restrictions on) the laws of classical logic. This negative side is often associated with the old constructive tradition in mathematics. But (as documented at length in Kreisel and MacIntyre [1982]) the strategy of logic- free, often delicate algebraization within mathematics is more effective here, with so-called functional interpretations of classical logical systems serving as a crude kind of algebraization adequate for crude results. In any case, Brouwer’s rhetoric about the unreliability of classical logical laws assumes (tacitly) a nonclassical meaning of the logical formalism! Incidentally, logically sensitive mathemati- cians use different words of the vernacular for those different interpretations, for example, ‘there is’ as opposed to ‘we find’ or ‘we exhibit’. Remarks on reliability. Realistically speaking, questions about the reliability of laws (or principles) - as opposed to the reliability of their application in, say, complicated proofs - are quite rare singularities in mathematics, and certainly more rewarding in other branches of knowledge. (It is quite remarkable how skillful philosophers have succeeded in making a ‘problem’ out of mathematical reliability or certainty.) The popularity of Russell’s paradox at the beginning of this century made the matter of reliability an irresistible temptation for exponents of logical foundations. But in the long run this was unfortunate; as will be seen below, particularly for Brouwer when he himself came to see the principal interest of his logical contributions in the study of choice sequences (to which he devoted the great bulk of his publications in logic): whatever the virtues of choice sequences may be, the reliability of our untutored ideas about them is not their principal virtue. We now turn to Brouwer’s mature interest in intuitionistic logic, its positive side as it were.

Propositions about incompletely defined objects We refer to incompletely defined objects such as various kinds of choice sequences or, for that matter, of random sequences which, though of course different, are also incompletely defined objects. (This is generally forgotten because our knowledge about them is paraphrased - so far, very successfully - in measure-theoretic terms). To put first things first: both ontogenetically and phylogenetically one’s first idea is that there cannot be (anything like our) precise logical laws for propositions about incompletely or otherwise ill-defined things. This is familiar from the popular superstition that you can’t apply logic to the emotions (as if silly or random behavior were not often more predictable than clever strategies). By good fortune Aristotle is on record on the subject of propositions containing incompletely defined terms (in Metaphysica Γ 7, 1012a, 21-24 and other places); Variants of Frege’s Foundations 77 pointing out, briefly, that the law of the excluded middle would not hold. Frege went the whole hog, and required, more heavy-handedly, that concepts be defined for all objects in the universe, banning any trace of vagueness (as some people give a monetary value to everything in sight). Digression. To get a real feel for the suspicion of propositions about in- completely defined objects, readers should resist the temptation to make fun of Frege’s literalmindedness which, after all, differs only in degree from that of the average professional philosopher. On the one hand there is the (vague?) hope that, on closer analysis, a proposition which strikes us as vague or even senseless would turn out to be simply false.6 On the other hand the parallel to Galileo’s successful strategy in mechanics is inescapable, where he found immensely man- ageable laws by simply excluding the vagaries of friction and air resistance in suitable experimental set ups, and Newton finding them excluded by Nature in celestial mechanics, the (perfect) motion of the planets. Frege was shortsighted; but, without exaggeration, here less so than most of the time. Now, work on choice sequences (and intuitionistic logic) has conclusively cor- rected those first impressions documented for more than 2,000 years. It is a major contribution to the old tradition of providing (at least, contextual) definitions and axiomatizations of notions which are familiar or can be explained informally in familiar terms. This tradition of informally rigorous conceptual analysis includes a good deal of geometry (with von Staudt’s points at infinity and infinitesimals at two extremes), Russell’s business of the definite article, and much more besides. Remarks for specialists. Obviously, the use of intuitionistic logic for axioma- tizing, say, recursive realizability (among other ‘interpretations’ in the literature) also belongs to this tradition: but this was not among Aristotle’s major preoc- cupations. Secondly, for the topic of incompletely defined terms (in the present context, in contrast to the constructive tradition) the current literature on ‘strong’ systems like ZF with intuitionistic logic is by no means teratological, provided extensions by variables for choice or, say, random sequences are envisaged. As with all such conceptual analyses, the axiomatization at issue leaves open

6Or, at least, could be coherently declared to be false. This is indeed the convention underly- ing Zermelo’s formulation of set theory, where a ∈ b is false when in the intended (cumulative) hierarchy a ∈ b would be senseless because a and b are not of coherent type. By confining ourselves to the specific sets obtained by iterating the cumulative power set operation only, we can see the implications of making the convention above for atomic formulae a ∈ b and then extending it to compound formulae in the language of set theory in the usual way. The restriction to this language prevents the temptation of trying to extend the convention unreasonably, as can be seen from the ‘paradoxes’ connected with truth definitions. As is well- known, the requirement T (¬p) ⇔ ¬T (p) on the ‘truth predicate’ T cannot be satisfied if one insists that each side must have a truth value. Clearly the device of giving a conventional value to atomic formulae which have no sense per se, cannot be used because: both T (p) and T (¬p) are atomic; and for the familiar diagonal formula, say αD, which produces a paradox, neither T (αD) nor T (¬αD) makes, prima facie, any sense. In short, the device which works for the ∈-relation above, does not work here. Variants of Frege’s Foundations 78 to what extent the notion considered is suited for describing or otherwise han- dling the facts for which it is intended. After all, it cannot be the sole aim of any reasonable discipline to perpetuate the defects of current ‘natural’ language. As we shall see below, this matter is in fact a major concern of thoughtful sci- entists and mathematicians.7 And, granted that a particular informal notion is worthwhile, there is the related concern just how far the rigorous analysis should go. To clinch the matter it is best to push rigor beyond the point of diminishing returns, or get somebody else to do so: we do not merely wish to pursue the ideal of rigor, but test it.

Parallels between the theories of sets and choice sequences As far as foundational results are concerned, it can now be fairly said that the simplest kind of choice sequence (so-called lawless ones) play a similar role to the simplest kind of set, namely those built up from the empty set (and not Zermelo’s original unspecified supply of atoms); more precisely, a similar role for the foundation or ‘modelling’ of the mathematics of incompletely defined objects, and of classical mathematics respectively (with appropriate adequacy conditions on the - different - kinds of modelling involved). Such a ‘lawless foundation’ for the principal choice sequences studied in the sixties was provided only quite recently.8 Not only the results, but also the developments of the theories of sets and choice sequences show striking parallels (besides those in Kreisel [1980]). For example:

1. The early work of Cantor and Brouwer showed a sure touch, and the silent

7The same also applies to philosophers like Wittgenstein (in his later writings) although one would not guess this from the literature, for example, by Dummett or even Kripke, who try to make a ‘traditional’ philosopher out of him, intent to argue the hind legs off a donkey. As for mathematicians it is a commonplace that the analysis, for example, of the notion of continuum (real number) has long passed the point of diminishing returns; more often existing knowledge of the continuum is transferred to other less familiar structures by use of set-theoretic isomorphisms or some other kind of equivalence. Above all, it remains a perennial question whether knowledge of the continuum is relevant to (old or new) problems at hand; specifically, whether for a particular number-theoretic problem an embedding of the integers in the (plane) continuum, or in p-adic fields, or some suitable combination of such structures is significant. 8By Van den Hoeven and Moerdijk [1984]. They show that each proposition A+ about lawless sequences is equivalent to one (say, A−) in a restricted language. But this does not exclude the possibility of using new axioms for lawless sequences to solve problems in ordinary mathematics; after all, the theory of the A is incomplete. More specifically, A+ may be evident for its intended interpretation while A− is not. This is certainly true for some of the A+ asserted as (logical or non-logical) axioms, where A− needs a rather elaborate proof in the restricted language. As far as actual knowledge is concerned, it is pure ideology to dismiss differences in provability. The ideology is tempting because the latter are easier to state in familiar logical terms. Once again, the current concern is whether this slick language is adequate to the ‘epistemological situation’. Variants of Frege’s Foundations 79

majority was overcritical of that work.9

2. When others tried to formulate their understanding of what the pioneers had done, blatant errors were made. Russell’s paradox, derived for Frege’s axioms against which Cantor had explicitly warned, is well known. Kleene refuted his own first attempt (of identifying choice sequences and recursive sequences) in [1952] by deriving a contradiction from the fan theorem.10

3. When their subjects were stagnating, the pioneers made silly mistakes. For example, Cantor in his attempted proof of the preservation of dimension by topological mappings, and Brouwer in his attempted generalization of the bar theorem to non-monotone predicates.

4. Later workers overestimated the potentialities of particular aspects of the new notions. For example, in connection with higher types: in the case of set theory, an overestimate of the replacement axiom (even in the absence of the power set axiom), but also in the case of continuous (countable) functions, especially when combined with (the classical contradictory) con- tinuity axioms for choice sequences.

5. Whole-hearted attempts to enrich the two theories by means of delicate data, formulated in extensions of previously current languages for choice sequences and sets, were made only in the sixties and seventies. For ex- ample, by myself and Reinhardt with some corrections by Troelstra and Kunen.

By reference to an unfamiliar example of logical hygiene provided by choice sequences, the parallel above clarifies and supports the aper¸cu in Bourbaki [1948] concerning the language of abstract sets as such hygiene; abstract in contrast to, say, finite-sets-of-integers or perfect-sets-of-points, which need not and are not generally thought of as ‘instances’ of the general notion of abstract set.11 Now, the hygiene supplied by choice sequences is this. Long before any theory was elaborated their principal property, the continuity of (all) operations defined on them, had been used by Poincar´eand Brouwer in topology. The logician, regard- ing success as too arbitrary a criterion, asks: Why the restriction to continuous

9For balance: not critical enough of the rhetoric! Samples: higher cardinals get us closer to the Almighty (Cantor [1932], p. 400); theorems about choice sequences contradict classical analysis (which is about different objects, and uses correspondingly different logical laws). 10A more plausible mistake would be to identify choice sequences and random sequences since, as already mentioned, the latter are also incompletely defined objects. To get a contradiction, one has to go into the different data: open for lawless, measure-theoretic for random sequences; cf. 5 below. 11Though Cantor’s attention was drawn to that notion while working on point sets, he stopped thinking about the latter when he concentrated on the former. Variants of Frege’s Foundations 80 mappings? Answer: If the points in the spaces considered are determined by choice sequences then all maps are continuous.12 On the parallel we have transfer properties, in the sense of note 7, recognized long before the advent of abstract set theory. Specifically: 1. Dirichlet’s discovery that many theorems of his time about functions, thought of as rules, could be transferred to all other functions with the same graph

2. the discovery of such abstract notions as groups but without a formal answer to the questions: What is a function? or: What is a group? The language of abstract sets provides a convenient vocabulary, but generally little more. (Reminder of an occasional exception: existence of a free basis for certain uncountable groups.) It remains to be seen if, at some comparable stage of development, the language of choice sequences will find similar uses in some corner of mathematics (with luck, less remote from the center than the subject of uncountable groups.)

Proofs and logical operations applied to propositions about incompletely defined objects First a reminder about the turn of the century, Brouwer’s formative years. At that time Mach’s so-called verificationism (attention to methods of measurement in natural science) had fired Einstein’s imagination. It was a small step to trans- fer the idea to mathematics, where verification is done by means of proofs. As one talked of the observer in physics (although a photographic record would have served equally well), so one talked of the subjective element in mathematics. The matter was dramatized by resurrecting the hoary, alleged conflict between sub- jectivist and objectivist views of the world, thereby drawing attention away (in effect, if not by intention) from the opposite issue as it were: St. Thomas’s adae- quatio rei et intellectus (the remarkable adaptation of our intellectual faculties for gaining objective knowledge); cf. also Bourbaki [1948] on ‘structures and our intuitive resonances’. Be that as it may, in the intellectual climate of the period the only question was when (not whether) proofs would be introduced into the meaning of logical operations. In modern jargon:

enriching (mere validity of) propositions by proofs,

12It is a separate matter whether this particular instance of the (logical) ideal of reducing arbitrariness is scientifically rewarding; cf. the impassioned critique of the ideal in Weil [1974], p. vi. For specialists: because of the central role of axioms for data (open in the case of lawless sequences, analytic for another favorite), theories of choice sequences provide also another kind of logical hygiene; for saying out loud what mathematicians have learnt from experience: the need for close attention to the choice of suitable data (structures), for example, of different topologies on which to operate. Variants of Frege’s Foundations 81 and having logical operations act on proofs.13 Even if we prove theorems about ordinary propositions, not involving any incompletely defined terms, the proof will generally use only partial information about the latter. So it will apply to all objects about which this information is available, including incompletely defined objects (if they are considered at all). Much the same idea was behind Hilbert’s stress on the finiteness of proofs, (even) of propositions about infinite objects, later codified in his -calculus where each proof names only finitely many objects. In any case, smart ‘arguments’ aside, enrichments by proofs recommend them- selves inasmuch as mathematicians spend most of their time on these things any- way. When, in the early sixties, I resurrected Brouwer’s preoccupation with proofs (entering into the meaning of propositions), I wrote down a formalism with vari- ables for proofs and (in Bourbaki’s phrase: their least interesting side, namely) for the relation between a proof and the proposition proved. Two points were stressed: 1. the additional ‘elbow room’ for modelling provided by the new dimension of proofs

2. the decidability of the relation above, familiar enough from the special case of proofs and propositions coded in formal systems (and attractive for the logical tradition because of the possibility of a ‘reduction’: the intuitionistic logical operators were ‘reduced’ to logic-free manipulations of decidable propositions; in contrast to Tarski’s ‘non-reductive’ truth definition, the obvious object of comparison). As to 1, already a quarter of a century ago I had had experience of additional parameters for completeness proofs.14 Today, the same idea is more familiar from Boolean-valued or sheaf-valued models. Here it is natural to think of an additional parameter over elements of a suitable Boolean algebra or sheaf. Why not add proofs (for example, those coded in some manageable system)? Certainly, as shown trivially by Troelstra’s [1981] (pp. 209–211), nothing goes wrong formally.15

13Remark on the attraction of adding proofs, over using intermediate truth values: in proba- bility theory we already have something like intermediate truth values, and we know that there the logical operations are not particularly central; for example, the probability of p ∨ q is not generally determined by the probabilities of p and of q. 14For example, in the case of the excluded middle, the additional lawless parameter α to get ¬∀α[P (α) ∨ ¬P (α)] where P (α) is ∃n[α(n) = 0], while ¬(P ∨ ¬P ) is of course contradictory for any P . 15In the last decade, ad hoc enrichments by formal proofs, for example, of Kleene’s recursive realizability in fp-realizability (in Beeson’s good old days), or Lopez-Escobar’s enrichment of proof figures by additional proofs at the nodes in his [1976] (consolidated in his [1982]) have been quite successful. Variants of Frege’s Foundations 82

As to 2, the slogan was: I recognize a proof when I see one, and it isn’t a proof (for me) if I don’t. This idea is perfectly parallel to the convention (already discussed in note 6) where a (dubious or) senseless proposition is declared to be false. Needless to say, if one forgets the convention involved in the slogan one can manufacture paradoxes; just as the convention about sense in note 6 conflicts with the (genuinely!) most obvious laws about the truth predicate T when applied only to propositions that do make sense, for example, (of course, without the convention) T (¬p) ⇔ ¬T (p).

The technical means of enforcing decidability was to enrich the logical operations themselves: instead of merely having functions mapping

proofs into proofs, one also adds proofs (of the Π1-assertions) that the functions do what they are supposed to do.16

Nothing rewarding has come of this except to remove the itch of wanting to know if this sort of thing is coherent at all.17 Obviously, more could be done with imagination and determination. But I still believe (as already stated in Kreisel [1971a]) that logical high-jinks are misplaced until one finds a really convincing function of the additional proofs (of Π1-assertions), not mere drivel about validity.

5.4 Philosophical Hygiene

The blind spot is by no means an isolated phenomenon, but typical of a whole philosophical tradition, sometimes called ‘critical’. It assumes that such stop-gap notions as reality (in ontology) and validity (in epistemology) must be fundamen- tal; in the sense of being rewarding subjects for endless refinements. Realistically speaking, those notions serve very well at an early stage when we know too little about the phenomenon involved and about our knowledge of it in order to ask sensible specific questions: we can always ask if it is real or, respectively, valid. Besides, at an early stage, we often do well to ask just that (and so the most popular objections to those notions are not much better than the traditional claims for them: for example, when logical positivists regard them as inherently illegitimate, or as symptoms of some dreadful confusion). The simple fact is that, before long, those so-called fundamental notions and ‘problems’ about

16For example, in the case of implications p → q, and an operation i on hypothetical proofs π of p, also a proof of: ∀π[(π ` p) ⇒ i(π) ` q]. 17For a long time I had suspected that the complaint above would be misinterpreted to mean that a proof need not be valid or need not establish the proposition it purports to prove (and not, as intended, that the notion of validity and the relation between a proof and a theorem proved are too crude for any significant theory); just as complaints about consistency have been taken as advocacy for inconsistent systems (and not merely as being, generally, insufficient for soundness). Amazingly this kind of misunderstanding is now documented in print, in the priceless piece Parikh [1979]. Variants of Frege’s Foundations 83 them become sterile; for example, there is precious little of interest to be said about all valid proofs. Perhaps worse still: preoccupation with those notions draws attention away from (otherwise quite obvious) genuinely rewarding analyses. For example, from more relevant interpretations of discoveries that have obvious ‘raw’ interest, as in the following two instances. 1. Frege’s rules for logic were presented by him as contributing to some kind of ethereal standard of rigor, required by the Higher (philosophical) Sensi- bility. Not only today, but even at his time they could serve as evidence for the possibilities of Artificial Intelligence; of mechanically recognizing and manipulating at least logical proofs presented in (suitably artificial) logical form. ‘Even at his time’ since he was quite familiar with Leibniz’s project of a ratiocinator which Frege criticized as mind-boggling (zu riesenhaft; cf. p. 168 of Bachmann [1975]). This criticism would be met by stressing the obvious fact that pure logic in his system is a very small item in mathemat- ical reasoning. Frege’s neglect of this aspect of the matter fits very well his Higher Aims.

2. A more recent example of the blind spot at issue is to be found in the work of Lakatos. He collected a remarkably convincing set of elementary pieces of mathematics to convey a principal lesson of scientific experience: the value of asking new questions as knowledge expands (where of course, in general, a solution to the old question may not be adequate for the new version). Pre-occupied with validity, he presented his collection as evidence for some kind of Hegelian historical relativity of the notion of proof; mixing in the trivial fact that the literature contained some errors or oversights, and cases where the original proofs were perfectly adequate for the old questions, but not for the new, better versions. Finally, the ethereal view of validity mentioned in 1 above may be contrasted with a realistic view. The former makes the assumption that the dominant source of error is to be found in (logically) dubious principles of proof. But, if such doubts are in fact dubious, then they draw attention away from the genuine problem: the probability of applying correct principles incorrectly. Certainly, in cases of long routine computations mechanization genuinely reduces that probability, and so in this area Frege’s formal rules contribute to rigor in a realistic, not only ethereal sense. But in the bulk of mathematics a more significant safeguard against incorrect application is to require that proofs be easy to take in and to remember18 (5.1) over and above the validity of the principles (intended to be) used. And one way of achieving 5.1 for surprisingly large parts of mathematics is to follow a 19In Wittgenstein’s German: ¨uberschaubar and einpr¨agsam. Remark on Kripke [1983]: Variants of Frege’s Foundations 84 few, remarkably specific guidelines in Bourbaki [1948] (on an admittedly far from literal reading of it, sketched in Section 5). Corollary for Frege’s so-called anti-psychologism and his stress on validity ‘in principle’. Evidently, 5.1 conflicts with both of these concerns. Trivially, there can be no question of all valid proofs satisfying 5.1, let alone, of all such proofs being built up according to the scheme of Bourbaki [1948]. Less trivially: since that scheme applies primarily mathematical analysis to achieve the aim 5.1 with its literally psychological flavor, it just takes the sting out of Frege’s diatribes against psychologism. There is nothing revolutionary about that! as shown by the precedent of perspective projections in geometry which contribute to understanding some simple features of visual space (cf. note 3); an obvious precedent to which Frege was, apparently, quite blind. The corollary above is of obvious interest to those of us (today) who are willing and able to develop a balanced view of the subject. It is a quite separate matter whether, given Frege’s temperament, it would have been of much use to him. After all, he had nothing to contribute to 5.1 anyway (and nothing substantial to the possibilities of Artificial Intelligence mentioned earlier). So, on balance, his formal rules and notation may have benefitted from the philosophical pretensions with which he surrounded them, in line with Churchill’s idea that the Truth needed a bodyguard of lies.

5.5 Bourbaki’s alternative

The scheme of Bourbaki [1948] achieves the aim 5.1 by means of a delicate ax- iomatic analysis in terms of a few basic structures called structures m`eres; for example: groups, topological spaces, etc. Here is the general idea. Proofs are broken up into a few lemmas, each being stated abstractly; more specifically, each lemma concerns only very few basic structures. So the whole proof is easy to take in. Furthermore, the lemmas are so chosen that their proofs are built up relatively simply from the defining properties of the basic structures

Kripke’s account of this requirement disregards the fact that 5.1 is needed if any one of us is to understand the proof, a notoriously private affair, and not only for public agreement. Amazingly, Kripke does not mention the similar-sounding, quite hackneyed requirement, al- ready used by Hilbert and recently resurrected by Thom, that proofs must be (represented) in spatio-temporal form for public inspection, allegedly a precondition for agreement. Warning. In constrast to Kripke [1983] we treat the aim 5.1 as, so to speak, a fact of our natural history instead of deriving it from dramatic philosophical claims. This is parallel to our treatment of Hilbert’s program in Section 2, by relating it to the age-old taste for purity of method; in contrast to Hilbert’s (later) claims about the formal nature of the laws of thought (cf. note 4). In [1983] Kripke goes back to sceptical ‘arguments’ in the style of Hume (his favorite philosopher when he was 12); cf. Section 4 which touches on similar dramatics and appropriate logical hygiene. Variants of Frege’s Foundations 85

involved. So each lemma, being abstract, is not only ‘general’, but carries in its very statement a kind of code for its proof.20 So proofs are easy to remember

The present scheme is not to be confused with the universal logical scheme of analysing mathematical proofs in terms of axiomatic set theory (for example, the logical complexity of the axioms used). In fact, the two schemes are at cross purposes.21 Evidently the present scheme is more difficult to apply since it requires selection among several, albeit few, basic structures while set theory has been peddled (for example, by Russell in the introduction to Principia) for its ‘unity’: one basic object (set), relation (membership), propositional operator (for example, Sheffer stroke), quantifier.

Warnings to literal-minded readers Certainly, Bourbaki [1948] is full of almost embarassingly thoughtless remarks to which its author may have given as much weight as to the material in it which is stressed here.22 But the fact remains that [1948] constitutes a serious attempt to record some broad reflections which accompanied work on Bourbaki’s treatise (even if they did not guide it). Bourbaki’s several other ‘manifestoes’, with a much more hackneyed, crudely formalist flavor, are ignored here since, in contrast to [1948], they simply conflict with the practice of Bourbaki’s treatise. For example, the latter (contrary to basic formalist doctrine) does not even mention the logical or set-theoretical axioms (tucked away in the introductory volume) where they are used later; in terms of section 4: a (logical) hygiene which is not applied. In [1948] the particular choice of basic structures is claimed, albeit in passing, to be related to the architecture of mathematics and our intuitive resonances to it. The latter suggests an obvious empirical content concerning our intellectual capacities. However, it has to be admitted that, at least so far, the choice has been primarily used for solving mathematical problems rather than testing for the empirical content at issue.

20Reminder. A theorem about the particular structure R could require anything for its proof; a theorem about all locally compact spaces can use only local compactness . . . 21This is beautifully illustrated by an episode from the turn of the century. While Hilbert advertized the equivalence between Pappus theorem in projective geometry and commutativ- ity in skew fields in algebra (with a quite disastrous balance of trade for algebra), Poincar´e introduced much more fruitful relations between other algebraic and geometric properties, for example, fundamental groups of manifolds. In a similar vein: in connection with the parallel postulate, Poincar´ewas not satisfied with any old non-euclidean plane (sufficient to settle the logical question of independence), but introduced one which has remained relevant to several significant parts of mathematics. A contemporary view of these matters is documented in Rados [1906], on awarding the Bolyai Prize to Poincar´e(rather than Hilbert). 22For example, such embarassing grunts as the complaint that logic consists of syllogisms valid for arbitrary (!) premises. Part IV

ON RUSSELL

86 Chapter 6

Russell’s Logic

Those who are interested in Russell’s logic at all will surely have looked at his own lucid expositions, for instance his charming Introduction to Mathematical Philosophy. And those interested in the circumstances in which the work was done will have read the first volume of his Autobiography and his earlier essays on his intellectual development. I, for one, certainly cannot improve on Russell’s own accounts. What can perhaps be done is to try and supplement these accounts from a different point of view. As far as his logic is concerned, the obvious question to ask is how it looks to us, or at least some of us, sixty years later. Russell himself did not write about this matter at all, and as far as I know did not speculate about it. But to judge both from Hardy’s A Mathematician’s Apology and from my own conversations with Russell, he was concerned about it. As far as atmosphere is concerned, the atmosphere in which his logical work was done, I have of course no direct information. But I can draw attention to some accounts which complement Russell’s own writings. I do not think that the facts or overt actions of the period are in doubt. All this is found in Russell’s writings. But, attractive as they are, with their robust Victorian style, there is something missing for our present way of thinking. Somehow there isn’t much awareness of an unconscious, least of all in himself. I think much more of it is found in the accounts of the period by Russell’s contemporaries or near contemporaries, such as the economist Lord Keynes, the philosopher Broad, the mathematician Hardy, and to some extent the historian Trevelyan. These accounts are more highly charged, more Edwardian. The En- glish Edwardians seem to have discovered a bit of an unconscious, different no doubt from Freud’s version, which fits middle-aged Viennese housewives so well. I can add one thing that may be useful. It so happens that in my student

0This chapter was read on March 5, 1970 to the Hume Society, Stanford University, at a symposium on the life and works of Bertrand Russell, and originally published in Bertrand Russell: a collection of critical essays, Pears ed., Doubleday, 1972, pp. 168–174. 87 Russell’s Logic 88 days at Trinity (Cambridge), Russell’s old college, I had personal contact with the contemporaries mentioned. Now, descriptions of atmosphere require sensitiv- ity and other perfectly objective but simply rare talents. One will attach weight to such descriptions only if one trusts the author. God knows, personal contact doesn’t always increase one’s trust in people’s judgment. But at least retrospec- tively I find that my personal contact with this group has definitely given me more confidence, and thus helped me to form a more vivid and, I believe, more complete picture of the period of Russell’s work on logic. It is just possible that some of you who may have overlooked this material will share my impression.

Logic Russell was a pioneer. So his work did not depend on a great deal of previous knowledge. Indeed, it is easy to say what it was about. The question was:

What is mathematics?

The proposed answer was to be given in logical terms, involving such general concepts as object or thing, proposition or property and operations on these con- cepts, so-called logical operations. It could not reasonably be expected that fa- miliar mathematical experience would become more certain in this way, but more understandable from a theoretical point of view. A hackneyed but good parallel is the atomic theory to answer the question:

What is matter?

Evidently, just as we don’t have detailed knowledge of atoms when we start, so we do not expect to start with too detailed knowledge of general logical ideas. But equally obviously, it would be idle to start if one had nothing definite, no laws at all to build on. What Russell had to build on were two great and by now well-known contribu- tions of the nineteenth century, now a hundred years old, Frege’s logical language and Cantor’s theory of classes. The logical language

(¬, ∧, ∨, →, ∀, ∃) is an unexpected discovery: an extraordinarily simple vocabulary to express the logically significant aspects of our thoughts. Not all significant ones, because not all of them are logically significant. It’s something to be compared to the discovery that physically significant quantities can be expressed in terms of mass, length, and time. As to Cantor’s theory of classes, fortunately we live in the era of the New Maths; so Cantor’s theory of classes or sets need not be explained. Whatever weaknesses the New Maths may have for learning mathematics, it’s an excellent preparation for listening to a popular exposition of Russell’s logic. Russell’s Logic 89

When we now look back at these two discoveries, no single application of them can compare in interest to the discoveries themselves. This was very differ- ent when Russell entered the scene. To seize the imagination, to get sense and direction, a general scheme needs problems; either problems form outside which are understandable without the notions of the theory and solved by means of the theory, or problems within the theory to pinpoint its weaknesses and to show where it needs attention. Russell provided both sorts of problems.

The theory of descriptions The first kind of problem concerned a quite modest but perfectly intelligible puzzle. Its solution, by Russell’s theory of descriptions, gives a good idea of the kind of uses one could make of a logical language. The puzzle is this:

Is it true that the King of France is bald?

Obviously one can survive without giving a second thought to the puzzle. There is no King of France, so what are you talking about? One ignores the question, or, if one prefers current jargon, one dubs it as meaningless (and con- tinues to ignore it). In short, in the ordinary sense of the word, it isn’t necessary to consider the question. The philosophical question is different. Not whether it is necessary, but whether it is possible to make something of the question. This is a luxury, an intellectual luxury. (But then, at least in an Age of Affluence, men do not live by bread alone.) Russell gave an analysis by essential use of logical language. The King of France is bald if and only if

There is a unique object which has two properties: first, of being King of France, second, of being bald.

Evidently this assertion is plainly false, because there is no object which is King of France, let alone a unique object which is both King of France and also bald. Naturally, if today we want to illustrate the use of logical language, we wouldn’t quote a mere puzzle. But given some imagination, one could see in Russell’s analysis a hint of something that works, so to speak, on its own steam.

Russell’s paradox As an example of the second kind of problem, of pinpointing a weakness in the general theory, we have Russell’s famous paradox. (I mean a weakness in the theory, current at his time, of the logical concepts to be used for answering the question: What is mathematics?) Let me say a word about it, not only because of its intrinsic interest, but also because popular expositions are quite different from the way Russell himself looked at it, according to his Autobiography. Russell’s Logic 90

I mentioned earlier on the logical vocabulary in terms of which the question: What is mathematics? was to be answered. One of these terms is the relation usually called ∈, where a ∈ b means: the object a has the property b (if b is a property), or the object a belongs to the class b, or simply a is a b. If you do logic, if you want to answer the grand question in logical terms, you are supposed to understand this relation. Frege’s logical language tells you how to build up logically complicated expressions from it. But before we have a theory, we must have laws, principles for forming new objects, in particular new classes from given ones. A very tempting principle was:

Given a property P defined in logical language, form the class of all objects which have the property P .

In the normal course of events the principle is certainly something quite simple, e.g., the class of people in a room under 600. The empty class, I suppose, under 300. But by no stretch of the imagination could one say that the principle above is very clear. We are not talking of concrete properties, but of logical ones. The hallmark of a logical property is its generality. So the objects which might satisfy such a property can be anything under the sun, objects of the past, present, future, etc. It is one thing to be mildly uncomfortable about it. But this is very different from finding a clear-cut error. Russell considered the property of objects:

X 6∈ X and derived a contradiction. Put slightly more positively, he showed:

To any class C satisfying the condition: if X ∈ C then X 6∈ X, there is a class bigger than C, namely C ∪ {C}, which also satisfies the condition.

The paradox is superficially similar to the puzzle solved by the theory of descriptions. In the latter, we had the odd phrase ‘The king of France’; here we have ‘The class of all classes that don’t belong to themselves’. There Russell could retranslate the odd phrase in a natural way. Here he needed more than a rephrasing (for his aim of setting up a genuine theory of classes). A very obvious reason for being disturbed by the paradox was not slightly hysterical, but quite serious. Having got accustomed to the principle, people had not considered the pos- sibility of needing others; they were unprepared. They did not look for others (I called the principle tempting, and I suppose temptation often makes one forget alternatives. Sometimes of course it draws attention to new ones). Russell, in contrast to many, was not panicky, and did look for alternatives. Russell’s Logic 91 The doctrine of types The result, the so-called doctrine of types, has dominated the subject for the last sixty years. The idea is quite simple. We don’t mix cabbages and kings. More seriously, the idea was that objects present themselves in a hierarchy, objects which are not classes at all (say of type 0); classes of such objects, say of type 1, and so on. There are refinements of this, concerning also the definitions of these classes. But the basic idea is that possible members of a class have a uniform type. It is fair to say that Russell lost interest in the idea. I do not mean he rejected it, but that he lost interest. And I think it’s intelligible. Surely a universe laid out in such types is more manageable. But as a philosopher he could not be satisfied because our actual mathematical experience does not present itself in this way. At this stage the mathematicians took over, in particular Zermelo, in a way typical of mathematics. Leaving aside the question of an analysis of actual objects, the mathematicians considered those objects, or rather some of those objects, which are built up in the way Russell postulated. The greater clarity of the structure makes it indeed manageable, satisfying remarkable formal laws. It is in fact nothing else but the current theory of sets, the foundation of the bulk of existing mathematical practice. The reason just given for Russell’s loss of interest is sheer speculation. But is seems more convincing than his own reason. He was, he said in his Autobiography, exhausted and disgusted after the labor of writing Principia. This would hardly explain his permanent loss of interest.

Principia Mathematica This has brought us to Principia, the work containing the evidence for Russell’s answer to the question: What is mathematics? It does not make exciting reading; in fact, it is not read, and its details are never quoted. I believe its role is parallel to that of other massive studies behind the answers to grand questions such as: What is gravity? Remember the massive experimental work of E¨otv¨oswhich showed that gravity is universal, independent of the color, shape, or chemical composition of the objects. Nobody looks at the details, but the existence of this work has shaped an important part of our whole view of the physical world. Similarly, it may fairly be said that the existence of the Principia has shaped the modern mathematician’s view of his own subject and, even more, his expo- sition, what he says about it. For better or for worse, every text begins with a chapter on set theory even if this chapter is not referred to again in the rest of the text. The chapter on sets is there essentially because the Principia says that this is what mathematics is about. Chapter 7

Principia Mathematica: critical or speculative philosophy?

This chapter will not only be about Principia Mathematica, generally regarded as Russell’s most substantial contribution; but also about his hopes and disappoint- ments concerning this work. These feelings are not only of biographical interest, but also, I believe, useful clues for understanding Principia and related studies. Principia was to provide the evidence Russell had for his answer to the ques- tion: What is mathematics? namely: Mathematics is the theory of classes or predicates. Inasmuch as ‘class’ is a general logical concept, mathematics is there- fore a part of logic; one speaks of the logistic reduction. To understand this an- swer, you must of course understand the logical terms used, in particular ‘class’. Well, you do because of the New Maths, where you learn about classes.1 It is, I believe, no accident that the New Maths is useful for following a discusssion on Russell since, it seems to me, the pedagogues who have been advocating the New Maths were enormously impressed by Russell’s answer to the question: What is mathematics? Because of this answer the pedagogues feel that children ought to know about classes - at least as much on such normative grounds as because of practical uses of such knowledge. The practical value of educational reform is always hard to judge, and hence logical or philosophical norms tend to have great weight here. As to Russell’s feelings about Principia, which he describes vividly and re- peatedly in his later writings, he was badly disappointed. Philosophically, because Principia did not provide the particular certainty which he had hoped to find in mathematical knowledge. Mathematically, because the details of Principia were not used, not even of those parts (on relation arithmetic) which he himself liked

0This chapter was read on October 16, 1972 at the University of Leeds, as a Centenary Lecture in honor of Russell, and is published here for the first time. 1Actually there is a subtle difference between Russell’s intended concepts and the classes - or sets as they are now called - of the New Maths, but this difference will not be relevant till the end of the chapter.

92 Principia Mathematica: critical or speculative philosophy? 93 particularly. I shall not elaborate on his mathematical disappointment beyond observing that most pioneer work in mathematics is completely superseded. Such work rarely gives the best choice of details, so necessary for the intelligibility of proofs - a more important concern for the mathematician than is generally real- ized. Russell’s philosophical disappointment, I think, closely related to the question in the title to this chapter. Especially in his later years, he came to look at Principia as a contribution to critical rather than speculative philosophy. This view of Principia seems wrong; it produced his dissatisfaction, the low ratio of expectation to results. The expectation was wrong, not the results uninteresting. In those circumstances we shall do well to take a view different from Russell’s own! In fact, I shall adopt a view which is often valid for the work of pioneers like Russell who have caught the imagination of the (general) public: Have faith in the objective interests of the work, but use later experience to help analyze this interest. The pioneers did not have that experience at their disposal.

Mechanization of information Before I even begin to go into its philosophical interest I must consider another aspect of Principia, specifically a weaker proposition than Principia’s own claim, of great general - though not philosophical - interest and importance; namely this:

The great bulk of mathematics and of its current uses can be replaced by the mathematics of Principia. Thus, in particular, the concepts needed can be built up by means of the very simple vocabulary of Principia with its formally precise grammar.

The proposition is clearly weaker than Principia’s claim in that we here speak of the bulk of mathematics, not its essence; and of current, not of all possible uses. In short, we don’t speak of the nature of mathematics at all. The philosophical and mathematical imperfections of Principia referred to earlier, hardly affect the weaker proposition - and not at all its obvious consequence:

A remarkable amount of information can be mechanized, in the sense that it can be built up according to such mechanical rules as the syntax of Principia.

Make no mistake about it. Mechanization is needed not only if we have to communicate masses of information, but whenever we have information for the masses, that is, for a large number of people who have little in common except a common task. This simple truth has been known for a long time, at least since the beginning of organized warfare. Military orders are always mechanical to ensure quick and reliable cooperation between masses (of soldiers). Military orders do not seem to involve the higher mental processes. It could not be assumed that Principia Mathematica: critical or speculative philosophy? 94 mechanization of information could be carried very far. The mathematics of Principia, in contrast, is of a high level. It was indeed important that Principia carried out some mechanization in full detail; any mathematical defects of the particular choice of detail do not matter here: they do not cast doubt on the possibility of mechanization. Let me digress for a moment on the recent interest in ‘marxist’ explanations of (dubious) philosophical doctrines. Usually such doctrines are said to serve the interests of some devilishly devious wicked ruling class. The discussion of mechanization above suggests a variant. There is a doctrine in the philosophy of mathematics which has retained its direction despite repeated refutations; so- called formalism. It gives a central place to mechanization; incidentally not only of the information itself, of the language, but also of the information processing, of the proofs. Perhaps somebody will give us a ‘marxist’ explanation showing how this doctrine serves political purposes, but - for once - not the purposes of a wicked ruling class, but then of the ergodic part of society. More seriously, let me mention a social consequence of Principia which seems to me important and neglected; so to speak, its role in preparing the way for the computer age. The role I have in mind is delicate and, perhaps, difficult to be sure about. As I mentioned already, the particular details of Principia are not used, least of all in current computer science. But the founders of this science - such as von Neumann - benefitted, I believe, from their knowledge of Principia which had given them the conviction that something like non-numerical computation, like modern data processing, was possible. I find it hard to believe that, without this conviction, one would have known quite so quickly how to use computers when they became technically feasible. Of course it was not the mere existence of Principia that prepared the way, that conveyed the conviction, but Russell’s presentation which, as I said, had caught the imagination of those who later taught the founders of computer science. In this task of preparing the computer age some of the philosophical defects became virtues. For example, the weaker, more accurate proposition has Principia replace or ‘code’ parts of familiar mathematics where Principia claimed to tell us what mathematics is. But this restatement immediately raises doubts: replace for which purpose? code by what principles? By exaggerating, by asserting more, Russell inspired confidence and prevented premature doubts. (Of course he also believed his assertions himself - for which he had to pay with his later disappointments.) Needless to say, the same exaggerations which have been a great help in founding a science, may later hamper its proper development. As a preparation for the computer age, Principia clearly contributed to social reform, the reforms made possible by democratic technology for which (as for all technology) the mechanization of information is so necessary. Of course this piece of reform was not among Russell’s original hopes for Principia. In fact, I don’t know whether he ever thought of Principia in these terms even in retrospect. Principia Mathematica: critical or speculative philosophy? 95

Actually, I don’t know whether he thought - reflected - very much about any questions of social reform! though, as you know, he felt and spoke strongly about such matters, in particular on marriage and morals or on war and politics.2 Using popular language, we have to admit that Russell wasn’t very philo- sophical by temperament, nor, perhaps, by conviction. For when he spoke of ‘philosophy’ he certainly did not have in mind an attitude (a ‘philosophical’ atti- tude in the popular sense) but a systematic discipline. He wanted a new science (of philosophy), differing from the successful existing sciences principally in sub- ject matter. Its connection with philosophy in the traditional sense was to be this: the new science would solve a good number, if not all, of the questions which are traditionally called ‘philosophical’. (The rest might be rejected as senseless unless they happen to be solved by another science - in which case we should have been so to speak mistaken about their character.) Clearly, Principia is systematic enough; Russell certainly thought of it as part of the new sciences; and he would have thought the same of the much larger body of current mathematical logic. What is less clear is the contribution of this work to traditional philosophical questions, to their formulation and solution. It is convenient to separate these questions into critical and speculative ones. I shall begin with the former.

7.1 Critical Philosophy

By critical philosophy I mean a very narrow, but quite prominent part of modern Western philosophy; concerned with errors in our naive convictions, in what we regard as obvious. Thus this kind of critical philosophy regards our ordinary knowledge as dubious, in need of critical philosophy for greater certainty. The errors at issue are not thought to be mere oversights, to be corrected by closer attention (a common practice in science and ordinary life). The correction is supposed to need special critical faculties which - according to Kant - we don’t otherwise have occasion to use; or, better still, to need principles for testing the assumptions implicit in our ordinary reasoning.

2Take the business of free love in marriage which ended badly, at least for him; as somebody said: not with a bang, but a whimper. In Volume 2 of his otherwise frank Autobiography (Russell [1968]) he admits, very discreetly, to having discovered that he was not able to live up to his high principles (of free love). Perhaps he should have tried harder, other husbands have. I really don’t know. What strikes me as thoughtless is that he never discussed his principles by reference to his - not altogether original - discovery. Again, in connection with his pacifist stand in World War I, as far as I know he never looked back at the political consequences of his views. Consider the following debating point. After all, if he had prevailed and England had not intervened, we might still have the Prussian, Czarist and Austro-Hungarian Empires and with all these so-called reactionary empires around, we might not have had the massacres that took place between the two wars, let alone those of the Second World War. What disturbs me is that he never even raised the matter. Principia Mathematica: critical or speculative philosophy? 96

Let me say straightaway that, as I see it, this critical philosophy has not been rewarding (so far). But it does seem to me rewarding to examine how Russell wanted to go about it. He wanted to apply Ockham’s razor, that is, to eliminate superfluous entities. Entities are regarded as ‘superfluous’ if one can ‘abstain’ from assuming them and yet account for the data. Quite naively, the aim is unconvincing: what about the entities which can be eliminated, but are no more doubtful to us than those we keep? After all, if one wants to know about those ‘eliminable’ entities, it would be more natural to look for new data, to extend experiences (by new experiments as in the sciences) where those entities are likely to be relevant. Instead of pursuing these generalities, let me consider a concrete example where Ockham’s razor was applied, and see what good it did. A good illustration is provided by High School Algebra, the purely algebraic treatment of polynomials and their roots; without use of geometric, in particu- lar of continuity properties. This treatment goes back to Sturm, about half a century before Russell. Sturm thought that continuity involved dubious infinites- imals, and he wanted to apply Ockham’s razor to these entities. If this were really all, his work would have been superseded when Cauchy (actually at about the same time) eliminated infinitesimals from continuity consideration. But the inter- est of Sturm’s work remains, only his analysis of its interest has to be corrected. For one thing, the algebraic treatment is more general; amusingly, it does apply to infinitesimals! (or, to use respectable language, to so-called non-archimedean fields) where Cauchy’s treatment is out of place. But more importantly, many ordinary problems (where Cauchy’s treatment is meaningful) become more man- ageable when looked at algebraically. This is more important, though it has to be admitted that the idea of finding a ‘manageable’ method is both more subtle and less dramatic than the grand idea of correcting an error! And Russell, as he said himself, had a taste for simplicity in logic and, perhaps, for drama in exposition. Returning now to Principia, we find several discussions in Russell’s writings of the logistic reduction as an ‘application’ of Ockham’s razor. In fact, at one time he said that it reduced the possibility of error because it reduced, quite literally, the number of concepts considered; to the one concept of class or pred- icate instead of the great variety of concepts in ordinary practice. This is hardly convincing since one can tell as many lies about one thing, especially an unfamil- iar one, as about many (familiar ones). Curiously, Russell overlooked a different and perhaps more reasonable application of Ockham’s razor where one does not eliminate ‘superfluous’ entities, but superfluous assumptions about the entities considered. Practically speaking, one would have to verify that the bulk of cur- rent mathematics can be replaced not only by the mathematics of Principia itself, but even of suitable subsystems of Principia. Actually, such a refinement of the logistic reduction is possible; but - as in the case of algebraic proofs mentioned earlier - a convincing analysis of the interest of the refinement is a subtle busi- ness. At the present time the principal importance of such a refinement relates, Principia Mathematica: critical or speculative philosophy? 97 amusingly, to Hilbert’s programme;‘amusingly’ because his programme was a ri- val to Russell’s logistic foundations: the programme can be carried out for the subsystems of Principia in question, but not for Principia itself! Despite this point of detail, the brutal fact remains that neither Principia nor Hilbert’s programme has been rewarding for critical philosophy (of mathe- matics) simply because there is no evidence of errors in our naive (mathematical) convictions. Of course there are errors in many analyses for ‘theories’ of these convictions, and errors in our speculations about extending our naive concepts to unfamiliar domains. There we have no convictions; in fact, we usually have exaggerated suspicions, for example when Cantor extended the familiar concepts of cardinal and ordinal to infinite sets (and regarded himself as a misunderstood martyr, a victim of his contemporaries’ suspicions). Granted that there have been no errors in our naive convictions so far, there was of course little hope of progress for critical philosophy - at least in the narrow sense used here. Incidentally, I also see little progress in critical philosophy in the following, wider, sense:

We now don’t doubt what is obvious, e.g. the objectivity of the notion of natural number, but look for arguments which refute such doubts satisfactorily.

To avoid misunderstanding, let me stress that I speak only of progress so far, not of the possibilities of critical philosophy (in mathematics, let alone other domains). Perhaps we have simply been unimaginative, especially in connection with the search for refutations just mentioned.3

Antinomies I have not yet mentioned the work for which Russell is best known, his analysis of such pseudo definitions as ‘the present king of France’ or ‘the class of all classes which do not belong to themselves’ which occurs in his famous paradox. It is true that his analyses are often presented as pieces of critical philosophy; by him as applications of Ockham’s razor, by others as showing errors in our naive convictions (about classes). I think this view of the matter is not only false, but pernicious; it stops us from even beginning to think about genuine issues.

3After all, in the well-known study (Rokeach [1964]) of the 3 Christs of Ypsilanti (a lunatic asylum in Michigan), two of them argued the third out of his conviction! - This is not meant as a cheap dig at critical philosophy, since whatever may be wrong with paranoids they don’t lack what we call ‘logical power’. (For all I know, some eminent logicians were paranoid.) And if doctors tell us that paranoids cannot be argued out of their convictions we have to object - with all respect due to that profession - that they simply may have missed, so far, the kind of arguments that do succeed with paranoids. Perhaps we stand a better chance of finding such new kinds of arguments to refute abstract philosophical doubts since we have a less ‘personal’ stake in them than in the usual paranoid variety. Principia Mathematica: critical or speculative philosophy? 98

Incidentally, as will be elaborated below, his constructive work on a theory of so-called ramified definitions is, probably, of more permanent interest than the popular fine works about pseudo definitions. How do we really view those pseudo definitions naively? We simply reject them as senseless. There is no king of France: so what are you talking about? (And even if there were a king of France, but we didn’t know this fact, we still would not naively use the definite article.) Again, one does not associate even vaguely any definite meaning with that phrase ‘the class of all classes that do not belong to themselves’ - nor, for that matter to ‘the class of all classes which do belong to themselves’ (unless one has decided on closer reflection that no class belongs to itself; in this case we recognize that the last phrase defines the empty class). In short, if the principal issue raised by the antinomies were really the matter of errors in our ordinary reasoning, Wittgenstein and others would be wholly right in ridiculing the whole business. But they are wrong because there is another, less dramatic but perfectly genuine problem connected with these pseudo definitions:

To what extent can meanings be assigned systematically to prima facie senseless expressions? Even more, to what extent are the meanings ‘determined’, not merely consistently, but by simple rules?

These questions are truly problematic, highly speculative, because ordinary lan- guage appears so complicated that we have genuine doubts about finding simple laws which yield our ordinary meaning even approximately! We certainly have no naive convictions about the existence of such laws. As I see it, Russell’s own analysis of phrases containing the expression ‘the present king of France’ comes under a very general, natural scheme; ‘natural’ in that it is easy to explain and, equally important, easy to apply. We look at sentences built up according to the usual grammatical rules formulated in predicate logic. If an atomic part, that is, one without logical symbols, is prima facie senseless, we give it the truth value false (and then assign truth values in the usual way to compound sentences). This is natural: if we have to choose between asserting or denying the number 2 is blue, we’d rather deny it than assert it. Incidentally, this scheme is implicit in the systems of set theory which are the principal rivals to Principia. When Principia regards the types of sets X and Y as simply incoherent and rejects X ∈ Y , the rival system going back to Zermelo regard X ∈ Y as false.4 The rival systems are certainly natural and therefore manageable. Well, what can go wrong with this device (of applying Ockham’s razor to the concept of senseless expression)? For one thing, it assumes mutual independence

4To be precise, for a comparison one has to consider the so-called cumulative, not simple types. Principia Mathematica: critical or speculative philosophy? 99 of - what we treat as - atomic sentences. But in the particular case of the pseudo definition, say R for ‘Russell’: the class of all classes which do not belong to themselves, we implicitly require for all X

X ∈ R ↔ ¬(X ∈ X).

Now for each X (and, in particular, for X = R), X ∈ R is prime facie senseless; if X ∈ X is also senseless the device simply does not fix a coherent truth value for X ∈ R (and for X = R this situation arises). Russell’s discovery has been heuristically very useful, even if it doesn’t concern our naive convictions. For example if we write R(C) for

∀X[X ∈ C → ¬(X ∈ X)] we find R(C) → R(C ∪ {C}) and R(C) → C 6= C ∪ {C}; in short, there is no biggest class C satisfying R, just as there is no biggest natural number: its successor would be bigger!5 The list of applications of verbal pathology similar to that of Russell’s pseudo definition could be extended almost indefinitely. I repeat: I believe people were wrong to call Russell’s work on pseudo defi- nitions critical (in the narrow sense used here). But it should not be forgotten that there is rhyme and reason to their error, especially - if like Russell himself - they looked for a new science of philosophy which is distinct from the existing ones. What is the distinction? Most people could not be satisfied with a mere illusion of a science ‘prior’ to all sciences. But it is clear that the usual sciences do not question what is obvious, they try to extend the range of our knowledge. So if we had a science of critical philosophy, it would stand clearly apart form the other sciences.

7.2 Speculative Philosophy

In contrast to critical philosophy, the kind of work illustrated by Russell’s analysis of pseudo-definitions, speculative philosophy is not so easily distinguished from the other sciences. More specifically, several questions which were traditionally thought of as philosophical, were in fact solved by mathematicians or physicists by essential use of their specialized knowledge. Russell was so much impressed by this fact that he had lifelong doubts about the legitimacy of philosophy (as an

5In the best known, though not Russell’s own set-theoretic reduction of arithmetic the successor operation is precisely the operation which associates n ∪ {n} to n. Principia Mathematica: critical or speculative philosophy? 100 independent subject). Was it not merely a limbo for those topics which are not yet ripe for the paradise of the usual sciences? And in the Introduction to his History of Western Philosophy [1945], evidently meant as a kind of last will and testament, he looked to philosophy for some kind of practical wisdom - how to act decisively in the face of uncertainty (the kind of thing a gambler has to learn). He had reason to be disappointed in Principia, if he had hoped that Principia would provide such wisdom. It seems to me that Russell’s pessimism rests on an oversight; on the tacit assumption that the usual sciences are homogeneous. Each of them may, possibly, be homogeneous in subject matter, but surely not in the kinds of arguments used. For example in mathematics, there is an obvious difference between the propositions we call axioms and those we call theorems. Of course they are, as it were, ‘equally’ true: they don’t differ in degree of certainty. But there is a distinct difference in how we establish them - a difference between our reasons for axioms and the passages from axioms to theorems. The reasons for axioms or, equally importantly, for the choice of axioms often involves an analysis of notions with a distinctly philosophical flavour, while the (formal) proofs of theorems requires us to build up notions, in agreement with Kant’s distinction between philosophy and mathematics.

Since the analyses required have this philosophical character and since, in most cases, there is genuine doubt about the possibility of giving a ‘complete’ analysis, what better term is there for this kind of reason- ing than speculative philosophy?

With a little experience in the sciences it is easy to give illustrations. In logic itself we need only think of so-called completeness proofs for (the choice of) axiom systems establishing that all true propositions, in a given language, concerning some intuitive notion are derivable from the axioms. Such proofs always involve some analysis of the notion considered. Again, in the natural sciences we have an obvious difference between collecting and analyzing data (as emphasized by Francis Bacon) on the one hand and general qualitative arguments, for example in cosmology, on the other. After all, one would hardly expect to crawl around the universe to follow Bacon’s recipe. In short, the usual sciences contain philosophy because they contain philo- sophical arguments, where - on the present view of the matter - the adjective ‘philosophical’ applies primarily to arguments, not to subject matter. Of course, this characterization induces a distinction between topics (or subject matter) too according to whether they lend themselves to philosophical arguments. The ex- ample from High School Algebra quoted earlier provides a good illustration how distinctions in kinds of arguments lead to distinctions between objects (or subject matter). In this example we separate out algebraic arguments from other math- ematical (in particular, analytic) ones, corresponding to the separation between Principia Mathematica: critical or speculative philosophy? 101 philosophical and general scientific arguments. As already mentioned Sturm orig- inally emphasized differences in arguments. We now tend to speak of algebraic, as opposed to analytic, concepts or structures, which ‘lend’ themselves to algebraic arguments (and we speak of the algebraic ‘nature’ of certain questions about an- alytic structures if the questions lend themselves to algebraic treatment). Indeed, modern logic - by essential use of the completeness theorem referred to earlier - has found a precise sense in which algebraic or, more generally, so-called first or- der structures ‘lend’ themselves to algebraic arguments: any valid algebraic (first order) proposition about such structures can, in principle, be established by first order proofs. (So far, logic has been less successful in indicating which questions about non-algebraic objects lend themselves to algebraic solution.) It should not be assumed that Russell would have been satisfied by the char- acterization of ‘philosophy’ suggested above. Or by any other characterization for that matter! Remember, he wanted philosophy to be a legitimate science, and so had to look for similarities to the other sciences, he wanted it to be a distinct science; and so had to look for differences; and to top it all he had, as he said, a taste for simplicity. In addition, he was devoted to Ockham’s razor which, it seems to me, tends to remove precisely those topics which do lend themselves to philosophical argument. But it has to be admitted that there are also genuine objectiones to the present characterization. For one thing, it is difficult to use because one can only rarely see in advance if a subject lends itself to philosophical argument. This objection seems important because our experience (at least in the usual sciences) suggests that it is better to analyze our knowledge in terms of subject matter rather than by reference to the kinds of arguments we use. Russell himself noted this advantage early in his career, and formulated it under the slogan:

What we know versus how we know It is indeed remarkable, perhaps even surprising, that the most useful physical laws do not refer at all to properties which are decisive for our actual knowledge of the external world: whether the object is visible to the naked eye, its color or its shape (properties which are striking to us, but - as one says - physically insignificant). Again, in the last century mathematics was enormously simplified when people realized the extensional character of the assertions actually estab- lished. Specifically, though one always thought of functions as rules, as processes for stepping from an argument to its value, the statements actually established were independent of the processes but involved only the ‘graph’, that is the set of points consisting of arguments and (associated) values. In retrospect it is easy to find reasons for the advantage noted by Russell. One graph, say the extension of the identity function, is defined by any one of a totally ungraspable variety of different definitions! Or, in connection with assertions in place of functions, the possible ‘values’ are the two truth values true or false, while here are infinitely Principia Mathematica: critical or speculative philosophy? 102 many ‘processes’, that is proofs, which lend to any one (true) assertion. In these circumstances it is, I think, natural enough to look for characterizations in terms of objects, of what we know, rather than by reference to arguments, to how we know. After all, at least prima facie we have no reason to suppose that argu- ments or, more generally, processes lend themselves to a satisfactory theoretical analysis at all! Perhaps ironically, Principia itself provides one of the first steps to a theory of processes, of definitions (and, at least in my view, this contribution of Russell’s is of much greater permanent interest than his better known work on pseudo- definitions). I mean here the so-called ramified hierarchy of types, which may be described as follows. The kind of sets considered by Cantor (who however did not make this restriction explicit himself) are obtained by starting with a collection V0 of ‘individuals’, that is objects which are not sets in that they have no elements; then adding all subsets of V0 to form V1, then all subsets of V1 and r r so on. In contrast, in the ramified hierarchy, say V0,V1 ,V2 ,..., one does not r take (all) subsets, but - in the first place - definitions of subsets of V0,V1 ,..., r definitions in a suitably restricted language depending, respectively, on V0,V1 ,... It is called ‘ramified’ because at each stage of this hierarchy (of definitions) there r appear new definitions of subsets of any V , even new definitions of subsets of V0. Of course, in the simple, ‘primitive’ hierarchy V0,V1,... of sets, no new subsets of V0 are introduced after stage 1 because V1 contains already all subsets of V0 - The difference illustrates the greater simplicity, stressed in the last paragraph, of dealing with objects, the sets themselves, rather than with definitions.6 Russell’s original presentation of the ramified hierarchy (which he restricted to finite stages) was too complicated to be manageable. It was, much later, simplified and extended by G¨odel[1940] under the name constructible hierarchy, ‘constructible’ because, at that time, G¨odelregarded constructivity as being con- cerned primarily with definitions, not with proofs (which is the current usage, going back to Brouwer and Hilbert). At the present time we know much more about the (transfinite) constructible hierarchy than about the primitive hierarchy. So, without for one moment doubting the good sense of Cantor’s notion, there is the possibility that, for the actual understanding of functions, the constructible hierarchy will simply turn out to be more useful, because it is more manageable, than the primitive hierarchy. Put differently, since the primitive hierarchy is log- ically prior, we cannot exclude the possibility that the foundations of set theory are of limited use for the actual understanding of this subject. In view of what I said earlier about the nature of pioneer work, it is probably not psychologically realistic to try and explain all the defects of Russell’s presen- tation by reference to his general philosophical views!7 But it is undeniable that

6This is the difference, to which I alluded in note 1, between Principia and the New Maths, which treats sets and not their definitions. 7For example, in G¨odel’spioneer work on constructible sets he speaks of the relative con- Principia Mathematica: critical or speculative philosophy? 103 some of the defects are consistent with Russell’s views, in particular, his liking for Ockham’s razor. He was led to the theory of definitions, because - in the empiri- cist tradition - he wanted to eliminate the abstract objects defined, and to confine himself to the definitions on which we operate (or, perhaps, better: predicates, since one can hardly speak of ‘definitions’ when one eliminates the objects de- fined by them). But since this elimination was his principal purpose, the results actually stated about the definitions were little more than mere circumlocutions about the sets defined! There was nothing of specific definitional interest! Amus- ingly, there is a particularly striking omission connected with a topic that Russell himself stressed constantly: the requirement that his foundational scheme should ‘account’ for the empirical uses of the natural numbers, for example, their use for counting. Now here it is absolutely essential that we can operate effectively on the particular definitions chosen; given two definitions A and B, of the numbers a and b, we must be able to find a definition of the sum a + b of these numbers. Of course I do not claim that Principia is in fact inadequate, at least if we take ‘effective’ in its idealized sense of recursive. The omission consists in the fact that Principia doesn’t go into the matter at all! This is perhaps a suitable point to mention the work of Frege who anticipated some of the more important philosophical contributions of Principia, often in technically more perfect form. Speaking for myself, I have to admit that I find Frege’s perfectionism aesthetically tremendously appealing, and that I prefer his sharp, sophisticated wit to Russell’s playful and really rather gentle malice. But so to speak objectively, other considerations dominate. As to pedagogy, I mentioned already (in accordance with the Nobel Prize Citation, printed in Holmberg [1951]) that Russell caught the public imagination, Frege did not. And even today, if we want to convey a bit of elementary logic to a beginner, we use one of Russell’s cute phrases or metaphors. As to present knowledge, the difference between Frege’s and Russell’s contributions is of course quite small compared to any standard modern text. This too is of course to be expected from what I said, repeatedly, about the diminishing returns on early elaborations of pioneer work.

7.3 Looking Back. Russell’s Hopes and Disap- pointments

It was natural -objectively and subjectively - to compare the theory of classes to atomic theory; the former as an answer to the question: What is mathematics? the latter to the question: What is matter? The comparison is also natural for the present exposition because the atomic concept of matter is certainly, so sistency of the quite implausible ‘axiom’: all sets are constructible; where nowadays we’d say quite simply that the constructible sets can be proved, by the usual principles of set theory, to satisfy those principles - a fact which is patently stronger than the relative consistency result. Principia Mathematica: critical or speculative philosophy? 104 far, the most impressive result of speculative philosophy, a perfect example of a conception suggested by general philosophical reflection; not by looking at the familiar world around us. Incidentally, though both questions above are natural, it was by no means obvious that they had a satisfactory answer; specifically, it had to be discovered that the somewhat unmanageable question: What is matter? could be replaced by the more precise question: What is matter (or matter-in- bulk) made of? just because the atomic conception is true. Let me begin with the positive aspects of the comparison. The atomic con- ception goes back to the Greeks, some 2000 years ago. It took a very long time before anybody formulated any laws that the atoms were to obey. Frege and Russell were the first (both within the last 100 years) to consider any general conception of mathematics; they introduced the logistic conception and formu- lated laws which the basic objects - the classes - were to obey. I have no recipe for comparing progress in the last 100 and the last 2000 years. But, naively, Principia seems pretty impressive compared to the first explicit formulations of atomic laws. The atomic theory does not contribute to the aims of critical philosophy dis- cussed here. The theory extends the range of our theoretical knowledge; it does not correct naive convictions, since we have no convictions concerning the be- haviour of atoms at all! And of course our knowledge of atoms is not more certain but less certain than our knowledge of the familiar objects around us. In short, the present comparison doesn’t even suggest any contributions by Principia to critical philosophy; and certainly doesn’t lead us to expect such contributions. As far as our actual understanding of different branches of physics, respec- tively mathematics is concerned, the roles of the atomic theory and of the logistic reduction are quite similar in that they differ (at least at present) according to the branch considered. Specifically, so far the logistic reduction of arithmetic has taught us little of interest about the natural numbers; actually this reduction was of considerable importance for the development of the theory of classes. Similarly, some branches of ordinary physics, for example mechanics, which were essential for the development of the atomic theory, have so far benefitted little from the latter. Others, like the subject of semi-conductors, owe their existence to atomic theory. Similarly, there are parts of the theory of the continuum which can hardly be imagined, and certainly cannot at present be formulated, without the help of something like the logistic reduction. Evidently, the facts just mentioned provide the germs for disappointments especially for somebody like Russell who, as he stressed himself, insisted on the finality of his theories; ‘finality’ not only in the sense of being true (as far as they went), but being so to speak the last word on the subject considered! The brutal fact of the matter is that both the atomic theory and the logistic reduction look much less ‘fundamental’ when one remembers how much ad hoc imagina- tion is needed to discover the areas where they really add to our understanding. Principia Mathematica: critical or speculative philosophy? 105

Wittgenstein may well have had those fundamental theories (and philosophical questions generally) in mind when he quoted Nestroy: Progress always looks more marvelous than it is.8 As we see from Russell’s own experience, this superficial glamour of progress is bad for progress itself: deception creates disappointment, as the French know very well (d´eception). On the other hand, there is certainly also the objective difference between the depths of present fundamental theories in physics, and mathematics. Russell knew both. Neither he - nor we at the present time - could say anything about classes which is of scope and interest comparable to what he himself could tell lay readers in his ABC of relativity or his ABC of atoms. It simply cannot be excluded that the present state of mathematical knowledge is simply not suffi- cient for any satisfying philosophical analysis. After all, 300 years ago Galileo and Newton could make decisive contributions to some branches of physics, say mechanics. They certainly speculated about a microstructure, but did not make lasting contributions - and, probably, a case can be made in retrospect that, quite objectively, nothing for reaching could be inferred about atomic structure from the physical knowledge which was in need at that time. I confess that, to my mind, the speculations of Galileo and Newton are as interesting as their conclusive arguments. I believe that Russell enjoyed interest- ing arguments - but that he could not accept them as satisfying, perhaps from the same austere sense of duty which his mother showed toward one of his tu- tors (according to the Amberley papers). Come to think of it, Russell’s austerity, towards our intellectual needs, must have added to his disappointment in the following way. Traditionally, philosophy is expected to make our existing knowl- edge intelligible to ourselves, a task which, obviously, is never ending as long as we are (mentally) alive and acquire new knowledge - a never ending activity, like breathing, and as natural, as T. S. Eliot might have said. Furthermore, we have feelings about our knowledge, hopes and disappointments about its intrinsic qualities (and not only about our uses of it, say in our careers). Traditionally, philosophy is expected to make us aware of such feelings and, perhaps, help us come to terms with them. This kind of art is performed by the great novelists; however, usually not for our feelings about knowledge, but about other parts of our mental life, our perceptions or aesthetic sensibilities. Only rarely, for exam- ple, in Solzhenitsyn novel The First Circle about imprisoned mathematicians and physicists or in Musil’s The man without qualities do we find any concern for the feelings which we, undoubtedly, have about our knowledge. The need (to come to terms with such feelings) exists; if like Russell one does not mention it one is liable to expect - consciously or unconsciously - that it will somehow be satisfied automatically; perhaps even by the science, the systematic discipline by which one wants to replace traditional philosophy! Well, if one really expects Principia to satisfy this need, one will be very very disappointed - and quite understandably

8See Chapter 15. Principia Mathematica: critical or speculative philosophy? 106 disappointed in Principia itself. Chapter 8

Bertrand Russell

Russell left an autobiography in three volumes and two earlier autobiographical essays. These works were widely read. The style is fresh and lucid, perhaps unequalled since Bishop Berkeley or Hume; and as memorable. So the reader may be assumed to know the general outline of Russell’s life and thought. Since the complete bibliography of his writings is said to run to 500 pages there can be no question of attempting a full account here. The selection below, from his life and works, is made on the following principles. The well-known aspects of his life, including his activities as a publicist and reformer, are described only briefly. For balance Section 1 presents in more detail those aspects which, though perhaps equally important and sometimes quite explicit in his writings, have not become widely known. (Since Russell was a controversial figure, the selection made here may also be controversial.) Section 2 goes into ‘his researches concerning the Principles of Mathematics and the Mathematical Treatment of the Logic of Relations’ - to use the wording of the proposal for his election to the Royal Society. Some of Russell’s views on points of general philosophic interest related to his scientific work are sketched in Section 3. Many of his later general writings do not always respect Hooke’s warning (to the Royal Society) against ‘meddling with Divinity, Metaphysics, Moralls, Politicks, Grammar, Rhetorick, or Logick’, in the sense in which Hooke understood those words. To compromise with Hooke’s law, the memoir confines itself to describing Russell’s view of the world, how questions in the forbidden subjects presented themselves to him, without going too closely into the sense of the question or the validity of the answers. This is done at the end of Section 3.

0Originally published in the Biographical Memoirs of Fellows of the Royal Society, 19 (1973) 583–620. Many people have helped, directly or indirectly, in the preparation of this chapter; in particular, Sir Isaiah Berlin and Mr. Kenneth Blackwell of the Bertrand Russell Archives at McMaster University, Hamilton, Ontario in Canada.

107 Bertrand Russell 108 8.1 Russell’s Life

Childhood Russell was born on 18 May 1872 at Trelleck, Monmouthshire, Wales. His par- ents died before he was four, and he spent his childhood and adolescence at his grandparents’, on his father’s side, at Pembroke Lodge in Richmond park. His grandfather, the first Earl Russell, had been Prime Minister, and was famous for having used the immense power of the British Empire with discretion. In his writ- ings on education, Russell often referred to his own childhood, hoping to correct what he remembered as unsatisfactory. At least occasionally, he warned against 1 drawing general conclusions from his limited experience; for example (A126) his diet was atrocious by current views and yet, as he goes on to point out, he never had a day’s illness except for a mild attack of measles - and lived to be nearly 98, most of the time in excellent health and full of vigour. His psychological diet was, it seems, mildly unusual even for his time and class. Unlike, say, Churchill, he does not seem to have formed strong attachments to any of the people who actually brought him up; his nannies, governesses or tutors. He liked some of them well enough (A131), but they do not seem to have stayed on for a long time; possibly as Russell himself suggests (A149), because of the child’s own na- ture and its effects on the people around him. All this may be relevant to the more painful episodes of Russell’s life. For at least statistically, his kind of child- hood goes with long periods of loneliness in later years and, above all, a very schematic understanding of human nature which brings one unpleasant surprises; not only at other people’s conduct (always doubly unpleasant if one likes to have good psychological judgement); but also at one’s own feelings (when one is, so to speak, moved by one’s own emotions rather than by their objects). Be that as it may, Russell’s childhood was certainly not wholly unrewarding. He obviously had a great deal of affection for his family, in particular, for the remarkable col- lection of independent and, perhaps, formidable female relatives who must have enjoyed young Russell’s attentions more than they let on (A133 − 34). In any case they did not spoil his gifts for entertaining and scintillating in company. He describes their foibles mockingly, but the mockery is good humoured and gentle - in sharp contrast to his acid indiscretions about the dons at Trinity (A189 − 90); ‘in contrast’ even if one allows for the objective differences between the foibles involved. As Russell was to find out, experience of his relatives had not prepared him adequately for some of the women he encountered later, at home and abroad.

1 The letter Ai refers to volume i of Russell’s Autobiography, and the numbers are page numbers (similarly for other references). To avoid misunderstanding it should be noted that the pagination of A1 − A3 refers to the hardcover edition of Russell [1967], [1968], [1969], and differs substantially from the English edition. Bertrand Russell 109 Adolescence He was educated privately, with plenty of free time to pursue those grand tradi- tional questions which occur to us when - ontogenetically or phylogenetically - we begin to reflect. To some extent this was family business: John Stuart Mill was his godfather. Almost inevitably in the circumstances, questions about Euclid’s axioms cros- sed his mind. But, it seems even at this early stage, he showed robust good sense. He did not dismiss the questions - and surely was much more articulate about them than the average schoolboy as is evident from his diary - ([1959] 28 − 34) - but did not let them cramp his style; he greatly enjoyed doing geometry ([1944] 7).

Studies and research At Cambridge he studied mainly mathematics and philosophy, and was soon rec- ognized to have exceptional talents. He was awarded a fellowship at Trinity for an essay on the foundations of geometry. It expounds fairly orthodox views, familiar since Kant and the German idealists; naturally without Kant’s gratuitous stress on Euclidean geometry. But it exhibits already one of the most striking qualities of Russell’s later work: a light and sure touch in marshalling an immense amount of erudition, which produces the very pleasant conviction that it is possible to have a broad view; both of the subject and our knowledge of it - a conviction very much in keeping with Hegelian doctrine. In 1894, between the tripos and his fellowship, he married Alys Pearsall Smith, whom he left in 1910 and from whom he was divorced in 1921. They made two extended visits to Berlin, where he studied German social democrats in action, by attending meetings of their party. One day, in the Tiergarten and very much under the influence of Hegel, he decided to write a series of books organizing, more or less, all knowledge. This decision left a deep impression on him; he referred to it repeatedly, for example, in [1944] 11 and in the reflections on his eightieth birthday (A3329). Even if the plan was not wholly realistic, it shows the proper spirit for embarking (some years later) on the more limited enterprise of writing the three volumes of Principia; with Whitehead listed as first author, in anti-alphabetic order, presumably because Whitehead was older. After his return to England, Russell worked on mathematical logic and the foundations of mathematics. He found much of the work on Principia exhausting and depressing, and remained reluctant to return to the subject; of course, he did return to it, for example when preparing its second editing. This important work is the principal subject of Section 2, where also some further relevant biographical material is given. But two ‘objective’ qualities of Principia must be mentioned here because they may have affected his relations with the academic world. Principia is not polished enough to be really useful, not even to trained math- Bertrand Russell 110 ematicians. They cannot dip into it and work with it easily, not even with the theory of relation arithmetic which Russell liked particularly ([1959] 101). Pro- fessional reputations depend less on the general intrinsic value of ideas than on specific memorable results of more or less immediate use to other professionals; in accordance with the principle that - intrinsic - virtue is its own reward. This accounts for much of the lack of interest on the part of others, not only his own. Of course in some isolated centres such as Warsaw, Principia was studied enthusiastically. A second problem, which lies beyond Russell’s purely subjective description of his relief at being rid of Principia (A1234), is this: What were the objective possibilities of improving or developing Principia? There were two ways. One was to review Principia in the light of criticism from ‘without’ ([1959] 112), by so-called formalists and intuitionists. Russell did not do so but chose to ‘repel their attacks’ (which was not hard because the opponents’ dialectics were piti- ful). As a matter of fact, as will be seen in Section 2, a great deal of work which has built on Principia made essential use of such criticism from ‘without’. The second kind of criticism, from ‘within’, had had to come from Russell himself: Whitehead’s role in the work, though obviously substantial, consisted in develop- ing and consolidating Russell’s ideas. Especially in the light of a letter by F. H.

Bradley (A1307) this intellectual loneliness, this lack of support by constructive criticism, may have been more exhausting to Russell than the actual writing of Principia.

Wittgenstein Just about the time when Principia was completed, Wittgenstein came to Cam- bridge and soon began to criticize some of the views in Principia from ‘within’ - as Russell stresses repeatedly, for example, in [1959] 112. Without doubt he had great hopes of profound help from Wittgenstein. Also, while according to one of his brother’s letters (A285) many of the dons were uncongenial to Russell, he found Wittgenstein an impressive human being (A2140). Indeed, his mockery of Wittgenstein’s eccentricity has something of the same gentle quality noted ear- lier on in connexion with his own family. In an outburst of anger ([1959] 214, 215) Russell criticizes Wittgenstein for ‘abnegation of his talents’ in later years, comparing him to Pascal and Tolstoy. Even remembering Russell’s dislike of re- ligion, one imagines that he could have found harsher things to say about others. (Besides, in some circles similar criticisms had been made of Russell.) Actually, Russell introduces his criticism, in [1959], by first speaking of his own envy of Wittgenstein’s reputation. The passage strikes me as quite unconvincing, just as many other references by him to envy and jealousy. Ruthford found them a bit facile (A1280) and the subject will turn up again in connexion with Russell’s Nobel Prize lecture. Could it be that he was so surprised to discover that he was capable of jealousy at all (A233) that he forgot other, less banal motives? - such Bertrand Russell 111 as the feeling of being let down, intellectually, by Wittgenstein. (In the footnote on pp. 33 − 34 of A2 he himself has late second thoughts on jealousy.) Wittgenstein too showed a good deal of affection when he spoke to me (in the forties when I first met him) of Russell. Wittgenstein was quite furious that Russell had involved himself in America with some universities and their administrators who, realistically speaking, simply did not belong to Russell’s world. Not likely to let a chance of an apt, but malicious observation pass, Wittgenstein proposed the defence: Look at this face! to the charge made against

Russell (A2334): roughly speaking, Russell’s personal appearance at City College of New York would be ‘dangerous to the virtue’ of the tender souls that grow up in, say, Brooklyn. Both Russell and Wittgenstein, though very different, had splendid faces and great style. They were a little below average height, and delicately boned, but generally quite free from the jumpy nervousness that often goes with this physique. Their gestures were always sure, often graceful, and sometimes beautiful.

First world war Except for a period in Brixton jail, when Russell wrote Introduction to Mathemat- ical Philosophy, he devoted much of his energy to - and, perhaps, derived it from - politics and the kind of social life he missed during his first marriage. He was strongly opposed to England’s participation in the war. But the theory behind his opposition, especially as he saw it later (A2288), and his practical politics did not quite match.2 He was not a doctrinaire pacifist, not a conscientious objector. He thought that ceteris paribus peace was better than war - like most people, even those whose wartime experiences constitute the most fulfilling part of their lives. His own activities show that he himself sometimes found it necessary to resist. At the time he was particularly struck by the possibilities of non-violent resistance (of

Gandhi’s followers against the British); but according to A2288, Russell had over- looked that such resistance presupposes ‘certain virtues in those against whom it is employed’. Actually, he later advocated the use, or threat, of force; not only during the second world war ([1944] 17), but also after. In short, Russell objected to a specific war.

2There will be frequent references to war and military matters in the following. Some of them are trivial, in the sense that they concern Russell’s public activities concerning various wars. But most of the references may appear, naively, to be ‘gratuitous’. In fact, the reason for including them is plain and familiar, and follows established practice (cf. Tolstoy’s War and Peace, or Solzhenitsyn’s August 1914 ). Quite apart from the subjective importance of war time experiences to the individuals directly involved, by and large wars involve the most complex organization of masses of people, with high stakes. It would be absurd to ignore our knowledge of these matters if one wants to think about social forces and structures, as to ignore tornados if one wants to think about metereological forces. Bertrand Russell 112

So much for theory. His anti-war propaganda pursued many different lines. In particular he was not jailed (in 1918) for pacifist convictions nor even for advocating non-violent resistance but not for ‘disaffecting’ the troops. He had warned that America - reluctant as she was to enter the war at all - would send over troops to break strikes. (It is a moot point which was more far fetched: his warning or the charge against him that - presumably loyal - miners or soldiers were likely to be so easily ‘disaffected’.) In any case, he liked jail quite well, in particular, smuggling love letters in volumes of the Proceedings of the London

Mathematical Society (A231). Incidentally, Russell was in his mid-forties, not of military age. He most certainly was not a mere coward; and nobody - including himself - had to ask himself if he was. After a conviction in 1916 under the Defence of the Realm Act, Trinity College deprived him of his lecturership, a shabby act by any standards. So quick to see shabby motives, like envy, in himself, he seemed to think of the dons as moved solely by political ‘passions’ and bigotry; and of himself as a ‘martyr’. The realities of the situation seem different from the public debate which concerned of course the formal, legal merits of the case. There was a clash of temperament between Russell and those dons who opposed him (or: most of them; the matter is statistical since it concerns a vote). For many dons college life provided not so much a place for the pursuit of knowledge, but simply a shelter from practical life where one’s opinions have to be put to the test of experience. Russell, full of dash and vigour, had little respect for the opinions of those who had ‘no knowledge of life’ (A1240). The dons did not think lightly of their opinions. They happened to constitute a majority, and exercised their vote. Apart from formal rights and motives, there is a further point - however far it may have been from the minds of the opposition: realistically speaking, the normal duties of a college lecturer were hardly fitting for someone of Russell’s standing at that time. Besides, as mentioned already, Principia was not in a suitable shape for immediate consumption by his academic colleagues. Russell, who had long found the dons uncongenial, cannot have expected them to behave differently, let alone more generously than they actually did. But, like Kierkegaard before him and others since, he may have had the illusion that it ought to be possible to make his smug opponents ‘take notice’ (even if he took little notice of them).

Between the wars During the twenties Russell engaged in diverse activities; he lectured, wrote books on general philosophy, and pursued his interests in social questions. He travelled in many countries, including the U.S.S.R. and China. He was critical of the Soviet regime, unlike others in his circle, in particular, Dora Winifred Black whom he married in 1921, left in 1932 and by whom he was divorced in 1935. An unsatisfactory interview with Lenin, who showed little interest in Russell’s Bertrand Russell 113 political views, may have helped Russell see other defects of Lenin’s judgement. As has long been known, prejudice sometimes permits us to see the mote when love blinds us to the beam. In contrast, China enchanted him, particularly the human qualities, of wit and finesse, which he found among Chinese intellectuals.

In a memorable letter to Ottoline Morrell (A2202), he spoke of political and ‘bureaucratic machines [that] cared nothing for human values’. He meant the values of those particular qualities, of the Chinese and Oscar Wilde, which he himself possessed in the highest degree; not those surely much rarer qualities which permit a man to be both successful in public life and humanly impressive. Russell and his second wife had ‘advanced’ social ideas. They also had the courage of their convictions. They decided, jointly (A2222), to try out some of them; a free school and a swinging marriage. He wrote Marriage and morals, which, though mentioned among his principal publications in the Nobel Prize vita (on p. 129 of Holmberg [1951]), was not stated to be a, let alone the sole, reason for the prize - contrary to Russell’s memory of it (A325). His advanced views became widely known. His change of views, though expressed in the clearest possible terms, is less well known. He discovered that the views were trivially wrong (at least for him) inasmuch as they were refuted by commonplace events. The children were diffi- cult; when told to brush their teeth, they would sometimes say ‘Call this a free school!’ (A2227). As regards marriage, he had apparently completely forgotten to consider the case when the wife turns up with a child fathered by another man (A2228); as he says (A2288), he had been ‘blinded by theory’ - his own, not Hume’s who goes into this sort of matter in Section 195 on chastity (Hume [1777]). Russell did not try to refine his views; neither by the scientific method of making more actual or imagined (Gedanken) experiments nor by analysing the factors involved in happier historical precedents; for example, it has long been said, for good reasons given on pp. 210 − 211 of Bernard [1973], that the painter Eug`eneDelacroix was Talleyrand’s natural son; Eug`enehad the full blessing of M. Charles Delacroix, Talleyrand’s predecessor (in office). Actually, Russell hardly ever refined his views. He dropped them, replaced them by new views and usually - both in his younger days and later when he was nearly 90 ([1959] 41) - had ‘an almost unbelievable optimism as to the finality of [his] own theories’. But now, pushing 60, he was less confident. In particular, he remained an agnostic on the subject of marriage (A2228). And especially his last marriage, to Edith Branson Finch in 1952, seemed a good deal more peaceful - without theory and with 80 years behind him. He also wrote no more books like Marriage and morals which, in a sense, could induce a wife to be naughty out of sheer loyalty to her husband’s views (though one would not wish to be too dogmatic in such matters). During these difficult years, in the early thirties, he wrote several articles surveying his past. He even tried to find out about his genetic make-up, as it Bertrand Russell 114 were, by assembling the Amberley papers; in collaboration with Patricia Helen Spence, whom he married in 1936, and divorced in 1952 after she had left him in 1949. Perhaps the most critical point that struck him at the time (1931) is contained in the article Christmas at sea (A2229); on the extent to which his life and his view of the world had depended ‘on a superabundant vitality’. Clearly life had been simpler when there was plenty of energy to spare; the satisfying ‘unity between opinion and emotion’ in the first world war, of which he speaks nostalgically later (A2289), goes naturally with vitality and with the conviction that one will solve problems as they come along, a conviction produced by vitality. For this very reason the inconveniences directly due to vitality (such as the clash of temperament with the dons at Trinity) were not too disturbing. The decrease in vitality created different problems; most prosaically, presumably, the very marital problems that had taken him so badly by surprise. His writings give the impression that the decrease in the level of vitality to which he had been accustomed for nearly 60 years, had taken him by surprise too. If so, this must have created its own additional difficulties. In 1939 he emigrated to America with his third wife and their young child born in 1937. He describes most vividly his experiences there and his reactions to them. They seem to be of general interest and will be taken up in section 3 as a typical example of Russell’s contributions to what might be called literary social philosophy.

Return to England He was happy to return and happy with his reception. He was now the third Earl Russell, having inherited the title when his older brother Frank died in 1931. In 1949, he was appointed to the Order of Merit, described as ‘this odd miscellaneous order’ in T. S. Eliot’s letter of congratulation (A357). In the same year, Russell was awarded the Nobel Prize for literature. This unexpected honour pleased him too, but, understandably, not the somewhat absurd citation, mentioned already, on pp. 57−59 of Holmberg [1951]. It is not recorded whether Russell’s own Nobel Prize speech was meant to be repayment in kind. In it ([1951] 261) he suggests that the Kaiser wanted the first world war because he was literally jealous of his Grandmamma, Queen Victoria, on account of her Navy. Russell’s untiring efforts for nuclear disarmament, after both East and West possessed nuclear weapons, are well known. It is perhaps, less well known that in the late forties he advocated the threat of atomic bomb against Stalin’s Russia

(A37−8). Whatever the merits of the proposal, many of the people I knew at the time were taken aback by it - some of us, but not all, much more so than by the proposals of preventive war ascribed to some Hungarian scientists (who had had a dose of communism - albeit the local variety - under Bela Kun). In the early sixties, at the beginning of a conversation partially reported on pp. 129 − 130 of Crawshay-Williams [1970], I asked Russell about his old proposal; admittedly in Bertrand Russell 115 general terms. He immediately assumed that I objected to ‘inconsistency’, and answered (his own objection) with disarming logic: He did not want the bomb to be dropped on him; as long as the Russians did not have it, the proper thing was to prevent them from getting it; and when they had it, the proper thing was to persuade them not to use it. Obviously, he was quite fearless, in big things and in small ones. He was no more afraid for his own skin than he was afraid of my going out and telling his answer to some of his more fanatical admirers. By this time he and his face had become part of our lives; organisations and foundations built up around him. One of them was the Bertrand Russell Peace Foundation. It is no doubt memorable for many things that it has done. But is was memorable even before it started, uniting among its sponsors Nehru of India and Ayub Khan of Pakistan, Albert Schweitzer and Kwame Nkrumah (whose political bible - Nkrumah [1970] - contains a final chapter on set theory). The touch of irreverence in the last paragraph is not unexpected from one who has been reading Russell a good deal, and writing about his life. But, perhaps inevitably in such circumstances, the irreverence is mixed up with a more personal note. Nobody reading the reflections which Russell wrote on his eightieth birthday, and reprinted 17 years later in A3326 − 330, can fail to be impressed by his strong sense of failure. At least as far as his scientific work is concerned, his disappointments are not founded on objective facts, but - as will be clear from Sections 2 and 3 - mainly on a failure of memory and lack of knowledge; mistaken memory of the aims he actually formulated for his scientific work, and lack of knowledge of the remarkable ‘extension of the sphere of reason to new provinces’, as he put it in [1944] 20, by the work of others who built on his scientific ideas and achievements. As far as moral and political problems are concerned, the disappointments are understandable. As he put it (A3328): in regard to those problems he did not ‘pretend that what [he had] done . . . had any great importance’. It is far beyond the scope of this chapter to analyse to what extent this failure was due to the nature of these problems, to Russell’s conceptions of them, and to his particular human qualities. His brother Frank considered these matters back in 1916 in a - for him - remarkably sombre letter

(A285 − 86). Also Russell’s own letter to Lady Ottoline, quoted on p. 113, may be relevant here. Russell died on 2 February, 1970, at his home in Penrhyndeudraeth, Meri- onethshire, Wales. He was survived by his last three wives and by his three children, John Conrad, Katherine Jane (Tait) and Conrad Sebastian Robert. Bertrand Russell 116 8.2 Mathematical Logic and Logical Foundati- ons of Mathematics

Some mathematical background will be assumed. Basic issues involved here, such as the meaning of ‘foundations’, are best discussed in the general context of section 3, in connexion with the scope of (scientific) philosophy. But a few general remarks may be useful here to avoid misunderstanding.

Foundations: loaded terminology The word ‘foundations’ suggests a firm basis for a superstructure, which is to be ‘secured’; here, the superstructure of mathematical practice. Evidently it is not in need of greater ‘security’ or reliability if it is already 100% secure. Besides, or- dinary clear exposition may well be the best method to achieve greater reliability where this is possible. Actually the term ‘foundations’ or ‘Grundlagen’ belongs to the familiar doctrines of finitism or formalism (rivals to Russell’s school) which claim that the abstract principles of mathematical practice are not reliable. Thus, according to these doctrines, to be capable of reliable proof, an assertion about an abstract concept has to be reinterpreted in their doctrinaire terms. Since, in point of fact, long formal calculations have to be checked by means of short abstract considerations, clearly some kind of idealized reliability is meant; what ‘should’ be reliable, not what is reliable. And if the idealization is not realistic the doctrines are themselves questionable. Be that as it may, Russell’s own aims were originally different though he occasionally used the word ‘foundations’, for example in [1919] 2. He did not assume that the basic (logical) notions would be particularly easy to grasp nor, a fortiori, that our assertions about them would be particularly ‘reliable’. According to Principia, p. 12:

‘It will be found that owing to the weakness of the imagination in dealing with simple abstract ideas no very great stress can be laid upon their obviousness. They are obvious to the instructed mind, but then so are many propositions which cannot be quite true, as being disproved by their contradictory consequences. The proof of a logical system is its adequacy and its coherence.’

Even if Russell’s views on the conclusions to be drawn from those contradictory consequences (paradoxes) are questioned as on p. 124, the passage shows that he did not expect the basic logical propositions to be obvious or obviously reliable. In any case his original aim was not to cleanse mathematics of the paradoxes because he started his work before he discovered them. Russell’s own work in the logical analysis of mathematics, that is, in building up mathematics from a few logical primitives, would seem to be better compared to fundamental science such as the atomic theory of matter. The principal aim Bertrand Russell 117 of that theory is hardly to ‘secure’ our ordinary physical knowledge - despite dramatic assertions, for example by Eddington who thought that it was correct to think of a table as being like a swarm of flies, but false not to think about any micro-structure at all (which is what we normally do). Atomic theory builds up matter from a few elements and tries to derive the macroscopic laws from simpler laws for these elements. Correspondingly logical foundations ‘build up’ mathematical concepts by defining them from a few logical ones. Since Russell did not mention this comparison in his publications, I once asked him about it (in the conversation mentioned on p. 114); he agreed - whatever weight one may attach to spontaneous agreement during a pleasant conversation. Views differ on the pedagogic value of the comparison since, to some, it sug- gests that mathematical objects are physical substances. Here it is only intended to prepare the reader for some peculiar difficulties of foundations which are sim- ilar to those of the fundamental sciences, not to those in the bulk of everyday scientific practice; in particular, the special kind of incompatibility between ri- val schemes, corresponding to strategic and tactical differences. Fundamentally different schemes or different analyses within the same scheme may be compat- ible with familiar practice to an extremely high degree of approximation; and apparently, that is formally, insignificant differences in the basic schemes have enormous consequences. In short, we have here all the advantages and defects of an all-or-nothing approach to life, a point which Russell stressed ([1945] 643). The next few sections describe briefly the background to Russell’s logical analysis of mathematics; the tools (logical concepts) used for the analysis, and examples of analyses of familiar mathematical objects.

Background: logical language Today elementary, also called first order, language (of predicate logic) is familiar. It builds up expressions from given ones by means of

¬ (not), ∧ (and), ∨ (or), → (implies), ∀ (for all), ∃ (there exists).

The intended meanings of these operations are not very common in ordinary usage, particularly of ∨, →, ∃. As a matter of discovery, the chosen meanings lend themselves better to theory. Some of the discrepancies are most easily removed by suppressing p∨q, p → q, ∃xA altogether and replacing them by ¬[(¬p)∧(¬q)], ¬(p ∧ ¬q), ¬∀x¬A resp. which are equivalent (for the chosen meaning; actually, in ordinary usage not all meaningful sentences p, q can be ‘sensibly’ combined to, say p ∨ q; for example, if p is: this glass is blue, and q is: this glass is 8 cm high). Frege [1879] contained this language. Although Russell had Frege’s book for many years before the International

Congress of Philosophy in Paris in July 1900 (A191), he could not make out what it meant till he met Peano in person at that unusually useful congress. Peano Bertrand Russell 118 inspired a great deal of confidence in Russell, enough for him to master Peano’s logical notation; and soon after, also Frege’s. It is often said that Frege’s notation was ‘cumbrous’ and ‘difficult to employ in practice’ ([1903] 501). This may be true - though after all Frege did employ it quite a bit. The differences in the purposes for which Frege and Peano employed their symbolisms seem more profound, as hinted at by Russell loc. cit. and stated more explicitly in G¨odel[1944] 125. Peano established that his simple vocabu- lary with a perfectly precise grammar had great expressive power throughout all of mathematics. Frege used his for the analysis of (logical) thought and for a particularly detailed derivation of arithmetic from pure logic. - Peano’s aim and even the details have a permanent place in mathematical culture. But while the penetrating analysis of the most basic steps in logic was immensely fruitful for Frege, leading (him) to distinctions and notions of permanent value, we should not nowadays follow his own line of exposition; it is much easier and more con- vincing to explain his logical discoveries by means of examples taken from more ‘advanced’ mathematics where differences are greatly magnified; plain for all to see, not subtle as in elementary logical contexts. Besides, perhaps by mischance, logical analysis has so far been less rewarding for arithmetic than for many other branches of mathematics, even those existing at Frege’s time. (Gauss did quite well in his Disquisitiones without knowing Peano’s axioms.) It was left to Russell to set caution aside and to search for Principles of (the whole of) Mathematics, not only of Arithmetic. Digression. Analogues to the facts just described are easy to find in the development of the atomic theory of matter. Precisely:

1. We do not see the world very differently after we learn that it is atomic.

2. Many macroscopic phenomena continue to be analyzed by use of purely macroscopical theories. Very occasionally (for example when designing an atomic bomb) in order to produce a macrophenomenon one uses with ad- vantage one’s knowledge of atomic theory. On other occasions one uses macroscopic experience to learn about atomic structure.

Logical and set theoretic foundations of arithmetic have been peculiarly sterile in a quite precise sense (at least so far). For one thing, we have not derived arithmetically interesting results from set theoretic axioms (via the set theoretic definition of natural numbers). But also:

1. The axiomatizations of the set of arithmetic consequences of usual set the- ories are really quite simple and elegant even if one uses only the usual arithmetic notions (+ and ×); while one of the hopes for ‘fundamental theories’ has always been that the laws for the ‘fundamental objects’, be they sets or atoms, would be simpler and less arbitrary than the laws for compound objects. Bertrand Russell 119

2. The axiomatic theories describing arithmetic experience, specifically first order arithmetic, is ‘neutral’ on set theoretic axioms, that is, teaches us little about set theory. In particular, for set theory without the axiom of infinity, the replacement axiom or the axiom of choice does not imply new number theoretic theorems.

In contrast, both set theoretic concepts and set theoretic principles have be- come a standard (and useful!) tool in the geometry of the plane. Even if the best proofs are geometrical or topological, theorems were often first discovered set theoretically. (Analytic Number Theory is not relevant to the present com- parison since here the natural numbers are embedded in the complex plane, not (re)defined set theoretically.)

Background: sets and predicates Cantor, after studying the (mis)behaviour of trigonometric series at certain pe- culiar sets of points in the plane, went on to develop the properties of such sets more abstractly. Many familiar operations on sets of points or on finite sets (in the theory of combinations and permutations) were seen to be meaningful in a far more general context too. One operation, which turned out to have a particularly rich theory was the so-called power set operation P : x 7→ 2x, which associates to any given set x the collection of all its parts (subsets). It provided examples of infinitely many infinite cardinals. Just because Cantor’s study began with (infinite) sets of points, it cannot be supposed that he relied on his experience with finite sets to develop the general theory. But it is true that many of his results, in particular those concerning the power set operation, are very well illustrated by finite sets; even by so-called hereditarily finite sets (also called the cumulative hierarchy of type ω) which are obtained from the empty set ∅ by iterating the power set operation finitely often. In symbols: V0 = ∅, Vn+1 = P(Vn) S Vω = n Vn.

Only, as Russell put it (Principia, p. vi), the ‘general laws [of Vω] are most easily proved without any mention of the distinction between finite and infinite’. (To be quite precise, Russell’s remark applies to purely universal laws; for existential statements one has to look at the axioms used in the proof to verify in addition that a finite set realizes the statement.) One might add that the general laws are often better understood, for example by a child who has not yet convinced himself that the sets he knows - such as forests of trees or heaps of sand - can be counted at all. Cantor gave some very general indications of the kinds of objects for which his assertions hold, for example, in Cantor [1899] or [1932] 282; saying that a set is Bertrand Russell 120 a variety of objects (Vielheit) which can be grasped or comprehended as a unity (Einheit); but he did not draw many conclusions from this. Indeed, as far as (even today’s) mathematical practice goes, little would be lost if Cantor’s work were applied only to Vω+ω, that is, to the sets obtained from the collection of hereditarily finite ones by finitely many applications of the power set operation. Only, by Principia p. vi, it would be ‘a defect in logical style to prove for a particular class . . . what might just as well have been proved more generally’. Frege considered a prima facie much more general notion, much farther re- moved from mathematical practice (then or now) than segments of the cumulative hierarchy described above. It is the logical notion of predicate, with no a priori restriction on the kind of thing to which the predicate may apply. Such predicates are very common in ordinary life; for example, blue is understood without any clear idea of all the things past, present or future, mental or physical, that may be blue (in contrast to mathematical practice, where one has predicates or sets of something; of numbers, of points etc.).

We shall write aηP for: the predicate P applies to (the object or predicate) a. As usual, a ∈ X will mean: the object a is an element of the set X.

Instead of ‘predicate’ Principia uses ‘propositional function’ which, as Russell says in [1959] 69 − 70, ‘sounds perhaps unnecessarily formidable. For many pur- poses one can substitute the word “property” . . . but, except in ultimate analysis, it is perhaps easier to . . . use the word “class”.’3 We use ‘predicate’. To each set

X corresponds of course a predicate PX where

aηPX if and only if a ∈ X.

However, Cantor’s explanation of ‘set’ shows clearly that there is no reason to assume the converse. Nevertheless operations on (Cantor’s) sets often have ana- logues for predicates. Suppose we start with a variety P0 which is not a set, and form varieties Pn by an analogue of the cumulative hierarchy construction, where

aηPn+1 if and only if aηPn or a ⊆ Pn and a is a set, that is, a can be grasped as a unity. Whatever doubts there may be about the notion of predicate, the Pn are predicates if any sense is given to Cantor’s explanation, and there should be a variety P0 which is not a set if Cantor’s distinction is to be of use at all. (The hierarchy will be cumulative if P0 consists only of sets.) There will be frequent references to various versions of this hierarchy con- struction. This is not ad hoc, but connected with Cantor’s explanation of ‘set’,

3As so often, defects in terminology reflect a defective analysis of the notion considered; cf. p. 128. Bertrand Russell 121 if not only the variety in question, but also its elements, their elements, etc. are required to be ‘unities’. The only quite unproblematic way of achieving this is to build them up from the variety of ‘individuals’ which are simply given as unities.

This variety takes the place of P0 above (and the sequence P1,P2,... may be continued beyond ω). Actually most formal laws discovered at an early stage of the subject hold equally for sets and predicates (∈ and η), about unions, differences, Cartesian products and the like; and again one ‘might just as well’ prove them generally. But this does not exclude basic differences between the notions; nothing could be simpler than the predicate, say V , which applies to everything; but it is hardly plausible that this variety V can be grasped as a unity.

Background: definitions of natural and real numbers We begin with the natural numbers. In Dedekind [1888] there is a definition of the natural numbers in logical terms, that is, in the logical language built up from η or ∈. It does not define specific numbers 0,1, . . . but, as we should put it now, the class of structures (X,S) where S is a binary relation on X which are isomorphic to the natural numbers with the successor relation. This is accepted as a definition not because of some arbitrary decision but because of a discovery (made in the last century). Even when we think of the natural numbers as specific objects, the results we prove about them in pure mathematics turn out to be true for all those (X,S) described above. And this empirical observation becomes a theorem if we confine ourselves to results formulated in logical formulae (not necessarily of first order languages) built up from the successor relation. Russell had two objections ([1919] 10). First, Dedekind’s procedure ‘does not even give the faintest suggestion of any way of discovering whether there are such sets’, specifically, sets X and S satisfying Dedekind’s conditions. One may think that we do not need to discover them, because we already know the familiar natural numbers, and Russell’s demand, for a logical foundation, may be considered a luxury. It becomes a necessity when we pass from arithmetic to problematic notions. His second objection is this: ‘we want our numbers to be . . . used for counting common objects, and this requires [them to] have a definite meaning, not merely . . . certain formal properties’. This matter is more delicate (and in any case, as will be seen below, the analogous requirement is not satisfied by Russell’s own definition of the real numbers). But it is quite evident that not all isomorphic images of the natural numbers can be used for counting. Suppose a0, a1,... is an ω-ordering (of say the usual words for numbers), but we do not know the value of say a5. We could not use the isomorphic image of ω above, that is, a0, a1,..., for counting in the literal sense because we should now know the numerical label for a set of 5 objects. (In current logic this remark is developed by use of recursion theoretic notions.) Frege [1884], four years before Dedekind’s publication as is pointed out on p. 1 Bertrand Russell 122 of Frege [1893], does given definitions, in logical terms, for each natural number; for example of 1 as

the class of all classes with a single element in Russell’s formulation. It would be idle to speculate whether we ‘need’ these definitions; we evidently do not, as long as we are concerned with results that hold for all structures defined by Dedekind. The question is rather whether, at least sometimes, we can do better when we do have such definitions. Amusingly, developments of arithmetic in current set theory introduce such definitions: the empty set is taken to be zero, and either of the two functions:

s1 : x 7→ {x}, s2 : x 7→ x ∪ {x}, for a successor; s1 and s2 generate the structures in which ∈ is the successor and the order relation, respectively. A pedant would say that s1 analyses the notion of natural number, s2 the notion of finite ordinal (tacitly assuming that ∈, evidently the simplest relation of the set theoretic language, is to realize the successor, resp. order relation). Dedekind [1872] also gave conditions, in logical language, for a structure (X,O) to be isomorphic to the real numbers; more precisely, to the ordering of the real numbers. Russell supplements the definition by treating each specific real number ρ as a set of rationals (< ρ) and verifying that Dedekind’s condition is satisfied if these sets are ordered in inclusion. The questions raised by Rus- sell ([1919] 10) about applications of natural numbers (to counting) have their analogues here. But he does not discuss them. His definition is not particularly useful for applications within mathematics; for example for computing, Cauchy sequences, satisfying

−n ∀n(∀m > n)(|am − an| < 2 ), are better in the sense that we can effectively find sums, approximate Euler’s constant γ and so forth, while it does not seem to be known if one can effectively decide for any rational r whether r < γ. In short, there has not been much progress with Russell’s (and Frege’s) aim of finding ‘privileged’ structures in Dedekind’s classes; perhaps the most one can hope is that relatively few specific structures will turn out to be well adapted for many uses (a topic for the philosophy of applied mathematics).

New theories: Russell’s paradox By the end of the nineteenth century it was known that then current mathema- tics could be ‘reduced’ to the natural and real numbers. So definitions of these objects in logical terms made it plausible that the grand old question: What is Bertrand Russell 123 mathematics? had a satisfactory answer (relative to the knowledge at the time). Russell saw this clearly soon after that exciting congress in 1900. Whatever the formal defects of The Principles of Mathematics, it put the grand old question back on the map when the time was right. When actually carrying out the work, Russell and Whitehead were naturally led to rethink parts of the mathematics of the day, and to develop a new subject, the Mathematical Treatment of the Logic of Relations. The critical problem was to find significant laws satisfied by the basic logical notions; a minimum requirement being that these laws provided logically defined structures which are isomorphic to the familiar mathematical notions. Frege [1893] contained already such laws which he himself used specifically in the case of natural numbers. His laws were amazingly simple since there was essentially just one principle. In modern notation, for each formula A of the language considered: ∃X∀Y (Y ηX → A), provided A does not contain the variable X. It is called comprehension principle, X ‘comprehending’ all objects Y which satisfy A. Frege himself was quite aware that the principle was problematic: ‘Ich halte [das Prinzip] f¨urrein logisch’,4 on p. vii of Frege [1893]. He certainly considered the possibility that it might be contradictory (ibid. xxvi). He obviously thought it wasn’t and made the - for him! - weaker prediction: ‘Aber [daraus einen Widerspruch abzuleiten] das wird Keinem gelingen’.5 An even more convincing symptom of Frege’s malaise was this ‘evidence’: ‘Es ist unwahrscheinlich, dass ein solcher Bau sich auf einem fehlerhaften Grunde auff¨uhrenlassen sollte’;6 Frege insisted on the objectivity of logic and mathematics, and knew quite well from physics that there were grand, but false theories. Cantor’s review of Frege [1884] in Cantor [1885] was extremely critical of the comprehension principle. This is quite natural given his explanation of set (to which, however, he did not refer there explicitly). Within the context of Frege’s language, the only way to talk about varieties was to introduce such formulae as A above, and so according to the principle every variety could be comprehended as a unity! (neglecting the distinction between η and ∈). Cantor’s own objection was quite specific; the extension of a concept, defined by the formula A, is in general quantitatively quite undetermined. But Cantor did not derive a contradiction at that time. When he did, in Cantor [1899], he did not publish it. Russell derived a simpler contradiction (in 1901, some 15 years after Cantor’s review) and did publish it. He arrived at it by analysing Cantor’ proof of |2X | > |X| applied to X = V where V is the ‘universal’ class obtained by taking some true formula A in the comprehension axiom, for example A of the form B ∨ ¬B.

4‘I consider [the Principle] as purely logical’. 5‘But nobody would succeed [in deriving a contradiction]’. 6‘It is impossible that such a structure can be erected on a faulty foundation’. Bertrand Russell 124

Russell’s analysis reduced the number of applications of the axiom (to derive the existence of V and the various operations involved in Cantor’s proof) to the single case where A is ¬Y ηY.

One can put the argument a little more positively. Suppose XR is any predicate satisfying

∀Y (Y ηXR → ¬Y ηY ). (8.1)

+ Then the predicate XR determined by + Y ηXR if and only if Y ηXR or Y = XR + also satisfies 8.1; XR ⊂ XR but not XRηXR since + XRηXR but not XRηXR. Thus there is no X satisfying ∀Y (Y ηX ↔ ¬Y ηY ). (Amusingly this is literally the proof that there is no greatest integer if one takes the successor operation s2 on p. 122). Frege wrote to Russell ([1944] 13), ‘die Arithmetik ist ins Schwanken geraten’7 (meaning that Frege’s analysis of arithmetic had turned out to be shaky). There was, unquestionably, a problem here; even if Frege’s formulation was a mere oversight, the fact remained that neither he nor anybody else had a theory; just because in his presentation everything had depended on that one comprehension axiom. One had to make a fresh start. These facts leave no doubt about the objective interest of Russell’s discovery. But, in addition, his and similar paradoxes seem to have considerable psychologi- cal interest since the reactions to paradoxes are both strong and diverse; possibly indicating, in a reliable way, striking personality differences. But existing studies, for example, Hermann [1949], are not altogether convincing. Russell’s own reac- tions, at least as he remembered them, seem very natural from what we know of his personality. ‘It seemed unworthy of a grown man to spend his time on such trivialities, but what was I to do?’ (A1222). ‘It was quite clear to me that I could not get on without solving the contradictions’ (A1228). He did not look for drama perhaps because he was able to get his ‘kicks’ elsewhere.

True, false, meaningless One of the reasons why paradoxes are so disagreeable is that it is hard to locate an error, to ‘solve’ the paradox by specifying an error. In the familiar paradoxes produced by dividing by zero, a · a−1 = 1 is asserted. There are two alternatives:

7‘Arithmetic has began to be shaky’. Bertrand Russell 125

1. to require, as is common in logic, that the functions used must be defined on the whole range of the variables, in particular, 0−1 must have a (numerical) value; then the correction is: a 6= 0 → a · a−1 = 1 2. to take 0−1 as undefined and simply assert a · a−1 = 1 for those a for which this formula is significant. In isolation, the example does not decide convincingly between the alternatives. Just because - in this particular case and many like it - it obviously does not matter which decision is taken, we are liable to be unprepared when it does matter. There is a further, more specific consideration which is perhaps more per- suasive in the case of predicates, such as the formula A in the comprehension principle, than in the case of functions (where, as a matter of historical fact, Frege did use a type distinction from the start, as pointed out by G¨odel[1944] 147). Why should we not simply decide that a predicate P does not apply to an argument a if, by intention, aηP is meaningless? As intended, ‘the number 2 is blue’ is meaningless; but one would rather have it false than true. More importantly, this remark shows why current systems of set theory, going back to Zermelo, are formulated without explicit type distinctions, although those sys- tems are intended, in Zermelo [1930], to apply to segments of the cumulative hierarchy of types; cf. G¨odel[1944] 140. In the cumulative hierarchy on p. 119 it is natural to say - and explicit in

Russell’s doctrine of types - that xm ∈ yn is meaningless, if xm is introduced at stage m (of the hierarchy) and yn at stage n and m ≥ n, since xm is not even a candidate for being an element of yn. However, in current set theories xm ∈ yn is meaningful but false; and assertions of a logically compound structure are interpreted accordingly, for example ¬(xm ∈ yn) is put true. How can this kind of convention conflict with others or with axioms? Most easily if we have an independent assertion relating such simple formulae. Specif- ically, in the instance of the comprehension principle used in Russell’s paradox, consider the predicate PR determined by

Y ηPR if and only if ¬Y ηY ; in particular, PRηPR if and only if ¬PRηPR (predicates and the relation η are used in place of sets and ∈ because, as pointed out in Cantor [1885], the comprehen- sion principle is not even remotely plausible for the latter). Suppose then that, as intended, PRηPR is meaningless; then so is ¬PRηPR. The convention above requires PRηPR to be put false, ¬PRηPR to be put true and this conflicts with PRηPR ↔ ¬PRηPR. As in some other domains of life, once we allow ourselves to parley at all, to entertain the ‘proposition’ PRηPR, we are seduced. A different matter, which Russell himself described as a puzzle in [1905], had drawn his attention to (the logical interest of) meaningless expressions; specifi- cally the sentence ‘The present king of France is bald’. Since there is no such Bertrand Russell 126 king, the use of the definite article is improper. Here there is a relatively simple convention giving a manageable meaning to all such phrases, roughly speaking this:

If P (x) and Q(y) are formulae not containing unexplained occurrences

of the definite article and ιxP (x) stands for: ‘the x which satisfies P ’, then

Q[ιxP (x)] means [∃!xP (x)] ∧ ∀x[P (x) → Q(x)], where ∃! means ‘there is a unique . . . ’.

(At the time it was satisfying to see that such matters could be expressed suc- cinctly in Peano’s language.) This meaning, due to Russell, was manageable in the sense that he found relatively precise and simple formal laws for expressions containing also ι-symbols.8 Russell’s well-determined extensions of such arith- metic functions as: n 7→ n! to a wider domain (Γ-function), but, as so often, the mathematical example is more interesting. There is no guarantee that equally manageable meanings exist for such ex- pressions as PRηPR which occur in the paradoxes. If not this would be consistent with the view of many mathematicians at the time that very general logical ideas do not lend themselves to theory (except perhaps at a very advanced stage). This could be compared to so to speak the opposite extreme: in physics, apparently accidental facts. They are quite objective, quite striking, and therefore literally make up the bulk of the world as we see it, but do not have a simple theory; for example, of the fact that (the substance having the chemical composition of) glass is transparent. Be that as it may, the fact remains that without some definite meaning for formulae containing the expressions involved, the adequacy of the marvellous logical language on p. 117 is in doubt. The particular simple interpretation of the logical particles which leads to the familiar laws, does apply. If the logical words are reinterpreted so as to apply to meaningless expressions, the expressive power of the simple vocabulary must be expected to be very much limited. This matter is wide open. At any rate, Russell pursued a different line.

Doctrine of types and use of types Russell was always very quick to see, and formulate memorably, the ideas which naturally cross one’s mind. He did this in [1906] in connexion with ‘patching up’ formally Frege’s system. Some of his ideas have since been pursued, with varying success, as described by G¨odel[1944] 132 − 133. Russell himself pursued less superficial aims:

8These laws can be made quite precise relative to given primitive or atomic predicates. Any formal language is given with a list of such predicates, but not unanalysed ordinary language. Is male or (its negation) female primitive? Bertrand Russell 127

To describe objects, that is, predicates and contexts involving them, that are clearly meaningful, and to state some of their properties. To give reasons for supposing that those objects and contexts are exhaustive. (If the reasons given are not convincing, one speaks of a doctrine.) The objects described may of course have uses even if they are not exhaustive.

The idea was that the predicates (considered) occur in a hierarchy, and that a context of the form P ηP or QηP is meaningless (excluded) if P is, roughly speaking, ‘involved’ in Q, the so-called vicious circle principle. Though, of course, it was important for Russell’s own research to attempt a general formulation of the idea, it cannot be expected that his wording remains satisfactory after more than half a century. Russell was too much of a pioneer for that. As G¨odelexplains carefully ([1944] 133 − 135), Russell’s formulation of the ‘vicious circle’ principle was defective. Formulation apart, even if there is circularity in some sense, it is not clear that it is vicious. We understand perfectly well many grammatical laws which patently apply to themselves: ‘In basic English the verb follows the noun’. Again, in mathematics, we understand definitions of Dedekind cuts, which are predicates of the rationals, even if the definition contains variables over Dedekind cuts. The same applies, of course, to definitions of integers of the form:

n = 0 if A is true and n = 1 if A is false, where A contains a variable over the integers. Of course, we have to understand ‘basic English’, ‘predicate of the rationals’ (or ‘set of rationals’) or ‘integer’. The logical ‘essentials’ of Russell’s idea - which are related to ideas mentioned but not developed by Poincar´e- are most easily understood by comparison with the cumulative hierarchy described on p. 119 - and a little difficult to follow without understanding the latter. (So this hierarchy is fundamental.) In the full cumulative hierarchy, at any stage α > 0, [ Vα = Vβ ∪ P(Vβ), β<α where, as on p. 119, P is the power set operation; in particular

Vα+1 = Vα ∪ P(Vα).

(If V0 = ∅, each Vα is included in P(Vα), whence the simpler form on p. 119.) Speaking now of predicates instead of sets, Russell’s idea9 is realized if one passes form the predicates accumulated at stage α, say Lα, to the collection of only those predicates PF which are defined by formulae F with all variables restricted

9The type structure of Principia is not cumulative, and therefore formally quite different from the modern presentation which is used here. Bertrand Russell 128 to Lα; here F is in the logical language considered (with additional symbols for the predicates in Lα) and

XηPF if and only if F (X) is true and XηLα.

This hierarchy is more difficult to understand than the (full) cumulative hierarchy because, for domains familiar in mathematics such as the set of integers, the idea of arbitrary subset (of the domain) is easier to think about than the idea of subset defined in some particular language. But for Russell it was perhaps natural to avoid the power set operation which led (him) originally to his paradox when he thought of applying this operation brutally to the ‘domain’ V of all things (p. 123); ‘brutally’ since, in terms of Cantor’s explanation (p. 119), every subvariety of a familiar domain (or set, grasped as a unity) is a set, but of course, not every subvariety of V . The explanation of the predicates Pn on p. 120 takes account of this difference; thus if P0 is the predicate of ‘being a set’, Pn = P0 for n = 1, 2,... It may be difficult, though surely not impossible, to establish conclusively the rˆole,in later research, of Russell’s original, rather complicated, formulation. One reason for the difficulty is this: in Principia, the simple idea of the hierarchy is mixed up with the so-called reducibility axiom. This was introduced to de- rive formally the familiar properties of natural and real numbers as defined in Principia; but the ‘axiom’ is just not true for the hierarchy here described (and Russell’s lack of precision about the notion of ‘predicate’, its linguistic or abstract character mentioned on p. 120, was natural: he was thrashing about for a notion which satisfied the axiom). G¨odeldiscovered a version, described in [1944] 147, which makes sense of this axiom.

He first extended the hierarchy Lα to transfinite α, and then showed, for example, that all subsets of Lω, definable in any Lα, have already definitions in

Lω1 where ω1 is the first uncountable ordinal. This is just what is asserted by the reducibility axiom applied to subsets of Lω, if Lω1 is taken to be the next stage of the hierarchy after Lω. Roughly speaking, the axiom holds for the hierarchy Lα, if not all ordinals, but only cardinals α are considered. G¨odel’sdiscovery is particularly relevant to a comparison between intensions and extensions, which concerned Russell a great deal ([1959] 87 − 88). Specifically, new intensions (that is, definitions) of predicates of Lω appear at arbitrarily high Lα; but, for α ≥ ω1, no new extensions ⊂ Lω, that is, no new subsets of Lω. The ‘totality’ of all such intensions is far bigger than that of the corresponding extensions. Interestingly enough, G¨odel’sformulation of his results refers to axiomatic theories and their consistency properties, not simply to, so to speak, objective properties of the sets definable in the Lα’s. For example, he speaks of conse- quences C of the ‘axiom’ that all sets are definable in the Lα’s and of its consis- tency relative to usual axiomatic set theory. But where C is stated to be such a consequence, often a stronger result is proved: C is true for the hierarchy of the Lα even if some sets (of the full cumulative hierarchy) do not occur in any Bertrand Russell 129

10 Lα. G¨odel’slanguage comes from Hilbert’s ‘rival’ foundational scheme, from ‘without’, to use Russell’s words quoted on p. 110 - and is necessary if the rival scheme is accepted.

Principia Mathematica and the axiomatization of mathe- matical practice As so often at an early stage of research, even the criticisms of Principia were overoptimistic. The defects talked about most can in fact be corrected fairly easily (though, even now, these corrections are not as well known as the defects). The real difficulties were hardly discussed. The first principal defect, particularly for the mathematician, was the com- plexity of the language of Principia with its many types. Also the description of the syntax was not formally perfect; less so than Frege’s as was pointed out in G¨odel[1944] 126 - but of course, for this very reason, known to be corrigible. Ordinary mathematics provides many lessons for introducing a ‘global’ theory of objects of different types, which are, originally, arranged in stages; for example, of all those points in the Euclidean plane which are obtained from points with ratio- nal coordinates by means of n compass constructions. (Incidentally, the relation between this, Pythagorean, plane and the full Euclidean plane illustrates quite well many relations between the hierarchy of Principia and the full cumulative hierarchy.) The second principal defect, particularly for the logician, was the unsatisfac- tory status of the reducibility axiom which, together with a few others, took the place of the comprehension principle on p. 123. A better formal correction was proposed by Zermelo at about the time of Principia, but after Russell’s funda- mental article [1906], and clearly interpreted in Zermelo [1930] by reference to the cumulative hierarchy. For definite properties P , not meaningless ones as on p. 125, ∀a∃x∀y[y ∈ x ↔ (y ∈ a ∧ yηP )]; the side condition y ∈ a expresses explicitly the difference between properties of ordinary life and in mathematics (p. 120): x is a set of a’s. When one has type distinctions, the condition y ∈ a is not needed because the type τ of the variable y automatically limits the range of y (to the class, say aτ , of objects of type τ). Enough properties P could be recognized to be definite to develop familiar mathematics from Zermelo’s axioms (and modern extensions).

10For example, C may be the generalized continuum hypothesis; or the negation of Souslin’s hypothesis by Jensen [1972]. In contrast we do not know whether Cantor’s continuum hypothesis or Souslin’s hypothesis is true for the full cumulative hierarchy; nor, equivalently, for Vω+ω on p. 120. In other words our present knowledge of sets is much more efficient when applied to the Lα’s than when applied to the Vα’s (in terms of which the Lα’s are defined). Bertrand Russell 130

The real difficulties are closely connected with the view expressed in Prin- cipia, p. v: ‘. . . The chief reason in favour of any theory on the principles of mathematics must always be inductive, i.e. it must lie in the fact that the theory in question enables us to deduce ordinary mathematics’. Principia was the re- sult. Instead of trying to analyse what is meant by logical validity and to prove that the rules of the calculus generate all logically valid formulae (in the language considered), Principia deduces a lot of such formulae. Instead of looking for some global features or ordinary mathematical concepts, for example, that any propo- sition about some specific objects such as the natural numbers is either true or false, and comparing these features with the formal properties of the calculus, Principia deduces a lot of arithmetic propositions. In short, Principia contains no metamathematical theory of its system, no criticism from ‘without’ (p. 110). G¨odel’scompleteness theorem for predicate logic and his incompleteness theory, for Principia and ‘related systems’, w.r.t. quite simple arithmetic assertions are perfect examples of successful metamathematics. To be precise his original for- mally undecided assertions had simple metamathematical content (consistency); nowadays, we have undecided assertions of relatively simple number theoretic content, of the form: a diophantine equation in 14 variables, with prescribed integral coefficients, has a solution. But the metamathematical assertions were easier to discover. The incompleteness theorem is by no means in conflict with the inductive evidence! There are plenty of unsolved problems in number theory. But it raises problems which one had hoped to avoid. What is a correct axiomatization or def- inition of a mathematical concept? As long as one believes that all true propo- sitions (in the language considered) are formally derivable, the question above may be bypassed. Certainly often formally different definitions, say P and Q, are proposed. A minimum requirement is that both P and Q are satisfied by the same objects x. So ∀x(P ↔ Q) must be true. If this equivalence is also formally derivable, every assertion derivable for P is also derivable for Q - and in this sense it does not matter which of the proposed definitions is chosen. A more so- phisticated complication is very familiar from the current mathematical practice of ‘enriching the structure’, which applies here as follows. Suppose we wish to axiomatize the concept of set in one-one correspondence with the integers. Shall we define it as a set X with the side condition that there exists an enumeration? Or shall we take a pair (X,F ), where F is a mapping of ω onto X, that is, with the side condition ∀x[x ∈ X ↔ ∃n(F n = x)], so to speak enumerated sets? Or even triples (X,F,G) satisfying

∀x(x ∈ X ↔ F Gx = x), where G takes values in ω? A moment’s reflection shows that different classes of X, so-called retracts, are obtained without explicit appeal to choice principles. Bertrand Russell 131

(The list of examples can be continued indefinitely.) Evidently such complications would be minor if all true propositions could be formally derived, that is, if not all (sound) formal systems were incomplete. In connexion with completeness, that is, the requirement that all true statements in the language considered should be formally derivable, the introduction of ‘ugly’ types is most natural: ‘global’ systems are obviously incomplete, for example, Zermelo’s (without the addition of the so-called replacement axiom); one can express that there is a set of type ω +ω but cannot derive this assertion since the segment of the cumulative hierarchy below ω + ω satisfies the axioms and of course contains no set of type ω + ω. All this does not discredit the idea of a correct axiomatization - or, as Russell prefers to put it, analysis - of ordinary mathematical concepts; nor even an appeal to inductive evidence. It only means that the use of such evidence would have to be a great deal subtler than had been thought (by logicians). The same applies, mutatis mutandis, to the choice of fundamental system itself, the most simple minded criterion, of completeness, being demonstrably inapplicable since no formal system is complete (for the usual language of arithmetic). - Indeed, the sensitivity mentioned above to details of the axiomatization provides a rational means for finding a - or even the - correct axiomatization. If one thinks of modern axiomatic mathematics, the hierarchy of Principia

- or rather, as always, its modern version Lα mentioned on p. 127 - seems to have potential interest for mathematics, in particular for the study of the full cumulative hierarchy. In modern mathematics one uses knowledge of the rationals and of other formally real fields to study the reals and, possibly, facts suggested by looking at the Pythagorean plane (p. 129) to study the Euclidean plane. So one asks: Do we gain anything by examining those properties (in a suitable language) which are not only true in some long specific segment of the cumulative hierarchy, but also for many Lα, with relatively small (countable) α?; gain either because these Lα have an independent interest or because proofs which are valid for all these Lα provide a better analysis of the nature of the theorem proved. There is a good deal of work in this area, partly under the name of ‘generalized recursion theory’. Though it has not yet provided a conclusive answer to the question above this study of the Lα’s, the direct descendants of Russell’s hierarchy, still seems the best way of understanding higher set theoretical principles at all properly and so use them effectively in mathematical practice. However natural the Vα’s may be, we do not know enough about them for practical use; cf. the footnote on p. 129. Finally and quite generally, as far as the analysis of mathematical practice is concerned, it should be noted that Principia is much more detailed about the analysis of mathematical concepts than of mathematical proofs or, more generally, of processes. For example, there is really no machinery in Principia for formulat- ing relations between proofs expressed by different formal deductions (from given axioms) of, say, the same formula. Bertrand Russell 132 Principia Mathematica: a parenthesis in the refutation of Kant According to [1959] 75, this is how Russell himself thought of Principia initially. (Considering the space Kant takes up to make his points about mathematics, the parenthesis is not very long.) Specifically Kant was led to connect the validity of mathematical assertions with the properties of our combinatorial, spatio-temporal imagination (Anschauung). His examples came almost wholly from traditional geometry; he neglected the massive work, at his time, on algebra and the calculus, let alone attempts at non-Euclidean geometry. (This neglect annoyed Cantor who, to Russell’s delight in A1335 and again in [1959] 75, called Kant ‘yonder sophistical philistine who knew so little mathematics’.) As always when psychological aspects are stressed, Kant’s view suggested that mathematical experience did not lend itself to theoretical analysis. Cantor’s set theory made Kant’s view extremely dubious (unless one regarded assertions about infinite sets as a mere fa¸conde parler). Principia and work build- ing on it certainly refuted the suggested implications: put quite conservatively, such work showed firstly, how the logical properties of mathematical concepts could be built up systematically from a few primitives, and secondly, how much of pure mathematics was conserved if one confines oneself to these logical proper- ties. Perhaps it should be added that not only Kant’s personal convictions were refuted, but what most people would have said too; even a century after Kant the expressive power of a few primitives came as a surprise. There are at least two respects in which, at the present time, Russell’s view of the refutation would be qualified. The first qualification is minor. We should not stress as much as Russell the logical (rather than mathematical) character of the primitive notions. Put more formally, no very significant assertions about the logical notion of predicate and the relation η (p. 120) are known. Of course, a formalism for this primitive notion can be set up and the property, say C (of predicates P ) P is built up in the cumulative hierarchy from the empty predicate can be defined in the language (using the device, on p. 120, for associating a predicate to each set). Then we merely assert that the predicates in C satisfy the known facts about the cumulative hierarchy. But this procedure would not introduce any specifically logical properties which distinguish η from ∈. If the assertions mentioned are the only axioms, we can consistently add: (∀P )C(P ). Here, perhaps more than elsewhere, it is slightly easier to formulate the facts if one confines oneself to the hierarchy built up from the empty set, instead of starting with some indefinite variety of objects which are not sets or predicates. (The restriction is not arbitrary: specific mathematical structures have isomorphic copies in that hierarchy, copies which can be defined in the usual set theoretic language.) Bertrand Russell 133

The second qualification is more serious though it applies mutatis mutandis to any foundational scheme. It concerns the passage from a familiar concept, for example, of natural number, to its logical (or set-theoretical) analysis. Russell put the matter very clearly in his discussion of definitions on pp. 11 − 12 of Principia: ‘when what is defined is (as often occurs) something already familiar, . . . the definition contains an analysis of a common idea . . . Cantor’s definition of the continuum illustrates this: . . . what he is defining is the object which has the properties commonly associated with the word continuum.’ Evidently, this passage or analysis is not part of the set theoretic development. The passage may be compared to the analysis of familiar properties such as colours in terms of physical theory. Views may well differ on what is lost, at the present stage of knowledge, by excluding the passage in question from an analysis of mathematics. It cannot be denied that the passage involves a different kind of reasoning from logical deductions, but after all, it is made by mathematicians. Realistically speaking there is as much certainty (agreement) on such analyses, for example on the definition of the length of a curve, as on the correct result of a long formal computation. More remarkably still, at least to the outsider, there is often agreement on the proper ‘enrichment of structures’ (p. 130).

8.3 Philosophy, Pedagogy, Literature

Scope of philosophy It is a commonplace that the aims, the proper ‘meaning’, of a study will change as we learn more about he objects studied. Around 1912 when Russell wrote The Problems of Philosophy, he formulated his views on the proper scope of philosophy, distinguishing two broad areas. One was philosophic contemplation which ‘views the whole impartially’ and thus achieves an ‘enlargement of the Self’ ([1912] 245). In less exalted language this involves taking a broad view, not forgetting other sides of a question, gen- erally giving matters a second thought rather than simply ‘forging ahead’. It should not be assumed, as Musil [1930] warns in §72 of Book II, Part 1, that philosophic contemplation will necessarily help in business, war or science. The other broad area, which we shall call scientific philosophy, is described as follows ([1912] 239): ‘Philosophy, like all other studies, aims primarily at knowledge . . . the kind of knowledge which gives unity and system to the body of the sciences, and the kind which results from a critical examination of the grounds of our convictions.’ Bertrand Russell 134

The description is clear enough though it cannot be assumed that the com- mon part of the various sciences (that which gives ‘unity’) will necessarily be particularly useful or interesting. As is to be expected of a science of philosophy, it is not so easy to be precise about the features which distinguish it from ‘or- dinary’ sciences. All the more since, as a matter of historical fact (stressed by Russell, [1912] 240), many present sciences were formerly included in philosophy and some were created to answer such typically philosophical questions as: What is matter? Russell’s dicta concerning those characteristic features vary. But the one he stressed most consistently is the uncertainty of philosophy, with different emphasis in different contexts. Perhaps the most positive formulation occurs in [1945] xiv. Philosophy is to teach us ‘how to live without certainty’. In the context (loc. cit.) Russell is preoccupied with the unfounded certainty of dogmatic theology; in the tradi- tion of Kant, Russell looks to philosophy as a bulwark against the temptations of theology. Those of us who do not feel tempted will naturally interpret Rus- sell’s formulation more broadly and look to philosophy for help in the ‘face of uncertainty’. There are two extremes corresponding to the two parts in Russell’s description of scientific philosophy. The passage from genuine uncertainty to moderate certainty; at the very beginning of scientific research when one really knows nothing about the nature of the objects considered (or the ‘kind’ of answers to be expected); and secondly the passage from theoretical uncertainty, that is, practical certainty, to some ideal certainty, which is the business of so-called critical philosophy; the aim of ‘securing’ mathematical practice, mentioned on p. 116, belongs here. The next subsection concerns the first passage. The idea that there may be a distinct subject here is not discredited by ordinary experience of scientific work. There is a recognizable difference in the flavour of the kinds of arguments used for starting a science (in the ‘face of uncertainty’) and for developing (or finishing) it. This difference is particularly striking in the case of the so-called fundamental sciences which study the very large or the very small: here one could hardly start by following Francis Bacon’s recipe for collecting data. In short, the arguments used within our ordinary sciences are not altogether homogeneous. Looked at this way, the first branch of scientific philosophy (in Russell’s formulation in [1912] 239) would require us to separate different kinds of argument already used within science, and to see whether the arguments with a philosophical ‘flavour’ lend themselves to a systematic development. The separation of pure mathematics, as a distinct study, from its uses in scientific arguments would be a concrete example of the project considered.

Uncertainty and generality Even if the laws to be used in the ‘face of uncertainty’ are themselves perfectly certain they would not necessarily provide a panacea. There would be uncertainty Bertrand Russell 135 about how to apply them in a practical situation and, more importantly, whether the application is of practical interest. So the search for such laws is not absurdly utopian. Perhaps the most naive idea for finding such laws is this:

If the laws are to be used in the ‘face of uncertainty’ they should concern arbitrary objects. The more general the objects considered the better the chance that laws about them can be applied without knowing the nature of the objects.

Evidently, the existence of valid general laws is not in doubt: we do not dispute that p implies p, nor other logical laws. What is needed, in Russell’s words ([1919] 2), is a genuine ‘enlargement of our logical powers’ by means of laws which are (generally valid and) at least sometimes useful. Russell repeatedly discusses how one sets about finding such laws. A favourite of his ([1903] 3) was ‘a precise analysis of . . . the ordinary employment [of] a common word’. But unlike many of his would-be followers, he was less concerned with the so to speak literary aim of a faithful analysis than with scientifically fruitful analyses.11 Russell unquestionably understated the progress made in the branch of sci- entific philosophy which searches for general laws, progress very much based on his own work. Practically speaking he probably simply did not know the details (though he seemed pleased when he heard about developments, for example, in [1959] 101 and also in the conversation mentioned on p. 114). But there was also a theoretical obstacle; his stress on the identification of logic and mathematics. This suggests that there is no ‘qualitative’ difference between ordinary mathe- matics, correctly interpreted, and modern logic. Many mathematicians have a different impression, and distinguish between different parts (in developments) of Russell’s work described in Section 2. Set theoretic structures and operations are seen to be natural extensions of familiar mathematics, less so results involving (logical) languages, in particular, of first order predicate logic. Such results have been applied within mathematics; partly in a routine fashion to formulate ex- plicitly the general character of some ‘easy’ facts, partly with great imagination to solve old conjectures. Nobody questions the validity of these applications of logic. But inasmuch as they are ‘qualitatively’ different from ordinary mathe- matics, they constitute a ‘qualitative’ enlargement of our logical powers. They certainly help us sometimes to make a routine start in mathematics: to prove trivial things trivially (a, if not the, most useful result of having sound general conceptions). The search for an enlargement of our logical powers recalls the heroic days of Descartes [1637] or Leibniz, as quoted for example, in [1900] 169−170, 283−284. Both said - and surely believed - that their general rules of thought had led (them) to their remarkable scientific discoveries. These claims may well be true, but they

11The reader of the subsection on p. 129 will recall the hazards of such an (inductive) analysis without safeguards against systematic errors and omissions. Bertrand Russell 136 are difficult to judge - as it is difficult to judge the actual role of general principles which (rich) businessmen believe to have led to their fortunes, as in Getty [1963]; or those to which (healthy) centenarians attribute their vigour. The principles of first order predicate logic are easier to use; one need neither ‘sincerely wish to be rich’ nor have the iron constitution required to digest the health food which brings longevity. The reader of Section 2 will have noticed another, quite typical, heuristic value, for ordinary science (mathematics), of studying (mathematical) concepts with a philosophical ‘flavour’. There is nothing particularly ‘philosophical’ about solving diophantine equations; but the concepts needed for establishing negative results were discovered, by G¨odel,in connexion with logical problems (p. 129 ff.).

Critical philosophy and Occam’s razor We now come to the second branch of scientific philosophy (in the sense of p. 133) about which Russell says ([1912] 233):

‘The essential characteristic of philosophy, which makes it a study distinct from science, is criticism . . . it searches out any inconsisten- cies there may be in [the] principles [used], and it only accepts them when, as the result of a critical enquiry, no reason for rejecting them has appeared.’

The formulation is not perfect. To be consistent with Russell’s description of the scope of philosophy, one would have to assume that this kind of criticism is likely to given ‘unity and system to the body of sciences’. And, in any case, every (respectable) science rejects inconsistent principles. (Contrary to his memory in later years, for example [1959] 11, at the time of Principia, Russell was not preoccupied with this kind of critical philosophy, at least as far as mathematical knowledge is concerned. This is clear from the quotation on pp. 116.) The formulation of [1912] 213 is improved by Russell’s description, on p. 13 of [1959], of the principal tool to be used in critical philosophy, Occam’s razor to which he ‘had become devoted’:

‘One was not obliged to deny the existence of the entities with which one dispensed, but one was enabled to abstain from ascertaining it. This had the advantage of diminishing the assumptions required for the interpretation of whatever branch of knowledge was in question.’

As is well known, critical philosophy goes farther in practice: having shown that it was possible to dispense with certain entities for the interpretation of (existing) knowledge, it assumes that it is permanently desirable or even necessary to do so. Incidentally, Russell realized that it was not easy to be precise about meaningful uses of Occam’s razor, about choosing the particular assumptions to Bertrand Russell 137 which Occam’s razor should be applied, but oversimplified matters in Principia p. 91, 1.5; also earlier ([1903] 15) and later ([1959] 71). He looked for a reduction in the number of assumptions or of undefined terms, as if one could not be as wrong about one unfamiliar or complicated thing, such as set, as about twenty familiar ones such as numbers, points, etc. Occam’s razor is also applied in ordinary science, but with fundamentally different aims (two of which will be mentioned below). They too concern the analysis of existing knowledge, but they are not negative, not essentially critical. First of all, Occam’s razor provides a tool for action in the ‘face of uncertainty’, complementary to the use of laws about arbitrary entities on p. 135. Instead we see how far we can get without assuming anything about certain entities which we don’t happen to understand. Here Occam’s razor is generally of temporary heuristic use. In any case ordinary scientific knowledge is not static; as a given branch of knowledge develops, the entities in question may become ‘indispens- able’. In fact more is true: unless there is independent reason for suspicion of the entities dispensed with, one will try to extend existing knowledge in such a way that additional assumptions are required for its interpretation! just as the decision between (sensible) rival theories is rarely made by use of existing knowl- edge: one has to invent a novel experimentum crucis. In other words, in ordinary scientific practice one will often conclude from an application of Occam’s razor that existing knowledge is inadequate for studying the entities in question, and no more. A second, also positive use of Occam’s razor has proved fruitful so to speak at the opposite extreme when a branch of science is in a very advanced state: the ap- plication of the axiomatic method, especially in mathematics. Having established a result about a specific mathematical object, say, the real numbers, one examines which properties of the object are used in the proof and ‘dispenses’ with its other properties which are superfluous for the given result. But the reason for doing this is not that those other properties are any more dubious. (The reasons vary; from genuinely useful generalizations to what might be described as an analysis of the nature of the specific result.) However, it should be remembered that critical philosophy played an important heuristic role for modern axiomatic mathematics, even on a technical level! One of the early results which were most useful for its developments was the algebraic treatment of polynomials in Sturm [1835]. His express purpose was to avoid continuity considerations which he believed to in- volve infinitesimals, and wanted to dispense with the latter; a perfect example for Russell’s views (p. 136) on Occam’s razor. Sturm’s treatment survives; the aim does not, since Cauchy made continuity considerations independent of infinitesi- mals (and, besides, Sturm’s work applies to non-archimedean fields too, that is, fields which do have infinitesimal elements). The reader of Section 2 may wish to pursue the following parallel between Sturm’s work and Principia. Inasmuch as Principia served as a parenthesis in the refutation of Kant (p. 132), its ax- Bertrand Russell 138 iomatization also fits Russell’s description of Occam’s razor; one ‘dispensed’ with Kant’s acts of intuition. (And as Russell could have said, one was not obliged to deny that such acts occur in actual mathematical reasoning.) The next step in the parallel is to compare Cauchy’s analysis of continuity to Zermelo’s analysis of the cumulative hierarchy and G¨odel’sof the reducibility axiom (described in Section 2). Possible uses of the hierarchy of Principia in generalized recursion theory, mentioned on p. 131, correspond to standard practice of modern axiomatic mathematics. To sum up; so far critical philosophy seems to have had considerable heuristic, but less permanent, value. Particularly as far as ordinary knowledge is concerned, views will differ on the interest of the following result of critical philosophy ([1912] 233, 234): ‘as regards what would be commonly accepted as knowledge . . . we have seldom found reason to reject such knowledge as the result of our criticism’. Paranoids ought to be reassured. Indeed the methods of argument in critical philosophy may even have clinical value since paranoids are notoriously sensitive to logical rigour.12 Less speculatively, critical philosophy may lead to more interesting results at the frontier of knowledge; as suggested by the example of Frege’s analysis of thought on p. 118 and, particularly, by Einstein’s success, on p. 139. The questions of critical - and other traditional - philosophy would not be ‘deep’ if we really wanted to answer them in the context where they occur to us; they are exciting (and difficult to judge) because they draw attention to a new kind of study which is not forced on us by familiar experience.

From what we know to how we know In the preceding two sections the stress was on objects (and not primarily on our assertions about them); on extending the class of objects considered to get general laws, and dispensing with entities by means of Occam’s razor. This respects Russell’s manifesto in [1959] 16:

‘I reverse the process which has been common in philosophy since Kant, [which was] . . . to begin with how we know and proceed after- wards to what we know. I think this a mistake, because knowing how we know is one small department of knowing what we know.’

This type of mistake is familiar from popular positivist philosophy which proposes to begin with methods of measurement (as means of knowing) while, in point of fact, one has to examine whether proposed methods are correct; whether they measure what we want to know about and not artifacts. Russell’s work in logic, as was stressed at length in Section 2, certainly fits in with his manifesto, and so

12Some are certainly capable of conviction by the right kind of argument: one of the three Christs of Ypsilanti in Rokeach [1964] was convinced by the others that he was mistaken. Bertrand Russell 139 does the work on the atomic structure of matter with which the logical analysis of mathematics was compared there. It should be added that in these cases one hardly goes on at all to the second stage, of analysing realistically the processes which bring us knowledge. It should not be assumed that this last omission is a mere accident. Observation of the facts of our intellectual experience shows that we can often be more sure of what we know than of how we (come to) know it; however strange this may seem on the so-called empiricist or any similarly simple- minded conception of our intellectual apparatus (wie sich der kleine Moritz das Denken vorstellt). Russell, according to [1959] 13 − 14, did want to proceed to the second stage (how we know) - though, apparently, not in the case of mathematical or logical knowledge - ‘. . . in the years from 1910 to 1914, I became interested not only in what the physical world is, but in how we come to know it’. But, to quote Einstein’s remark in [1944] 290 concerning Russell’s theory of knowledge, the latter was infected by a touch of bad intellectual conscience. This came about because Russell looked for an empiricist analysis in the style of Hume, where our concepts are ‘reduced’ to, that is, obtained inductively from, sense experience. (In the case of mathematical knowledge - more or less sophisticated - versions of this style of empiricist analysis are required by the doctrines of formalism and finitism mentioned on p. 116.) Russell, reluctantly, came to the conclusion that many of the concepts we constantly use could not be so reduced, and remained troubled about them - in contrast to Einstein. It should perhaps be added that Einstein makes these concepts a little too mysterious, in [1944] 286, by insisting that they are free creations of our minds. The concepts would be less mysterious if, practically speaking, we had no choice: there would be little to distinguish them from what (we believe) is externally given. When looking back ([1959] 12) at his change in interest, from what we know to how we know, Russell connected it with his having ‘done all that [he] in- tended to do as regarded pure mathematics’. But the change is also in keeping with one of the great events of the first decade of this century, Einstein’s special theory of relativity. Einstein’s own presentation did begin with an analysis - in accordance with [1959] 16 quoted above - of how we know (simultaneity), for the critical purpose of searching out serious inconsistencies in the accepted principles of determining simultaneity; at least, when objects move at high speed. From that analysis one could then proceed to ask what we know (not space, nor time, but space-time). Incidentally, Einstein’s reservations mentioned above about the positivist or empiricist theory of knowledge were not due to ignorance of its - occasional - advantages; on the contrary, Einstein was initially strongly attracted by the positivist views of Mach, but abandoned them after closer reflection. Rus- sell’s later ideas on how we come to know the physical world (as summarized in Chapter II of [1959]), refer quite explicitly to the physiological apparatus of perception - in contrast to Einstein’s exposition of his, perhaps, singular success Bertrand Russell 140 with critical philosophy. Views differ on whether the properties of this apparatus are essential for Russell’s aim. But if they are, the time is hardly ripe for pursuing this aim. Judging whether the time is ripe for a problem is of course essential for all research, but especially in philosophy; perhaps not surprisingly because the questions occur to us when we know very little. With wit and literary skill something of interest can no doubt be said about them at any stage; but at a given stage there is often - demonstrably - nothing significant or conclusive to be done. The physicist will here think of such a natural question as: What is matter (made of)? at the time of Galileo or Newton, who both had brilliant arguments for some kind of a micro-structure. The mathematician will think of questions about the need for abstract ideas in mathematics when all we know is numerical arithmetic or even elementary geometry.

Pedagogy in the large We now turn to two social changes which are, at least formally, connected with Russell’s writings - on logic, not on war and peace. These changes, which liter- ally affect the lives of many of us, the way we spend our time, come from the introduction of the New Maths and from the use of computers, particularly for so-called non-numerical computation. The connexions are clear. Principia is about logic and classes (sets), and claims that they provide the means for the correct analysis of other mathematics. The New Maths teaches children about these things before teaching them sums and what Russell called ‘childhood’s enemy’ ([1903] 90). Principia expresses a great number of sophisticated mathematical concepts in terms of those logical primitives in a formally precise, that is, purely mechanical manner. The effective use of computers patently depends on the possibility of this kind of mechanization. The actual, causal, role of Russell’s writings in bringing about these social changes is less easy to establish, and perhaps not even important for understand- ing social history, for example, if good ideas may be expected to be ‘in the air’ (a view which could be connected with rushing into print). Be that as it may, there are some outstanding qualities of Russell’s writings which have surely influenced the details of those social changes. First, as Einstein stressed in [1944] 278, there are few scientific writers and certainly no contemporary logicians who catch the reader’s imagination so vividly and so agreeably as Russell; incidentally, not only by form but also by content. Specifically, though the earlier work of Frege [1884] and [1893] is, in some respects, more satisfactory to the professional logician, the fact remains that Frege wrote about arithmetic, about the question: What is a natural or real number? while Russell asked: What is mathematics? (By p. 118, even formally Frege’s caution was hardly justified, at least on present knowledge.) The bold sweep of Russell’s presentation may well have impressed some educational reformers who worked Bertrand Russell 141 for the New Maths, at least in Anglo-Saxon countries. After all, educational reforms can hardly be based wholly on empirical evidence since their effects are difficult to trace, let alone to predict. So it ought to be a comfort to know that one should begin with logic because, as Russell taught, mathematics is logic. As suggested above, this factor influenced details and the style of the reform. But also the principal factor involved is not unrelated to Russell: the success within mathematics (in establishing mathematical facts and mathematical fashion) of the logical and set theoretic notions first treated systematically in Principia. Secondly, in connection with the exploitation of computers, which require an artificial language, Russell’s writings must have helped by establishing confidence in the expressive power of such languages. Before him there was little practical evidence for this power (or, at least, not widely known). Of course there were highly mechanized languages, for example, for military orders, but these were not though of as a sophisticated business. And there were theoretical arguments against artificial languages, for example, the dispute with Condillac described in Maistre [1821] - with interesting political overtones which will come up again on p. 144. In short, it is suggested that Russell’s writings established a favourable intellectual climate for the use of computers. Naturally, it is more difficult to be sure about effects on any particular scientist such as von Neumann, perhaps the principal pioneer of computer science; his early work concerned a theory related to Russell’s [1906] and, of course, Principia. Here again it is suggested that Russell’s writings affected the details in the development of computer science; but, in this case, not the principal factor (which is, presumably, the technological advance in electronics). Russell himself does not seem to have written about the two social changes here considered. He certainly would not have liked all aspects of these changes. Perhaps he would have had - to use his own words about The conquest of happi- ness (A2228) - ‘commonsense advice’ for atomic scientists, divided between the (moral) agony of responsibility and a (wicked) sense of power; of having been responsible for something sufficiently powerful to merit the agony.

Pedagogy in the small Russell’s writings, particularly on logic, are distinguished also by another, appar- ently quite intimate, didactic quality. A host of memorable metaphors, aphorisms and analogies relieves the reader of all anxiety about the knowledge he has and about the work to come. Russell formulates the questions that occur to the reader at the moment when they arise; even the right irrelevant ones, confu- sions. In short, Russell has great understanding of the feelings we have about our scientific knowledge. Descriptions of these, as of other feelings are liable to be tedious and banal. Russell avoids this because he always sees the, so to speak, universal element in these feelings - like such novelists as Musil or Solzhenitsyn. Incidentally, the quality under discussion of Russell’s scientific style is relevant Bertrand Russell 142 to a principal gap between science and current literature: the characters of most novels go through life without any thoughts or feelings about (their) scientific knowledge, in conflict presumably with the actual experience of those readers who happen to be scientists. (Evidently such a literary treatment of our feelings about knowledge may be satisfying at a stage when, objectively, there is nothing significant to be done; in contrast to the view about scientific philosophy on p. 139.) Russell himself, quite explicitly, attached great importance to the universal element mentioned above. As early as 1900, he devoted §1 of [1900] to ‘reasons why Leibniz never wrote a magnum opus’, concluding that Leibniz did not use sufficiently impersonal, universal arguments because he was too much concerned with persuasion of some particular person; and quite specifically, according to p. vi of the second edition of [1900], of ‘Princes and (even more) Princesses’. Some of us may have doubts about the validity of Russell’s explanation; but we can have no doubt about his view on such matters. Actually, a good deal of Russell’s pedagogic skill is used on subject which nowadays, when more facts are known, are easy to present; it is not necessary to spend nearly 20 pages on the notion of order ([1903] 199 − 217). The skill would, presumably, be more essential for a task which Russell often mentioned as a principal job of philosophy: the discussion of indefinables, as he called them in [1903] xv and [1959] 81, or, better, or primitive ideas (Principia, p. 91):

‘primitive ideas are explained by means of descriptions intended to point out to the reader what is meant; but the explanations do not constitute definitions, because they really involve the ideas they ex- plain.’13

More precisely ([1959] 81), the reader’s attention is to be drawn to the entities concerned (as in pointing a finger at an object in front of us). One way of doing this is to state striking formal properties of these entities, so-called axioms which can often be relied upon to indicate the entities (and can be misunderstood, like pointing). Presumably, pedagogic skills like Russell’s can sometimes be used more effectively than those axioms for conveying primitive ideas. The ‘intimate’ pedagogy here described seems to be similar to the other, non- scientific, area of philosophy which was mentioned at the beginning of Section 3. It is perhaps of interest to go briefly into Russell’s views on these matters at different times. 13Of course, ‘description’ is not used in the technical sense of the ι-symbol on p. 66 or p. 172 of Principia and ‘definitions’ is not used in the informal sense of analysis of a familiar notion, top of p. 12. Bertrand Russell 143 Philosophic contemplation On the face of things Russell changed his mind about the efficacy of philosophic contemplation. When he was forty, he was most positive ([1912] 243 − 250); twenty years later (A2230 and again in [1959] 20) he said: ‘the “consolations of philosophy” are not for me’. On closer inspection the conflict disappears. In [1912] he meant by ‘contemplation’: reflection on unfamiliar possibilities ([1912] 243), viewing ‘the whole impartially’ ([1912] 245). Later he meant: gazing at the stars or admiring the universe for its size, and found that these activities did not console him. Views differ on whether one should look for consolation at all or be an activist; also on the extent to which philosophic contemplation presupposes a suitable temperament. But there is no doubt that the later Russell hardly ever engaged in philosophic contemplation in the sense of [1912]. The two examples below are chosen because of their connection with the work described in Section 2. The corner stones of Russell’s logical theory were: first, his warnings about the ‘irrelevant notion of mind’ ([1903] 4) or an ‘undue admixture of psychology’ ([1903] 53) - although of course, na¨ıvely (whence the need for the warnings), the mental phenomena of logical activity strike us first. Secondly, the famous analogy of ([1919] 2):

‘Just as the easiest bodies to see are those that are neither very near nor very far, neither very small nor very great, so the easiest con- ceptions to grasp are those that are neither very complex nor very simple.’

In the context ‘simple’ is used in a logical sense; but the remark applies also to space or matter when ‘simple’ is used in a physical sense; evidently it is intended generally as an introduction to ‘high’ theory. Both these points are in sharp contrast to the style of his analyses of social and political problems, for example, in [1951] concerning politically important motives. Certainly, there is no guarantee that these problems lend themselves to high theory at all. Even so, Russell’s analyses do nothing, in conflict with the requirement of philosophic contemplation ([1912] 243), to stop ‘the world [from becoming] definite, finite, obvious’. Besides, like everybody else, Russell was quite ready to apply his ‘logical’ maxims to economic life; to accept that this is dominated by rational self interest, yet knowing perfectly well that, individually, businessmen are as lazy, vain and generally irrational as the rest of us. (Presumably their irrational actions cancel out.) The second example does not involve delicate moral issues. The subtitle of A History of Western Philosophy reads: [Philosophy] and its Connection with Political and Social Circumstances. True, Russell interprets ‘social’ in a rather broad sense, for example, in his analysis of Nietzsche’s views on etiquette (ibid., Bertrand Russell 144 p. 767); but in the discussion of the philosophical works of Frege or of himself there is simply no trace at all of any connexion with social circumstances. Yet, in the ordinary sense of ‘philosophic contemplation’ and particularly in the sense of [1912] 244, one should look at oneself from outside. Now it so happens that the matter of artificial languages, a principal topic of Principia, has traditionally been regarded as a political issue, at least in Romantic and nationalist political philosophy. This was mentioned on p. 141 in connexion with the conservative de Maistre and the progressive Condillac; but also, back in the Theaetetus (184c), Plato has Socrates connect formal precision with lack of breeding. Evidently, the conservative view was that there is a universal human capacity for acquiring formal languages, but that only people of the same (proper) background are ca- pable of free and easy intellectual intercourse. In modern jargon, formalization or mechanization would be needed not only for communicating masses of informa- tion, but simply for communicating information to the masses. From this point of view formalist philosophy suits the outsiders as, according to some Marxists, idealist philosophy suited the ruling classes. Principia shows how unexpectedly far the - accepted - universal capacity for acquiring formal languages goes, and may be thought to weaken the conservative case. It is of course a fact that the century of democracy is also the century of formalization to which Principia contributed so much. But this fact, and the connexion it may suggest, did not have an important place in Russell’s view of things. In short, whatever its practical merits the view was often - to quote again from [1912] 243 - characteristically uncontemplative, ‘unfamiliar possibilities [be- ing] contemptuously rejected’.

Russell’s picture of America It remains to mention Russell’s view of ordinary life around him. Probably, his encounter with America is as representative as any other part of his long, active and varied life. On the personal side, he returned there repeatedly in the first half of this century, worked there as already mentioned in section 1, and both his first and last wives were Americans. Also, for anyone interested in the liberal ideas of the last century, America is simply bound to be fascinating. Although it is unfash- ionable to say so, these ideas - as originally conceived - have been followed there to an incomparably higher degree than elsewhere. Some highly uniform societies have been more democratic, for example, in ancient Greece (if the slaves are not counted) or modern Sweden (where legal discrimination on religious grounds till the middle of this century may have encouraged homogeneity). But, for good or ill, America has provided incomparably more opportunity for the (instant) use of native talents and for social mobility than any other country with a similar population mixture; for groups of widely different background or none at all as it were: chasing opportunities, many practically lost their mother tongue altogether. - It would, perhaps, be too much to expect that this social Bertrand Russell 145 mobility would benefit only people of exceptional refinement and sensibility or that it was likely to create social conventions (on speech and manners) which work smoothly and naturally. Russell’s America consisted mainly of university society; at both ends of the scale (Harvard and midcentury U.C.L.A.), and figures on its borders such as the self-made millionaire Dr. Barnes, a patron of the arts and humanities, who battled with Russell over Pithergawras (A2338). This society has changed since Russell’s days, particularly after the first Sputnik. But as far as his times are concerned, the picture created by Russell in his autobiography is certainly quite true. The picture is produced by a series is concrete episodes; several of them are gems, for example (A1325 − 327). The short description of the bullying President of U.C.L.A. (A2333) conveys much of the atmosphere, since made familiar by McCarthy [1952], the famous novel about administrators at a minor American university. The comparison between the reactions to policemen of his young son and of university professor accused of speeding (A313) is, or should be, a classic. As already mentioned, Russell’s picture of his times in America is true; both perceptive and very amusing. What is striking is that the picture includes very little reflection about the overt facts, little connexion with general views which he expresses so to speak abstractly. Specifically, he said quite explicitly (A3329) that ‘institutions mould character’. His picture of the characters he met in America does not include any reflection on how they were ‘moulded’ by the social insti- tutions of the country. (Yet most of the concrete episodes merely illustrate the surely obvious consequences of the social mobility of American society mentioned already.) Nor does he speak of the inherent difficulties when academics who are accustomed to judge cases become university administrators whose business it is to settle cases. The difficulties were compounded at the end of the thirties by the fact that many distinguished exiles had jobs below their academic standing at American universities. This put the administrators in a false position. Russell mentions that some of these administrators were pompous bullies; but not the temptation - in their position - to stand on their dignity. They did so in the style they had learnt at (American) public schools or, perhaps, from Hollywood films. Since Russell does not mention these things, his picture also leaves out the remarkable qualities of those characters who lived under the same institutions but did not offend his aesthetic sensibilities. For present conceptions, Russell’s view of America is narrow, especially if - as seems appropriate - one thinks of other great writers with high ideals. Thus, Solzhenitsyn[1968] and [1968a] present public prosecutors and commissars as ‘moulded’ by institutions, caught up in them. This then leaves room for real moral freaks, good or bad, as in the portrait of Stalin in Solzhenitsyn [1968]. Russell’s view doesn’t. On the other hand, present views on social matters are probably narrower than

Russell’s - which he called his ‘social vision’ (A3330). He saw ‘in imagination the Bertrand Russell 146 society . . . where individuals grow freely, and where hate and greed and envy die because there is nothing to nourish them’. There is little in his autobiography to make his vision concrete. He does not seem to have been particularly romantic about humanity-in-the-raw; in A1240, 244 and especially in [1959] 214 where he criticizes Tolstoy for his saintly picture of Russian peasants. Yet however uncon- vincing it may be (at least for those of us who do not know these peasants), this picture makes Tolstoy’s social vision quite concrete, since the vision is practically realized already. There is much more to convey what Russell called his ‘personal vision’, above all the description of his relation with Joseph Conrad (A1320 − 324), obviously - for Russell - a kind of paradigm for the possibilities of human relationships. There is deep admiring affection and, in particular, there is no trace of contempt for any aspect of the other person. And other episodes and letters in the autobiography suggest that, all through his life, Russell was linked to several people by similar, so to speak, chemical bonds. These bonds seem to have developed freely and naturally without the caution or cunning calculation which are sometimes said to be necessary to protect such relations and which were so alien to Russell. Part V

ON WITTGENSTEIN

147 Chapter 9

On Some Conversations with Wittgenstein: Recollections and Reflections

The recollections come primarily from my student days in Cambridge: before and after my military service (in England). Many of them are still exceptionally vivid, though perhaps rose-colored. I was eighteen when I got to know Wittgenstein in early 1942. Since my school days I had had those interests in foundations that force themselves on beginners when they read Euclid’s Elements (which was then still done at school in England), or later when they are introduced to the differential calculus. I spoke with my ‘supervisor’, the mathematician Besicovitch. He sent me to a philosophy tutor in our College (Trinity), John Wisdom, at the time one of the few disciples of Wittgenstein. Wittgenstein was just then giving a seminar on the foundations of mathematics. I attended the meetings, but found the (often described and, for my taste, bad) theatre rather comic. Quite soon Wittgenstein invited me for walks and conversations.1 This was not entirely odd, since in his (and my) eyes I had at least one advantage over the other participants in the seminar: I did not study philosophy. Be that as it may, in his company (`adeux) I had what in current jargon is called an especially positive Lebensgef¨uhl. Some facts occur to me in this connection.2 On the one hand, he was often

0Originally published in Wittgenstein: Biographie-Philosophie-Praxis, Wiener Secession, 1989, pp. 131–143, as ‘Zu Einigen Gespr¨achen mit Wittgenstein: Erinnerungen und Gedanken’. Translated by W.B. Ewald. 1Incidentally, we usually spoke English, although I (too) had grown up in and had only been living in England for about three years. This reminds me of some other facts, e.g. that at the time I did not know the German mathematical jargon of elementary analysis. (Wittgenstein probably did not either.) 2Clearly, it would be thoughtless to assume that the above-mentioned Lebensgef¨uhl must be connected with such facts. It might also be, as Wittgenstein said even five years later, that 148 On Some Conversations with Wittgenstein 149 upset by things that did not seem to me in the least tragic. On the other, I soon noticed that he had a great deal of understanding for my own perplexities, which mostly consisted in a malaise about a malaise. In less contrived terms, I was irritated when things disturbed me that one could (as one said in those days) just as well ignore if one pulled oneself together. On the material level: environmental pollution (say by noise); but more often on the mental level: thoughtless obiter dicta, especially by people I otherwise liked. Time and again Wittgenstein found a way of describing the situation - i.e. its relation to other aspects of my (limited) experience - that removed my malaise. In this respect mental pollution is well- known to be less demanding. And it struck me, particularly in his company, that often apparently (but not always literally) superficial corrections get rid of a malaise.3 In short, as far as that positive Lebensgef¨uhl is concerned, I find even today that it goes well with my view touched above, without wishing to drag in matters of cause and effect here.4

Constructive content of proofs One day Wittgenstein suggested that we take a look at Hardy’s Pure Mathematics together. This introduction to differential and integral calculus was a ‘classic’ at the time and, at least in England, very highly regarded. It dates from before the First World War, and fits the ethereal idea(l) of rigorous logical foundations (roughly, what engineers believe they do not need from analysis). Hardy was personally thrilled with Russell, and Wittgenstein had known Hardy in his best years. I only got to know him after he was very depressed. But even then he was an exceptionally impressive sight: so ethereal that even his bones seemed transparent. In any case, we had the kind of faith in the author that usually makes reading a book of his easier. Wittgenstein had only distaste. So something in the style, and perhaps also in the content, was liable to have got in the way; naturally, not concerning its (mere) validity, but its appropriateness. Such things happen in the best books, and in the logical tradition especially at the beginning. (The beginning of the Elements of Euclid is hard going, too.) Now, Wittgenstein never acquired the skill, so useful in science, of leafing through (say the way one scans a landscape, rather than feeling one’s way inch by inch). So with him difficulties at the start generally were the end of his reading. Moreover, for common sense, those initial difficulties are hardly avoidable in

I was exceptionally ‘green’. 3Today I would think of high technology: a defective seal is overlooked, and a space-shuttle explodes. 4Granted, causal relations have turned out to be excellent in some domains (e.g. non-chaotic dynamics, especially of point-masses), but this leaves open how suitable they are in other domains. After all, there are other, impeccably rewarding relations too. On Some Conversations with Wittgenstein 150 the logical tradition, which wants to derive all knowledge (tacitly: if possible) from a single fundamental principle.5 Now, this logical scheme often works quite well in a very limited field of knowledge, but hardly ever in elementary analysis (based on the least upper bound principle). At least for common sense, in such a logical enterprise a great deal of familiar knowledge is likely to be presented in a tortured (albeit formally correct) order. The proofs sound like a sort of litany: already in Euclid, and still more in Hardy. Depending on temperament one finds litanies memorable, or repulsive, or somewhere in between. To avoid misunderstandings (especially by young people, who since their school days have been used to the ‘new’, i.e. abstract, mathematics) it must be stressed here that the foundational ideal (in Hardy) has not dominated or- dinary mathematics in the last few decades. Abstractions, alias generalizations, of relatively many proofs (of analysis) by relatively few suitable spaces (or, say, topological groups) are the new ideal. In fact, such abstraction in mathema- tics had already begun before 1930 in Germany and France. Bourbaki’s article L’architecture des math´ematiques, written in 1939 and published after the war in 1948, can be read as a manifesto for that enterprise. But in Cambridge in 1942 one relied on the usual mental reflexes of robust common sense. If I had had Wittgenstein’s inhibitions towards Hardy’s style, the mathematics of the day would hardly have attracted me. Be that as it may, it soon turned out that the aspects of the proofs in Hardy’s book to which my reflexes drew attention also appealed to Wittgenstein. Of course, this does not just mean that those aspects were there (and that he saw them), but above all that they (also for him) were suited to the proofs in question. These aspects cut across the division (alias structuring) of analysis in Bour- baki. Roughly, in the conversations at issue it was a matter of aspects from which the engineer also obtains something, and not just in principle. Sometimes more efficient techniques of computation results in this way than simple number crunching. At the time I had not yet given the business a name, like ‘constructive content’. Example. If y = f(x) is (the equation of) a curve continuous in the interval 0 ≤ x ≤ 1 and such that f(0) < 0 and f(1) > 0, then f intersects the x-axis. The job was to compute, from the proof (in Hardy), a point of intersection.6 Taken literally, the calculation would be restricted to curves for which the case distinction required in the procedure can be made effectively. It is important to note that the procedure does not generally determine a point of intersection even

5Thus not even like a tree, which (as it were, in principle) grows from a seed only in good soil; to say nothing of a crystal that is seeded by a dust particle. 6 1 1 The proof runs as follows. If f( 2 ) = 0, let x0 = 2 . Otherwise, consider the interval 1 1 1 1 2 ≤ x ≤ 1 if f( 2 ) < 0, and the interval 0 ≤ x ≤ 2 if f( 2 ) > 0, and start again. This so-called bisection procedure determines an x0 such that f(x0) = 0. On Some Conversations with Wittgenstein 151 moderately precisely from arbitrarily precise data about the curve.7 In the conversations one looked for suitable additional data, and thus for a new formulation of the theorem in Hardy. This was in conflict with an early slogan of Wittgenstein’s (which was popular down market):

A theorem is understood by (inspecting) its proof.

Now

a proof is understood by stating a possibly new theorem.

In the first few conversations about Hardy’s book, Wittgenstein discussed everything thoroughly and memorably. The conversations were brisk and relaxed; never more than two proofs per conversation, never more than half an hour. Then one switched to another topic. After a few conversations the joint readings came to an end, even more informally than they had begun. It was, by then, clear that one could muddle through in the same manner. After the war I had a chance to go into mathematical logic in more detail; in particular, into consistency proofs. Instead of pursuing Hilbert’s aim of eliminat- ing dubious doubts about the usual methods of mathematics, a more compelling application (better: interpretation) of those proofs occurred to me.8 Once again, the issue was a kind of constructive content; not, however, for items in some mathematical textbook, but for all derivations in some current formal systems. Different consistency proofs generally produced different procedures which (in current jargon) extract computer programs (mechanically) from formal deriva- tions. I told Wittgenstein about this, probably in 1947; very briefly, since the new aim was immediately convincing to him (incidentally, without a trace of any objection on his part, say against ‘regimentation of language by formalization’9).

Family resemblances of concepts Already before the summer vacation of 1942 Wittgenstein gave me a copy of the Blue Book. In leafing through it I came across the metaphor family resemblance

7For instance, not with the two curves

 1 1  1 1 x − 3 if 0 ≤ x ≤ 3 + ε x − 3 if 0 ≤ x ≤ 3 − ε  1 2  1 2 f1(x) = ε if 3 + ε ≤ x ≤ 3 + ε f2(x) = −ε if 3 − ε ≤ x ≤ 3 − ε  2 2  2 2 x − 3 if 3 + ε ≤ x ≤ 1 x − 3 if 3 − ε ≤ x ≤ 1, which can lie arbitrarily close to one another (at a distance of at most 2ε), and each of which 1 2 has only a single point of intersection, but one at x = 3 and the other at x = 3 . 8Pedantically: not of precisely the given proofs, but of suitable variants. 9For this project, formalization was the starting point (in modern jargon: part of the data) and not a ritual or some other ideological end-in-itself. More explicitly: not just any old (correct) formalization was meant, but one appropriate to the subject matter. On Some Conversations with Wittgenstein 152 of concepts; it pleased me, as I mentioned to him in conversation. But in the course of the summer a malaise developed. Not at all merely because it jars with ordinary English (after all, one has a concept of a family - here, of concepts - too), but because the subject was too familiar to me (at least as I understood the metaphor). The whole of abstract mathematics is full of families of concepts; for example, the family of groups, of which one has a concept (from the usual definition). But it is not a (tacitly: rewarding) object of research: roughly, what holds for all groups is rarely of use for any particular group. Usually such families have a few broad, simple properties that should not be forgotten. But, at least for common sense, brooding about them is liable to reach the point of diminishing returns quite soon. Viewed this way, the question at once arises:

Who can afford, as it were, to cut out the half of his brain where the experience from group-theory is stored, and still make headway with the matters for which the metaphor is meant in the Blue Book?

When I returned Wittgenstein’s transcript after the summer vacation I also men- tioned my malaise, but this time he had no sympathy for it. For once I remember the approximate wording: ‘I taught you everything I,’ - then he interrupted him- self - ‘you are capable of learning.’ I laughed; or, rather, was unable to suppress the kind of giggle to which he was in any case accustomed. He in turn found this odd - but not, I believe, insulting. After all, I was not laughing at him, but was delighted by the elegance of the remark. Its content was altogether impeccable (and the form not hackneyed): this time I should take care of my malaise myself. At the time I did not succeed: I lacked the right concepts. I returned to the matter only much later, once again in accordance with the motto:

relatively few distinctions for relatively broad domains of experience.

The concept of group becomes a rewarding object of research only in combi- nation with others: as with finite groups, algebraic groups, and the like. These then have ‘family resemblances’. Whoever broods on what they all have in com- mon stays below the threshold of scientific knowledge. Fortunately this claim is not (any longer) merely airy-fairy since logicians have done the brooding (e.g. under such headings as ‘Elementary Theory of Groups’, its undecidability, etc.), and one sees just how little even exceptionally skillful and clever people have squeezed out of it. Today Wittgenstein’s metaphor no longer conveys these insights convincingly. Who today (in the market for these commodities) does not think of the DNA structures of common ancestors? And with these structures you just can’t know too much. In the commerce of ideas (to use Kant’s metaphor, apparently of permanent market-value) one cannot afford to cling to every metaphor: some (e.g. Wittgenstein’s) lose their market-value. On Some Conversations with Wittgenstein 153

Certainly, I still find Wittgenstein’s enterprise, as I see it in the Blue Book and elsewhere, most appealing. Many insights of broad validity become clear and even self-evident when one studies a particular subject seriously (here, group theory), but only for this subject. At the same time many of them not only have broad validity, but even contradict venerable heroic ideals. The enterprise is to convey that broad validity memorably by a suitable metaphor. It all comes down to being memorable, since a very general valid insight is usually quite elementary, and becomes important because it is in fact (and not just possibly!) forgotten when it is needed most. But precisely the metaphors that have become most popular are distracting (by experience, not only for me), e.g. the business of language games. In games (or experiments, in the sense of ‘tinkering’) their most conspicuous feature is their isolation from everything else. There are of course relations, but they are put out of mind. In any case, as I read the Blue Book, the so-called language games should rather be compared to something else familiar in science: descriptions of simple situations, in short, imagined experiments (also known as thought experiments). While these descriptions focus on simple aspects of the situation, they also recall many similar situations (and are supposed to): they direct attention to new relations.

‘Deep’ processes In the next example the initiative came from me. It concerns, as it were the opposite of a malaise. In the war (at least, after early 1944) I was occupied with hydrodynamics; among other things, with artificial harbors and sea-waves near the coast. One wanted to record the wave motion by measuring the pressure at the bottom of the sea (such measurements being inexpensive). Now, the theory looks like this. The (tacitly: exact) distribution of pressure on the bottom determines10 (also exactly) the so-called potential, and thereby the different aspects of wave motion. But here the experts actually overlooked the fact (so it was not just the mere possibility of an oversight) that small variations (i.e. errors) in the distribution of pressure can go with large variations in the shape and motion of the sea surface (cf. note 7). Pedantically: aspects that strike the eye (e.g. the height of the waves) remain undetermined. When the waves are short (compared to the depth of the sea) the pressure at the bottom of the sea is slight, even if the waves themselves are relatively high. In an extreme case one has white horses which, as one says, are a surface phenomenon.11 The observation in the last paragraph is still far from an adequate answer to the question: What does one obtain from pressure measurement? (here, with

10‘Determines’ sounds better here than ‘causes’, since pressure and (surface) motion do not vary separately. 11Incidentally (and this too should not be forgotten), this phenomenon often determines our impression of the sea more than the massive but slow motion of the tides. On Some Conversations with Wittgenstein 154 regard to the movement of ships outside and within artificial harbors). It is clear that a suitable smoothing that neglects short waves is sufficient: one does not need the wave-motion itself. (As usual, only after further investigation can we say what is adequate: here for the ships.) In short, to the extent that the observation above is in fact decisive here, the pressure at the bottom must determine the (suitably) smoothed-out motion of the surface. A calculation confirms this. Everything that I then knew about Wittgenstein (including his background in engineering) suggested that this story would please him. (The insight pleased me.) As usual, a few short sentences sufficed to explain the matter to him; thus, without the long-winded introduction in the last two paragraphs. He seemed to be quite satisfied, especially with the observation (in fancy terms: the insight) that what is decisive here is not that the wave motion is (analytically) determined by the pressure, but how (continuously) it depends on it. Afterwards he suggested I should try harder and prove that the weather (in a suitable, corresponding sense) is not predictable. I did not follow up the sugges- tion. Perhaps it was premature.12 Certainly, work in the sixties, which confirmed the chaotic character of the weather (according to a current theory), used large computers that in the forties were not sufficiently developed. For me, this story of the waves on the surface and the pressure deep down remains a vivid metaphor, in particular for self-protection against gushing about depth, which to me is a kind of intellectual pollution. Of course, it can be re- warding to ask which aspects of a phenomenon that presents itself are, and which are not, connected to deep structures; and, of course, there are branches of or- dinary common commerce of ideas that specialize in working in the depths, e.g. coal miners and mathematicians. For such people it would be a mistake to allow themselves to be (excessively) distracted by events on the surface. But these surface events sometimes have great market-value. Today one thinks of the global laws in chaotic dynamics, which are provably independent of the details of the local processes; thus, neither informative about these (deep) processes, nor understandable through them. Looked at in this way, the fuss about depth (as though it were obviously a Holy Grail) is simply embarrassing. In the forties one had only a pale inkling of all this. (The example from hydrodynamics was not entire sufficient.) From the start, Wittgenstein was not in the market for the metaphor regarding depth. But he rarely said anything to me that would have contradicted its sense. 12To use a phrase with which I answered his (actually, rare) questions about Higher Things: Who am I of all people to know such a thing? On Some Conversations with Wittgenstein 155 Warning The conversations recorded here are not typical even for my own experience with Wittgenstein, to say nothing of that of other people. As a burnt child I should like explicitly to stress this. I still remember my disappointment (actually, shock is a more fitting word) when I picked up his Remarks on the Foundations of Mathematics, which had nothing at all in common with my own experience of Wittgenstein’s interests in these matters. And here it was a matter of his own for- mulations; merely edited with a taste (of the editors, e.g. for clumsy helplessness) that was, to put it mildly, foreign to me. For one thing, in our conversations Wittgenstein touched on the Higher Things (although, as already mentioned, seldom). For example, he once said he was surprised that he could be friends with anybody as irreligious as me. (I was not surprised.) If I had been compelled to be surprised about something of this sort, then at most about the fact that there should have been any talk at all of friendship despite an age difference of nearly 35 years. (My contemporaries, too, regarded what today is called the generation gap as a ‘fact of our natural history’). For another, Wittgenstein insisted on speaking of ‘discussions’, which perhaps fits the conversations reported above. But mostly it was a matter of monologues on his part. These were for me a pure delight: one did not need to do anything, since he chose the topics (usually with great sensibility) and (as I would then have put it) did not ‘let himself go’, but formulated everything beautifully. So I had nothing against the practice itself, only against his description of it. To have such delight, one need not have suitable means for describing it too (at least, I did not). On the other hand, without such means (at least, for me) it would have been unsatisfactory to protest against his description. Today I should attempt it as follows. As for content, the monologues had a good deal in common with the notes in Zettel or the Vermischte Bemerkungen; more precisely, with those from the time after 1942. In contrast to the Remarks on the Foundations of Mathematics mentioned earlier, they evoke in me (as it were) rosy recollections. But in reading them I have the same experience as with some stage plays (even Shakespeare’s). To be sure, it’s beautiful. Isn’t that enough? Must it be right too? (Say, for an area of my own experience.) Of course, with this reaction one doesn’t get much from the content. The expository style (of Wittgenstein’s conversations, where ‘expository’ would not apply to discussions) was at any rate for me much more successful. This experience (of mine) leads to the following observation, which perhaps applies to others besides myself (but certainly not to all!). One gets nothing from carping at a style. Occasionally it is possible to read one’s way into Wittgenstein so that this replaces his voice and gestures (which I no longer clearly remem- ber), and brings somewhat more life into, say, the highly polished Philosophical On Some Conversations with Wittgenstein 156

Investigations. Seen in this way, Wittgenstein’s favorite quotation:

Le style, c’est l’homme is very apt, and even more so when one thinks of Buffon’s previous sentence:

Ces choses sont hors de l’homme.13

Only the utterly clueless would assume that the market-value of an article in the commerce of ideas is not affected by the style of the producer, or even not by the sensibility of the consumer.

13What is meant are impersonal things, such as discoveries, insights, and the like: if one person does not notice them, then others will. Chapter 10

Ludwig Wittgenstein

The broad aim of this chapter is to convey an idea of what is done in contemporary philosophy; not of the day-by-day activities (for which one would have to be in the trade), but of the broad interests; not by abstract formulations, but by memorable examples from which one can derive one’s own formulation according to one’s own background. For this purpose Wittgenstein’s interests seem to serve very well: he started early and became famous early; so he had a chance to give second thoughts to what he had done. (Besides he did not have to worry about a career.) And he not only had the chance; he used it. He also had, to a remarkable degree, intellectual sensibility - a nose for what is coming, not merely for what is staring at us from all sides - and a sense of the dramatic, disciplined by unusual taste. Given the two qualities in question I have been encouraged to reread his two books in a way that seems to me plus satisfaisant pour l’esprit than my earlier readings. Of course he chose his words carefully. But this does not mean that they should be taken literally if - in effect, not by conscious intention - the reader is to benefit most; tacitly, in the ways envisaged in the introduction to the Philosophical Investigations. In my case this ‘benefit’ consists in being encouraged to develop ideas on logical foundations that have been slowly growing out of my logical experience.

Meaning of ‘philosophy’: adaptation to background knowl- edge The etymology of ‘philosophy’ is: love of wisdom. Who knows what kind of thing the Greeks had in mind when talking of ‘wisdom’. But let us suppose they had a sensible meaning, and were not merely gushing about Higher Things. I think one can say:

0This chapter is based on lecture notes for the class ‘Wittgenstein: the philosophy of language (A survival kit: when facing the faithful)’, Winter 1984-85, Stanford Extension in Vienna, and it is published here for the first time. 157 Ludwig Wittgenstein 158

We need wisdom when we are confronted with a difficult situation, one about which we know little. Otherwise the only wisdom is to be competent, and learn what there is to learn about the situation.

This explains why at an early stage all knowledge was regarded as the province of philosophy. Even if one knows little about contemporary philosophy, one knows that phi- losophy is full of questions about existence and validity (What is mind? or What is matter?) and all sorts of imagined situations. Does this sort of thing have to do with the pursuit of wisdom, even when the latter is interpreted in sober terms as above? We remember the line:

What is matter? Never mind. What is mind? No matter.

I want to view this as follows; now ordered according to background knowledge required.

1. Imagined situations There is a general story that Greek philosophers did not do experiments or make observations because they were too snobbish. Well, that may be so. But I don’t know enough to discuss it. Instead we have this point, supposing now that they were not snobbish. Just ask yourself the first question that comes to mind, and what obser- vations or experiments you instinctively want to make, e.g. in astronomy (which has excited every culture because of its obvious regularities). The moon takes so many days to turn around the earth. What do you instinc- tively want to do? Go to the moon, and take a closer look. But how? Much the same applies to all sorts of other questions that interest you; for example, the stars. So, if one does not have the technical means or, as one says now, the technology, it is generally wise to rely on something else. And it was an exciting discovery that, at least occasionally, imagined situations will do. Let’s not forget one thing. The broad idea to use imagined situations is a prominent tool in our intellectual arsenal. Usually it is an adjunct, often a kind of spice. In philosophy it is part of the staple diet. Reminder. However much we know there are always areas for which our technology is inadequate. Another way of putting it: we use imagined rather than concrete situations when we are not in a position to create the relevant concrete situations (or do not want to, e.g. experiments with humans) Here concrete is thought of as opposite to: abstract. We now have a second sense. Ludwig Wittgenstein 159

2. Abstraction If we know little about something it is surely natural to ask: perhaps, we can find something that is true of everything, and then it might tell us what we want to know about the particular thing. It does not seem to be very promising. Things look similar only of you look at them from very far away. The first impression is that this is absolutely hopeless. A couple of discoveries caught the imagination of people: Mathematics, especially geometry (what is true for all triangles can be of interest for particular triangles) and, even more exciting, Logic (as Aristotle said: the science of Being; or Leibniz: What is true in all possible worlds). And since the first impression was that nothing of this kind is possible at all, the second impression is that there is no limit to its possibilities. If a commercial idea looks hopeless, the relevant shares will be undervalued, and any flicker of success will go to people’s heads. Preview: This kind of thing determined Wittgenstein’s early work.

Let us go back for a moment to existence and validity. There are big words for this: ontology and epistemology. Far from being subtle these ideas require very little background knowledge: as knowledge expands these first questions recede into the background. All this is expressed by a pun on the word basic or fundamental. It can mean ‘elementary’ in the sense of: first steps; but also, as in ‘elementary particles in physics’: objects whose properties give you knowledge of a great variety of other things.1 Put this way the main theme of Wittgenstein’s late interests is already dis- counted: the danger of reaching the point of diminishing returns very soon when pursuing flashy, dazzling kinds of generalities. In other terms: exaggerated ex- pectations lead to exaggerated reactions. The usual knowledge business, including science, leaves a gap: one tends to look at successful strategies, and ignore others (as one says of the latter: we have to learn to forget them). Certainly, for the most natural sense of understanding our knowledge, including the feelings we have about our knowledge (and igno- rance!), this tendency is defective: we also want to be aware of ideas that are plausible, but unsuccessful. Or, more precisely,

if we trust a bit in our instincts we expect such ideas to be successful under limited circumstances, different from those for which we intend them. 1Just for the record. The presentation above is diametrically opposite to the most familiar claim for philosophy: namely, that it is supposed to provide ultimate and necessary truths. For example, Popper: science is supposed to be always relative,while his pronouncements are of eternal application. Well, it just doesn’t look to me that way at all. Ludwig Wittgenstein 160

An example of this shift of emphasis is furnished by Wittgenstein’s early interests: the kind of (formal) language that had caught his imagination is a model for computer languages.

Style of the chapter What means are there to formulate memorably the fruits of the kind of intellectual activity that goes on in the kind of philosophy adumbrated above? I have already said what I want to do here. It is really the kind of thing one did in the 18-th century when people were educated by reading biographies. (Occasionally, I may use the biblical style of parables, too.) See Sections 2 and 4. Somewhat at the opposite extreme I will use a literary discovery of the last century, which dominated the early part of Wittgenstein’s work (not his discovery; but he discovered a way to convey its excitement): in the trade one speaks of metatheorems2. See Sections 3 and 5. It has to be admitted that Wittgenstein rejected metamathematics as a tool for clarification (to his taste); at best, envisaging it as a rewarding subject matter. But this is only superficially paradoxical. For one thing, the metamathematics he had come across in his early studies is not an effective tool. More interestingly, his oversight fits in with a very widespread phenomena at the present time, connected with pollution created by technology. But what a gut reaction against technology overlooks is, obviously, the fact that closely related technology can be used to locate and, with luck, clear up the pollution involved.

10.1 Wittgenstein: First and Second Thoughts

From time to time some striking discovery is made in a narrow area which catches the philosopher’s imagination: the new discovery suggests a general strategy in the knowledge business (partes pro toto). Specifically, having a striking instance of (what Descartes would later call) a rule of thought before one, one just ‘gives it a whirl’ (that is, thinks only of that rule) and sees what it does in every domain. Examples, the second very directly relevant to Wittgenstein’s first thoughts: 1. Greek philosophy The Greek geometers discussed good answers to questions like: What is a circle? What is an ellipse? and so on. Their definitions - in what we now call ‘Euclidean’ terms - helped one to find out striking properties of the objects or notions defined.

2Later work of Wittgenstein introduced - and made popular - another literary style: puzzles. Socalled paradoxes are a kind of prototype, but exceptionally successful because they are both puzzles and good jokes. Ludwig Wittgenstein 161

The philosophers went hay wire, and asked: What is X? whatever X was mentioned. Aristotle warned that every sequence of definitions must have an end, and admitted - at least implicitly, by saying that it was a matter of good breeding - how hard it can be to know whether one has reached the end. Indeed, he could have said that it is hard to know in what terms the definition should be given.

2. 19-th century chemistry In 19-th century chemistry one had an answer to: What is X? for any substance under the sun (or, more precisely, to: What is - the chemical composition of - X?), under the scheme: a mixture of chemically pure sub- stances, and each of the latter is built up of atoms. Of course, the chemical composition says nothing about the amount, geometric form, temperature, electric or negative charge of the substance. Well, if chemistry could do so much, so quickly, why not everything (with a little more effort)?

A universal language for definitions Before the language of mathematical logic was developed the word logic was used widely in the philosophical literature. There was, for example, Hegel who used the word ‘logic’ quite simply to mean: principal laws (of anything: logic of history, logic of law, etc.). In order to make one word fit so many different things it was necessary to attribute the most curious properties to it. Sample. A proposition and its negation(s) are equally valid. When you have a thesis and an antithesis you don’t see where the thesis is valid, and the antithesis considered is not, or vice versa, but you are encouraged to synthesize, and more of that ilk. It is against this background - of socalled idealist philosophy - that the lan- guage of mathematical logic came as a breath of fresh air. Beginning with Boole’s elementary propositional logic (not, and, or, implies, equivalent) and Cantor’s sets and relations, Frege considered also predicate logic (for all x, there is x) and discovered definitions of natural and real numbers. Actually, the mathematicians had discovered lots of things about natural numbers without these definitions; for example, Euclid in his Elements or Gauss in his Disquisitiones. But the thrill remained. People have tried to find the words that might do justice to that thrill. They said that now, at last, we knew what - natural or real - numbers really are; or that we never knew the answer precisely, but now at last it is really rigorous. This was not convincing. For one thing there were difficulties, the socalled paradoxes (as mentioned, they are good jokes; tacitly, in small doses). But also without these difficulties - which, contrary to a common misunderstanding, did not hamper the subject Ludwig Wittgenstein 162 but were needed to make it interesting at all - there was scepticism, for example, Bishop Butler’s: everything is what it is, and not another thing. It is fair to say that Wittgenstein was the first, and remains the most success- ful, author to catch the thrill of the whole enterprise: that one, logical, language was to be the way of expressing all knowledge from socalled basic (simple) propo- sitions, expressing unanalyzed facts.3 Literally,

Mathematics today, the world tomorrow.4

The attitude of the silent majority was: Trust in God, but keep your powder dry. You don’t expect miracles, but occasionally some ambiguity will be elimi- nated, for example, when ‘anybody’ could mean either everybody or somebody. Less trivially, even though one did not have the hardware for any efficient com- puter it was clear that a small vocabulary and a simple grammar would have some use as software - in other words, in programs - for such things. Grand talk about the universal language was treated by the silent majority as Churchill wanted to treat lies: the truth is so precious it needs a body guard of lies. The universal language was - then - such a delicate blossom it had to be protected by exaggerated claims (even if not necessary, they it could not do much harm generally.5 Wittgenstein did not have the robust attitude of the silent majority; neither the general scepticism about thinking too much which is part of the Austrian intellectual tradition, nor the amusement at this kind of excess which is part of the English tradition. (Perhaps it is no accident that Wittgenstein attracted more attention in places and among people where clever people with his intellectual temperament were not common; unlike Germany and France. But I am sure I don’t know.) Be that as it may Wittgenstein took himself not only seriously, but literally. During the 20’s, the decade after his first thoughts were published in the book Tractatus logico-philosophicus, he gave them second thoughts. He also was very depressed. People ask if he was depressed because of these second thoughts, or whether he was prepared to indulge in second thoughts be- cause he was depressed (instead of enjoying fame, etc., resting on his laurels). Knowing as little as we do about such mechanisms it is as pretentious to ask such questions as: Which comes first, the chicken or the egg?6

3In analogy to chemistry, simple propositions correspond to atoms, and sense to chemical combinations of them. In contrast to chemistry, with its list of (then) around 90 atoms, now increased by 50%, there was no list of simples for the whole world, only for mathematics. 450 years ago the Nazi’s had a song: Heute geh¨ortuns Deutschland, morgen die ganze Welt (today Germany belongs to us, tomorrow the whole world). 5They did on gifted people like Poincar´e.Today similar exaggerations put off even average people, and attract only logical cripples. 6The latter does have an answer! provided one accepts the socalled fundamental dogma of molecular genetics: from DNA to practice and never the reverse. In other words, any changes Ludwig Wittgenstein 163

Be that as it may, by the end of the 20’s, Wittgenstein forgot or suppressed all the virtues - achieved or expected - of the universal language, and devoted the rest of his life to its shortcomings.

Other functions of language During the 30’s and early 40’s, Wittgenstein paid attention, roughly speaking, to just about everything that had been neglected in his first thoughts. These, as said already, were mainly concerned with the use of (a universal) language for defining notions. Needless to say, the second thoughts did not have comparable focus or even coherence to the earlier thoughts. As mathematicians say: the complement of a set is not a set. He also developed a new vocabulary; on the principle that a standard vocab- ulary becomes threadbare, and so, if you want to rethink old issues, you have to freshen them up by use of new words. Perhaps, the best example is the phrase ‘role in our lives’ where, since Darwin, one had spoken of ‘survival value’. Actually, there is a little more to it than that. His words are less pretentious. After all, the word ‘survival value’ implies quite a lot about our global knowledge of the future of society; much of this ‘knowledge’ is illusory or at least dubious. The words ‘role in our lives’ of course do not give us better knowledge of the survival value of anything. But they also do not engender the illusion of such knowledge. Wittgenstein’s second thoughts really run along that very familiar line of being constantly aware of how little we know. And his personal twist to that line is that (pretentious) language is the main obstacle to recognizing that familiar truth when we meet an instance of it, and so to acting accordingly, of course. It has a modest sound, without being modest at all; at least, in relation to professional colleagues. By implication they do not recognize that truth, and so are unaware of how much there is to know. In short, they are superficial, and this is not a term of endearment among philosophers. As Musil said in Der Mann ohne Eigenschaften: you judge people by what they want to do, not by what they achieve. (This would not be a good way of judging parachutists.)

Manifesto I myself do not at all agree with what would popularly be called Wittgenstein’s philosophy, that is, the balance of emphasis in his interests. That’s just not the way the knowledge business works (best). But it represents in a so to speak chemically pure form the balance that is natural to some of us logically sensitive people. So it is rewarding to learn about it; especially since his literary style is agreeable. in the organism are preceded by changes in the DNA (of the fertilized egg compared to the DNA of the chicken that lays the egg). Ludwig Wittgenstein 164 10.2 Biographical Material

Speaking for myself, after having first met Wittgenstein when I entered Trinity College, Cambridge, in 1942 and seen a good deal of him for about ten years, I find the book of H¨ubnerand Wuchterl [198?] one of the more agreeable published sources of biographical information about him. (My favorite, the unpublished family history by his eldest sister Hermine, is quoted occasionally in the book, and is the source for some of the material below.) In contrast, several published personal reminiscences of Wittgenstein present a personality I barely recognize. Presumably, to some extent the reminiscences reflect their authors’ ideas of what a great man ought to be like. But, if my own experience with Wittgenstein is at all typical, he may indeed have felt and behaved very differently in different company. He seemed to me quite extraordi- narily pliable, far beyond the demands of ordinary good manners to be sensitive to one’s company. It was much more the kind of pliability associated with women who combine a flashy exterior with a heart of gold. I am also not too happy with more ambitious - and certainly more famous - books about Wittgenstein’s cultural background, such as Wittgenstein’s Vienna by Janik and Toulmin [1973].7 Presumably this is fascinating in itself for those who learn about it for the first time. But I at least have not been able to enrich my reading of Wittgenstein’s works by use of this background, even after taking into consideration facts about it that are not in the book. Again speaking for myself, I am not surprised since Wittgenstein left for Berlin and England at an early age, spent the war away from Vienna, and some of the twenties in the country around the Semmering which is culturally as far from Vienna as Cambridge, where he went around 1930 for good. More generally, I am simply distracted by the parallel with astrology (which, on the surface, inspires little confidence in me, and I have not dug deeper). Lives of individuals are embedded in celestial or terrestial surroundings shared by many others. And little attention seems to be given to what statisticians would call the mean deviation among individuals in the same astrological or cultural envi- ronment. Naturally this is not a dogmatic rejection; for example, celestial events may be connected with an ice age. Be all that as it may those ambitious books raise a question: To what extent can knowledge of Wittgenstein’s life help us get more out of his writing? Certainly, no ‘deep’ casual connection is meant here, nor exegeses of his own intentions. As

7In connection with culture there is an extraordinary gap in the literature on Wittgenstein which would be very easy to fill. Contemporaries, even in Austria, do not seem to be aware of the - conscious or unconscious - allusions in his writings to the kind of culture which was perfectly ordinary in his (and even my) youth. For example, the expression Buntheit der Mathematik immediately brings to mind Goethe’s Grau ist alle Theorie. It is of course a separate matter how sensible it would be to look for, say, a unified theory that sparkles with vivid contrasts. Ludwig Wittgenstein 165

I mean this question it is quite enough if aspects of his life fit his writing; at least roughly - not like equilateral and equiangular triangles (where one doesn’t wonder about cause and effect either). Nor are conscious intentions at a premium here; neither his nor the reader’s who may simply see new aspects after being steeped in the author’s life; through personal contact or through biographical literature. (And, like with most things in life, there is a risk of getting too much of a good thing.) Wittgenstein’s was not a life of ‘raw’ power in overcoming external obstacles; not a single-minded pursuit of some goal that had absorbed him since early youth. Against this so to speak personal background it seems idle to belabor his contacts with Viennese culture or, for that matter, with England, where he spent more than half of his adult life. (For the record, I never heard him say a good thing about either place.) Similarly, little would be gained by being systematic; looking from left to right, and then from top to bottom. So what I propose to do is to tell you stories.8 This seems to me to correspond to the way how we actually get a global impression of a broad canvas. (This is not needed for academic philosophy. But little of Wittgenstein is rewarding for that subject.) You can now judge this philosophy in action.

Ancestry Wittgenstein himself was very much preoccupied, at least, at times, with what he thought of as his typically Jewish properties.9 Wittgenstein’s preoccupation is certainly not intrinsically absurd; for example, if ever one discovered relevant hereditary or cultural features. But even if one had statistical macroscopic infor- mation one could not answer Wittgenstein’s (self) doubts because the variations within any racial group are very great. There are only very few features that seem to be wholly confined to any particular group, such as certain hereditary diseases. Corollary. Wittgenstein’s preoccupation is a half baked thought of precisely the kind which he was very sensitive to in philosophical contexts. But the matter of Jewish ancestry was a simple practical matter after Austria became part of the Third Reich. The question about the grandfather (in their case, on the father’s side) determined how the grandchildren were classified.

8Umberto Eco (on the flyleaf of the German translation of Il nome della rosa): If you can’t speak about it, you must tell some stories. In contrast to the last sentence of Tractatus: What you cannot speak about, you must be silent of. 9Cf. Vermischte Bemerkungen (Wittgenstein [1977]) on his lack of original thoughts. As for that defect, Section 1 stresses that neither his early nor his later work is distinguished by originality of content. On the contrary, in the former Wittgenstein expresses memorably the attraction of Frege’s foundational science, and in the latter he speaks for the silent majority (which is dubious about foundational sciences: What is matter? Never mind . . . ) Ludwig Wittgenstein 166

Wittgenstein’s oldest and youngest sisters decided to emigrate. One of their uncles was a landowner in Yugoslavia. He agreed to get them Yugoslav passports, with which they then tried to emigrate. (They had plenty of money in Swiss accounts.) The passports were false, and they were arrested. Hermine’s family history describes the course of events with humor, and a certain pride in the good impression she and her sister made on the judge who ruled in their favor. He said that the attempted use of a false passport was not punishable by analogy with the attempted murder of a corpse! What she omits in her family history was that - after first having thought of employing a Jewish lawyer - they succeeded in getting Seiss-Inquart to take their case. He was in fact Schuschnigg’s successor after the Austrian annexation (Anschluss), and at the time he could have got off a murderer not only of a corpse, but also of - almost - any living person.

The father When all is said and done, both in the biographies and in the picture book by Nedo and Ranchetti [1983], the figure that sticks most in my mind is Wittgen- stein’s father.10 His resourcefulness, imagination and practical sense were most remarkable. As a young apprentice he went, apparently empty handed, to an outing of the firm. Others had brought ham, cases of beer, etc. During the meal he asked if anybody had brought any mustard. As he had speculated, none had thought of it. His jar of mustard was appreciated more than the bulky contributions of the others. At the opposite end of the scale. During the Russian-Turkish war in the 80’s his firm, which made light rails, competed with Krupp who made heavy rails. He pursuaded the Russian commander that light rails were superior because the Russians would win quickly, and so heavy rails would be wasted. The father got the order. This was not all. The Russians forgot to cancel the order even though the war was coming to an end. So the father reminded them, but got their agreement that they would buy all the rails that had been produced for them already. Of course, they had no way to check how much had been produced. The father was extremely generous in his support of artists. But he never gave any money to organizations, always to individual artists, whose career he followed closely. After all, spending large sums of money sensibly is not an easy job.11

10Presumably this was not the intention of these authors (in contrast to Hermine). The same actually happened to me with the Brothers Karamasov when I was 17, although Dostoyewski surely wanted readers to pay some attention to Alyosha, too. I remember telling Wittgenstein of my difficulty, and found him sympathetic, one of those moments with him that I remember nostalgically; cf. also the episode with Littlewood on p. 225 of Chapter 12. 11During World War I his sons Paul and Ludwig gave huge sums of money (2,000,000 Kronen Ludwig Wittgenstein 167

In 1913 - when he retired quite early - he achieved his greatest financial coup. He sold all his investments (keeping of course the town house, and the 10,000 acre country estate at the Hochreith), and bought shares in the greatest US concerns. In addition, he put it all in a Swiss bank account. The father seemed to be critical of his sons, all of whom were very artistic. He liked to say that he could not imagine any intelligent young man who would choose another occupation than his own, that is, the combination of engineer and business man, except: if such a man did not know that such a combination was possible. (Of course, this did not apply to his own sons!) In the circumstances it certainly made good sense to sell his investments as described above; instead of belly-aching about sons who were unable to carry on a tradition or found a dynasty. It would be, I think, facile to speculate on the reactions of the sons to the father’s personality. Certainly, some of them, including the philosopher, did not find life easy, at least, not at certain times. Some were obviously unstable during the father’s lifetime (to the extent of committing suicide without being physically ill), though not Ludwig.

(Homo)sexuality Much is made of their tendency to homosexuality. I am not competent to go into details. I am just too much impressed by differences within the group (without denying more significant differences between the group and other people). After all, there are flamboyant homosexuals like Oscar Wilde or W.H. Auden, where the first had a family, and the latter did not. Certainly, many homosexuals are quite possessive patres familias. They just don’t like physical contact with women much. Dean Swift didn’t either without (apparently) being homosexual. Though Wittgenstein often tried to talk about such matters I didn’t like it. Old people should think of other things. But it seems to me that he:

• liked some grandes dames (Sraffa’s mother, Lady Keynes)

• generally, distrusted women: men are foul, but women are vile (cf. Pope: Some men to business, some to pleasure take - but every woman is a rake)

• actually disliked touching women; he once told me he had said to one - presumably, the one whom he was supposed to marry according to Nedo and Ranchetti [1983] - that he would love her all his life, but could not marry her (presumably, in the sense of ecclesiastical law: erectio, introductio, ejaculatio). each) to the mayor of Vienna and, respectively, the ministry of war. The purpose: to bring aid to returning prisoners of war and, respectively, to develop a mortar of bore > 30 cm). Ludwig Wittgenstein 168

Inevitably, during ten years of frequent and extended conversations, informa- tion about his diaries, etc. I have an endless supply of anecdotes, also about this side. But I don’t really know what to do with them. - I am willing to take on trust that the whole business plagued him, as it is supposed to plague others. But I don’t have enough imagination to put myself in his place. (I can’t claim: nihil humanum mihi alienum) As I see it, if you have a cold you are really a helpless victim; there is nothing to do. The same applies to mental illness such as paralysis of the will. But with social matters there is the old English advice: Do what you like, as long as you don’t do it in the streets and frighten the horses. There is another side which I have mentioned in print (and have thereby offended the locals). It is much less dramatic than the business of homosexuality. Wittgenstein spoke, at lectures, most dramatically of the importance of sex. One day, totally out of the blue, he told me with absolute conviction that it is impossible to imagine another person in a state of ecstasy; presumably, not on the trivial ground that some people keep their eyes closed.12 To me the obvious interpretation was that all his ecstasy was associated with his own imagination. G.B. Shaw, incidentally, expressed similar views, but in so to speak comparative terms; in particular, the Mona Lisa was to him more attractive than the women at hand. Again, I am not competent because I just don’t have enough experience. But almost any competent doctor should be able to say whether this sort of thing tends to be more or less satisfactory at different periods of life (e.g. in one’s thirties, Wittgenstein’s most difficult period).

The youngest sister I stayed with Margarete (the youngest sister, but older than the philosopher) both in Vienna, and at the Villa Toscana in Gmunden in the late forties and early fifties (at Wittgenstein’s suggestion). She struck me as a parody of him: similar gestures, a certain theatrical ele- gance, but all of it empty. It is not a matter of originality: she simply did not seem to have thought through anything she said! The only redeeming feature was that she avoided hackneyed language, but that was all. Some anecdotes:

• According to her she was psychoanalyzed by Freud13 (for completing her ‘education’). She was not a perfect pupil. Her eldest son developed a

12Evidently, Picasso did not; otherwise it would not have been so natural to him to paint three-eyed women. 13Incidentally, Wittgenstein had the same experience as I with Freud’s Interpretation of Dreams (or, for that matter, Marx’s Kapital): absolutely overwhelmed by the first half dozen pages. No trace of ‘internal’ resistance; on the contrary, a very strong temptation to be non critical. Ludwig Wittgenstein 169

stammer. She took him to Freud at the age of five. After a few minutes Freud asked her how long the child had been stammering, and whether there were siblings. Yes, a younger son. Freud asked for his age: she had not noticed that the older child started stammering just after the other was born. After telling me the story she proposed to psychoanalyze me (to cure me of sleeping problems). Happy end to the story. Of course, the child did not stop stammering after Freud’s diagnosis. But as a grown man, in his fifties, he did: when the mother died.

• Wittgenstein was very pround of the following episode (I’d be too). The sister bought her clothes in Paris, I think at Dior’s. One day he saw her come down the staircase of the family townhouse, and found there was something wrong with her dress; he then showed her how he would change it, pulling it together near the navel. Then she told him that originally there was a button at precisely that place, which she had removed.

• The sister constantly bragged about her exquisite taste, and her openmind- edness. As to the former, her bedroom was full of cartoons by Daumier whose haggard faces could have been cartoons of her own. (The cartoons were excellent, and surely fine for her because she saw them, without seeing herself at the same time. But she received visitors to the lev´ee.) As to the latter, at my suggestion she had a friend stay with her for several weeks. (He was in trouble with the police.) She threw him out because he repeated an, admittedly weak joke about the two big reasons for Rosalind Russell’s success.

The rest of the family I don’t know too much of other family relations (see Nedo and Ranchetti [1983]). For example, Wittgenstein talked to me of his mother with great warmth, but as somewhat absent minded. Hermine, the eldest sister, says of the mother that she was totally incapable of thinking any thought through, except for - what she called - musical thoughts. As for the nephews and nieces, it has been said that Wittgenstein’s not having a family was proof of his sense of social responsibility; even if he has occasionally committed the ‘crime that dare not speak its name’.14

14Incidentally, according to St. Thomas Aquinas the corresponding heterosexual activity is a worse crime in the eyes of the Church; pedantically, when the seed is spilled in situ: so near, and yet so far, as they say. Ludwig Wittgenstein 170 The house in the Kundtmanngasse Speaking for myself, I was bowled over by the elegant proportions, especially of the interior (designed by Wittgenstein). As to the exterior it wasn’t so marvellous; for example, if I think of a place near Fontainbleau that now belongs to the Duc de Noailles, and was originally built for La Pompadour by the state architect of Louis XIV, who also designed the fa¸cadeof the Place Vendˆome. It seems such an odd idea to design a house oneself instead of travelling to France, and look at a better design. The amazing features were the proportions. The various rooms were of dif- ferent heights; the library high, the bedrooms lower, but everything orderly. If you look at the whole building from afar, the impression is wholly harmonious. If you are close, and look at just one wing, the impression is again harmonious, but of course the proportions quite different: the eye rests, and is not distracted by the surroundings. To me the most impressive ‘furniture’ at the house was Postl. He had been one of Wittgenstein’s pupils at Trattenbach (or nearby); or perhaps, Wittgenstein had only met him there at their musical evenings. I don’t think I have ever met a more serene person, with a more perfect manner; at least, to me: no trace of theatre, so to speak to help nature along. The only approximation I can think of are Dirac and Deligne among academics, and some Tuaregs (Beduins) in North Africa. I didn’t have the sense to ask him for photographs of himself in the 20’s. Would others, besides Wittgenstein, have recognized these qualities of Postl when he was still in his teens or early twenties? (Perhaps it was easy. From time to time one meets people, and says to oneself: I never knew that they still make them that way.)

Music There was a curious gushing way in which the Wittgensteins talked about music. I know nothing about music.15 But when Artur Rubinstein or Georg Solti talk about it in TV interviews I don’t detect this gushing note. I’d almost apply here a remark of Wittgenstein’s to me (he did not like my frivolous style): If you ever commited a murder I probably should not mind the deed. But I’d surely hate the way you’d talk about it. Probably not the whole family talked that way. After all I only met Wittgen- stein and his sister (of his generation), never the father.

15Vague regrets. When I saw a lot of Wittgenstein (in the forties) I assumed one had to have at least some musical talent to listen to remarks about music; so I did not pay any attention to his. Now I know that this is not so. I knew several (performing) musicians who can speak abstractly about music, and I can check my understanding by talking of the visual effects produced by (a film of) the orchestra or the pianist; I can’t do it with violinists. Ludwig Wittgenstein 171

A couple of reminiscences:

• Once - and quite contrary to his gushing style - Wittgenstein talked of the laws of harmony making good sense without any reference to feelings, and thus being a more suitable subject for study.

• From the sublime to the ridiculous. At least on one occasion he said that his first memory was of being in a cradle when he was tickled by Brahms’ beard. (I don’t have a clear first memory. But if I had to choose I’d be quite happy to choose this one.)

10.3 Tractatus

As I see Wittgenstein’s work - both the (early) Tractatus and the (later) Philo- sophical Investigations - it was inspired by rather simple logical ideas, and then given an artistic form. If I am right this is part of an old tradition; the extreme form is the kind of metaphysical poem inspired by very simple ideas of natural science; for example, the atomic structure of matter (Lucretius) or cosmological ideas about the beginning and end of the world (in most religions). As for Tractatus, the logical idea is simply propositional logic; not even the use of set theory. Later it is the idea of formal rule as in rules for games like chess (and language games). What I want to do is to explain the attraction of these ideas in quite sober terms: not only by sober examples, but also by expressing the attraction itself without gushing associations. My guess is that this kind of knowledge of the ideas can heighten the appreciation of Wittgenstein’s artistic presentation. So the first step is to say something of propositional logic (without Wittgen- stein’s hyperbole of the logical structure of the whole world); rather: as homeo- pathic medicine for knowledge.

Propositional logic

Grammar. Letters p, q, . . . or p1, p2,... and combinations:

• to each word w also ¬w (read: not w)

• to words w1 and w2 also w1 ∧ w2 (read: w1 and w2).

The following are abbreviations:

• w1 ∨ w2 (read: w1 or w2, vel in Latin) for ¬[(¬w1) ∧ (¬w2)]

• w1 → w2 (read: if w1 implies w2) for ¬[w1 ∧ (¬w2)]. Ludwig Wittgenstein 172

Intended meaning. Letters stand for propositions which are so clearly defined that they are either true or false, in particular, do not contain vague terms. For ¬ and ∧ the meaning is described by the following tables:

p0 p1 p1 ∧ p2 p ¬p T T T T F and F T F F T T F F F F F

Because of the simple grammar, every combination of p1, . . . , pn gets a value T or F .

n Exercise 10.3.1 There are 2 different ways of assigning values T or F to p1, . . . , pn, 2n and 2 different ways (propositional functions) of assigning T or F to (p1, . . . , pn) (including the two constant functions: e.g., p ∧ ¬p, ¬(p ∧ ¬p)).

Warning. Not all natural properties of propositions depend only on whether components are true or false; for example: interesting, offensive, silly, deceptive, and so forth.

Theorem 10.3.2 Functional Completeness Theorem. All propositional func- tions can be defined from ∧ and ¬ by composition.

Proof. By induction on n. It is true for n = 0 (constants T or F ; cf. above). Suppose it is true for n:

f(p1, . . . , pn, pn+1) = [pn+1 ∧ f(p1, . . . , pn,T )] ∨ [¬pn+1 ∧ f(p1, . . . , pn,F )].

But both f(p1, . . . , pn,T ) and f(p1, . . . , pn,F ) are propositional functions of n arguments, and ∨ was defined above. 

Warning. This is not a good time, for us, to explain formal programs. Of course, there are formal programs to generate socalled valid expressions, that is f(p1, . . . , pn) which are = T for all combinations of p1, . . . , pn. But it is too easy to think of the meaning, and one has to force oneself to use a particular program. It is never wise to put oneself into a situation which leads to temptations. So in this chapter - on philosophy - we do it later. In a chapter on magic it would be different.

Finite Problems The amazing fact is that anything at all can be expressed by these tiny tables. Example. We express in propositional terms (the pigeon hole principle): if you put M things into N holes, and M > N then one hole will have more than one thing. Ludwig Wittgenstein 173

Let 1 ≤ i ≤ N and 1 ≤ j ≤ M, and pi,j mean: the thing j (has been put) in the hole i. • For each j (j = 1,...,M), the thing j is in some hole:

p1,j ∨ p2,j ∨ · · · ∨ pN,j.

• Everything is in some hole:

(p1,1 ∨ p2,1 ∨ · · · ∨ pN,1) ∧ · · · ∧ (p1,M ∨ p2,M ∨ · · · ∨ pN,M ). (10.1)

• For each i (i = 1,...,N), more than one thing is in the hole i, that is, at

least 2 things are in the hole i (Ai):

(pi,1 ∧ pi,2) ∨ (pi,1 ∧ pi,3) ∨ · · · ∨ (pi,1 ∧ pi,M ) ∨ (pi,2 ∧ pi,3) ∨ · · · ∨ (pi,2 ∧ pi,M ) ···

∨ (pi,M−1 ∧ pi,M ).

• At least one hole contains more than one thing:

A1 ∨ A2 ∨ · · · ∨ AN . (10.2)

Suppose you have a gadget that can handle the simple grammar of proposi- tional logic, but not the ideas of holes and what to put in them. Then the pigeon hole principle can be coded by the gadget modulo the insight that (10.1 → 10.2) is valid expresses the pigeon hole principle. If the gadget is programmed to compute

(10.1 → 10.2) for all pi,j, then our knowledge of the pigeon hole principle allows us to conclude that the gadget will come up with a constant value T . Reduction of the pigeon hole principle to (10.1 → 10.2) above is called logical (in particular, propositional) foundation (of knowledge about finite situations). Use of the interpretation of (10.1 → 10.2) as expressing the pigeon hole prin- ciple is called exercise of mathematical imagination. The terminology ‘imagina- tion’ is good because there is simply no limit on the variety of interpretations of a propositional formula. Translations into propositional terms (above: of the pigeon hole principle) are of course not unique, but more limited.16 They are all correct, and differ in more delicate respects, for example, how easy they are to verify. This is in accordance with the warning on p. 172 that correctness (true versus false) is a special property of properties only inasmuch as it is particularly simple. Before discussing how Wittgenstein’s imagination ran away with him in Trac- tatus, it is good to linger a moment over the temptation. (That is what saints do when they repent their sins.) Here it is best to do further

16Translation into the artificial language is the real art, not the processing of that language. Ludwig Wittgenstein 174

Exercises 10.3.3 a) Four color problem. If a map represents N countries, each country can be colored in one of four colors (say: red, white, blue, and green) in such a way that if two countries have a border, not merely a point, in common they get different colors. Express the problem in propositional terms. (Hint: Let ri, wi, bi, gi mean, respectively, that the entry i gets the color red, white, blue, green.) b) Finite linear orderings. If N things are linearly ordered then there is a least element. Express the problem in propositional terms. (Hint: Let pi,j mean that i is before j.)

Infinite problems The expressive ‘power’ of the propositional language is increased further if one thinks of infinitely many propositional formula, that is, uses descriptions of such infinite sets in one’s language; cf. differential calculus as against mere difference equations. Example (going back to the four color problem 10.3.3.a). We think of infinitely many regions R1,R2,... (e.g. on a sphere or on a torus), and have a list of pairs (i, j) ∈ CB where Ri and Rj have a common boundary, and of course (i, j) 6∈ CB when Ri and Rj have no common boundary. They can be colored with, say, four colors if and only if the following propositional formulae can be made true: for each i,

1 2 3 4 ci ∨ ci ∨ ci ∨ ci 1 2 3 4 2 3 4 3 4 ci → (¬ci ∧ ¬ci ∧ ¬ci ), ci → (¬ci ∧ ¬ci ), ci → (¬ci ) and, for all (i, j) ∈ CB,

1 1 2 2 3 3 4 4 ci → ¬cj , ci → ¬cj , ci → ¬cj , ci → ¬cj 1 2 3 4 (read ci , ci , ci , ci to mean, respectively, that the region Ri gets the color 1, 2, 3, 4.) What is a possible advantage of expressing coloring properties in this way? The next theorem shows that if every finite number of regions can be colored in 4 colors then all of them can be so colored. Note that not every fact about infinite configuration has this simple relation to a formally similar statement about finite 1 1 1 1 configurations (cf. every finite ordering has a least element: ··· , n , n−1 , ··· , 3 , 2 , 1 has not).

Theorem 10.3.4 Finiteness Theorem. Suppose P1,P2,... are propositional formulae, each built up from finitely many propositional symbols p1, p2,... If for each N:

P1 ∧ P2 ∧ · · · ∧ PN can be made true by suitable choice of

p1 = T or F, p2 = T or F, ··· then all Pi can be made true simultaneously. Ludwig Wittgenstein 175

Proof. Consider p1.

1. If, for each N, P1 ∧· · ·∧PN can be made true by taking p1 = T , then define

p1 = T .

2. If case 1 is false, e.g., P1 ∧ · · · ∧ PN1 cannot be made true with p1 = T then, for each N, P1 ∧ · · · ∧ PN can be made true with p1 = F . Otherwise some 00 0 P ∧ · · · ∧ P 0 cannot be made true with p = F , and so if N = N + N 1 N1 1 1 1 1 then P ∧ · · · ∧ P 00 cannot be made true at all, neither with p = T nor 1 N1 1 p1 = F . In case 2 we define p1 = F .

Suppose p1,..., pn have already been defined.

1. If, for each N, P1 ∧ · · · ∧ PN can be made true with p1 = p1, . . . , pn = pn, and pn+1 = T , then define pn+1 = T .

2. If case 1 is false, e.g., P1 ∧· · ·∧PN1 cannot be made true with the conditions described, then, as with p1 and for each N, P1 ∧ · · · ∧ PN can be made

true with pn+1 = F (and p1 = p1, . . . , pn = pn). Otherwise, again some P ∧ · · · ∧ P 0 cannot be made true with p = p , . . . , p = p at all. (If 1 N1+N1 1 1 n n it could, pn+1 would have to be either T or F .) Then define pn+1 = F .

So we have defined a sequence p1, p2,... under the hypothesis that each finite conjunction P1 ∧ · · · ∧ PN can be made true. Observe that PN contains only finitely many propositional symbols. By defi- nition p1, p2,... satisfies PN , and thus all the formulae P1,P2,... 

Language: expressive and reasoning capacities To recognize that the propositional formulae used earlier on express - what is normally described as - the possibility of colouring the regions Ri in 4 colors trivially requires that we also understand that possibility! So we have in no way analysed our ordinary reasoning capacities: we have extended our ordinary methods, e.g. by recognizing the finiteness property above. It is clumsy to stress analysis here (as is often done) instead of extension:

• It is false, and so creates suspicion of the whole enterprise.

• Practically speaking, extension is more rewarding than analysis (of what we know already).

• Talking of analysis draws attention to its problems - and as long as we have no real idea how to solve those problems, we’ll be stuck with sterile efforts.17 17Is this philosophy? in the sense of: pursuing wisdom? Ludwig Wittgenstein 176 Anti-psychologism (another clumsy word for timely advice) In the philosophical literature there is great excitement on whether logic does or does not depend on psychology, whether the laws of logic are objective or subjective, etc. Now it is simply obvious that actual thinking requires awareness; for example, one does not think too well when one is asleep. So, clearly, the issue is elsewhere. Can anything rewarding be said (about laws of logic) that neglects specifically psychological elements, socalled wetware altogether?18 So if there is any good sense to the heated debates about logic and psychology at all, a question to which we return below, it is a controversy on what is realistically possible. Thus one side believes that simply nothing rewarding will be done in the gen- eral area (around logic) if psychological aspects are ignored altogether.19 This belief is certainly exaggerated because something useful can be done with switch- ing circuits used in computers. The other side believes that we know so little about specifically psychological aspects that merely touching them will drag one into a morass. But this belief is also not without consequences! Given this view one will regard even the most trivial little tricks about untainted, that is, non psychological, propositional logic as sensational because it is compared to: nothing at all on the psychological side. The - incidentally widespread, so to speak ‘democratic’ - assumption that any heated debates are symptoms of some important, possibly unconscious issues neglects the possibility that the heat of the debate is its own fuel, to be compared to: indignation which can be its own reward. A more distant parallel: l’appetit vient en mangeant (appetite comes with the eating). In other words, appetite is not an indication of hunger; at least, not among the people most of us know personally. And philosophical debates have the flavor of luxury rather than necessity. The general discussion above is, literally, trivial, that is: not specific to our main subject; though it may be wise, in other words, sound philosophy, to remind oneself of these general points. However, it is important not to forget the specific matter of propositional logic either.

Specific Reminders Propositional language is remarkable because it combines considerable expressive possibilities with a simple vocabulary and grammar, granted that this expressive 18Cf. the case of matter, all of which is full of electrically charged particles: certain natural phenomena are not affected by this side, i.e. the electrical effects cancel out. 19The obvious parallel is: if, in geometry, only constructions with rulers are considered, not even a pair of compasses, let alone, the differential calculus, geometry is bound to deteriorate into hair splitting. Ludwig Wittgenstein 177 power (i.e. equivalence to expressions in a non flexible language) is recognized. But that simplicity is futile for those intelligent systems that already master a richer vocabulary and/or grammar, unless supplemented by more detailed results. Specifically, results depending on the fact that

propositional language has restricted expressive power!

For example, the Finiteness Theorem is a result that is clearly not shared by ordinary language (say, of ordinary arithmetic). It is an open question for which systems the simple vocabulary and grammar pays off. Switching circuits have alredy been mentioned, especially old ones which could implement ¬ and ∧ particularly cheaply. What has not even been touched so far is the matter of propositional reasoning in the wide (not necessarily psychological) sense:

efficient means of testing propositional formulas.

Recall that our knowledge of the pigeon-hold principle was used to derive propo- sitional formulae, not the other way around!

Formal rules for propositional logic The main aim is to convey an idea which fired the imagination of lots of people in the first third of this century, including Wittgenstein. (He was very sensitive to this sort of thing.) One of these people was the best known mathematician of the time, Hilbert, with a special knack for catchy phrases. He talked of rules for games with symbols, the obvious source of Wittgenstein’s term language games, albeit with a mildly differing meaning.20 Today the idea is best illustrated by computer programs or, more precisely, by Simple Simon’s idea of a computer program. Indeed, once the rules below are presented it will be clear that

if a computer couldn’t do that it couldn’t do much else either.

In fact there is no need to go to any pains in order to convey the general idea involved since we all have it anyway. It is perhaps nice to have the particular example which was the grand daddy of it all. And one might as well use the example to illustrate other points into the bargain, in particular,

how to find a more efficient method for deciding if a propositional formula is valid (i.e. constant T )

20Naturally - as always in such circumstances - the variant presents itself as ‘fundamentally’ different. (Luther thought of himself as much farther removed from the pope than from mere atheists.) Ludwig Wittgenstein 178 than by trying out all 2n ways of assigning values T or F to the n basic symbols of the formula, say, P .21

General strategy. To try and find one assignment which makes P false (instead of trying out all). If there is none, P is true. So what is needed is a strategy which finds one such assignement if there is one at all.

• First look: conditions for P being false.

1. If P is a basic symbol, p, there is no problem: make p false. 2. If P is ¬Q: Q must be true.

3. If P is Q1 ∧ Q2: at least one of Q1 and Q2 must be false.

The question about P is thus replaced by questions about, literally, parts of P . (Technical term: subformula.) So the process comes to an end. But the question about P being false is sometimes replaced by questions about (other) formulae being true. So we take a

• Second look: conditions for P being true.

1. If P is a basic symbol: no problem. 2. If P is ¬Q: Q must be false.

3. If P is Q1 ∧ Q2: both Q1 and Q2 must be true.

Even if we start with a single formula, the question about its truth is sometimes replaced by two questions (about the truth of simpler formulae).

This suggests considering sets of formulae Γ and ∆ and looking for conditions ensuring that

all formulae in Γ are true and all in ∆ are false (written Γ ` ∆).

If it is not possible to satisfy the conditions then V Γ → W ∆ is valid, where V Γ is the conjunction of all formulae in Γ, and W ∆ the disjunction of all formulae in ∆.22

0 Formal procedure. Suppose Γ = Γ1,P and ∆ = ∆1,P . The first and second ‘looks’ are made into a system by examining, alternately, the left hand side and right hand side of ` (i.e. what has to be made true, what false).

1. P is p:

21For each of these 2n assignments some work is needed to compute the value assigned to P ; the work depends on the number of occurrences of the two logical symbols ¬ and ∧ in P . 22Recall that p → q is ¬(p ∧ ¬q) by definition. Ludwig Wittgenstein 179

Γ1, p ` ∆ p, Γ1 ` ∆

p is shifted in order that, at the next look but one, another rightmost formula of Γ is treated (the computer has to be told such things; otherwise it plays with p forever).

2. P 0 is p:

Γ ` ∆1, p Γ ` p, ∆1

3. P is ¬Q:

Γ1, ¬Q ` ∆ Γ1 ` ∆,Q

4. P 0 is ¬Q0:

0 Γ ` ∆1, ¬Q 0 Γ,Q ` ∆1

5. P is Q1 ∧ Q2:

Γ1,Q1 ∧ Q2 ` ∆ Γ1,Q1,Q2 ` ∆

0 0 0 6. P is Q1 ∧ Q2:

0 0 Γ ` ∆1,Q1 ∧ Q2 0 0 Γ ` ∆1,Q1 Γ ` ∆1,Q2

Here we have a branching (remember trees). Note that the top line is false 0 0 if either Γ ` ∆1,Q1 or Γ ` ∆1,Q2 is false.

Exercise 10.3.5 Verify that you get rules of inference (i.e. rules of proof) in the sense that V Γ → W ∆ on top of each rule can be inferred from the bottom. In particu- V W 0 0 V W 0 lar, in 6, Γ → [ ∆1 ∨ (Q1 ∧ Q2)] is valid provided both Γ → ( ∆1) ∨ Q1 and V W 0 Γ → ( ∆1) ∨ Q2 are valid.

Suppose now the total number of occurrences of logical symbols in Γ ∪ ∆ is M. Then after at most M steps - not counting 1 and 2 - only basic symbols occur to the left and the right of `.23 And there are two cases.

23Exercise. How often can steps 1 and 2 occur, given M and the number of occurrences of basic symbols in Γ ∪ ∆? Ludwig Wittgenstein 180

• At some top node N of the tree, ΓN ` ∆N (where only basic symbols, i.e. letters, occur) ΓN ∩ ∆N = ∅ (empty); that is, no letters occur on both sides of `. Then all Γ can be made true, and all ∆ false.

Rule. Make the p in ΓN true, and those in ∆N false, and give any value you like to p which occur in Γ ∪ ∆, but not in ΓN ∪ ∆N . Proof. Consider the whole path from N back to the starting point with Γ ` ∆. Use induction on the distance from N to the starting point to show that

at each node on the path all formulae on the left of ` are true, those on the right are false.

At N this is so by hypothesis. Suppose so at distance k. Look at k +1, and suppose node k has been generated by 1 or 2: nothing to prove since p is 0 only shifted; by 3 or 4: clear, and so is 5. For 6, node k is either Γk ` ∆k,Q1 0 0 0 0 0 or Γk ` ∆k,Q2 and node k + 1 is Γk ` ∆k,Q1 ∧ Q2. But to make Q1 ∧ Q2 0 0 false, it is enough if either of the two nodes Γk ` ∆k,Q1 and Γk ` ∆k,Q2 satisfies the induction hypothesis.24 V W • At all top nodes N ,ΓN ∩ ∆N 6= ∅. Then Γ → ∆ is valid, that is, no assignment of truth values to the letters makes all formulas in Γ true, and all those in ∆ false. Proof. Again by induction, starting at the top nodes (where the assertion 0 0 is obvious). The critical case is 6 when ∆k+1 is ∆k,Q1 ∧ Q2. Then the 0 0 nodes above it are Γk+1 ` ∆k,Q1 and Γk+1 ` ∆k,Q2, and so by hypothesis V W 0 V W 0 Γk+1 → ( ∆k) ∨ Q1 and Γk+1 → ( ∆k) ∨ Q2 are valid.

Discussion

Suppose Γ and/or ∆ are infinite but countable with the basic symbols p1, p2,... By the Finiteness Theorem, Γ ` ∆ cannot be satisfied if and only if, for finite sets Γ0 ⊆ Γ, ∆0 ⊆ ∆, V Γ0 → W ∆0 is valid. Now it is easy to state the improved efficiency of the present procedure over considering all possible truth distribu- tions. The latter are uncountable, the tree defined here, for given Γ and ∆, is countable. The rules are found by ‘coding’ distributions which make V Γ true and W ∆ false, identifying obstacles to the search (second case above), and translating these obstructions into rules of proof. In contrast, the old literature takes ‘received’

24More sophisticated exercise. Find another proof by induction on the number of logical symbols in any formula occurring on the path considered. Reason for the exercise. In other parts of logic infinite paths occur such that no basic symbol ever occurs both left and right of ` on any node. The sophisticated argument will generalize, but not the simple argument above which depends on having a path with an end. Ludwig Wittgenstein 181 rules and then verifies that they generate all valid statements. (The technical expressions is: semantic completeness.) The argument works equally well for other choices of basic operations such as ∧ and ∨; in particular, there is no need for the logical symbols to define both ∧ and ¬. (The technical expression for collections of operations that do define ∧ and ¬ is: functional completeness). The literature treats such cases as if they needed additional inspiration. Warning. Far from constituting a description of all possibilities of proposi- tional reasoning the rules just given show that a a tiny portion of those possi- bilities is enough to generate the valid implications V Γ → W ∆, even if purely propositional methods are considered (and not interpretations of propositional formulas as expressing, for example, instances of the pigeon hole principle and map coloring theorems). For example

Γ1 ` ∆1,P Γ2,P ` ∆2 Γ1 ∪ Γ2 ` ∆1 ∪ ∆2 is valid but is not included among the rules given, since the latter, by construction, have the property:

Γ ` ∆ is inferred from Γ0 ` ∆0 (and, in one case, also Γ00 ` ∆00) where the formulae in Γ0, ∆0 (and Γ00, ∆00) are proper parts of those in Γ ∪ ∆.

In contrast, P above can be anything!. Once again it is an advantage that the rules given do not correspond to the totality of possibilities of reasoning, since we know that we know very little about that totality. As a corollary: There is a chance of doing better once we have discovered limited purposes for which the rules above are enough.

10.4 Transition to Wittgenstein’s Later Concerns

I have already stressed the fact that standard dates - socalled historical facts - cannot possibly be very informative simply because quite a lot of people are born and die on roughly the same day etc., but have very different lives (and if one were to try and be more exact one had better do astrology).

The conclusion is not that the facts are dubious in the sense of being possibly incorrect; they may be sterile. But also: ignorance of these simple facts is liable to lead to grotesgue misconceptions.

This is a refrain going practically through everything I have been talking about for nearly 15 years. It is thus more promising to continue to look at singular events. Ludwig Wittgenstein 182 Indignation Wittgenstein’s years in the decade after World War I were unhappy.25 There are several independent aspects to this, but, as always, if there is a weak spot somewhere the effect of another weak spot is increased. (If alcohol disturbs the chemical balance in one part of the brain, and barbiturates in another, the combination can be spectacular.) In order of accessibility: 1. If Tractatus really solved everything, as he said (and perhaps thought), what do you do for an encore? I once knew the girl who played Lolita in the film, thus being a huge success in her teens. She just didn’t know what to do with herself.

2. Clearly, 1 can become worse if one has doubts about the early achievement itself. It is a matter of temperament - some say ‘character’ - if it is. Another reaction is to investigate the doubts, and confirm or remove them. Anecdote. Wittgenstein never recovered from an observation by Sraffa about a singularly exaggerated formulation in Tractatus: the formal struc- ture of a symbol (i.e., a propositional sentence) corresponds to the objective structure of the fact symbolized. Sraffa’s (counter)example: an Italian ges- ture meaning that one is unconvinced. As I once told Wittgenstein: If this were the only defect of Tractatus it would be a very good book. A less coy way of saying the same thing: such an observation can only be a trigger; it has an effect if one is full of generalized doubts from the start. Both 1 and 2 concern feelings about one’s knowledge or ignorance (not, pri- marily, about relations to others). A more popular point is this: 3. Wittgenstein’s homoerotic - or the presumably related autoerotic - ten- dencies (and, as was mentioned earlier in connection with Dean Swift, a fear of the unknown). Again, the primary question would be under what conditions, either exogenic or endogenic,26 this becomes an urgent problem. Obviously, all these matters are most important, for example, to anybody who might have wanted to employ or marry Wittgenstein at the time. The question for us is: Does it concern us? Sure it does. Lots of people live quite comfortably with the kinds of problems that Wittgenstein had; or, if not comfortably, frivolously, etc. So we have here a case of making a mountain out of a molehill and being proud of it. In short, Wittgenstein had an exceptional talent for indignation. This is not conclusive. Sometimes indignation is confined to particular sub- jects, but not to one’s profession; e.g., in business people with a talent for indig- nation are likely to be successful executives only if this talent does not extend to

25When I knew Wittgenstein in the forties he was not particularly unhappy. 26E.g. age: Wittgenstein was in his thirties. Ludwig Wittgenstein 183 their professional encounters. But in philosophy or literature many successes are directly related to the talent for indignation (and not only because of the pleasure which the exercise of this talent furnishes, with greater drive as a bonus). The exceptional role of indignation in Wittgenstein’s private life provides a good reason at least for remembering the possibility of excessive indignation in his philosophical writing too; both at his own oversights and at common blind spots. And I think it pays to read him with this in mind.

Artistic skill There is another aspect of his intellectual temperament, superficially similar to the excessive talent for indignation, but, qualitatively, at the opposite extreme: the rare talent of artistic skill. After all, there are plenty of people whose imag- ination is fired by a little learning, but few who can make something of general appeal out of it. It seems to me that Wittgenstein had this talent to a high degree. Warnings. This talent is no guarantee for the ability to analyze itself! For example, Wittgenstein often talks about his interest in reflecting on art or artistic skills (and I at least have never found anything memorable about this in his writings except for the remark that this interest can be more absorbing than an interest in science). He also talks a great deal about clarification (Kl¨arung), when he obviously means exposition. And the last thing he means is clear, sober exposition suited for somebody who wants to know; what he means is arresting, memorable exposition that makes you sit up, and, especially, exposition for the bewildered; for somebody who has thought too much (for his intelligence27), and just doesn’t know how to get out of the morass of such thoughts. Thus if Wittgenstein’s particular talent is to be employed effectively, it had better look for an unconventional subject matter. For the conventional academic subjects are not likely to benefit from it, for a so to speak flattering reason: this talent is too rare to be needed in an academic subject (that has to be teachable), and so it is very chancy that it will ever be useful in such a subject. Accordingly - and here I mean: in effect, not necessarily on purpose of course - Wittgenstein picked on something that is miles away from the preoccupation of most contemporary philosophers, including his would-be disciples, but certainly has parallels in the mixed bag of topics that have been listed under ‘philosophy’ in the past. He himself speaks of his activity being one heir of (traditional) philosophy. A practical conclusion for this chapter (since I don’t have his talent) is: not to attempt to pursue his own writing in detail - something that one can do by oneself - but rather to explain the ideas which fired his imagination. This need not mean that the particular aspects stressed here did the trick for him. But it

27Remember Talleyrand: Le prince pense beacoup trop pour son intelligence. Ludwig Wittgenstein 184 seems fair to assume:

Whoever has the intellectual temperament needed to benefit from reading Wittgenstein, will quite naturally guess from the exposition given here which twists are central to Wittgenstein’s exposition.

(The twists may range from mere verbal association to germs of ideas that become effective only in very advanced sciences).

10.5 Philosophical Investigations

Tractatus seems to me the most memorable presentation of the excitement which - for good or ill - the idea of a universal anguage, with a formally precise grammar and a small vocabulary, arouses in us. Without exaggeration: tacitly assuming - our picture of - the world to be finite, Wittgenstein presents propositional logic as a typical universal language. By not being too explicit about the meaning of the basic symbols, the socalled simples, Wittgenstein leaves, overall, a more, not less convincing impression; at least, in the reader who lets himself be carried away by the dramatic exposi- tion. Afterwards one is prepared for the truly remarkable expressive power of propositional logic (measured, e.g., by its ratio to the simple syntactic structure). But given a modicum of intellectual sensibility the reader is also prepared - in effect, of course not by Wittgenstein’s conscious intention - for the limitations of his example for a universal language; with luck even prepared to suspect the very ideal of such a language. After all, if propositional language is accepted as universal, all those ideas that Tractatus dubs as ‘inexpressible’, really cannot be expressed; tacitly, in propositional language. Viewed this way the last sentence of Tractatus - naturally, under the proviso that what is to be said, is to be said in the universal language considered - stares one in the face as a beginning of a new chapter. It is simply obvious that the rest need not be silence. There are all sorts of ways that not only can be, but are used effectively when we do not have notions (of some standard language) to say what we know (e.g. metaphors, including parables and proverbs). But there is a real gap in Tractatus, corresponding perfectly to the would- be systematic literature on logical foundations. Except for a purely dogmatic, and thus openly suspect, identification between expressibility in some proposed universal language and the actual possibilities of understanding, Tractatus does not even touch the question: What is gained -or lost! - by pursuing the holy grail of such a language? And here the most rewarding clues come - contrary to an almost universal superstition - not from negative, that is, inexpressibility results, but from theoretical successes, socalled rational reconstructions, showing that this or that can be translated-in-principle into the universal language. Ludwig Wittgenstein 185

Viewed this way the first few pages of Philosophical Investigations - about St. Augustine, a really suitable patron saint for the trade, with his stress on the arbitrary character of the choice of symbols, for example for explicit definitions - serve as an excellent introduction. They are about orders, as opposed to declara- tive sentences. Of course, orders can be translated into the latter: it is obligatory that . . . (and a minor extension of propositional logic - socalled deontic logic - provides the appropriate formal ritual). But what’s the pay off? For a sensible answer one needs to look first at a, realistically speaking, equally difficult question: Where in everyday life is there any genuine pay off for anything remotely like a traditional logical analysis of orders? In other words, where is such an analysis even remotely relevant? Hardly in the building trade (discussed in the first pages of Philosophical Investigations), unless some kind of mechanization is envisaged.

Partial rules: Frege’s bˆetenoire A perfect parallel in mathematics is found in rules. When mathematical rules are talked about in set theoretic language one uses a paraphrase, the socalled graph of a rule. One forgets everything about the rule except the argument (starting point, input) and the value (output). In particular, if the rule leaves open the output, one leaves out the argument too. The graph is the set of pairs (a, v) where a is the argument and v is the value. Frege’s ideal of precision was offended, and he wanted every rule to be defined everywhere. So if you also want to read off when the rule is not defined you take some distinguished object, not meant to be a value at all, say ∗, and put (a, ∗) whenever the rule is not defined. Without exaggeration: this ‘egg of Columbus’ simply shows that, as formulated, the holy grails (here: of extensionality) are mirages, as clear and usually more beautiful than the real thing. So, as a socalled question of principle, the issue about forgetting orders is much ado about nothing. But in practice it can become an interesting question because you look at different aspects of an assertion when you think of it as the paraphrase of a rule and when you don’t. In particular, this will be central in connection with socalled partial rules.28 A prime example is given by computer programs. If you think of a program as a rule whose value is the displayed output, you generally do not know if the rule is defined. If you think of the program as a rule for taking the intermediate steps (as values), not the final output if there is one, then those programs are defined as rules operating on two arguments, the input and the number of the step in the execution. 28All our ordinary rules are partial in the sense that they are defined for a particular domain, for example, 2n is not usually defined when n = Julius Caesar. (Frege did worry about n = Julius Caesar!) Ludwig Wittgenstein 186

Specific attention to partial rules - among the many other things that can be paraphrased in the universal language - is demonstrably of the essence in the case of computer programs. We need only think of - the possibility of - partial rules to see that Cantor’s diagonal argument, applied almost verbatim, leads to proofs of a certain rule being undefined at a suitable argument, in place of: different. And the recognition that computer programs - for all but extraordinarily primi- tive (ideal) computers - satisfy the hypothesis of Cantor’s argument modified for partial functions, makes this modification into a tool of technology (beyong being a nice jeu d’esprit) and produces the most famous29 - though by no means the most useful - result on computer programs and in logic, under the heading: what computers cannot do.

Generalities (going back to Cantor) There are two ways of comparing sets: by inclusion (X ⊆ Y ) and by matching (via a one-one onto function X 7→ Y ).30 For finite sets X, any subset of X that can be matched with X is equal to it. For infinite sets this is not so (e.g., the even numbers 0,2,4,. . . are matched with all numbers 0,1,2,. . . ). A set is called countable if it is finite or can be matched with 0, 1, 2,... For example, the set of pairs of numbers (n, m) is countable, via the matching

(n, m) 7→ hn, mi = 2n(2m + 1) − 1.

Let f0, f1,... be a countable sequence of functions of natural numbers. This means, for example, that

f0 is f0(0), f0(1), . . . , fn(0),...

f1 is f1(0), f1(1), . . . , f1(1),... and generally fn(m) is the m-th value of the n-th function. Cantor observed that each fn differs from the socalled Cantorian diagonalized function fC 1 + f0(0), 1 + f1(1),..., 1 + fn(n),...

Specifically, fn and fC differ at the n-th place:

fC (n) = 1 + fn(n) and fn(n) 6= 1 + fn(n).

29The secret behind the fame is simple. People had made a simple confusion. A simple confusion can be removed by a simple distinction. You get something for very little effort: that is a good bargain; in technical language: high marginal utility. 30One speaks of (infinite) numbers or cardinals in connection with the matching of sets. Finite sets can be matched if and only if they have the same number in the ordinary sense. So there is no conflict if one extends the word ‘number’ to infinite sets, as long as one does not expect too many familiar laws to be satisfied. Ludwig Wittgenstein 187

Cantor’s point can be reworded as: the set of functions of natural numbers cannot be matched with the numbers, i.e. it is uncountable. All this is plain as pikestaff if we think of functions as graphs, and fully defined on the numbers. If we think of rules all this sounds odd. After all, rules are usually thought of as being expressed in words or something like it. The idea of having uncountably many words sounds strange. And if we think of computer languages it is obvious that we can enumerate all programs. Can’t a computer do it too? And if it can, what happens?

What computers can and cannot do More precisely, not real computers are meant, but our primitive idea of a com- puter; as Simple Simon imagines a computer. This is the reason why the discus- sion below belongs to philosophy, not to genuine computer science.31 We now go back to sequences of functions (of natural numbers), but mean rules, in particular, computer programs (for defining such sequences). So we write r instead of f. There is also another difference. Given, say, the n-th program applied to m, denoted by rn(m), we generally do not know if it terminates: in fancy language, rn may define a partial function; in ordinary language, there may be empty places in the table r0(0) r0(1) r0(2) ··· r1(0) r1(1) r1(2) ··· ··· The following cases concern computers that can do much less than we can do with them: 1. There is no arrangement of the - obviously countable set of - programs such

that some enumeration program rE does this:  rn(m) if rn(m) terminates rE(hn, mi) = does not terminate if rn(m) does not. For short, we write

rE(hn, mi) ' rn(m).

2. There is such an rE, but no diagonalization program rD such that

rD(n) ' rE(hn, ni) ' rn(n).

3. There is such an rD, but no Cantorian diagonalization program rC such that

rC (n) ' rD(n) + 1 ' rn(n) + 1. 31Philosophers themselves stress that they are interested in ideas, so - by implication - in those ideas that do not correspond too closely to the real thing. Ludwig Wittgenstein 188

The most famous theorem in logic applies to computers which can do the sort of thing above, and it says that such computers cannot decide whether or not rn(n) terminates. In other words:

Theorem 10.5.1 There is no termination program rT such that  1 (or true) if rn(n) terminates rT (n) = (10.3) 0 (or false) if rn(n) does not. Proof. Consider the program  0 if rT (n) = 0 r 0 (n) = (10.4) T undefined otherwise.

In words the instructions rT 0 say to start computing rT , and print out 0 if rT 0 finishes up with 0; otherwise do nothing. Look at rT (T ): by definition of rT it must terminate, and be either 0 or 1.32 But:

0 0 • if rT (T ) = 0 then by 10.4 rT 0 (T ) is defined (actually equal to 0), and by 0 10.3 rT (T ) should be 1

0 0 0 • if rT (T ) = 1 then by 10.3 rT 0 (T ) should be defined, and by 10.4 rT (T ) should be 0. 

Discussion Simply as a matter of course, rules named by a subscript of ‘r’, and their argu- ments named by the letter in parentheses after ‘r’ are treated on an equal footing, for example, in the enumeration rE. There is nothing mysterious about it. But as a matter of historical fact nobody used it before logicians did when talking about general rules for computers. In this way a kind of blindspot was removed. There are oodles of simple questions about what has been done already which simply force themselves on one (the French say: qui s’imposent), and can be answered. We shall not do this because this is a different subject, called Recursion Theory (or at least its more interesting part).

For example, here we used nothing about rE except its mere existence: every rE satisfying the definition on p. 187 satisfies the rest. More interestingly, in connection with the diagonalization program rD, we did not distinguish between the infinitely many D satisfying

rD(n) ' rE(hn, ni), that is, infinitely many even for a given rE. For some rD satisfying the equation, 33 rD(D) will, for others rD(D) will not be defined. 32Warning. Given T 0, in general the computer cannot decide which of the two cases holds. But that does not affect the conclusion. (After all, if one wants to answer a question about what computers cannot do, it would be silly to use only operations that the computer can do!) 33For formal rules as treated below, there are results under the heading Henkin sentences. Ludwig Wittgenstein 189 Formal rules (for arithmetic) Particular computer programs, socalled formal rules for mathematics, were stud- ied intensively before computers were treated generally. A reasonably typical example is the set of rules for propositional logic. For these particular rules rather special kinds of arguments were used to derive special cases of the results given above. Grandiose claims were made for these particular computer programs, especially concerning their relation to reasoning (which are still used in the rhetoric of Artificial Intelligence). There are reasons for the special form of these formal rules, albeit not good reasons! They are connected with grandiose purposes which these rules were intended to serve: they are not complicated enough for these programs, and are too complicated for the simple questions that are at issue here. As a result one fell between two stools. The grandiose purposes will be discussed in the next section. The simple question is about arithmetic, and particular formal rules for it, say H (for Hilbert, who liked this sort of thing):

Is H adequate to decide all its problems?

That is, given any formula A in H, is there a derivation in H for A or for ¬A? The answer is a very simple consequence of the very simple result proved above for computers, and here is a translation in the language of computer programs. • As a master program think of all possible derivations in H laid out in order. All this uses only countability of finite sequences of anything countable.

• To each formula A associate the program: follow the master program till you hit either a derivation of A or a derivation of ¬A, and write as output T , respectively F .

• Hard work (because it is specific to H): find a formula AH which cor- responds to Cantor’s diagonal rule above. (There are lots of systems of arithmetic that cannot do this at all: for example, addition alone is not enough.)

• Easy result: either H is so defective in its expressive possibilities (vocab- ulary and grammar) that the first two steps above cannot be coded in it (although there is no ambiguity in sight) or for some A the master program will not terminate. Clearly, in the second case H does not decide A. Given this result it is relatively easy to make short shrift of the original claims for the wonders of formal rules as contributions to an analysis of reasoning. It is equally easy to make short shrift of almost all social or moral claims of which the best tragedies are made. Equally, some of the philosophical writings about the feelings which were generated by those claims have high literary quality. Ludwig Wittgenstein 190 10.6 Logical Aspects of Language

Tractatus presents the positive side in an extremely simple, but by no means unrepresentative form. Wittgenstein’s later writings stress some negative sides; less simply, and perhaps therefore less convincingly.34 Knowing much more than he did, we naturally wonder about the use of this additional knowledge for assessing the logical aspects of language. Necessity. Very little is actually needed to convey the principal points. In fact, what we have learned here about propositional logic is enough to make the points about the remarkable efficiency of such a small vocabulary and precise grammar, and about its role for extending our reasoning capacities as opposed to analysing - the mechanisms of - those capacities. More specifically, the existence of a complete formalization - that is, rules for generating all valid propositional formulae - has nothing to do with the actual possibilities of propositional reasoning; in fact, the formalization may draw attention away from these possibilities. As a corollary, the logical ideal of completeness is deceptive. Luxuries. First of all, besides conveying a point, one may wish to prove a point. And, clearly, to prove the principal points (above), one has to show that they apply also in less simple situations then those of propositional logic. Secondly, one may be interested in the internal coherence of the logical ideals; whether they are of a piece. Take it on trust that the internal coherence (and precision) is impeccable; no less than the formal perfection of Newton’s mechanics or, for that matter, of astrology. The acid test comes with the question whether intended notions serve the purposes for which the notions are intended.

Positive aspects of ‘negative’ results Obviously, a result which is ‘negative’ for one aim, may be distinctly positive for another. (One man’s meat is another man’s poison.35) Otherwise there would be less to worry about in the choice of aims. Example. If you get it into your head that rational√ numbers are the measure of all things (Pythagoras) then the irrationality of 2 is a disaster. If you want to find linear factors for all quadratic equations (at least with rational coefficients) such as x2 − 2, you’d better pray for irrationals.

34In his introduction to Philosophical Investigation - as so often, much more mature than many parts of the text itself - Wittgenstein says that it is not a good book. As it happens he once told me what he meant by this: it should be memorable like a poem (and Tractatus almost lives up to this). Now, without stretching the meaning of the term ‘pollution’, the exaggerated claims and half baked aims of logical foundations, as dramatized in Tractatus, constitute intellectual pollution. For me it is hard to imagine that cleaning up pollution - especially if it is supposed to range from the philosophy of mathematics to the philosophy of sensations - could ever be as dramatic as creating it. 35From the French: poisson (fish), poison (poison). Ludwig Wittgenstein 191

Extreme example. Suppose you have a complete system S for some part of arithmetic. Then there is certainly nothing gained from knowing that a formula is provable in S over knowing that it is true. At best, you get some extra knowl- edge from knowing a particular proof in S. On the other hand, if you have an incomplete system there is a chance that provability by restricted means has a pay off. The subject of mathematical logic is full of work along these lines, and some of it is fine. Negative results can even be turned into a new philosophy: looking for analogs to the failure of traditional logical aims in connection with all traditional philo- sophical aims. Without exaggeration: this was Wittgenstein’s reaction (specifi- cally: to the failure of his own version - in Tractatus - of traditional logical aims). With considerable skill and very remarkable persistence he found such analogs in the traditional philosophical literature on a great variety of subjects. Remark about a gimmick. You can always reword any problem as a problem about language, that is, about the words you use to express your knowledge about the phenomena involved. More or less like lawyers who have their conventions for rephrasing different views about social, political or psychological matters in terms of interpretations of the (holy) laws. Article of faith. I myself find it hard to believe that his efforts were needed, at least, for any reader who has sympathy for his fascination - with extending those negative conclusions at all. A few striking examples would be enough to convey the general idea. And if, as seems apparent, there is a common trait to the mixed bag of socalled traditional philosophy, the chances are that such a common trait will include - variants of - the negative aspects he happened to spot. It would seem to me much more likely - and has turned out to be the case in my own experience - that the search for analogs in non-philosophical activities wold be more rewarding; in other words, the pursuit of plausible, but self-defeating ideals. I haven’t looked at the law (and barely at politics). But for the last 15 years I have looked at certain areas of computer science, and found there - not in the broad area of mathematics - good reasons for Wittgenstein’s phrase: der unheilvolle Einbruch der Logik in die [Computer-]Mathematik (the disastrous invasion of Logic into [computer] mathematics).36

Natural language Perhaps the single most common reaction to Wittgenstein’s emphasis on the negative side of the logical ideal for understanding language is the current fashion of studying socalled natural language. A sure sign that this is a tiny variant of Wittgenstein is that the books talk of the profound differences between - each other, and between - them and Wittgen-

36See Chapter 15. Ludwig Wittgenstein 192 stein.37 But one difference seems to me worth mentioning. The current fashion looks for theories of natural language. Wittgenstein - in the jesuitical tradition - looks for facts that would be embarrass- ing to likely (even if, as yet, unstated) systematic theories; apparently forgetting how easy it would be to find many facts of mechanics that are embarrassing to Newton’s first law of perpetual motion. As so often, the principal weaknesses are the points on which all concerned agree - not unreasonably, since we are dealing with an area where nobody has made substantial progress. It seems to me that the most obvious oversight is this:

How many ‘natural’ phenomena do we know at all which have a re- warding theoretical treatment? that is, rewarding compared to our untutored knowledge of them. By ‘natural’ I don’t mean of course merely real (since also manufactured products had better not be mere optical illusions), but phenomena that present themselves to our untutored senses. Sure, there are some; for example, celestial mechanics (at least, after correction for errors of parallax). If one understands ‘theoretical’ in a very lax sense, one has Darwin’s theory of evolution. But all developments of substance, at least so far, have introduced the consideration of phenomena that are far removed for our untutored senses, for example, molecular biology. What disturbs me about traders in ‘theories’ of natural language is their scientific innocence: no glimmer of any thought about accumulated scientific experience with ‘natural’ phenomena. Given the fact how easy it is to learn new languages, make up languages, etc., in short, to deal with artifacts, the subject of natural language (in con- trast to the - completely unexplored - range of possibilities of learning languages) seems exceptionally unrewarding; for example, even more unrewarding than ter- restrial mechanics, e.g. the mechanics of the motion of leaves or the wind, and the acoustics of the rustling of the leaves. There is the additional reason that, from practical use, we have an immense amount of untheoretical and even uncodified knowledge of natural (human) language. It’s like a breath of fresh air to learn something - albeit very elementary - about the languages of birds and bees. It would be the height of irony if the excitement about natural language had been formed or at least sustained by Wittgenstein’s own slogans about the central role of language for God knows what. For then it would have been his own language that bewitched those bedazzled natural-language worshippers.

37In this chapter I have bent over backwards, too, albeit in the opposite direction; by stressing how easily one gets from one religion to another or from G¨obbels to Wittgenstein. Chapter 11

A Parallel Between Wittgenstein’s Ways and Works

A main purpose of the parallel in question is to help convey a view about the philosophical ideal of a universal language. Experience with the logical founda- tions of mathematics has suggested this view, for example, to me. But it is so simple that it is best stated first in general terms. Incidentally, though simple it is in sharp conflict both with the heroic and the philistine traditions in Western culture. First, the principal problem with the ideal is its relative sterility, not its pro- fundity, let alone some inherent absurdity (a kind of mirror image of profundity). This is in conflict with heroic expectations of miraculous progress from any pre- cise, even roughly universal language. On the contrary, by experience, the ideal has lent itself to almost effortless systematic development, with (formally cor- rect) answers to a wide spectrum of questions like: What is possible? or: What is (the object) X? Mutatis mutandis, much the same applies to universal rules of inference or, more solemnly, of thought. (This side does not come up much in Wittgenstein’s works and not at all in Tractatus.)1 Secondly, the view stresses the (established) possibility of rewarding shifts of emphasis. This is in conflict with the philistine view, apparently held by the silent majority, that: what is true in general, will be trivial in each particular case (that could, realistically, concern us). In this extreme form the philistine does not expect the ideal to be merely relatively sterile. Naturally, research is required to discover appropriate shifts, in particular, restrictions on the domain(s) for which the universal language is meant, for example, for programming digital computers,

0Published here for the first time. 1Warning (for experts). For the present view the much publicized limitations on a universal language in the literal sense, for example, on providing its own truth definition, are minor. On the contrary they distract from the sterility in areas where a universal language is available.

193 A Parallel Between Wittgenstein’s Ways and Works 194 or appropriate additional conditions over and above mere universality. It is not far-fetched to expect a (more) general philosophical relevance of the view just adumbrated; specifically, if it turns out that many ideals of a traditional philosophical literature share the properties, mentioned above, of the logical ideal of a universal language. Not far-fetched because: first, logical foundations follow - not only on purpose, but in effect - that tradition; secondly, they have been developed with exceptional imagination and determination. So their defects are liable to be intrinsic, not due to incompetence in their execution. By the nature of the case the two parts of the view must be presented in different ways. An ideal can be a subject. Its defects do not constitute a subject at all, to be compared to the fact that the complement of a set - which, by Cantor’s definition, can be thought of as a unity - is not a set at all. The main body of Tractatus will here be regarded as one way of presenting the ideal of a universal language, in particular, the excitement of the question: What can be said? To see Tractatus this way one has to allow for a particular, dramatic style of presentation. Philosophical Investigations or, at least, the early parts will be regarded as an attempt to convey - some of - the defects of that ideal. The later parts are seen as doing much the same for equally flashy ideals in the philosophical literature outside the tradition of logical foundations (and thus confirm the general philo- sophical relevance, mentioned earlier, of examining the logical ideal). Again it is necessary to allow for a particular style of exposition; quite different from that of Tractatus, but equally dramatic in its own way.

Warnings or reminders (depending on the reader’s general background) First and foremost, the universal language is not intended to incorporate those linguistic phenomena that strike the untutored attention most, for example, the remarkably rapid evolution of languages, including trade jargons. Roughly speak- ing, a principal original aim was to use a universal language for reasoning well (in science and philosophy); to be compared to the mechanics of point masses for thinking about planetary motion (and not terrestrial mechanics though the latter concerns us more). The logical idea of a universal language must be distinguished from the quite vague general idea of such a thing, to be fleshed out by new discoveries, for example, about neurological elements involved. This might be compared to the difference between the philosophical question: What is matter? and its quite vague general form that has been ‘fleshed out’ - more precisely, been transformed - in modern physics to become: What is matter made of ? The former requires an answer in terms of the background knowledge involved in asking it, the latter an immense amount of experience including the discovery of a microstructure. A Parallel Between Wittgenstein’s Ways and Works 195

(For a unified field theory the addition of ‘made of’ would be inappropriate.) Furthermore, in the modern answer, in terms of elementary particles, the question is restricted to other matter, while, applied to these particles the answer has the form: X = X. It is progress to restrict the question. As Aristotle might say today, one discovers where to end a chain of reasons, and does not always rely only on good breeding (cf. Metaphysica Γ 4, 1006a, 6–9). It is not claimed that Wittgenstein’s presentations are particularly efficient, at least, not for a public that is really likely to benefit from an examination of logical ideals at all. An obvious alternative is to develop such ideals, and so give them rope to hang themselves. This manner of refuting ideals has been compared to establishing a political ideology in a whole country after it has failed in a small, but enthusiastic community. However, Wittgenstein’s presentation have a literary quality that makes them rewarding objects of reflection (for some of us). Last but not least, a word about the literary device used below. The parallel, between Wittgenstein’s personal and philosophical styles, is here thought of as fitting the venerable tradition of reading biographies of notable personalities. The hope is that the reader would acquire effective styles of conduct, including the judgment for deciding which style, if any, fits a given situation. For a good use of the parallel below it is even more important to keep in mind what the tradition does not claim:

1. the hero of the biography need not have formulated, or even been aware of, any particular rule of conduct, let alone the one that the biography is expected to convey;

2. that rule need not even fit the actual conduct of the hero; it should just suggest itself to enough readers. This is the measure of success for the (educational) tradition in question.

Similarly, it is not relevant below to what extent Wittgenstein was aware of, let alone formulated, the view described at the outset, and to be conveyed by use of the parallel below. It would be satisfaisant pour l’esprit, du moins, le mien, if this use were successful since Wittgenstein had a particular liking for Buffon’s: Le style c’est l’homme. Of course, only the most simple-minded among us would even be tempted to speculate on some causal connection between, or common cause of, the (objec- tively) related styles. Our knowledge of such matters is so far below the threshold for speculations of this sort that the latter express coarse mindedness, not at all intellectual curiosity.

Wittgenstein’s ways Here is what struck me during the years 1942-48, when I saw a lot of Wittgenstein (before his final illness). A Parallel Between Wittgenstein’s Ways and Works 196

His single most notable feature was a sense of drama and timing, at least, when we were alone, and he was at ease (cf. note 2 for contrast). Once he said to me: Science is O.K.; if only it weren’t so grey. To me it is not grey; and when it is, a colorful commentary helps. But daily life can be grey, and Wittgenstein’s company was not. Moreover, at least to me it seemed natural, and free of the hysteria often found in people who try to be ‘lively’. The second most notable feature was his attachment to things or words that struck him, in the world around him or in his own imagination (in German: auf -, resp. einfallen). For someone who did not know him it must be hard to imagine how little he read of most books or articles that fell into his hands (cf. note 5). Any impression, positive or negative, triggered a train of thoughts, which he followed without any attempt to acquire more background knowledge. Also he never learnt the art of leafing through material, so useful, at least, in science. A particularly memorable example of that second feature is described on p. 235 of Chapter 13. Some phrase, about truth, in the introduction to G¨odel’s undecidability paper stopped him from reading on until the matter was explained to him more than 10 years later in terms more congenial to him. In the meantime he had written pages of bilge about it, published by his literary heirs in the Remarks on the foundations of mathematics. All of it is superseded by a single remark of his quoted loc. cit., incidentally quite similar to one later on in G¨odel’s paper. That second feature did not spoil for me his remarks on literature and art, and above all on the feelings we have about our own knowledge and ignorance. Perhaps he saw things in a similar way when he said that science could only interest him while those subjects absorbed him. But that feature made most of his remarks on everyday subjects and especially on politics almost painfully futile.2

Tractatus Tractatus satisfies the criterion that Wittgenstein mentioned to me for a good philosophical book: one should remember it like a poem. As I see Tractatus now, it is a kind of metaphysical poem that conveys the excitement of the ideal of a universal language. In place of (Frege’s): What is the number 1? the question: What is the world? is answered here, telling you what makes sense. Given that

2When I visited Wittgenstein in 1945 together with Crick of the Double Helix, Wittgenstein was ill at ease, as usual when we were together, but not alone. He brought up the subject of those gruesome concentration camp films, which he saw as election propaganda for Churchill, as Britain’s saviour from a similar fate. Crick pointed out - rightly, if somewhat contemptuously - that domestic issues would be decisive. Wittgenstein had no clue. The truth is that Wittgen- stein was struck by the awful effect of those films on the viewer who felt that he was being brutalized, and not made more sensitive to brutality. But he let his mind run on unchecked, drivelling about elections that did not really interest him. A Parallel Between Wittgenstein’s Ways and Works 197

Wittgenstein tacitly assumes that the supply of simples is finite (as he confirmed when I once asked him) the language considered, namely, propositional logic, is remarkably universal. Compared to Frege’s preoccupation Wittgenstein’s expo- sition corresponds to the step

from Heute geh¨ortuns Deutschland (today Germany belongs to us) to Morgen [geh¨ortuns] die ganze Welt (tomorrow the whole world [will]).

Obviously the kind of literal-minded people who worry about the exact stock of simples or for that matter who are attracted to analytic philosophy do not even begin to understand the excitement of the whole scheme. Many familiar ideas are not expressible in the universal language, for example, truth. In this respect propositional language is marvellously typical of the usual much more elaborate languages, many of which serve us very well. As somebody said Tractatus would be a very good book in the ordinary sense of ‘good’ if its only defect were that some Neapolitan gesture is not expressible according to its scheme. Wittgenstein’s sense of drama, stressed above, helped him convey the excite- ment; this job carried him almost all the way. Only towards the end are there jarring (second) thoughts about the whole scheme; for example, at least, as I read them, the following reminders:

• Many ideas are better shown by examples than said, tacitly in terms of the universal - or, for that matter, any other - language considered.

• If one cannot speak about something - again, tacitly, in that language - it is often just as well to be silent. After all, this is the so to speak passive practice of the silent majority; more aggressively this is recommended by an anonymous mystic, looking for cover under The Cloud of Unknowing.

Wittgenstein’s formulations are memorable; we still talk about them. The sober formulations are not. Yet the reminders are useful; not because we can expect much progress from remembering then, but because it can be disastrous to forget them. Obviously, Wittgenstein’s formulations are not meant literally. Many things can be both shown and said, for instance, what a spiral looks like. And many things that cannot be said according to some scheme have been conveyed suc- cessfully by metaphors, parables, illustrations, and in many other ways besides. This is so obvious that the last sentence of Tractatus is a great temptation to write a new book. How to do it is a different question. But first we have to look at another way of communicating the excitement of a universal scheme. A Parallel Between Wittgenstein’s Ways and Works 198 Einstein’s way Eistein’s way of presenting his theories of relativity, but also - in the book Albert Einstein with Infeld - many other spectacular physical theories, has certainly the general flavour of logical ideals. There is the business about the general form of theories, and the use of very simple isolated observations; for example, the experiment of Michelson-Morley, which is paradoxical for the then current theory, or the equality of inertial and gravitational mass, which is familiar enough, but was neglected. Certainly Einstein’s presentations are almost uniquely artistic. But they also contained real surprises. If we think of length, time and mass we think of dimensional analysis, appro- priate for the most superficial investigations in physics. Even if the rhetoric about space, time and matter and ‘our’ intuitive conceptions of these things conflicts with Hugo von Hofmannsthal’s reminder about good taste,3 the fact remains that Einstein got somewhere by taking a fresh look at the general area suggested by those conceptions. What kind of temperament would resist the itch to do the same for other conceptions of the same vintage? such as those considered in log- ical foundations. It is not a matter of some kind of bewitchment, but the sweet small of success. More precisely, this was so in the first third of the century. The actual devel- opment of physics and of the other sciences including mathematics has been quite different from early expectations (cf. Wigner [1982]). The mix of simple, familiar ideas and their variants on the one hand and of experimental imagination and skill in using technological resources to extend experience on the other is weighed heavily towards the latter. One has developed confidence in the ability of our intellectual reflexes to develop effective ideas from this extended experience; one does not rely principally on closer analysis of what we know already; cf. also the end of this chapter. Of course, the price is high: the ‘entry fee’ is greater background knowledge. But even before the scientific experience just described Einstein’s presenta- tions could be seen to convey only a very one-sided kind of understanding; severely weighted towards theoretical ideas without much attention to the kind of phe- nomena where those ideas are spectacularly relevant; more pedantically, to the question in which areas of experience the forces involved in those ideas are dom- inant. Characteristically, Einstein himself had an exceptional talent for spotting such areas or, as one says, for discovering experimental consequences. For exam- ple, others had come close to the ideas of general relativity; only Einstein thought of genuinely relativistic phenomena. With all this in mind we now return to Wittgenstein’s works.

3Der gute Geschmack besteht darin, jeder Ubertreibung¨ jederzeit zu widerstehen (good taste is always resisting any sort of exaggeration). A Parallel Between Wittgenstein’s Ways and Works 199 Philosophical Investigations Philosophical Investigations had, for Wittgenstein, the correction of ‘grave errors’ in Tractatus as a principal initial aim. This led naturally to excursions into parts of the world somewhat neglected in Tractatus such as psychological phenomena including mathematical reasoning, and schemes, specific to those parts, with a flavour similar to socalled logical atomism. (The schemes are easy to guess, but usually not stated by Wittgenstein.) Here the two features of Wittgenstein’s style stressed above interfered. There just is nothing dramatic about a simple scheme being defective. Con- trary to a widespread superstition, the schemes of logical foundations and more generally of logical atomism were sensational just because they were so obviously exorbitantly simple; certainly not because they were ‘subtle’. The second feature interfered even more; above all, through Wittgenstein’s reluctance to use more demanding background knowledge. More generally,

defects of a scheme become really convincing only if they plague also its most imaginative and sophisticated developments.

It is not enough if they appear in a very elementary exposition even if the latter is enough to convey the excitement of the scheme. More superficially the second feature spoils Wittgenstein’s - in my view basi- cally sound - case because his fascination with his own thoughts and his formu- lations of them hides, also from himself, the fact that they refer to things and practices that are quite well known under another, albeit less colorful name. Here are a couple of examples. Families of concepts are often defined by socalled first order axioms that are derived from categorical higher order axioms, but simply turn out to serve the purposes of the latter better; more pedantically, they serve better the intended purposes of the intended (categorical) notions. In the area of linguistic (or, to use a fashionable term, cognitive phenomena) language games are perfectly comparable to real or imagined experimental situ- ations where complicating elements are eliminated. For example, in (terrestrial) mechanics Galileo eliminated disturbing effects of friction and air resistance by considering a cylinder rolling down an inclined plane or a sack of feathers falling freely behind a sphere of lead. Sure, there is a point to the word ‘game’ simply because there is little evidence that the linguistic phenomena Wittgenstein con- siders are comparably rewarding for any theoretical treatment to the mechanical phenomena spotted by Galileo; for one thing we really know an awful lot about language even before we look for any theoretical understanding. So it is best to think here of jeux d’esprit rather than of first steps towards bigger things.4

4The list of examples where Wittgenstein does not seem to recognize the relevance of familiar practices to his own aim could be continued; for example, to take a couple of adjectives in place A Parallel Between Wittgenstein’s Ways and Works 200

Wittgenstein’s attachment to old thoughts, both his own and others, goes well with his occasional worries about not having - tacitly, enough - new thoughts. It seems fair to say that the best help he got to wean himself from that attachment was to hear or see his own formulations from the mouths or in the writings of others, where they sounded less resonant. Thus many that appeared in the Blue and Brown Books or other earlier writings no longer intrude into the Philosophical Investigations. The next two sections give a single example of a very different style for achiev- ing Wittgenstein’s aim. Perhaps the principal difference is that Wittgenstein’s all-purpose expression ‘use of language’, a kind of frame without picture, will be replaced by specific uses in an area familiar enough to let us judge the matter.

Rules Instead of quoting Tractatus, which he wanted to correct in Philosophical Inves- tigations, Wittgenstein begins with a passage from St. Augustin’s Confessions5 on the general form of propositions as it were, namely, declarative sentences. The great news is that we also have commands, exclamations, and so forth. Now, what is gained or lost by paraphrasing commands as declarative sen- tences (for example, in the style of deontic logic)? Wittgenstein leaves not only the answer, but even the question to the reader; an inefficient tactic just because these mattes are not altogether obvious. Without exaggeration, Wittgenstein had not even begun to think of a ‘use’ which would permit a convincing answer. In connection with the building trade such linguistic details could be relevant if the mate’s work is to be done by a robot, and a suitable programming language is needed. This reminder leads immediately to the example promised above. In recursion theory, which studies the most elementary aspects of programs, the single most striking novelty is the use of partial functions. They are anathema to Frege or, for that matter, Tractatus, where good rules must be everywhere defined. A principal advantage of considering both good and bad rules together is stated in technical language as follows:

There is a partial recursive function that enumerates all partial recur- sive functions, but none that enumerates all total ones.

As a corollary, Cantor’s diagonal argument could be applied by G¨odeland Turing. Here it establishes that certain rules are bad. More precisely, all rules, say, with of the nouns above, he uses ¨uberschaubar und einpragsam in the case of proofs (that are easy to take in and to remember). 5Actually, Wittgenstein did not know the Confessions all that well, perhaps because of his reading habits described above. On several occasions I made allusions that would have infuriated him if he had known the passage meant. Of course he was aware of what was going on, but just did not know the passage. So he called me ‘whimsical’. A Parallel Between Wittgenstein’s Ways and Works 201 arguments n = 1, 2, 3,... and values 0 or 1, are thought of as enumerated in ω order, say, r1, r2, . . . with a ‘diagonal’ program rD:

rD(n) 6= rn(n).

For example, if

rD(n) = 1 − rn(n) the condition above is satisfied. Evidently, if n = D, rD(n) 6= rn(n) requires that rD(n) - that is, rD(D) - is not defined at all. This point became clear to Wittgenstein in the forties.6 Now we are ready for the question: What is lost if knowledge about partial functions, including bad rules, is paraphrased in terms of total functions?7 The general answer is simple. Attention is drawn away from some central points. (By the introduction, more specific answers cannot be simple inasmuch as defects do not constitute a ‘simple’ subject.) One such point is the asymmetry between the domain where a partial function is defined, and its complement; specifically, a numerical computation can verify that a rule is defined at some argument, but a mathematical proof is needed to establish that it is not. Another, but related point concerns closure under logical operations. In contrast to children, comput- ers are never explicitly told or programmed what not to do. The paraphrases above have perfectly sensible negations; but the latter are not paraphrases of any commands, in other words, irrelevant to programming. These easy examples lead to a

Principal Point: For effective knowledge the matter of attention is critical. Rituals connected with such possibilities-in-principle as the ritual of paraphrasing in a universal language draw attention away

6Wittgenstein almost certainly never realized how easily G¨odel’sincompleteness theorem for a formal system, say F, is deduced from this. The point is to think of a formula with a single free variable n, say A, as a computation rule rA. (Formal rules of inference used to be thought of as means of ‘checking’ proofs formally.) One chooses - quite unrealistically - canonical notations, say numerals, ∆n for the number n, and - equally unrealistically - thinks of all derivations of F laid out in an (appropriate) order. Then

 0 if A(∆ ) is the end formula of some derivation in F  n rA(n) = 1 if the formal negation, ¬A(∆n), of A(∆n) is so derivable  undefined if neither is the case.

According to taste readers can use G¨odelnumbers of the formulae A considered or just enumer- ate the latter. The counterpart of the rule rD in the text is the formula ¬∃xP rov[x, s(n, n)]; here s(n, n) is the G¨odelnumber of A(∆n) where n is the (G¨odel)number of the formula A. As usual, P rov(a, b) means that a is the number of a derivation in F of the formula with G¨odel number b. 7Such paraphrasing is in fact done by those who insist on going through the ritual of formu- lating elementary recursion theory in the official language of set theory, for example, by use of three-valued functions, when the value −1 indicates that the function is undefined. A Parallel Between Wittgenstein’s Ways and Works 202

from a proper focus to so to speak the outer limits of possibility. This is a crucial, but not primarily logical defect; it concerns sterility, not illegitimacy.

This point helps to sharpen Wittgenstein’s aim of debunking the ideal of a universal language in general, and the language of (Frege or) Tractatus in partic- ular. As already stressed, the futility of an ideal is established most convincingly in an area where it can be realized. In the present case Frege’s ideal of a good rule or well defined predicate can be realized in the area of programs, but is shown to be inefficient.8 But the real defect of a literal interpretation is elsewhere. It draws attention away from occasional rewards for some non-literal interpretation, that is, univer- sality for a relatively large domain, such as programming languages for digital computers. Stretching the meaning slightly, we may look at the material of the present section as illustrating the use of another kind of universal element. Here, in the domain of partial functions, a ‘universal’ one enumerates all of them. As a corollary, any attempts, such as those of the later Wittgenstein, to attribute ‘grave’ logical defects to the ideal are themselves grave philosophical errors, re- flecting a false perspective. Exaggerating very little, the shifts of emphasis from literal to (appropriately restricted) non-literal interpretations of an ideal consti- tute exercises of those sound intellectual reflexes that are involved in any sound use of any theoretical ideas; cf. p. 198 on the second sense of ‘understanding’. Differences between different theoretical ideas relate to the kind and degree of reward for simple shifts of emphasis. In particular, universal programming lan- guages mentioned earlier are indeed possible, but -except for very crude problems, such as removing a blind spot - demonstrably inefficient. Warning against the hope of panaceas. For some of us the efficient application of the diagonal argument above to partial functions is indeed an unsurpassably concise correction of Frege’s ideology. But the same argument has been (ab)used in Hofstadter’s G¨odel,Escher, Bach9 to hawk another ideology, not an iota less pretentious or silly, the holy grail of the black box. Following Turing he is proud of his ignorance of the wetware of biological intelligence (that is, treats it as a black box), and compares it with artificial - better, digital - intelligence only by reference to the crudest kind of stimulus and response.10

8Reminder. In contrast, such results as the indefinability of truth in note 1 are inadequacies only for the widest sense of ‘universal’; in other words, when the ideal is taken absolutely literally. 9Collectors of blindspots with popular appeal will cherish Hofstadter’s book (but no more than the Liar in the next section). He tries to make out that G¨odel’s theorem is a help to AI , although the theorem as stated shows what computers cannot do, while AI lives on what computers can do! What is true is that a step in the proof could be interpreted, by von Neumann, as a hint on how to apply programs to programs, which is indeed useful for AI . 10Reminder. The possibility of use and abuse of the same or similar technology (here, of elementary recursion theory) is of course familiar, for example, in connection with pollution A Parallel Between Wittgenstein’s Ways and Works 203 Bewitchment: deep thoughts and deep feelings about them In the light of the last section, on the need for shifts of emphasis in general, and non-literal interpretations in particular, Wittgenstein’s slogan about bewitchment (about thoughts11) is viewed as dramatizing effects of forgetting that need. The slogan goes well with Wittgenstein’s own attachment to thoughts (and feelings) mentioned on p. 196. Some of the components of such bewitchment are best illustrated by the old paradoxes, and their later non-literal interpretations, for example, by the cretinous Cretan Liar:

1. What is most remarkable is the survival, over more than 2,000 years, of the original form: All Cretans are liars (in ignorance of the fact that some variants do not have the defects below of the original.) Who is a liar? One who lies (i) once or (ii) always, or, at least, has done so always in the past? There is not even a chance of a paradox unless all Cretans other than the speaker have been liars in the sense considered; in (ii) of course also the speaker. So the paradox arises only in a most singular context, not at all just with the one sentence.

2. At least somebody in most generations over the last 2,000 years was paral- ysed enough by the original form not to have asked the pertinent question in 1. After all, if things go wrong only in such a singular situation we have a very good language; cf. the ‘grave’ defect of Tractatus in connection with Neapolitan gestures.

Realistically speaking there can be no question of bewitchment. The Liar has disturbed humanity during the last 2,000 years no more than the set-theoretic paradoxes have disturbed the silent majority of mathematicians in this century. A different matter is the following. There is a kind of cult of the paradoxes; as providing some kind of glimpse of some Profound Truth. This view is fitting enough if, as above, logical foundations are viewed as comparable to Einstein’s reflections on the foundations of geometry, where, as already mentioned, the experiment of Michelson-Morley has a paradoxical flavour. Specifically, without the fuss about the paradoxes the comparison is ludi- crous since, judged by existing results, the contributions of logical foundations to knowledge are vanishingly small (even if reservations about relativity, as in

(here, intellectual pollution). Related technologies produce, identify, and clean up pollution; with an asymmetry reminiscent to that in the introduction between a view and its defects. Hofstadter’s abuse consists of a kind of gushing euphoria that has been compared to gushing drivel about solving all social problems by (Mendel’s laws of) genetics; but cf. also note 14 about ‘progress’. 11I do not use ‘[bewitchment] by language’ because the ritual of replacing ‘thought’ by ‘lan- guage’ seems to me a prime example of bewitchment, if one wants to use this word. A Parallel Between Wittgenstein’s Ways and Works 204

Wigner [1982], are remembered). But there remains hope of future miracles if the current state of logical foundations can be presented as severely defective, and the cult of paradoxes has this effect, if not this conscious purpose.12 Last, but not least: interpretations of - or, less pompously, comments on - some of the paradoxes have become rewarding in connection with mathematical logic. The most famous example is G¨odel’sswitch to - or, equivalently, non literal interpretation in terms of - formal derivability. One of the most recent examples, but clearly related to the matter of partial functions above, is Kripke’s version with his slogan about the facts being with us. The slogan is well illustrated in 1 above with its stress on the singular (factual) context needed for the Liar paradox.13 Loosely speaking, as in the cult of paradoxes the latter may be said to have found ‘uses’ in the interpretations above. But the perspective is quite different, perhaps comparable to the difference in Einstein’s use of the ‘paradoxical’ result of Michelson-Morley, and the traditional treatment of observations that could certainly be called ‘paradoxical’ without stretching the meaning of the term. For example: 1. the sky is blue though sunlight is white and air is colorless;

2. a straight stick looks bent when placed in water and is straight again when taken out of the water. 1 and 2 are not thought of as central problems giving rise to grand developments, but as odd facts that become intelligible as corollaries to general knowledge of optics; by the grace of God, not needing intricate knowledge of the human visual system.

Alternatives to philosophical brooding about familiar phe- nomena, including puzzles It would be a misunderstanding to suppose that, in connection with traditional philosophical matters, High Science is the principal alternative to brooding. At least to some extent those matters have to do with feelings about our knowledge and ignorance, already referred to. Those feelings occupy, realistically speaking, much more of the inner lives of (interested members of) the fraternity than the

12Incidentally, the view taken here of the cult of the paradoxes comes under the heading of l’intelligibilit´eprofonde (de notre pens´ee). This is particularly popular among French mathe- maticians except for one difference. They apply it mainly to intellectual successes, while here we are concerned with rhyme and reason of an intellectual aberration, namely, the fuss about the paradoxes. 13The list is endless; for example, Russell’s paradox is interpreted in terms of the obvious fact that there is no biggest set X satisfying: x ∈ X → x 6∈ x (since if X satisfies this then so does X ∪ {X}) and, respectively, that there is no biggest natural number for Zermelo’s successor: x 7→ x ∪ {x}. A Parallel Between Wittgenstein’s Ways and Works 205 preoccupations of most ordinary novels, and are, I believe, a proper subject for literature. A classic of this kind is Musil’s Der Mann ohne Eigenschaften. But there are also aper¸cus and asides in popular writing by thoughtful scientists that belong here.14 One extreme alternative to brooding is to focus attention on not only unfamil- iar phenomena,15 but those hard to come by; involving macrostructures like stars and their interiors, and microstructures as in electronics or molecular biology. Of course, one does not forget the familiar phenomena, but tries to relate them to the new ones. For success it is not necessary to be brighter than Plato and Aristotle who, after all, had access to familiar phenomena too. With the present alternative one uses something that they certainly did not have, such as experience of high energy physics getting us ‘closer’ to the interior of the stars. Another alternative, specifically, to brooding, is the exercise of common sense; including those intellectual reflexes that help us decide when which alternative is likely to pay off. Now, of course, Wittgenstein too thought he was looking at something new, even in Philosophical Investigations: elements of the easily accessible world that are perhaps systematically neglected, at least, in the contexts in question (and only in this sense unfamiliar). There can be no doubt that it is occasionally rewarding to look at such elements. (Polonius talked boringly about things that Hamlet neglected.) But even without Wittgenstein’s wish to make the matter dramatic, it may be hard to write a whole book about new elements.

14For example, somewhere in Life Itself Crick conveys the general - but often unfounded - feeling after any appealing discovery that tremendous progress is just round the corner if only one bends the right way . . . To me this is much more vivid than Wittgenstein’s motto, about progress, for Philosophical Investigations (taken from Nestroy). For one thing, Crick expresses the feeling itself, not merely reservations about it; and in connection with topical discoveries to boot. 15Warning. The expression ‘unfamiliar experience’ should not be associated with supernat- ural experience, drug-induced hallucinations and other things (admittedly, unfamiliar to me). Good luck to those who can procure such experiences and, above all, do something effective with them. But what I have in mind is the kind of ‘cost intensive’ experience mentioned below or, as contrast to the particular case of logical foundations, the possibilities of the mathematical imagination exhibited in Higher Mathematics. Chapter 12

Wittgenstein on the Philosophy of Mathematics

Apart from the obviously impressive flair and vigor of the style, Tractatus remains an outstanding example of the heroic tradition of Western philosophy, with its questions about the general structure of knowledge or the correct analysis of (all meaningful) propositions. Since the questions certainly occur, to anyone, prior to any detailed intellectual experience, more or less the same is expected of the answers. Tractatus is quite remarkable in this respect: no appeal to anything that would ordinarily be called a discovery, about physical or mental phenomena, barely any use of new intellectual (let alone material) tools except - of all things - truth tables for propositional logic. In particular, there was very little about mathematics in Tractatus except for a brief reference to an operational analysis (in contrast to the set-theoretic analysis in Principia, which is also in the heroic tradition, but a less pure example). When he later spelled out his misgivings, not so much about his personal contribution as about the whole heroic tradition, Wittgenstein found that quite elementary mathematics provided excellent illustrations of the dominant theme of the Remarks on the foundations of mathematics: weaknesses of the principal logical foundations. Trivially, logical theory dulls the variety (‘motley’, Bun- theit) of mathematical experience.1 But significantly, at least in the bulk of mathematics, logical analysis fails in its own aim (as a safeguard against error) because it neglects just those features that make proofs perspicuous, memorable, and thereby convincing. Independently of the Remarks, logicians have tried to study quantitatively those neglected features (which are best discovered in an

0This chapter is based on: ‘Wittgenstein’s Remarks on the foundations of mathematics’, British Journal of Philosophy of Science, 9 (1958) 135–158; ‘Zu Wittgenstein’s Gespr¨achen und Vorlesungen ¨uber die Grundlagen der Mathematik’, International Wittgenstein Symposia, 2 (1978) 89–81; and ‘Review of Wittgenstein’s Remarks on the foundations of mathematics’, Americal Scientist, 67 (1979) 619. 1Cf. Goethe’s comment that all (!) theory is gray.

206 Wittgenstein on the Philosophy of Mathematics 207 area, like foundations, where logical aspects are not obscured by more interesting mathematical ones). But neither their works nor Wittgenstein’s endless talk of (unspecified) ‘uses’ is adequate for anything beyond banalities about the aspects of proofs that interested him.2 The Remarks are concerned with both:

• general philosophy (cf. Section 1), which (in so far as it is concerned with mathematics) applies to philosophically interesting differences between ma- thematics and other intellectual activities, and

• foundations of mathematics (cf. Section 2), which applies to philosophically interesting differences between various parts or aspects of mathematics.3

The vivid and incisive language (of the original4) and the many stimulating ques- tions apart, they make a mixed impression, and seem to me a surprisingly in- significant product of a sparkling mind. Of course, one should not forget that the Remarks were not intended (unlike the Blue and Brown Books) even for limited circulation, let alone for publication.5 The value of the book does not lie in a new point of view, but in penetrat- ing observations on a limited subject matter. Specifically, Wittgenstein’s signif- icant contributions to the philosophy of mathematics concern very elementary computations, a subject which seems to have been neglected by contemporary philosophers though not, for example, by Kant. Also, Wittgenstein emphasises repeatedly those aspects of proofs which are neglected in the customary treatment by the methods of mathematical logic. His most striking fault is that he believed that all significant philosophical problems occur at the level of elementary computations, and that he made un- warranted generalizations from this limited region of mathematics to mathematics generally. But even if one is aware of other philosophical problems, the book gives a misleading impression, because it suggests an artificial dichotomy in the foun- dations of mathematics, so to speak: Wittgenstein against the rest. In fact, the aspects emphasised by him are a few among many. To me the single most disturbing (and most surprising) defect of the Remarks was and remains Wittgenstein’s own fumbling. This is not only in contrast to what I remember from his conversations with me, but also to other material

2The known representations, by formal derivations or ordinary expositions with pictures, just are not good enough. At least some crude idea(lization) of the memory structures actually involved in seeing proofs would seem to be needed. 3For instance, the questions ‘what is a (correct) proof’ or, generally, ‘what is mathematics’ belong to general philosophy, ‘what is a constructive proof’ or ‘what is a predicative concept’ belong to the foundations of mathematics. 4The letter R refers to the Remarks, and the numbers to page numbers in the German text. 5Not to talk about the Lectures on the foundations of mathematics, which do not even record what Wittgenstein said, but what a bunch of students thought he had said. Wittgenstein on the Philosophy of Mathematics 208 such as Zettel, apparently also not intended for publication. For balance, I thus conclude this chapter with some personal reminiscences.

12.1 General Philosophy

Wittgenstein’s starting point is this: he is not prepared to use the notions of mathematical object and mathematical truth as tools in philosophy. Actually he gives some arguments against them,6 but I do not find them convincing. To me the real objection to these notions is that, at any rate as far as I know, there does not exist a single significant development in philosophy based on them. As I see it, the position is similar to that of the notions of the atom or absolute simultaneity at the time of the Greeks: neither of them could be used for understanding the world at the stage of technical and conceptual development of that time.7 In other words, the notion of a mathematical object is defective because one has no clue for using it to provide satisfactory answers to the (philosophically significant) questions which it should answer, although no case has been made out that it cannot do so.

Empiricism Now, granted that such ‘metaphysical’ notions as that of a mathematical object are to be avoided in favour of an empiricist approach (in the sense of discussing a subject in terms of things perceived and done, i.e. facts and phenomena, in contrast to sophisticated abstractions), it seems quite natural that Wittgenstein should be concerned with positions such as:

1. A theorem is a rule of language, and the proof tells how to use it (R 81).

2. The meaning of a theorem is determined by its proof (R 79, 122).

3. A calculation is a (psychological) experiment (R 99).

Note that 1 and 2 are more or less complementary because 1 leaves open the exact role of proof in mathematics, while 2 is mainly concerned with this question. Position 3 is contradictory to the others. What is common to them is that, separately, they are in the direction of an empiricist approach.

6For example: they are supposed to be metaphysical and to get in the way of straight thinking like alchemy (R 60, 142). Another objection cites the misleading (mystifying) pictures that may be associated with these notions (R 36). Incidentally, the question is not the existence of mathematical objects, but the objectivity of mathematical truth. Wittgenstein argues against the former, but not against the latter (R 96, 124), since he often refers to formal facts (R 128). 7I chose the atom as an example of a notion with a future, and absolute simultaneity as an example of a notion which was later discarded. Wittgenstein on the Philosophy of Mathematics 209

Partly the attraction of these positions is due to wishful thinking. As to 1, it is clear that language is something ‘tangible’, not a ‘hidden’ object; as to 2, proofs are recognised spontaneously, like color (R 96, 125), in contrast to theorems which, when regarded as assertions about mathematical objects, are inaccessible; as to 3, experiments are something practical and they tell us facts (R 99, 171). Moreover, from these points of view the problem of an ultimate justification for mathematical assertions does not arise: in case 1, since there is no sense in speaking of the truth of a rule anyway; in 2, since there is no sense in speaking of the truth of an assertion before its meaning has been fixed, and so, if the proof is needed to determine the meaning of a theorem, then there can be no question of a justification for principles of proof; and in 3, one is simply seeing what is happening and this is supposed to be unproblematic. Of course, these formulations are oversimplified, and Wittgenstein does not accept 1–3.8 Generally speaking, the fault lies in the assumption (of naive em- piricism) that there is a sharp distinction between the empirically given and the means of description, while according to Wittgenstein (R 173) we need concepts to tell us what are the facts. But it is interesting to see more specifically where these positions go wrong, and what is sound about them.

Theorems are rules of language In favour of this position we can say that, whatever else a theorem may be, it certainly is also a rule of inference: e.g., if A is a theorem then from A → B we can infer B. As for rules of languages, Wittgenstein stresses (R 184) an interesting point which shows a general limitation of crude empiricism: while he regards ‘believing oneself to follow a rule’ as an empirical notion, the notion of ‘following a rule correctly’ is not. In short, there is a non-empirical residue in the notion of a rule of language.9 The point seems to me of importance for foundations generally, but Wittgenstein does not develop it. Position 1 raises the question why proofs are needed since a rule of language, as ordinarily understood, is a matter of simple decision. Wittgenstein does not give a plain answer, but suggests (R 77) that the proof tells us how to use the rule. A parallel development to this suggestion leads to position 2: before the proof is given the concept is pliable (R 144).

8For example: 1 is criticised on R 119; 2 on R 77, 165; and 3 is rejected on R 95. 9The same applies also to the formalist conception of mathematics as a manipulation of symbols because the mathematical concept lies not in the physical production of the symbols, but in the formal fact that the symbols are produced in accordance with the given rule or, as Wittgenstein prefers to put it (R 81), in our accepting the sequence of symbols as an application of the rule. Wittgenstein on the Philosophy of Mathematics 210 Proofs determine the meaning of theorems This doctrine is familiar from intuitionist writings. As G¨odelhas pointed out to me (but see also R 93) the doctrine is supported at the level of computations where one considers symbolic operations with numerals (formal representations of numbers) in contrast to assertions about numbers (considered as characteristics of sets): ‘5 + 7 = 12’, at this level, means that this equation is the last of some sequence of equations obtained by the application of certain rules, and the proof goes just the one step further of exhibiting this sequence. But as soon as one regards numbers as characteristics of sets one can meaningfully ask whether certain computational rules are correct, and to this extent statements about numbers have meaning independent of the rules of proof considered. Quite generally, it is simply not true that proof is primary and theorem de- rived, that only the proof determines the content of a theorem. In fact, Wittgen- stein is wrong in saying that generally we change our way of looking at a theorem during the proof (R 122), since equally often we change our way of looking at the proof as a result of restating the theorem. For example, if we are accustomed to the principle of proof that the totality of all subsets of a set is itself a set, we may reject it when it is pointed out to us that it is only valid for the notion of a combinatorial set and not, e.g., for the notion of a set as a rule of construction.10 I believe that Wittgenstein’s violent dislike of the consistency problem is con- nected with the thesis that proof is the fundamental concept. For, on the one hand it is difficult to understand how it is that different correct proofs do not lead to contradictory results, and that the calculations of different people agree. On the other hand there seems to be no approach to this problem on an empiricist basis. Wittgenstein proceeds as follows. He minimizes the importance of consis- tency (R 105) by saying that he could imagine people who would be proud of a contradiction: but why should one attach more weight to Wittgenstein’s imagin- ings than to the fact that people are not? He attributes (R 122) the agreement between different people to a similarity of training, though presumably the same could be said about the agreement between reports of the same physical event by different people; or (R 13) simply calls the agreement interesting, obviously implying that one should recognise this as a necessary condition for mathematics (R 164) and not ask for explanations. The situation is quite unsatisfactory. We shall return to a more detailed discussion of consistency in Section 3.

Calculations are (psychological) experiments Wittgenstein does not support this position at all (R 124, 173). His criticism of it leads to a general criticism of naive empiricism: one needs concepts to tell us what facts are, and mathematics both provides and handles them. Wittgenstein’s

10I chose this example and not, e.g., cancellation by zero, because in the present case a restatement of the theorem is involved. Wittgenstein on the Philosophy of Mathematics 211 presentation of this view seems specially vivid, particularly in his discussion of the difference between measurement and experiment (R 98):11 the same physical acts constitute in one case a measurement, in the other an experiment. The fact just mentioned immediately raises the question wherein this differ- ence lies. It would be wholly barren to reply that it lies in the intention of the agent and that it is to be discovered by asking him whether he is making an ex- periment or a measurement. Wittgenstein wishes to remove the distinction from the sphere of psychology by saying that whether an act is a measurement or an experiment depends on the language game of which it is a part. It is not at all clear to me that this nice word ‘language game’ really clarifies the traditional term ‘meaning’, and does not merely replace it. There seems to be something psychological in the distinction considered, be- cause, with our present experience of machines, it does not seem possible to say significantly of a machine that it is making a measurement or an experiment: instead, we use it for such a purpose (cf. R 133: does a calculating machine calculate?) Wittgenstein employs several other notions with a similar psychological flavour in his attempts to characterise (certain aspects of) proofs: to impress a proce- dure upon someone(R 14); to use something as a picture (R 18); to use it as a paradigm (R 72); to take in a structure at a glance (R 45), etc.

Wittgenstein’s general conclusions On the negative side, Wittgenstein regarded the traditional aims of philosophy, in particular of crude empiricism, as unattainable. But I do not see why these aims should not be modified in the light of criticism, and then pursued. Thus, though the requirements of crude empiricism cannot be satisfied, a leaning towards it seems fruitful. It seems to me that Lorenzen’s operative logic constitutes an interesting formulation of the idea that theorems are rules of language: Lorenzen, of course, does not try to reduce the act of following a rule (correctly) to empiricist terms, but starts with the fact (or: idealisation) that we do follow rules. Or we can develop a mathematics based on the primitive concept of ‘proof’ or even ‘constructive proof’, as done in intuitionism: we ignore the problem of how we come to recognize a proof and start from the fact (or: idealisation) that we do. Only, it is not likely that one approach will turn out to be ‘more unique’ than the other. On the positive side, Wittgenstein said (R 171) that the aim of a philosophy of mathematics should consist in a clarification of its grammar. He does not say exactly wherein this consists (and I have not found a single passage in the Remarks which I could with certainty identify as a ‘clarification of grammar’). But it is evidently to do with applications and intellectual institutions (R 173, 176). The

11He also talks about the difference between calculation and experiment (R 95). Wittgenstein on the Philosophy of Mathematics 212 concepts used in his description of our mathematical activity are those of the previous subsection. Now, I see no objection to them on formal grounds, and I can believe that they afford a better framework than either a purely empiricist (behaviourist) or a purely introspective analysis. But these notions and the whole programme of clarifying the grammar remind me of the ‘soft options’ at school: human geography (not the mineral composition of mountains, but their effect on history) or economic chemistry (not the atomic structure of the chemical elements, but their uses in society). It is all sensible and interesting, not hackneyed (R 170), but often bypasses the problems which later turn out to be most fruitful. And even if this conception turns out to be useful, there is no clear reason for rejecting the others, no more than that human geography should exclude scientific geology. There are such traditional problems as the genesis of our mathematical con- cepts, the justification of proofs (i.e. what makes them correct rather then what makes them interesting), which are the fundamental concepts and which derived: we need a conceptual apparatus to formulate these questions in a satisfactory way, and we do not have such an apparatus. But it seems unlikely that the concepts favoured by Wittgenstein will provide it. It is by no means clear that these questions are ripe for a precise formulation, any more than the general (and natural) questions of present day mathematics were ripe for a formulation at the time of Greeks.

12.2 Foundations of Mathematics

Wittgenstein objected to a mathematical foundation of mathematics because the concepts used in the foundation are not sufficiently different from the concepts de- scribed (R 171), and he thought (R 177) that there are no mathematical solutions to his problems. As far as mathematical foundations are concerned, it is certainly true that if mathematical concepts are used in foundations, they are liable to raise the same type of problem as they are supposed to answer: we are left with elucidations rather than explanations. But to quote out of context (R 174): don’t demand too much and have no fear of your problems dissolving into nothing. In fact, mathematical logic has done far more to get an overall view of mathe- matics, to help us find our way about (R 104), than any other single discipline: it provided concepts necessary for the description of mathematics, just as, according to Wittgenstein, mathematics provides the concepts necessary in the description of nature. Wittgenstein’s views on mathematical logic are not worth much be- cause he knew very little and what he knew was confined to the Frege-Russell line of goods.12 But it is true that the methods of mathematical logic have not

12This is not at all typical of the subject. However, I am told that many professional philoso- phers are similarly uneducated and are, therefore, likely to have similarly distorted views. Wittgenstein on the Philosophy of Mathematics 213 been applied successfully to the subject of elementary computations (see strict finitism below), and this is precisely the subject which interested him most. There is another, less austere conception of the philosophy of mathematics, which Wittgenstein ignores too. Since this conception, it seems to me, underlies most of current work in mathematical logic, he implicitly rejects it by rejecting mathematical logic. As mathematics has grown, a variety of different methods of proof, definitions, theorems have accumulated. By the light of nature we see differences, groupings within one branch, and similarities between different branches of mathematics. One may see one aim of a foundation of mathematics in getting a clear understanding of these connections, and there is no reason in advance why this should be done only by reference to applications and not, e.g., by mathematical properties, by mathematical characterisations. From this point of view it is a contribution to the philosophy of mathematics if a new aspect of the methods of mathematics has been noticed, such as, e.g., Wittgenstein’s own observations discussed below; here there is no one fundamental problem. I regard the ‘rival’ foundations of mathematics in this light: not as contra- dictory in substance, but as emphasising different aspects of mathematics (cf. Bernays [1935]). It should be observed that the originators of the rival founda- tions took a different view: they insisted on rejecting those aspects of mathema- tics which they did not consider themselves; like a woman who fears one would not be interested in her if one remembered that others existed besides her. Of course, from the point of view of a Lebensphilosophie these rival foundations are contradictory because they regard different aspects of mathematics as specially important. The historical background to Wittgenstein’s remarks consists of two parts. First there is the logicistic reduction of mathematics to logic and abstract set theory: many of Wittgenstein’s remarks are a reaction against this. Second, there are the so-called constructive tendencies which are also a reaction to the logicistic approach or, at least, are in a different direction. We describe the reduction to set theory first, and then some brands of constructivism in order to compare Wittgenstein’s views with them.

Set Theory Abstract set theory provides the most famous of all foundations of mathematics. The remarkable fact is this: each known branch of mathematics has a model in abstract set theory, and frequently, a most natural model. For example, the question ‘what is a number’, to which it is hard to give a natural meaning, gets the answer: an element of the set which is the intersection of all inductive sets. Similarly, the corresponding questions for other mathematical concepts are an- swered in this uniform way. It is fair to say that this programme of regarding our mathematics from this point of view of set theory has been carried out in far more detail than, e.g., the programme of describing the physical world as assem- Wittgenstein on the Philosophy of Mathematics 214 blages of fundamental particles.13 In particular, the development of arithmetic within set theory has not only helped us to understand arithmetic better, but has had important repercussions on the study of set theory and logic generally, e.g. it permits the application of G¨odel’sincompleteness theorem to systems of set theory, and related results. The interest of the discovery just described cannot be doubted. But it is not so clear that it satisfies the philosopher who seeks a ‘simpler’ foundation, who looks for the fundamental concepts in mathematics and wants to build up derived ones. Those philosophers who see the content of a scientific statement in the ob- servational verification will deny that the fundamental particles are simpler than the physical objects of our experience, but they cannot deny that at least they are smaller. There is no obvious order in which abstract sets precede numbers, as the fundamental particles precede objects of our immediate experience in size. And, as e.g. Poincar´epointed out with great lucidity, the reduction of arithmetic to set theory itself requires the processes of arithmetic. If the notion of number presupposes the concept of a finite set, even then, as Lorenzen [1951] has ob- served, this does not mean that one has to consider arbitrary sets first and then restrict them to finite ones. In other words, in this search for foundations, for notions with a more elementary content, one may wish to select parts of mathe- matics, or, in particular, parts of set theory. Wittgenstein emphasises strongly the mathematical significance of such selections. Another respect in which the set theoretic foundations fail, is in characterising the constructive aspects of mathematics which Bernays calls, epigrammatically, a mathematics of doing (cf. also R 118) in contrast to a mathematics of being, or, more formally, an idealisation of process as against an idealisation of what is the case. Neither of these conceptions can be expected to be fully comprehended by the other; although there are interesting formal relations between them. It is interesting to note that there is not only a constructivist critique of so-called classical (‘platonistic’) mathematics, but also a converse. In fact, many mathe- maticians are almost proud to declare that they don’t understand intuitionistm (or the other constructive tendencies). They ask: How do you define constructive proof? and do not really expect an answer; rightly as long as they presuppose a definition in set theoretical terms. Their complaint about the vagueness (mean- inglessness) of the notion of constructive proof is on a par with the intuitionist complaint about the meaninglessness of the notion of arbitrary functions: how do you specify them, since they are supposed to be non-enumerable? Yet without giving a list of either arbitrary sets or of all constructive proofs, Zermelo laid down properties of the former notion in his set theory, and Heyting of the latter in his axiom systems for intuitionistic logic. From now on we shall be mainly con- cerned with the constructive side and discuss it on its merits without attempting to reduce it to set theoretical terms. 13The common feature of these programmes is that they reduce the number of ‘primitives’. Wittgenstein on the Philosophy of Mathematics 215 Constructivity It is the custom to lump together all constructive tendencies under the heading of intuitionism. To get a little more orientation we shall distinguish between intuitionism proper as developed by Brouwer and Heyting, and finitism. There is an even narrower conception of constructive mathematics, namely strict finitism, a notion described by Bernays [1935]. Wittgenstein’s views seem related and favourable to intuitionism, probably mainly because of common features such as the objection to the idea of a mathematical object, the priority attached to proofs over theorems, and the use of Brouwer’s household example of the decimal expansion of π (R 138, 144, 147). But a closer look shows that this similarity is superficial, and that Wittgenstein’s views on mathematics are near those of strict finitism; or, perhaps one should say, he concentrates on the strictly finitist aspects of mathematics. To justify this assertion it is necessary to describe some differences between these three constructivist tendencies.

Intuitionism and finitism Hilbert and Bernays ([1934], pp. 20, 21, 65) state that intuitionism permits the use of general logical considerations in addition to the combinatorial facts with which finitist mathematics is concerned. Intuitionism deals with mental constructions, while a finitist piece of mathematics is to be regarded as a Gedanken experiment (description of an experiment) with concrete objects which are thought of as reproducible, and are to be recognisable and surveyable, i.e. thought of as built up of discrete parts whose structure can be surveyed.14 Intuitionism includes finitism because a picture of a concrete object can be used in a mental construction, as this word is employed by the intuitionists. But it goes beyond finitism because it makes statements concerning all possible constructions, which certainly do not constitute a concrete totality. These, and similar, general differences become apparent in the typical features of the notion of intuitionist as opposed to finitist proof. For example, a false proposition im- plies anything. Or undecided propositions, and even implications between such propositions, may be used as premisses in implications, i.e. one makes assertions which involve an hypothetical proof, namely a proof of the premise, though the totality of all proofs is not concretely specified;15 as a result is not clear what ‘con- struction’ constitutes the content of such an implication (this applies in particular to the double negations of which intuitionist writings are full).

14Hilbert and Bernays ([1934], p. 21) use ¨uberblickbar instead of Wittgenstein’s (R 65) ¨uberse- hbar, with its ambiguity of ‘overlook’ and ‘look over’. 15It is regarded as an accident that in most proofs of implications A → B the details of the proof of A are not used in the proof of B, e.g. in [A ∧ (A → B)] → B where one need merely attach the proof of A → B to the proof of A in order to get a proof of B, though, in fact, Brouwer’s proof of the fan theorem is the only known example of the contrary case. Wittgenstein on the Philosophy of Mathematics 216

Granted that features of this kind are typical of the difference between fini- tist and intuitionist mathematics, it is clear that it is unprofitable to compare Wittgenstein’s views and intuitionism. For, in his simple computational examples we are dealing with strictly combinatorial processes, and the typically intuitionist concepts do not apply: he leaves off before intuitionism starts. Finitist mathematics does not use the general notion of a constructive proof at all, in fact it might be said to avoid logical inferences (which involve an impredica- tive concept of proof) because it is restricted to purely combinatorial operations. In particular, the logical connectives have a purely combinatorial character since they are applied only to decidable formulae and so the truth functional inter- pretation (truth table method) is not problematic. Universal quantifiers are not used at all except in so far as they can be replaced by free variables (e.g., not in the premiss of an implication). Existential quantifiers are used as shorthand for a (constructive) function or functional (if they occur in a premiss, e.g. in (∃x)A(x) → B, they are interpreted as A(n) → B when n is a free variable which does not occur in A(x) nor in B). Iterated implications are not used at all. These restrictions all follow more or less cogently from the general conception of a mathematics about concretely presented objects. The reason why there is such general confusion about the difference between finitist and intuitionist proofs is simply this: one very rarely uses all that intuitionism would allow.16 In short, all the mathematics which Wittgenstein considers clear (not, e.g., the completeness of the set of real numbers) fits comfortably within the framework of finitist mathematics, and so it is futile to compare his views with intuitionism. In fact it will turn out that an even narrower aspect of mathematics is considered by him.

Strict finitism Finitism is, of course, an idealisation too: it ignores certain differences, and distinctions based upon them. In particular, it does not distinguish between con- structions which consist of a finite number of steps and those which can actually be carried out, or between configurations which consist of a finite number of discrete parts and those which can actually be kept in mind (or surveyed). In fact within any given degree of sophistication of proofs a classification according to the degree of complexity is most natural. Wittgenstein stresses (R 65) the further point that explicit definitions and new notations may convert a piece of mathematics which is not strictly finitist into one. No rigorous work has been done on this subject, partly, no doubt, because the current methods of mathematical logic do not seem to lend themselves to it.

16For example, up to the present we do not know, for A known to be recursive, a single intuitionist proof of ¬(∀x)A(x) where we cannot specify an n such that ¬A(n) (though the general conception of an intuitionist proof makes it plausible that there are such A). Iterated implications are extremely rare in mathematics anyway. Wittgenstein on the Philosophy of Mathematics 217

Those logicians and philosophers whose taste leads them to work in developed fields where others have created concepts on which they can model their own, will certainly not be attracted to strict finitism at the present stage; just as they would not have been attracted to intuitionism before Heyting analysed its formal structure, and Tarski, Beth and Kripke important aspects of its interpretation. The study of finite computing machines or the human learning process makes the problems which arise here practically pressing. This concept of finitist proof in the strict sense can be applied to questions which interest logicians, such as the general notion of ‘equivalence of proofs’ or ‘content of proofs’. It seems to me that Wittgenstein gives some very interesting hints in this connection, which will be described in the next subsection.

Wittgenstein’s views It seems to me that what Wittgenstein has to say about the reduction of arith- metic to logic is not concerned with any peculiarities of the branches of mathe- matics considered, but applies generally to proof theoretical reductions (mapping of proofs of one kind into proofs of another), or translations (of theorems of one system into theorems of another). Wittgenstein’s first point is this: we do not really have a reduction of arith- metic to logic because, by the methods of logic alone, we could not decide whether a particular formula of logic corresponds to some given formula of numerical arithmetic. Also there would always be the question of whether a rule has been correctly applied (R 89, 91). This point seems wholly acceptable and, what is more, quite familiar: it concerns the metamathematical methods used for investi- gating relations between two systems. We do not speak of a ‘reduction’ unless the metamathematical methods are weaker in some suitable sense or, at least, more evident than the methods studied.17 In the colorless language of professional logicians one replies to Wittgenstein’s question: How we know that the rule (of translation) has been correctly applied? by saying: the whole reduction must be considered relative to the methods of proof used in the metamathematical argu- ment. And one would agree with him that the metamathematics required for a translation of arithmetic into logic requires some arithmetical concepts. However one would remember (e.g. from the discussion on Set Theory) that, without being a reduction, such a translation can be of central importance. Wittgenstein’s next point is much more positive. Instead of just saying that the metamathematical methods used in the translation include the methods stud- 17For instance, in his reduction of formalised classical arithmetic to intuitionistic arithmetic, G¨odelwas careful to use finitist methods of proof. Again, if a system is decidable, i.e. if there is an effective method of associating with every formula of the system one of the formulae 0 = 0, 0 = 1 (0 = 0 with provable ones, 0 = 1 with the others), we do not generally speak of a ‘reduction’ of the system to arithmetic modulo 2: for, in general, the method used for constructing this translation goes far beyond arithmetic modulo 2. Wittgenstein on the Philosophy of Mathematics 218 ied, he looks for appropriate weaker methods, and observes, e.g., that the reduc- tion of the decimal notation for numerals to the stroke notation (R 66) cannot be done by strictly finitist methods. He thereby uses the notion of strict finitism for making a natural distinction.18 Wittgenstein repeatedly raises the question of characterising the equivalence of proofs in contrast to equivalence of results (R 66, 69). This is a somewhat elusive notion, a little like Heyting’s [1956] notion of the completeness of a calculus with respect to proofs and not only with respect to results, but in particular cases it is clear enough.19 Wittgenstein does not succeed in characterising this notion of equivalence or in comparing proofs.20 He recognises (R 63) that under too strict a criterion of equivalence a proof will be equivalent only to itself: he does not see any objection to this, and does not attempt to find fruitful criteria. But, in my opinion, he raises an interesting problem here. He does attempt to find a characterisation of a very general sort by basing a comparison of proofs on their applications21 or, as he puts it (R 155), on what one can do with them. I believe that at a certain level this is a useful approach, and that it aims at unifying our point of view: we are to look at two conceptions like the classical and intuitionist foundations of mathematics, and find a place for each from our point of view: according to their applications. Mathematicians sometimes pretend to a similar criterion of judgment, namely: mathematical fruitfulness. But everybody knows that in subtle cases the criterion of fruitfulness may not apply, because people differ in just these cases in what they find interesting. From the discussion below it appears that in subtle cases Wittgenstein’s notion of application is no better. The aim is not achieved, and this fact seems to support the general conclusions of section 1. Wittgenstein makes the remark on application in connection with noncon-

18What Wittgenstein is concerned with is not at all: time needed for a mechanic conversion, but, roughly: what can be taken in at a glance (¨uberschaubar and einpr¨agsam). Furthermore, he was concerned with such operations (on notations) as (those implementing) addition, multi- plication, exponentiation. He did not assume that one and the same notation is a good bargain in all areas of computation! (He overlooked the possibility that relatively few may be good in relatively many cases, a statistical matter, which is anatema to philosophy!) 19For instance, Shoenfield [1957] showed how to replace the induction schema in the elemen- tary quantifier-free arithmetic of addition by means of a finite number of axioms, including a + b = b + a, without altering the set of theorems. Now, it is intuitively clear that the proof of a + b = b + a by induction in the original system does not have an ‘equivalent’ counterpart in the finitely axiomatised system. 20An incidental observation: I have sometimes felt that Wittgenstein’s violent dislike of the notion of mathematical truth is connected with comparisons of proofs. He could see that the comparison of proofs was informative, and he didn’t have a snappy answer to the objection sometimes made by mathematicians: But all we want to know is whether the theorem is true. The answer is: If you think about what you are saying you will see that you are wrong. 21There seems to be a conflict with R 157, where he says that it is useless in the philosophy of mathematics to reformulate proofs: one would have thought that this might exhibit the application especially clearly. Wittgenstein on the Philosophy of Mathematics 219 structive existence proofs. Perhaps it is particularly easy to say what we can ‘do’ with a non-constructive existence proof because the matter of constructivity is at issue, and so a complete elimination of the non-constructive use of the quan- tifier is all we want, and we get it.22 In any case, what we can ‘do’ with the non-constructive proof is more informative than the non-constructive theorem. But in the more fundamental problems of the foundations of mathematics, the question ‘what can one do with it’ does not seem to help. Suppose one has a finitist and a non-finitist proof of a universal formula (∀x)A(x): What can one ‘do’ with the former which one cannot do with the latter? In both cases, for each numeral n we shall have A(n).23

12.3 Metamathematics

This section deals with isolated topics which are discussed at length in the Re- marks.24 I make no attempt to relate them to a general point of view. Some of them are so absurd that they seem due to very general misconceptions and not just carelessness (cf. R 55).

Incompleteness Wittgenstein criticises G¨odel’sfirst incompleteness theoremor, at least, the part which states that if a suitable system of arithmetic is consistent then there is a

22Suppose we have a proof of (∃x)A(x), where we think of the variables as ranging over natural numbers. The first thing that one may expect to ‘do’ with the proof of such a formula is to read off from it instructions for calculating an n such that A(n) holds. With the ordinary methods of proof this can be done directly if A is recursive. At the next level of complexity, A(x) is of the form ∀yB(x, y). A non-constructive proof, naturally by reductio ad absurdum, of ¬(∀x)(∃y)¬B(x, y) would generally proceed as follows: suppose (∀x)(∃y)¬B(x, y); then, for a certain function f(x), (∀x)¬B[x, f(x)]; the proof (in general) shows ¬(∀x)¬B[x, f(x)], by constructing explicitly an xf (depending on f) such that B[xf , f(xf )]. This formula, with its construction of (the functional) xf in terms of f, may be said to tell us what we can ‘do’ with the non-constructive existence proof. All this involves a restatement of the theorem and a consequent reformulation of the proof without, as one says, changing the idea of the proof: cf. the preceding footnote. 23Incidentally, comparison between the intuitionist proof of the fan theorem and K¨onig’s proof of the corresponding theorem in classical mathematics raises the same question: what can I ‘do’ with one, what with the other? Nothing like the question of ‘estimates’ for existential quantifiers, which was central in the existential cases, is relevant here. What one is left with is simply this: if one starts with the classical conception Brouwer’s proof is simply not valid, because he makes an unjustified restriction on the methods of proof of the hypothesis which in any case is not relevant because one only wishes to assume its truth; and if one starts with the intuitionistic conception Konig’s proof is not valid, because he employs a (provably) non- constructive least number operator. I do not see any ‘practical purpose’ or considerations of ‘usefulness’ which could decide between the two proofs. 24For a complementary view of these topics, see Chapter 13. Wittgenstein on the Philosophy of Mathematics 220 true formula of the form (∀x)A(x) which is not provable in the system.25 The arguments are wild, including such points as: an inconsistency wouldn’t matter (R 51); or how do we know that this is the correct translation of the arithmetic formula (∀x)A(x); or what does it mean to suppose that a formula is provable (R 177). The answers to these objections are trivial. First, even if an inconsistency didn’t ‘matter’, one cannot hope to discuss significantly on this basis a result which explicitly supposes consistency of the system. Second, the elaborate de- tails of G¨odel’spaper are needed just because G¨odelhas to show that (on the assumption of consistency) the proposed translation is correct. And finally, one of the major purposes of considering formal systems (and it is formal systems which G¨odelconsiders) is that a clear combinatorial (geometric) meaning is given to a formula being provable. Of course, nothing is wrong in G¨odel’sresults. But perhaps the following explanations are appropriate. Logicians are quite aware of the choices open in translating a syntactic as- sertion in English into arithmetic form; in fact, Rosser’s variant of G¨odel’s proof depends just on this fact. For the first undecidability theorem all that is needed of a translation of the proof predicate

m is (the number of) a proof of (the formula with number) n is this: there should be an expression P (x, y) in the system such that P (0(m), 0(n)) is provable (refutable) in the system for the numerals 0(m) and 0(n) just in case m is (not) a proof of n. And this is ensured by consistency and the other conditions imposed on the system. For a significant formulation of the second undecidability theorem the expression P has to be so chosen that one can ‘see’ that it possesses certain properties of the proof predicate; i.e. one must be able to prove formally in the system certain implications involving P . This was first analyzed by Hilbert and Bernays [1939], and later by L¨ob[1955] and Jeroslow [1973]. This work shows what a workmanlike approach to such a matter as an ambiguity in translation should be: it is not regarded as an inherent defect, but it is analysed till one sees what is actually needed in a particular context. The beauty of G¨odel’sown formulation is that his result can be separated from the question of truth in arithmetic. He has found a formula (∀x)A(x), A recursive, which is formally undecidable in the given system S if S is consistent. Now, given this syntactic result one argues: since any closed proposition is true or false, either (∀x)A(x) or ¬(∀x)A(x) is true, and so there is a true proposition which cannot be proved in S (on the intended interpretation of S). One sees that all one needs of the concept of truth is that either R or ¬R. Actually, our notion of truth is clear enough to argue: (∀x)A(x) is true. For if it were false for

25The formula is one with number q which states: for every x, x is not the number of a proof of the formula with number q; i.e. a proof of itself. Wittgenstein on the Philosophy of Mathematics 221 the numeral 0(n), then ¬A(0(n)) would be provable in S (provided all recursive predicates can be computed in S) and so ¬(∀x)A(x) would be provable. To me G¨odel’sresults do not at all suggest that our intuitive notion of an integer is defective. I can’t believe that anyone ever had an intuition that there was a recursive decision method for arithmetic, even if, like Hilbert, he wanted one. It is not a death blow to the axiomatic approach since we now consider partial instead of complete axiomatisations. Nor necessarily to Hilbert’s consistency problem; one has to examine the methods employed in the consistency proof on their merits. In fact, what G¨odelshows is that the methods employed in a consistency proof of a system of arithmetic cannot be more evident than the methods formalised in the system, simply by virtue of being more restrictive. However, just because of this G¨odel’s results did destroy Hilbert’s aim of getting rid of all problems of foundations once and for all. To me, the most striking fact is this: while nobody has ever been troubled because the particular sentence of arithmetic is undecidable, so to speak because the systems are open, G¨odel’sown principle for deciding such sentences shows that they are artificially open; his principle is: if R is provable in the system then R. This principle can be formulated in each of the usual systems of arithmetic, and yet it extends the system considered.

Consistency Wittgenstein’s criticism of the consistency problem ranges from a proposal to use the double negation as an enforced negation (R 53) (when ¬A and ¬¬A would not be contradictory), to the proposal of not drawing conclusions from a contradiction (R 169), to modifications in our arguments after we reach a contradiction (which is done every day, both as a correction of errors, oversights in definitions, etc). In fact the consistency problem as conceived by Hilbert had a perfectly spe- cific point: if one can prove, by limited (‘understood’) methods, the consistency of a system S, then any universal recursive formula that can be proved in S at all can also be proved by the limited methods. Since he considered more elaborate formulae as ‘ideal’ elements, with no hope of assigning a clear constructive mean- ing to them,26 this was the most he could expect from foundations of arithmetic. In addition, it turned out that consistency proofs gave a great deal of additional information. Here, for once, Wittgenstein also makes a justified objection (R 130) among all the wild shots which miss the mark: the onesidedness of the consistency problem. From the consistency we can draw some conclusions, perhaps more than one would expect. Yet, as G¨odelhas shown, there are formally consistent systems of arithmetic which are ω-inconsistent, i.e. (∃x)A(x) can be proved but, for each

26Though this has been done by Herbrand and others (cf. note 22). Wittgenstein on the Philosophy of Mathematics 222 number n, A(0(n)) can be refuted. This had obviously not been suspected. And even some of the results of the Hilbert school are more informatively described, e.g., as extended -theorems than as mere consistency results. It is true that some popular reasons for the consistency problem are unac- ceptable. If, e.g. one’s aim is to save ‘our mathematics’, but if, like Hilbert, one regards only universal formulae as significant and others as auxiliary (ideal), one has not saved anything because these formulae can be proved by the methods employed in the consistency proof itself. If one is interested in scientific appli- cations, consistency is certainly not enough: the system must be correct for the intended application, and in this case consistency will be a by-product. (Wittgen- stein says, R 104, that consistency will be a by-product if one aims at a useful application: the quantum physicists wish it were). And indeed an inconsistency is not shattering. For, if one thinks of mathe- matics as a body of rules and finds an inconsistency one modifies their statement: Cantor’s proofs still stand although they can be embedded in Frege’s inconsistent system. And if one thinks of mathematics as concerned with mathematical ob- jects, the progress of science shows how our basic conceptions (R 138) may turn out to be inconsistent with experience, yet many observations made previously still stand, and only those whose interpretation is strongly tied to the particular conception are discarded. Finally, as is often stressed by the physicists, if mathematics is used as a handmaiden of the sciences, an inconsistency of the rules obtained by an unin- tended application, even if this had not been excluded explicitly in advance, may be quite acceptable.27 In any case there are ambiguities and uncertainties in the physical assumptions, so why not put up with similar faults in the mathematical manipulation? In fact, as Wittgenstein observed explicitly (R 189) one could do physics without distinguishing between mathematical and physical facts at all. In other words, we have the interesting fact that something to which our conception of mathematics does not apply, can also be used as a handmaiden of the sciences. But this fact does not invalidate our conception of mathematics which is more exacting than that just described. When all this is recognised, the mathematical problem of consistency still stands, and is fruitful: proofs of consistency and, more generally, of independence yield, perhaps, a better control over a calculus than anything else.28 Also, the separation between mathematical and physical facts, even if artificial in places, has its value in the manipulation of physical theories, particularly when a modi- fication of the theory is required. Of course, when we speak of the consistency of rules, we make a mathematical assertion, and ignore certain epistemological problems, such as the one which

27Once, a mathematician derived a contradiction from the definition of a function introduced by a famous physicist, upon which the latter commented: Did I make a mess of my function? 28In one place, Wittgenstein seems to agree with this (R 106). Wittgenstein on the Philosophy of Mathematics 223

Wittgenstein stresses particularly: how do we know that the rules have been correctly applied.29 I agree that this is also a problem. But it is completely barren within the context of consistency because it questions the very basis from which the consistency problem derives its significance. For, this epistemological problem arises with every application of a rule (and, indeed, an analogous problem arises with all communication) while the consistency problem arises with specific sets of rules.

Paradoxes Wittgenstein simply did not know what to say about the paradoxes. I don’t either. But one thing is clear: the fruitful problem is not to ‘get rid of them’ but to get something out of them. G¨odelgot his general results on formal systems out of one set of paradoxes, L¨ob[1955] got a beautiful result out of another. It is hard to say something really coherent about the paradoxes, but one can do better than speak of a head of Janus looking down on the other propositions (R 131). I myself have found the following version of Russell’s paradox illuminating, though it does not seem to lend itself to generalisation. • Recall first that in one definition of natural numbers in set theory, the empty set corresponds to the number 0, and if N corresponds to n then N ∪ {N} corresponds to the successor of n. There is no greatest integer because N ∪ {N} is not included in N.

• Now let us consider any set A with the property P that its members do not belong to themselves, i.e. X ∈ A → X 6∈ X, and hence A 6∈ A. So A ∪ {A} also has the property P . In other words the Russell paradox involves the same argument as the theorem that there is no greatest integer, and at the same time suggests a natural way of generalising the successor construction. From this point of view the Russell Paradox does not seem more astonishing than a child’s assumption that there is a greatest integer: we have overlooked the fact that not every property has a definite extension.

12.4 Wittgenstein’s Conversations About Foun- dations

I knew Wittgenstein from 1942 to his death in 1951. We spent a lot of time together talking about the foundations of mathematics, at a stage when I had read nothing on it other than the usual Schundliteratur. The topics raised were

29What is right about his criticism is this: if one is concerned with realiability in a realistic sense, then it would be wrong to consider only the abstract rules and not their use. Wittgenstein on the Philosophy of Mathematics 224 far from the center of interest of the Remarks,30 but full of remarks which can be found in Zettel or the Vermischte Bemerkungen. At least in my own experience the style of Wittgenstein’s conversations on foundations (not on everyday matters!) was very different from his public per- formances, which were always tense and often incoherent. Without exaggeration: what Wittgenstein actually said in the seminars I attended, did not express at all well his views at the time. This seems very much to the point in connection with the Lectures (cf. note 5). Here I shall stress less what I found useful, and more what I found agreeable - in fact, I want to stress the conflict; after all it is a bit much to ask that utile dulci should be combined. Only what is excessively bitter will not be used at all.

Applications Wittgenstein wanted to explain the meaning of a concept by its use, its applica- tions (cf. the end of Section 2). Evidently some discretion is needed here (if the program is not to be trivial): otherwise, an application of, say, a set theoretic concept would simply consist in its use for formulating properties of sets. Such things were excluded in Wittgenstein’s Lectures in 1939 where he confined himself to applications outside mathematics, but in our conversations in 1942–43 he was particularly interested in applications within mathematics, specifically of socalled nonconstructive proofs. The following example is typical: if f is a continuous function in the interval [0, 1], f(0) > 0 and f(1) < 0, then f(x) = 0 for some 0 < x < 1. The proof uses the socalled bisection procedure, via a sequence of nested intervals. We divide 1 1 [0, 1] in half: if f( 2 ) = 0 then take x = 2 ; otherwise, we consider the interval 1 1 1 1 [0, 2 ] if f( 2 ) < 0, and [ 2 , 1] if f( 2 ) > 0, and we start all over again. The nested sequence of intervals so generated converges to a zero of f. Wittgenstein wanted to regard this proof as a first step toward the construc- tion of x, and to restrict it by saying: the proof only gives an applicable method 1 when the relevant decision (e.g. whether f( 2 ) is equal to, greater than, or less than 0) can be done effectively (e.g. if f is a polynomial with algebraic coefficients). I still find Wittgenstein’s suggestion (of a certain restriction) agreeable: satis- faisant pour l’´esprit. But it is certainly not useful (since the restriction is hardly ever satisfied). A variant (Kreisel [1952b]) is much more useful: it applies when the restriction is only approximately satisfied, i.e. when one is able to decide not 1 necessarily at x = 2 itself, but sufficiently close to it (e.g. in the case of recursive analytical functions on [0, 1]).

30One common point was this: every significant piece of mathematics has a solid mathematical core (R 142) and if we look honestly, we shall see it. I see the mathematical core in the combinatorial or constructive aspect of the proof, but I realize that there are other points of view, e.g. a purely abstract one (J.P. Serre once told me that he saw the mathematical core of the Chinese remainder theorem in a certain result of cohomology theory). Wittgenstein on the Philosophy of Mathematics 225 To find one’s way about As everybody knows, Wittgenstein characterized (at least occasionally) philo- sophical questions as expressions of a malaise, of not knowing one’s way about a subject. Now one can hardly expect that there is only one, so to speak privileged method for getting out of this unpleasant state. Wittgenstein’s favorite method consisted in taking a fresh look at familiar material. Of course, often it is better to add something new (here: from mathematics).31 I don’t want to pursue here the question how Wittgenstein’s characterization of philosophical problems is to be supplemented; for example, by requiring a solution by means of Wittgenstein’s favorite method, which could perhaps be called literary. It seems better to mention an example which illustrates at the same time Wittgenstein’s and the mathematician’s method to find his way about. In 1942 I attended a course by Littlewood, which incidentally I liked very much. One day he gave two proofs of the theorem by Cantor-Bendixson on the socalled perfect kernel of closed sets (cf. Kreisel [1959]). Grinning so to speak to himself he added: ‘They say the second proof uses the axiom of choice; I am sure I don’t know?’ I was terribly shocked, and didn’t even know why. (I didn’t know my way about at all, as far as Littlewood was concerned.) Of course I didn’t tell my disappointment to anybody but Wittgenstein. He was very sympathetic, and said quietly: Doesn’t he care? Surely Wittgenstein’s comment calmed my excitement (about Littlewood’s frivolity and apathy) and was agreeable. But today - or, more precisely, for the last thirty years - I know what seems to me a much better method for the same purpose, a metamathematical theorem, which so to speak justifies Littlewood. Roughly speaking the theorem shows how to eliminate automatically the axiom of choice from proofs of results having the syntactic form of Cantor-Bendixson’s Theorem32 Thus the axiom in question isn’t even a candidate for a convincing distinction between the two proofs mentioned above. Evidently it makes sense to ask whether Wittgenstein’s literary comment or the metamathematical theorem is more suitable. Whatever our view may be in this particular case, one thing is certain: when one is really lost, one had better compare several (reasonable) explanations. Oth- erwise, an idea or a theory becomes plausible simply by the fact that the alter- natives aren’t even worth discussing! Wittgenstein often stressed in conversations how little evidence Darwin him- self had for his theories: their principal attraction consisted in the fact that the well known alternative (story of the creation) was so hackneyed and unimagina-

31 √ √ For example, at school one learns to solve cubic equations by use of and 3 . One uses √ −1 even when all roots of the equation are real. This makes one feel ill at ease. I know no better method to avoid this unpleasant state than to use Galois’ Theory and show that otherwise there just are no solutions by means of quadratic and cubic radicals. 32Cf. the Appendix of Chapter 14. Wittgenstein on the Philosophy of Mathematics 226 tive.33 (Tests of Darwin’s ideas were not really made before R. A. Fisher.)

Le style, c’est l’homme mˆeme The matter of style came up quite often in my conversations with Wittgenstein. For example, once after he had invited F.J. Dyson, who at the time had rooms in College next to Wittgenstein’s, to discuss foundations. Dyson had said he did not wish to ‘discuss’ anything, because what Wittgenstein had to say was not different from anything everybody was saying anyway, but he wanted to hear how Wittgenstein put it. Wittgenstein spoke to me of the occasion, agreeing very much with what Dyson had said, but finding Dyson’s jargon a bit ‘odd’. Wittgenstein’s lectures and even his notes (or at least those which he did not throw away) were very tense; actually also many of his conversations - in my presence - on everyday trivialities. In contrast he was completely relaxed when such analyses of proofs as in the first subsection above, or a bon mot as in the second were involved. Even more remarkable (for me) was the fact that on many a Friday afternoon he sketched in a few minutes the content of his two-hour seminar the following Saturday, and afterwards supplemented it with equal ease. Without exaggeration: what he actually said in his lectures did not express at all well his ideas (before and afterwards).34 Judging by my own experience I am reminded by Wittgenstein’s tense manner of the familiar state of mind where one tries to use an artificial tension so to speak to force oneself to have new ideas. But this interpretation is not convincing simply because only Wittgenstein’s intellectual attitude is natural to me, not at all his temperament. This is illustrated by two anecdotes which of course have little to do with the foundations of mathematics. He often spoke of the wickedness of men, which bored me even then. (Un- fortunately I did not know at the time the remark of Georg Buchner: Surely the Almighty will have created the world the way it should be.) When once again, towards the end of 1946, Wittgenstein was indignant about humanity I said it surely can’t be so bad; after all, one need only wait till the atmosphere becomes so radioactive by explosions of atomic bombs that many, many mutations, such as people with two heads, would be born. If people today are really so bad then

33Incidentally, there is a really imaginative variant of the story of the creation (in the Bible), at least as far as life on earth is concerned, by Crick and Orgel [1973] (see also Crick [1981]). They start with the fact that living organisms need substances like molybdenum which are in our part of the world. So it is improbable that organisms would have come about by chance reactions according to familiar chemical laws. The variant of the story of creation consists in this: life was planted here on earth; not by the Almighty, but by beings from parts of the universe where these substances are less rare. 34I’d be interested to know to what extent my observation - of Wittgenstein in my presence - applies to others. This would show to what extent it is appropriate to take his publications quite literally. According to Dr. Nedo, at least one other person confirms my observation. Also P. Sraffa found Wittgenstein’s conversations exceptionally elegant and relaxed. Wittgenstein on the Philosophy of Mathematics 227 things can only get better. He shook his head sadly, saying that he really didn’t want that. A few days later he said out of the blue: ‘If anybody asked why you come to see me, I shouldn’t know what to say. But one thing is certain: you don’t come to inherit my money!’ Evidently this was intended to lead to some soul-baring which didn’t appeal to me at all. So I replied innocently: ‘Do you have a lot of money?’ (and the idea of a literary inheritance didn’t enter my head at all). Actually, as far as one is competent to judge such things oneself, it seems to me quite clear why I liked to see him. First of all, one can easily imagine how entertaining those conversations were for me. But also he lived outside bourgeois society in a way which every teenager loves (and I right into my thirties); specifically, he lived like this without having failed in that society, let alone having been disadvantaged’. Without exaggeration: he lived out a kind of (teenager’s) dream. Today I am sorry that I avoided that soul-baring since probably - at least by the way - something of general interest would have resulted. This could help me now (when I am about as old as he was when I was meeting him) when I ask myself what the young want from me. Chapter 13

Wittgenstein on Consistency and Incompleteness

Historical background I was very astonished by the Remarks on the Foundations of Mathematics when they came out, especially by those on G¨odel’sincompleteness theorems, for rea- sons that I can state precisely only now; see Section 3. Certainly Wittgenstein often formulated insufficiently matured ideas (in stylistically perfect sentences); but the degree of his helplessness facing G¨odel’stheorems was quite extraordi- nary. It expressed itself in wild dialectics; figuratively speaking, in exaggerated gesticulations. Recently I turned over the pages of the Philosophische Bemerkungen of 1930, thus written before G¨odel’stheorems, and found critical considerations concern- ing Hilbert’s program, more precisely concerning the essay On infinity. As a critique of Hilbert’s program these considerations are rather clumsy. But they emphasize aspects of mathematical thinking which, although certainly fruitful, were neglected in Hilbert’s program. The above mentioned helplessness becomes entirely understandable, if one regards on the one hand G¨odel’stheorems, especially the second one, as refutation of Hilbert’s program, as is customary (although G¨odelexplicitly warned against it), and on the other hand attempts to bring these theorems into harmony with Wittgenstein’s considerations. Warning. I am not claiming I will provide a causal explanation of Wittgen- stein’s helplessness. On the contrary, I would find an explanation based, e.g., on peculiarities of his mental make-up less satisfactory than my present aim: to make clear that this helplessness was appropriate in his ‘knowledge-theoretical situation’. 0Originally published in International Wittgenstein Symposia, 7 (1983) 295–303, as ‘Einige Erl¨auterungenzu Wittgensteins Kummer mit Hilbert und G¨odel’.Translated by W. Fuchs.

228 Wittgenstein on Consistency and Incompleteness 229 Summary Wittgenstein’s considerations are principally concerned with proofs being easy to take in and remember (Einpr¨agsamkeit und Uberschaubarkeit¨ ), rather than with the validity of principles of proof per se. But they give no guidelines for the selection of such proofs among all valid ones and they ignore the relevant literature, e.g. the axiomatic analysis of proofs following the recipe of Bourbaki, that is by dissecting proofs into socalled fundamental structures. In contrast to the usual mathematical logic with its elegant metamathematical theorems (such as those about the rough notion of validity), the more subtle axiomatic analysis of proofs is content to reach its goal without the formulation of general theorems about its method. Instead of Wittgenstein’s familiar ‘psychological’ terminology one speaks here, colorlessly, of ‘useful generalizations’. With regard to Hilbert’s program, Wittgenstein - like many of his contempo- raries - was very enthusiastic about the formalization of mathematical reasoning, that is about the formalization of the formal and computational aspects, but not about Hilbert’s claims for his hobby horse: consistency. In Section 2 Wittgen- stein’s uneasiness is formulated more pithily, traced back to the considerations in Section 1, and complemented by the critique of consistency as sufficient con- dition, which is familiar in mathematical logic. In agreement with Section 1 - and in contradiction to Wittgenstein’s own practice - the interest of consistency proofs, not just of consistency assertions, is stressed. When - again in agreement with Section 1 - the proofs themselves, not just provability, become the center of interest, then the incompleteness theorems lose their central role, since (in)completeness only concerns provability. In Section 3 the question what incompleteness theorems are good for is then answered in a way that dawned on Wittgenstein only in the forties, which satisfied him at that time (see Historical background) and satisfies the silent majority to this day. In Section 4 Wittgenstein’s own expectations concerning the impact of his work are compared with its actual usefulness as I see it in the present context. Even if this context should prove atypical, this might still be useful for the analysis of his expectations in other fields, possibly as a contrast, leading, e.g., to a more reasonable attitude towards his posthumous papers.

13.1 Proofs Easy to Take in and Remember

Since the twenties one of the main concerns of mathematics is to provide general guidelines for proofs to be easy to take in and remember. Yet this goal is not even mentioned, let alone analyzed. E.g., what prerequisites are necessary for being easy to take in? One speaks of generalizations or at best about the removal of unnecessary restrictions (d´egagerles hypoth`esesutiles). Nevertheless one is concerned with the mentioned goal. For usually one starts from a long, opaque Wittgenstein on Consistency and Incompleteness 230 proof and dissects it - with intuition - into a few lemmas, that is to say into a structure easy to take in. In this process one tries to formulate (or, if neces- sary to reformulate) the lemmas in such a way that the properties used in their proofs are easily assimilated by the memory, so that they are easy to remember. Where possible, one uses properties which occur frequently (like those defining Bourbaki’s fundamental structures, of which experience shows that they have an ‘intuitive’ appeal for us; cf. the formula: nos r´esonancesintuitives `al’architecture math´ematique). In this way the connection between perspicuity and appropriate generality is established; for these lemmas are valid for all structures possessing the used properties. The above axiomatic analysis (more precisely: analysis using the fundamental structures; not to be confused with logical analysis in the framework of axiomatic set theory) has proved valuable in mathematical research. But beyond this it has also made a decisive contribution to the reliability and dependability of proofs. In contemporary proofs ease to take in and remember is decisive, because the brooding about the validity of the axioms or of the deductive rules has long ago passed the point of diminishing return.1 A further simple but remarkable weakness of logical validity derives from the fact that presumably not all valid theorems have proofs easy to take in and remember, let alone proofs built on the recipe of Bourbaki. In this light the class of all (valid) proofs is not suitable for the intended analysis. For better chosen classes one can not only say more, but one can make decidedly more substantial assertions.2 Incidentally, the skepticism of mathematicians about set-theoretical founda- tions is based on similar circumstances. Admittedly the above mentioned ‘fun- damental structures’ of algebra and topology are defined set-theoretically, and their ‘fundamental properties’ can be proved by so simple axioms, i.e. so simple properties of sets, that ‘ontological’ questions are entirely irrelevant. Superficially viewed the definitions are simple in as much as they involve only one relation ∈, one predicate logical operation and one quantifier (in the language of set theory). But this yields nothing: e.g., taking it quite literally, absolutely no dissection of a proof into parts easy to take in and remember. To repeat what cannot be repeated too often: contrary to the axiomatic analysis, the set-theoretical analysis belongs to logic (in the modern sense of the

1Of course one can speak of an ‘idealized’ reliability in the sense of purely logical validity. But there remains the question whether this idealization is appropriate: if not, then it is bad, because it distracts from important aspects. (We will deal with many similar questions concerning logical categories later.) 2One will certainly have to choose subclasses of the class of all valid proofs! But this does in no way mean that an analysis of the general notion of validity would be a useful ‘first’ step for a successful choice of subclass; cf. an analysis of the continuum when one chooses the rational, the algebraic or the p-adic numbers. (Reminder: physics does not begin with a consideration of all possible worlds.) Wittgenstein on Consistency and Incompleteness 231 word). It became clear to Wittgenstein, and it was always clear to the silent majority, that the principal philosophical problem is whether a logical analysis is better suited than some other kind of analysis for questions concerning the nature of mathematics. Put differently: whether these questions, or which of these questions, have logical character. Compare physical questions concerning matter which can have nuclear, thermodynamic or simply kinematic character. The decision is generally the end result, not the starting point of research. Now we are ready to apply some of Wittgenstein’s favorite slogans to the axiomatic analysis of proofs, e.g. the relatively original:

the proof constructs (i.e., in the proof one discovers) new concepts, or the one very popular around 1930:

only the proof gives meaning to the theorem that it proves.

Naturally one must not take this too literally: e.g., there would be no need to separate theorem and proof if only the latter determined the meaning. Not so well-known is that often one is only able to formulate a more informative - in daily language: more significant - theorem after a proof of the original theorem has been found. Here are two (entirely elementary) examples.3 The first example is - entirely in Wittgenstein’s style - perhaps exaggeratedly simplicistic. A cave man conjectures that a2 − b2 = (a + b)(a − b) is valid for all even integers. Of course he is right. But the proof shows that the theorem has nothing to do with the distinctions of even and odd, integer or fraction. Therefore one formulates the (more general) theorem for arbitrary commutative rings. This notion is determined by those few properties of the even integers which enter in the proof of a2 −b2 = (a+b)(a−b). The more general theorem is more appropriate to the proof; in short: it is more meaningful. The second example4 is concerned with Russell’s paradox (that is considered as a joke, and actually as a good joke, by intelligent mathematicians, e.g. in Lit- tlewood’s A Mathematician’s Miscellany). One starts with the ‘defining’ property of Russell’s set r: ∀x(x ∈ r ↔ x 6∈ x).

1. Instead of looking at r one starts with r1:

∀x(x ∈ r1 → x 6∈ x).

3Warning. The fruitfulnesses of the analysis and thus of the slogans becomes clear only in more difficult proofs, which really require an analysis. Experienced mathematicians will have no trouble in finding impressive examples in higher mathematics. And the others won’t find it difficult to believe it, since otherwise this kind of proof-analysis could not have remained one of the main directions in mathematics for the last fifty years. 4This example will be important in Section 2, because it shows that in axiomatic analysis ‘proofs’ of contradictions do not play any privileged role: they are made in the same way as valid proofs, especially those that one does not understand. Wittgenstein on Consistency and Incompleteness 232

For all such r1 (and there are obviously many of them, e.g. ∅, ω, etc.) one has: r1 6∈ r1 and thus r1 ∪ {r1}= 6 r1. Further,

∀x(x ∈ r1 ∪ {r1} → x 6∈ x).

So r1 is certainly not the largest set satisfying one ‘half’ of the conditions on r. Therefore there is no largest set r1, let alone a largest one satisfying the complete condition imposed on r.

2. If wanted, one can also consider the second half of the condition on r, that is:

∀x(x 6∈ x → x ∈ r2).

Now substitution of r2 for x yields: r2 ∈ r2.

In contrast to 1, no convincing examples of such r2 present themselves. A universal set will certainly satisfy the condition for r2, and therefore it is an element of itself: but this is hardly sensational news in the case of a universal set.

At any rate, under the present circumstances 1 is more fruitful than 2. Without doing violence to the language one can say that the proof of the paradox, and already part 1 of it, has led to the new concept r1, where r1 is defined or ‘con- structed’ by the property considered in 1. Certainly the results of 1 make more sense than the paradox itself, i.e. the logical triviality

∀r¬∀x(x ∈ r ↔ x 6∈ x).

Autobiographical remarks For a time my confidence in Wittgenstein’s thoughts was strongly influenced by the facts that:

1. he did not recognize, and consequently did not stress, the close relations between his own aims and those of the customary ways of doing science;

2. he rejected his earlier problems from the traditional theory of logical foun- dations (such as the problem of validity per se) in a highly dramatic manner (as illegitimate, or at least unclear or even as incapable of precise formula- tion), instead of simply saying that they are not fruitful.

Exactly this dramatization (in 2) had at first an effect on me which was opposite to his intended aim, which was to direct attention away from validity towards, say, ease to take in and remember. It produced in me an itch to make precise different kinds of validity, such as the constructive or predicative ones, an endeavor in which I always succeeded: luckily, since itching and human dig- nity are incompatible. Only afterwards was I able to investigate at leisure what Wittgenstein on Consistency and Incompleteness 233 these achievements are good for: more accurately, whether these precisely defined logical categories are adequate for a treatment of the general questions of the Phi- losophy of Mathematics, e.g. the already mentioned reliability of mathematical reasoning. I was also slow in recognizing the full extent of 1, in particular the relevance to these general questions of already existing, axiomatic (socalled structural) analyses; in genuine competition, so-to-speak, with the logical foundations. But I must also confess that for the longest time I did not quite take in the very brief objections of, e.g., Bourbaki, to logical analyses: that they are simply boring (le cˆot´ele moins int´eressant).

13.2 Hilbert’s Program and Consistency

Like many others around 1930, Wittgenstein was decidedly enthusiastic about the main component of Hilbert’s program: formalization. In particular he was convinced that all (essential) insights about mathematical reasoning could be formulated by referring to its formal and computational features. But he did not commit himself to specific formal rules; in contrast to Hilbert’s ‘methodologically pure’ systems for logic, arithmetic, geometry, etc. On the following two points he was more critical than Hilbert. Firstly (and this corresponds to his objections in Section 1 to the notion of validity) he thought it not fruitful to consider all calculations of a ‘calculus’. Put differently: formal provability (even by limited means) without regard to ease to take in and remember seemed to him a bad idealization - with the understood exception of particularly course-grained questions. Secondly, he was disturbed by Hilbert’s exaggerated claims for the impor- tance of consistency. (That they are exaggerated is, incidentally, obvious.) It is known that also others, e.g. Brouwer and Russell, also were very critical. Less well-known are the pregnant - and still relevant - criticisms by G¨odelin his lecture to the Vienna group (K¨onigsberg 1930), and by Gentzen at the end of his consis- tency proof for arithmetic ([1936]): consistency at best guarantees the validity of universal theorems (and even this only for certain systems, namely those which are complete for purely numerical formulas), whereas in practice one is rather more interested in existence theorems. Scattered among Wittgenstein’s many ill considered comments to consistency there are also a few worth mentioning, e.g. the ones following. ‘Proofs’ of a contradiction are nothing out of the ordinary since, in any case in the axiomatic proof analysis considered in Section 1, such ‘proofs’ and proofs that one does not understand (well) are dissected and ‘torn up’ in a similar manner (cf. note 4). Somewhat more superficially: consistency cannot possibly be decisive, because it can be achieved at very small cost, i.e. by just not looking at contradictions. It Wittgenstein on Consistency and Incompleteness 234 turns out that this simple comment can be made precise, and it yields a sensible complement to the usual formulation of G¨odel’ssecond theorem.5

Complementary remarks on formalization In Hilbert’s program for ‘securing’ mathematics the formal and computational aspects were to be used for the analysis of the reliability of mathematical reason- ing, of all things. The (undisputed) attractiveness of this plan is based on the following confusion. It is indeed true that formal rules are simply the prerequisite for reliable data processing, if the data themselves are formal objects, as in the case of the digital computer. But nothing in our experience with the custom- ary mathematical reasoning suggests that its reliability is enhanced, if one forces oneself to forget everything that one knows about the data, except their formal structure (e.g., their interpretation); if one, so-to-speak, cuts out half one’s brain. Of course formal data processing has its value: more or less sensible people spend millions of dollars every year on it. Formalization of the usual mathematical reasoning, a very tiny part of information processing, is of value occasionally, but not in connection with reliability and validity. This is no misfortune, since the validity of a proven theorem is by no means the only ‘virtue’ of its proof.

Autobiographical remark As was already stressed in Section 1, the proof of a theorem, in particular a consistency proof for a certain system, often leads to a more significant theo- rem. For example, nowadays one uses consistency proofs to formulate functional interpretations, even of (already proved) existence theorems. As far as I know, Wittgenstein never looked at a consistency proof: he only talked or thundered about the notion of consistency. Without exaggeration, the verbosity of his Remarks on consistency is hardly distinguishable from idle talk one does to waste time. By this he has confirmed his own remark that (his) philosophical problems arise from (his) laziness in mathematics.

5In this way one finds entirely natural formal systems which are not complete for purely numerical formulas. Without exaggeration, this unmasks Hilbert’s generalization of the notion of consistency for arbitrary formal systems:

at least one well-formed formula is not deducible.

For this generalized consistency cannot do those things to which the usual consistency owes its significance, e.g. (as mentioned) in relation to universal theorems. Wittgenstein on Consistency and Incompleteness 235 13.3 From Provability to Proofs (explanations about a shift of accent)

Since completeness and incompleteness only relate to provability, and have noth- ing to do with the structure of proofs, they lose their central role. More precisely: completeness (implicitly: completeness with respect to a specific interpretation or ‘semantics’) only requires that there exists a proof in the system of every statement which is valid (under this interpretation). Thus very little is said, in general, about the actual possibilities of insights, e.g. of proofs easy to take in and remember. The philosopher is, sensibly, more interested in these possibilities than, e.g., in the restrictions to a more or less artificial formal system taken from the usual literature.6 What happens to incompleteness proofs when incompleteness itself loses its ‘fundamental’ significance? A normal person remembers the good advice: we have nothing to fear but fear itself. In other words, such proofs have more meaningful consequences (than mere incompleteness).

An anecdote from the forties A few days after receiving several short, reasonable explanations of G¨odel’sin- completeness proofs Wittgenstein opined full of enthusiasm that G¨odelmust be an exceptionally original mathematician, since he deduced arithmetical theorems from such banal - meaning: metamathematical - properties like consistency. In Wittgenstein’s opinion G¨odelhad discovered an absolutely new method of proof.7 What he meant was that the metamathematical interpretation (made possible by the arithmetization of metamathematical concepts) makes the relevant arith- metical theorems immediately evident. This can be compared to the geometrical interpretation of algebraic formulae, such as ax2 + ay2 + bx + cy + d = 0, from which it becomes obvious that two such equations cannot have more than two common roots (x, y), since two circles can intersect in at most 2 points.

6Experts may think here of complete systems of algebra, such as the theory of real closed fields, and give thought to the fact that such systems do not exhaust the branch of algebra under consideration: one also uses topological methods, for which no complete system of (formal) rules exists. It is obvious, but nevertheless often overlooked, that the practice of using non-formalizable notions for the solution of problems for which a complete theory exists removes, so-to-speak, the philosophical sting from incompleteness (to use a phrase of Wittgenstein). 7Evidently this drop of sound common sense can replace (with profit) a whole cloud of wild dialectics in Wittgenstein’s Remarks. These specific talks with Wittgenstein I forgot in the fifties (until 1979), but naturally not their general drift which is irreconcilable with that of the Remarks. This shows that the disagreeable surprise (mentioned in the ‘Historical background’) caused to me by reading the Remarks when they came out was actually justified. Wittgenstein on Consistency and Incompleteness 236

One should remark that, from a logical point of view, the proof of

consistency → G, where G is G¨odel’sformally undecidable formula, by no means requires new logical tools. (For experts: the implication has a primitive recursive proof.) The novelty is within elementary arithmetic. Indeed, a look at G is enough to see that G expresses its own formal unprovability, if the system is consistent. Thus G¨odel’sincompleteness proof is seen in a positive light; incidentally, entirely in harmony with G¨odel’sattitude in his paper on the subject.

Analyses of theorems and of proofs It is worthwhile to consider at this point Cantor’s diagonal method, namely its ap- plication to the problem whether there are transcendental, that is non-algebraic, real numbers. Certainly the method is not necessary for this problem. Liouville had proved, more than ten years before Cantor, that e.g. P 10−n! is such a num- ber. Despite this Cantor’s proof is of interest, because it shows that the course question whether such numbers exist (not whether e, π, ee are transcendental) can be dealt with by entirely elementary cardinality considerations. Only a log- ical cripple can fail to see that Cantor established a new method of proof in the theory of transcendental numbers.8 Expressed differently: one distinguishes between the analysis of theorems and the analysis of proofs. Provability or non-provability (by some means or other) is obviously a property of a theorem. But proofs - even of the same theorem - differ from each other. It is the task of the analysis of proofs to find appropriate concepts in terms of which essential relations, in particular distinctions, between proofs can be formulated. Up to now there are reasonable and decidedly useful concepts of this type, but none that come close to being as impressive as the cyclopean (alias fundamental) logical concepts; just as the concepts and aims of modern higher arithmetic are not nearly as impressive as the Pythagorean claim that number is the measure of all things. Remark on the danger of a disastrous invasion of logic. It is certain that for some people the analysis of theorems is a distraction from the much more difficult analysis of proofs. But nothing in our scientific experience indicates that in future a potentially fruitful contribution to the analysis of proofs in the sense of Wittgenstein (e.g. the axiomatic analysis mentioned in Section 1), would be ignored in the long run. On the contrary: the literature is full of trifles (concerning proofs) that occupy the logicians, just because they can do something with them. There is no lack of - youthful - eagerness in this world; but rather of sensible leadership.

8Cf. the misguided decision by A. Weil to eliminate in the second edition of his book Basic Number Theory the reference to Cantor in the chapter on transcendence. Wittgenstein on Consistency and Incompleteness 237 13.4 Wittgenstein’s Expectations (repeatedly ex- pressed in the prefaces of his books)

Above all the Remarks were meant to stimulate the reader to have his own thoughts; especially those readers who had already come close to Wittgenstein’s thoughts. Of course a very special kind of stimulation was meant; rather a rein- forcement of the reader’s thoughts: just as one reinforces the faith of the faithful, and does not try to convert those of different persuasions. This expectation was confirmed by my own experience. When they came out, the Remarks did not help me at all. Since the end of the sixties I myself had started to consider structural properties of proofs. After a lecture in 1973 in which I presented these ideas and their development (also by Statman), Nagel drew my attention to the fact that these tendencies (certainly not the details) reminded him of Wittgenstein’s Remarks. I was absolutely unaware of this connection before then. But I am entirely aware of the additional confidence in my own thoughts that I derived afterwards from leafing through, e.g., Wittgenstein’s Zettel. Added to this was a certain pleasure, at his skillful formulations and at my reformulations of his less skillful ones. Incidentally, I only noticed later how strongly Bourbaki’s axiomatic mathematics (certainly not axiomatic set theory), which is popular since the twenties and which was discussed in detail in Section 1, goes in a similar direction. In contrast, I cannot assert that Wittgenstein’s specific examples gave me a fruitful impetus for further work. That proofs should be easy to take in and remember, one accepts very willingly. But this does not even hint at how to find guidelines leading to these fine properties, and certainly does not mention that there is literature about the guidelines, e.g. the dissection of proofs on the scheme of Bourbaki’s fundamental structures. (Of course one can expect that in due time a few more fundamental structures will be added. But in any case it is remarkable how many proofs which have an intuitive appeal for us are amenable to an analysis by so few fundamental structures.) At least until now I also had no use for another one of Wittgenstein’s chief concerns, his fussing about clarity and clarification. In a trivial way one can achieve clarity very cheaply, as Wittgenstein got consistency cheaply in Section 3: by not looking at what is not clear. Less trivial, but in my opinion often erroneous, is Wittgenstein’s contrapo- sition of clarification of existing knowledge and new constructions. That things are a bit hectic when many are occupied with a task (here: to create something new), is hardly surprising: short pendulums also behave in this fashion. It is a fact that we see the earth as well as the stars more clearly, if we build a (new) spaceship and look at the world from outside the atmosphere of the earth. In mathematics numerical calculations become more perspicuous, if we first equip ourselves with something new, e.g. suitable algebraic structures, and then look Wittgenstein on Consistency and Incompleteness 238

(anew) at the Old or interpret it with the aid of the New. Wittgenstein himself commented on this somewhere: that (possibly new) con- cepts are needed to describe the facts. The same is true of clarification; more precisely: one needs not only new concepts, but also new knowledge, new facts.9

9Only abundance brings clarity . . . A ‘free’ translation of (Schiller’s): und im Abgrund wohnt die Wahrheit (and in the abyss lives the truth) might be: simple true laws must be about things or concepts that are not on the surface; because we know anyhow that the phenomena with which we have to deal are complicated. Chapter 14

Wittgenstein and Bourbaki on Traditional Foundations

Trivially, Wittgenstein’s revised views in the decade after Tractatus was com- pleted are ‘revolutionary’ for traditional foundations, being a reaction against them; but they are quite close to those of many thoughtful mathematicians, for example, in Bourbaki’s most interesting (and no longer well-known) manifesto: L’architecture des mathematiques [1948]1 (cf. the Appendix of Chapter 5). A prin- cipal problem was to put those views into words convincingly, and Wittgenstein was aware of this fact; thus the last sentence of the Lectures on the foundations of mathematics stresses:

[The seed I am most likely to sow is] a certain jargon.

The main aim of this chapter is to restate the complaints of Wittgenstein and Bourbaki about traditional foundations, with due regard for the discoveries of mathematical logic (cf. the Appendix), which those authors neglected. By and large, at least in my view, the discoveries of logic support the principal complaints. For balance, some local virtues of traditional foundations will also be mentioned.

0This chapter is based on ‘Wittgensteins’ Lectures on the foundations of mathematics’, Bulletin of the American Mathematical Society, 84 (1978) 79–90. 1Occasionally, one has to read between the lines. Thus the negative remarks (toward the end of [1948]) about categorical axioms (in contrast to the axioms for Bourbaki’s basic structures, which are realized in many structures) are naturally interpreted in opposition to a preoccupation of traditional foundations, the analysis of socalled informal notions. Some of the best-known analysis of this sort have been of little use: Would Gauss’ Disquisitions have been better if he had started with Peano’s axioms? Less trivially, in practice (both in pure and in applied mathematics) the particular informal notions we start with often turn out to be unmanageable or otherwise unrewarding, and it is simply better to axiomatize which properties of such notions have been used (for some striking conclusion). What Bourbaki actually say, about the ‘sterility’ of categorical axioms, is a bit glib. By neglecting such axioms altogether, one loses (pedagogically) useful explanations of the choice of familiar axioms in algebra and of socalled formal independence results

239 Wittgenstein and Bourbaki on Traditional Foundations 240 Strategy and tactics Naturally, Wittgenstein’s and Bourbaki’s principal, though not their only target is the best-known branch or ‘school’ of traditional foundations, familiar from the formal-deductive presentation of mathematics in a universal system, of the kind often found in the first chapter of a mathematical text (but barely referred to later). Wittgenstein was most familiar with socalled logistic foundations going back to Frege and Russell, Bourbaki with its set-theoretic variant going back to Cantor and Zermelo. Usually the universal system has a single ‘primitive’ symbol, ∈, besides the logical operations; P ∈ Q is read as: P has the property (or ‘structure’) Q in logistic interpretations, and P belongs to (the set) Q in set-theoretic ones. It is then claimed that this sort of presentation provides the ‘fundamental’ analysis of mathematical notions (and proofs), and, at least occasionally (for example, by Russell), that the use of a single primitive reflects the ‘unity’ of mathematics. Probably the most obvious difference between Wittgenstein’s and Bourbaki’s tactics in discussing such traditional claims is this: Bourbaki refer to wide expe- rience in mathematics, while Wittgenstein uses very elementary examples. The latter are elegant (and popular, because most foundational issues present them- selves when we know little), but leave open to what extent they are representative of wider experience, too. A more basic difference is strategic. Bourbaki simply record their impression (of set-theoretic foundations):

This is only one side of the matter, and the least interesting at that, and then go on to describe a better alternative with the same general aim: to exhibit, in terms of Bourbaki’s basic structures, what is vaguely called the na- ture of mathematics (Bourbaki speak of ‘unity’ too though there are several such structures).2 Bourbaki’s strategy leaves professional philosophers cold who as- sume that the proper way to achieve that aim must use the notions prominent in traditional foundations (they dismiss Bourbaki as mere mathematicians lacking the higher sensibility needed for a true interest in traditional foundations!). In contrast, Wittgenstein reacts to (admittedly, exaggerated) claims of logis- tic foundations and attempts to convert the fundamentalists by ‘deflating’ the notions and thus the socalled fundamental problems of traditional foundations, stated in terms of those notions. In his words, he wants to

show the fly the way out of the fly bottle.

2Bourbaki observe the academic proprieties; they are respectful about the language (of set- theoretic if not logistic foundations). But when they come to the importance of their basic structures, they do not even mention that such structures can be defined in the language of sets (and membership, ∈). Thus by implication they dismiss the familiar claim that this definability constitutes the ‘unity’ of mathematics. Wittgenstein and Bourbaki on Traditional Foundations 241

He does this with much ingenuity and patience, and some overkill (e.g. he speaks of ‘the disastrous invasion of mathematics by logic’, cf. Chapter 15), while Bour- baki ignore the fly which does not see the way out by itself (by the light of the basic structures). Even granted Wittgenstein’s pedagogic aim above, his style does not seem efficient. In my opinion, current mathematical logic, which has developed several notions of traditional foundations (has, so to speak, given them rope), seems much better, and some of those developments have positive interest to boot (cf. the Appendix).

General complaint: deceptive abstractions When we know little (but want to make general, ‘big’ statements), we necessarily use superficial generalities, socalled abstractions.3 Being unquestionably venera- ble, all branches of traditional foundations concern principally early generalities; for example, they all agree in regarding such superficial ‘defining’ properties as validity (of proofs) or existence (of mathematical objects) as principal subjects of study, refined by almost equally obvious (and early) subdivisions into broad categories:

• constructive and nonconstructive validity

• concrete and abstract (in particular, infinite) objects.

Different branches of traditional foundations usually differ in picking on one cat- egory or the other as comprising all (justified) mathematics, or in the exact boundary they draw; cf. logicist and set-theoretic foundations mentioned above, or finitist and intuitionist (constructive) foundations. Debates over traditional foundations generate much heat, an occupational hazard (or attraction) for philosophers, who trade in justifications and similar hot commodities. More importantly, those debates obscure what (on general scientific experience) is most suspect: the assumption that those broad categories, which are certainly on the surface, should serve for a ‘fundamental’ theory of (mathematical) phenomena which, on the surface, strike us by their diversity. This suspicion is further obscured, in effect if not by intention, by, so to speak, the opposite assumption: that (most of) those early ideas are not only unrewarding, but simply incoherent or, at least, very difficult to make precise. This is simply false. Quite a number of traditional socalled informal notions have been analyzed precisely and convincingly: both ‘grand’ and ‘modest’ ones. And not only in traditional foundations (length of curves, formal rules, etc.), but also in traditional physics (rigid body, ideal fluid, perfect gas, etc): it just so happens

3As an example, Bourbaki cite general talk about the ‘experimental method’ being the link between physics and biology (in contrast, as we know now, to the general laws of molecular biology, which, quite literally, are not superficial at all). Wittgenstein and Bourbaki on Traditional Foundations 242 that most of these often very appealing concepts have turned out not to serve very well for a fundamental theory (and others less easy to develop, like chemical composition, were more essential).4 Of course, precision by itself is rarely enough to inspire universal confidence; astrology is a good (extreme) example, as venerable as traditional foundations, and a model of precision and clarity. And thorough or even brilliant justifica- tions of definitions can be quite sterile; thus analyses of the area of a triangle in Hilbert’s Foundations of Geometry have not helped with the (genuine) problems of defining the area of a surface. In short, the general complaint (of Wittgenstein and Bourbaki) is that tra- ditional foundations may be poor philosophy, in the broader popular sense of ‘philosophy’. Specifically, if in practice the general aims of foundations are better served by alternatives, for example, by ordinary careful scientific research and exposition. In my opinion this alternative is particularly superior to traditional foundations with regard to reliability, at least, in the bulk of mathematical prac- tice. This conclusion is of course quite consistent with the Appendix and note 4, which show that some developments of traditional foundations have occasional use (and appeal; being easy to handle, like other developments of simple-minded notions).

Principal complaint: explicit definitions For all branches of traditional foundations the matter of explicit definitions is utterly trivial: for validity because such definitions can be systematically elimi- nated from proofs, and for socalled ontology because no existential assumptions are involved. Both Bourbaki and Wittgenstein emphasize - of course in accor- dance with ordinary mathematical experience - the choice of explicit definitions, as incomparably more significant than the glamorous preoccupations of tradi- tional foundations, not only for discovery (a ‘mathematical’ affair), but also for intelligibility (a principal factor in reliability, and hence a more strictly ‘founda- tional’ business). In brutal terms: an idealization, of reliability, for which this factor is trivial, is a poor idealization. Bourbaki treat explicit definitions at length (at least, implicitly), in connection with the use of basic structures for solving ‘concrete’ problems. The scheme is

4For example, the definitions of length of curves (by use of calculus) and of (planar flow) of ideals fluids (by use of function theory) express correctly the notions intended: this is not the issue at all. Both notions are of course socalled theoretical idealizations: only, ‘length of curves’ does, and ‘ideal fluid’ does not isolate a dominant factor in (the bulk of) geometry and, respectively, hydrodynamics. Incidentally, it is now generally recognized that the notion of formal or, equivalently, (ide- alized) mechanical rule in the sense of recursion theory is a poor idealization for the study of computers; but, for example (see the Appendix), it is a good tool in algebra (and number theory) - to be compared to those parts of function theory which were originally developed for the analysis of ideal fluids, and have found impeccable uses elsewhere. Wittgenstein and Bourbaki on Traditional Foundations 243 this:

1. A structure S is explicitly defined in ‘concrete’ (say, number-theoretic) terms.

2. S is shown to be a basic structure, for example, (shown to satisfy the axioms for) a group.

3. Known properties of the basic structure are used to yield number-theoretic information.

Without exaggeration: experience shows that the conscious use of the scheme literally alters our view of mathematics, and so, in the popular sense: our philos- ophy of mathematics.5 Wittgenstein stresses the importance of explicit definitions in so many words, specifically, in connection with logistic foundations of numerical arithmetic (a principal topic of his own, limited study of logic). Here structures are explicitly defined in logical terms, and shown to satisfy familiar arithmetic laws. (This kind of thing calmed Frege’s indignation at the ‘logical scandal’ of being left speechless by the question: What is the number 1?) Wittgenstein stresses the following aspect of logistic foundations (which Frege and Russell, and of course Wittgenstein in his youth, had ignored as being trivial ‘in principle’). If the logical formula FA expresses the arithmetic theorem A, knowledge of A is needed not only to recognize this fact, which goes without saying, but simply to prove

FA convincingly. An analogue to this is used frequently in current algebra: if A is a theorem about ordered fields, and FA the corresponding logical formula, then FA is in fact proved by using set-theoretic or algebraic operations on ordered fields. Sure, by the completeness theorem, FA has a proof using only the rules of ordinary predicate calculus, too; but this fact, which is certainly fundamental if the assumptions of traditional foundations are granted, has turned out to be quite marginal in practice. In short, as a matter of empirical fact, arithmetic does more for logic than logic for arithmetic.

Specific complaints about some glamor issues Wittgenstein had a particularly strong aversion to one of the more dramatic topics of traditional foundations: the matter of contradictions (as in the paradoxes) or their absence, consistency (as in Hilbert’s program). Incidentally, at least by

5The scheme would be significant for traditional foundations if in striking applications the means used to establish 2 and 3 were foundationally problematic, not ‘reducible-in-principle’ to the usual methods of number theory. Not only logicians but, occasionally, also mathematicians assume that this must be so. They are wrong. The scheme above is unquestionably effective in existing practice also where the assumption is demonstrably false, as shown up to the hilt by the work referred to in the Appendix, for a wide range of precise formulations of the notions involved. Wittgenstein and Bourbaki on Traditional Foundations 244 implication, Bourbaki too are unimpressed; treating consistency (or the existence of some model) as a by-product; for example, the model of the - theory for the field C of - complex numbers√ furnished by the Euclidean plane, which was originally hailed for ‘legitimizing’ −1, is reinterpreted as a useful property of the plane. Be that as it may, the familiar dramatics about consistency, etc., are unconvincing. For one thing, one is accustomed to oversights or blindspots which result in straight errors and possible contradictions; cf. Hilbert’s recipe for proving Fermat’s conjecture (finitistically!): so lange herumrechnen, bis man sich endlich verrechnet (calculate, until you finally find a mistake). Also, there are confusions between notions: when P is true for one of them, and ¬P for another, it is simply futile to ask for a precise location of ‘the’ error. On the other hand, generally speaking, consistency alone is not too reassuring (from experience with skilled liars). Wittgenstein had a pet complaint: Why not ensure consistency trivially, by modifying the rules in the obvious way? Though he asks, in effect, what would be lost by this, he doesn’t really stop for an answer. Actually, his point is well illustrated by Rosser [1936] (without having been recognized explicitly by its author).6 Another matter which had long been prominent in traditional foundations, and especially in the writings of Wittgenstein’s teacher, Bertrand Russell, is the topic of higher (infinite) cardinals.7 Wittgenstein was particularly offended by the use of the harmless diagonal construction to support heavy infinities; ‘harmless’ inasmuch as - to Wittgenstein - the point of the construction was perfectly well illustrated by proving that the set of (ordinary) polynomials in one variable cannot be enumerated by a polynomial in two variables. Wittgenstein preferred to use the construction in the context of rules (for partial functions), specifically for proving G¨odel’s incompleteness theorem, incidentally without appeal to the liar paradox. Wittgenstein considered rules r1, r2,... for sequences of 0 and 1, where some rn says:  0   1  put at the m-th place if r tells you to put at the m-th place. 1 m 0

8 (So rn says: put nothing at the n-th place - and the value of rn(n) is undecided. ) Bourbaki’s manifesto does not seem to commit itself on the matter of higher

6 Wittgenstein’s idea of modifying any system F into an obviously consistent one FR, is treated systematically in Kreisel and Takeuti [1974]. If F itself is consistent, FR has not only the same theorems as F, but even the same proofs. What is ‘lost’ by the passage from F to FR becomes clear by stating the hypotheses of G¨odel’ssecond incompleteness theorem properly: in the usual systems F the consistency of F cannot be proved; in FR (the consistency of FR can, but) the adequacy of FR for numerical arithmetic cannot be proved even when FR is adequate. 7Concerning lack of discretion in the use of traditional foundations, Cantor speculated about higher infinite cardinals bringing us closer to the Almighty, with inconsistent manifolds keeping us at a respectful distance. 8This use of a variant of the diagonal construction to establish incompleteness (in conversa- tion in the forties, not in his writings) was reported in footnote 4 of Kreisel [1950]. To test his Wittgenstein and Bourbaki on Traditional Foundations 245 cardinals. But it seems fair to say that they would study problems about the natural numbers by means of (suitable generalizations to) finite fields rather than, say, by means of infinite ordinals (which Sierpinski attempted to do, for example, in his counterexample to an analog of Fermat’s conjecture). While the substance of Wittgenstein’s complaints is certainly eminently rea- sonable, the ordinary style of mathematical logic is more efficient: one refor- mulates the theorems involved.9 In my opinion, Wittgenstein is too tolerant of traditional foundations, for example, far too soft on formalization as a (neces- sary) condition for mathematical rigor. Thus, in an exchange with Turing on the subject of making ordinary proofs ‘more’ formal, Wittgenstein does not question this aim but merely assumes (Lectures, p. 127) that it would by ‘easy’ to do, inci- dentally, contrary to an almost universal opinion.10 The business of formalization is at least as prominent in traditional foundations as Wittgenstein’s pet aversions (and perhaps not so easy to put into perspective). Problems about formal rules are of permanent interest when detail (not mere existence) is in question: for example, in a choice among complete sets of rules, and its effect on the geo- metric structure of the corresponding formal derivations of given theorems. By the nature of the case the choice of relevant detail requires more than the kind of general, superficial impressions on which the notions and problems of tradi- tional foundations are based: for example, notions of (formal) rigor. Today the most obvious nondoctrinaire need for formalization comes from the application of computers to proofs: trivially, computers operate only on formal data (here: formalized proofs), and they are certainly needed when measures of proofs are relevant which are hard to calculate by hand (for example, the genus).11

(rough) idea, one also considers a rule rp which says, so to speak, the opposite: put at the m-th place what the m-th rule tells you to put there. One would expect to be able to write anything at the p-th place. But this is not altogether adequate, as seen by considering a problem of Henkin [1952]. The upshot is that the general character of the inferences by means of which rp ‘tells’ you what to do, is critical: for socalled cut-free systems we have one answer, for the usual systems another; for a precise exposition, cf. Kreisel and Takeuti [1974]. 9As in note 6 for the case of G¨odel’ssecond incompleteness theorem, and in the Appendix for debunking the ‘logical strength’ of languages of higher type. 10The passage from a ‘given’ proof, say, in a mathematical text, to a formalization should be compared to other processing of ‘raw’ data for theoretical treatment; for example, to apply physical theory, (physically) significant data are needed, including the correction for artifacts. Correspondingly, the drill or ritual involved in mathematical texts is a likely source of artifacts (like a stylized description of a physical situation by someone not familiar with the relevant theory). Wittgenstein’s obviously offhand comment, to the effect that the passage above is ‘easy’, overlooks not only the general problems mentioned, but even the distinction between (absolute) effort and the ratio: effort/reward, familiar from economics (utility and marginal utility). As far as effort is concerned, Wittgenstein may be right, simply because one’s subjective judgment tends to be bad! Specifically, at least my estimates of the number of lines in a formalization tend to be unreliable even after it has been carried out. 11The use of computers for operating on, or, as one says, for unwinding ‘given’ proofs is of course less glamorous than the better-known business of automatic theorem proving (which Wittgenstein and Bourbaki on Traditional Foundations 246

Wittgenstein’s complaints live up to one of his quotable quotes (Lectures, p. 68):

Don’t treat your common sense like an umbrella. When you come into a room to philosophize, don’t leave it outside . . .

Of course, ‘philosophy’ (of mathematics) is not meant here in its academic sense, of traditional foundations, but rather in its popular sense. We shall return to possible inadequacies of common sense later.

Wittgenstein’s advice When confronted - or, in Wittgenstein’s terms, ‘puzzled’ - by a philosophical problem about (mathematical) notions or proofs, we should see what we do with them, how we use them.12 This is like the familiar advice in ordinary mathematics to try and see what makes a proof work or, more formally, d´egagerles hypoth`eses utiles. Fair enough, compared to other elastic advice on conduct (ad majorem gloriam dei or its ‘enlightened’ up-date: pour l’honneur de l’esprit humain). But in really doubtful cases, usually more imagination is needed to find the concepts, the cadre, for stating a satisfactory answer than to think of the troublesome notion or proof in the first place. As to ‘puzzles’, many solve themselves in the ordinary course of nature, for example, by means of memorable counterexamples. In particular, see note 6 for Wittgenstein’s complaint about consistency. As to his other specific complaint, what was there to see in the thirties ´apropos of higher cardinals? Work that has been attributed to Wittgenstein’s or related advice is pretty varied, and of uneven interest. Here are two examples: both are extreme, the first in banality, the second in literalmindedness.

• When proving results about a wrongheaded project, one may stumble over something of interest, as in work on Hilbert’s consistency program or, more many of us do better than computers), where one starts with a formula, a conjecture, not with a proof. Computer-assisted proofs raise (genuine) problems. Historically - and scientifically, if not artistically - speaking, such proofs, for example, of the four-color conjecture, involve incomparably more progress than, say, the use of large cardinals above. Compare the effort which would be needed to explain large cardinals to Archimedes with getting him to understand, let alone put together the largish computer used by Haken and Appel. There are genuine doubts about the reliability of computer-aided proofs not resolved by the particular idealizations of reliability, that is, the doctrines of rigor in various branches of traditional foundations. Inasmuch as reliability is a principal topic of foundations, these new proofs present novel data for foundations: it would seem premature (to put it mildly) to assume that these new data are less fundamental than the matters of ‘principle’ stressed in traditional foundations. 12In the Lectures Wittgenstein concentrated on uses outside mathematics, but no longer in conversations with me in the forties. Wittgenstein and Bourbaki on Traditional Foundations 247

specifically, the elimination of nonelementary methods in number theory discussed in the Appendix. Then one tries to formulate that interest. This is familiar enough from the study of false hypotheses in the sciences, less so in the mainstream of pure mathematics which tends to be very conservative (confining itself to obviously relevant or well-tested notions).

• Probably the most literal minded interpretation on Wittgenstein’s advice, and hence very much more ‘philosophical’ in the sense of traditional foun- dations, is elaborated in socalled operational semantics, for example, of logical particles. Here the meaning of a word is determined by its ‘use’; in mathematics, (tacitly) by formal rules for the use of the word.13 Lorenzen has given two versions of operational semantics, one in his book Opera- tive Logik, the other in his Dialogspiele, terminology which goes well with Wittgenstein’s ‘language games’.14

There is a quite separate question whether the work just quoted was not only ‘attributed’ to, but whether, realistically speaking, it was influenced by Wittgen- stein’s advice. (Wittgenstein was clearly interested in the question, specifically in the heuristic and pedagogic value of what he had to say - as in his talk about ‘the disastrous invasion of logic into mathematics’, cf. Chapter 15, or about ‘tena- cious misunderstanding difficult to get rid of’). Dramatics aside, these matters are sociological and hence severely statistical, difficult to judge not only because of the hackneyed business about interpreting statistical data, but because of the various skills needed to compile and process them. Abstractly, Wittgenstein was very sympathetic to an ‘impersonal’, statistical view of sociological matters. But in practice, he did not even try to see whether his particular views could be exam- ined with existing resources; instead he expressed feelings, like ordinary mortals.

The positive side of traditional foundations In my opinion the weaknesses of traditional foundations, inherent and compared to available alternatives, mattered less to Wittgenstein than the style of tradi- tional foundations: 13The matter was very much in the air in the thirties; most logicians are familiar with it not from Wittgenstein’s advice, but from a passing remark by Gentzen, on p. 80 of his collected papers. 14Dialogspiele are two-person games associated with (many of) the usual logical systems in which the players choose formulas alternately: a formula F is said to be valid, if the proponent of F has a winning strategy; for a detailed exposition, see Lorenz [1968]. Evidently, as the name suggests, a rather special side of reasoning, scoring debating points, is emphasized here (where, incidentally, the rules of the games are heavily biassed in favor the proponent - as if intended for people who like to talk a lot). Nothing is said about the original choice of the usual logical systems, nor the fact that even after formal rules have been formulated we continue to reason logically without remembering them (the former has its parallel in the case of many definitions in mathematics, but not the latter). Wittgenstein and Bourbaki on Traditional Foundations 248

• the almost staggering banality of ‘fundamental’ notions and problems com- pared to the ambitious general aims, and

• the - basically pretentious - simple-minded language used to formulate the results of traditional foundations.

For obvious reasons, elaborated at the end of the last section, it is far beyond the scope of (at least) this chapter to try and assess the pedagogic or heuristic value of the stylistic feature of traditional foundations just mentioned. But it is worth remembering the possibility of such a value; perhaps best by reference to related aims, notions and problems which are of about the same vintage as those of traditional foundations, but have made much more progress; specifically, the ideas of the Greeks about physics, in particular, space, time, matter.15 The first examples one thinks of are, of course, spectacular uses of the skepti- cal (socalled positivist or operational) tradition,16 such as Einstein’s most artistic presentation of the special theory of relativity in traditional philosophical terms. As a result of relativity theory there are now masses of data which would admit a ‘purer’, purely mechanical presentation (without bringing in light, that is, elec- tromagnetic phenomena at all). But Einstein’s presentation seems to have a kind of permanent pedagogic appeal even to those of us who, like Einstein, have grown weary of positivism (which tells us to begin with operational definitions, as in op- erational semantics above, when in fact theory is needed for their choice). As far as mathematics is concerned, the switch from the nineteenth century’s version of the axiomatic method (used by Frege, Dedekind and others to set up categorical axioms, see note 1) to its current first-order version (including most of Bourbaki’s basic, in particular, algebraic structures) continues to be introduced in socalled formalist terms, formalism being intended as the specialization of positivism to mathematics. In short, the particular features of knowledge on which positivism concentrates, seem to be occasionally central, at least for pedagogy. Evidently, spectacular successes are few and far between. For the present (by the last section: necessarily statistical) purpose it is more interesting to look at modest, but appealing uses of traditional foundations (and their counterparts in physics). For balance, two illustrations from the speculative

15Concerning the (natural) philosophy of the Greeks, its most obvious actual or potential value has been to make posterity familiar with imaginative, socalled revolutionary ideas in a general way, for example, the idea of a few elements or of cyclic time (Aristotle’s Physics, Book 8, 265a, 15 or 265b, 10 on ‘primary’ time and motion). But, as so often with - obviously - premature enterprises, most attempts at more precise or explicit formulations were hopelessly off the mark; for example, Aristotle was unhappy with Anaximander’s ‘elements’: earth, water, air, fire, but didn’t even connect them with : solids, liquids, gases, energy as in a course on physics; instead he had the business of: dry, wet, cold, hot (things). 16In physics, the atomic theory is the standard example of a success fitting into the speculative tradition (atoms being hardly much more plausible than ghosts, from ordinary experience); in mathematics, nonconstructive methods. Wittgenstein and Bourbaki on Traditional Foundations 249 tradition will be given. Both are due to G¨odel(who has shown more discretion and above all more flair than most exponents of that tradition).

• In physics the search for ghosts has so far not proved generally rewarding. But it can lead (one) smoothly to cosmological solutions of Einstein’s field equations in general relativity theory with cyclic time (cf. note 15).

• In mathematics, the search for open problems, say about the natural num- bers, which are settled by means of (axioms about) large cardinals, has not been very rewarding either. In fact - and this is of course the princi- pal conclusion of the Appendix - the superficial impression that anything like nondenumerable cardinals is used in existing analytic number theory, is simply false. But there is certainly a pedagogic interest in the possibility of any effective use of higher set theory for number theory, discovered and stressed by G¨odel.17 What is more, the possibility is ‘revolutionary’ in the sense that it does not seem to be even remotely suggested by the bulk of mathematical practice.

It seems to me that, used with much discretion and a little flair, the ideas of traditional foundations provide occasional checks and balances on the strategy of relying on the ‘needs’ of current practice (Bourbaki) or on current uses (Wittgen- stein), presumably, most often when the matters considered are far removed from current scientific study and uses. (Mathematicians tend to avoid such matters, like free choice sequences and large cardinals, to mention minor topics from the constructive, respectively nonconstructive branch of traditional foundations.) When musing about the virtues and limitations of traditional foundations, readers may wish to recall the memorable successes of natural science in this cen- tury (which we can, perhaps, view with more detachment than our own subject). Some of the early ones in the first quarter, say on atomic and cosmological mat- ters, have a distinct flavor of traditional foundations. Others which have, literally, changed our view of the world even more (like Rutherford’s on the structure of the atom), and the extraordinary advances of the last 40 years, which have changed our view of ourselves too, do not. Naturally, the faithful either disregard those advances as not ‘fundamental’, or assume that things would have gone even better if the early preoccupations had persisted.

17Incidentally, in my view, G¨odel’sown proposals for the use of higher cardinals are, at present, undervalued for an apparently quite accidental reason. He happened to propose them for settling (number-theoretic problems and, above all) the generalized continuum hypothesis, which, demonstrably, is not decided by anything remotely like the cardinals presently consid- ered. (Earlier he has proposed to use the continuum hypothesis itself to settle number-theoretic problems: his own work could be used to refute this proposal; as in the Appendix). Wittgenstein and Bourbaki on Traditional Foundations 250 Appendix: Mathematical logic Wittgenstein makes passing references to some kind of (mathematical) interest of mathematical logic which had grown out of traditional foundations, but without any hint of what that interest might be. Though this is easier to state now, by 1939 (the year of Wittgenstein’s Lectures) and especially by 1948 (the year of Bourbaki’s manifesto), some people with their wits about them had a pretty good idea; for details, see Vaught’s story [1974] of model theory up to 1945, especially about Malcev and Tarski. The principal uses of logic divide into:

1. solutions of previously stated problems The best-known use of logic applies model theory to prove (an asymptotic version of) Artin’s conjecture on p-adic fields. To be precise, the proof combined a little logic with a good deal of algebra; but the fact remains that though some kind of relation between p-adic fields and fields of formal power series had been recognized by algebraists, model-theoretic notions were needed to formulate the relation precisely enough to finish the job.

2. adequate formulations of natural questions A good example of using logical, in fact, recursion-theoretic notions for stating a theorem (not only for its proof) is in Higman’s work [1961] on finitely generated groups. This is a good answer to the (natural) question: ‘Which’ finitely generated groups can be embedded in finitely presented ones? A similar question, with logical answers, is this: What makes algebraically closed fields ‘special’? These fields are singled out, among all fields, by reference to the elimination of quantifiers (MacIntyre [1971]).

Unquestionably, this type of work, especially in 2, is satisfaisant pour l’esprit; but it is hardly central. Though results from logic are of course applied to many areas, within any one area the successes are strictly local. If a choice had to be made, one would lose more by neglecting Bourbaki’s basic structures than even the most respectable parts of logic such as model theory or recursion theory. Apart from local uses of logic as in 1 and 2 above, there is an almost endless list of applications to intimate pedagogy, to answer contemplative, ‘useless’ questions which thoughtful mathematicians often ask (themselves): What do we know from what we have done so far? (which is ‘useless’ if we know that later we shall go much farther). Good examples of a ‘useless’ question (or of a ‘fly in a fly bottle’) come from the debates about the axiom of choice, or about socalled nonelementary proofs in number theory of purely number-theoretic theorems18(of course, such

18In the twenties such proofs were distinguished by the use of function-theoretic methods; more recently, by the use of l-adic cohomology (as in Manin’s problem [∞] (the mark “[∞]” indicates that more precise references will be presented in future editions of this work). Wittgenstein and Bourbaki on Traditional Foundations 251 proofs which happen to be elementary are of interest; the issue is whether this ‘raw’ interest derives from their elementary character). Practically speaking, except to doctrinaires, knowing how to use the axiom of choice and other nonelementary (civilized) methods is obviously a good thing. But even a nondoctrinaire reflective mathematician may simply want to know if these methods are eliminable from the proofs considered, as in Serre’s question whether ‘such’ uses of the axiom of choice as in his study of homotopy groups are logically necessary. (Though asked in the thrifty fifties, the question was not intended to be useful; for example, it was not expected that a general answer to this question would help in, say, the actual computation of homotopy groups. In short, no illusions were involved.) Inspection of G¨odel’swork on - what he called19 - the relative consistency of the axiom of choice gives an easy negative answer to Serre’s question, and, of course, a precise formulation of a whole class of ‘such’ uses; cf. Kreisel [1956]. Concerning the business of nonelementary proofs, logicians have spent a good deal of time showing that those occurring in current number-theoretic practice can be eliminated. In the process we have dotted the i’s and crossed the t’s by making distinctions between ‘direct’ and other elementary proofs, introducing suitable definitions of ‘logical strength’, and finding formal languages progres- sively closer (than that of set theory) to those used in the branches of mathemat- ical practice concerned. In this way it became progressively easier to verify that the usual nonelementary proofs can be mechanically converted into - obviously - elementary ones.20Put differently, the set-theoretic principles used (implicitly) in actual practice are of low ‘logical strength’ inasmuch as the replacement schema and other schemata are applied only to formulae of low logical complexity.21 But since the instances of those schemata which happen not to be used are not in doubt either, we have little more than a modest discovery of a temporary fea- ture of contemporary practice: we have not yet learned to use other instances efficiently, just as it took time to learn to use efficiently the law of the excluded middle applied to, say, the Riemann hypothesis. In fact, with the elimination before our eyes we see how little is gained by it. As a consequence, and as an example of how mathematical logic actually supports the doubts of Wittgenstein and Bourbaki about traditional foundations, the issue of nonelementary proofs or, more formally, of ‘logical strength’ is discredited, and thus the claims of those branches of traditional foundations for which the issue is central, are refuted. 22 Admittedly, the effort involved recalls Bertrand Russells’ description (in My philo- sophical development) of Principia as ‘a parenthesis in the refutation of Kant’.

19Consistency was not the main issue because the axiom of choice is true for the (only) notion which, at present, serves to make the consistency of the remaining axioms evident. 20Although, or course, there are purely arithmetic theorems in current metamathematics for which the analogue is not true. [∞] 21Cf. Friedman [1977] for a detailed exposition. [∞] 22Cf. a similar use in p. 116 for consistency proofs of obviously consistent systems. [∞] Wittgenstein and Bourbaki on Traditional Foundations 252

But we have not stopped at such refutations. We have gone on to look for factors which, unlike logical strength, do distinguish between elementary and prima facie nonelementary proofs, and above all measure what is gained by the latter.23

23Cf. p. 127 concerning the reduction of the genus of proof figures as a possible measure on the difference. [∞] Chapter 15

The Disastrous Invasion of Logic into Mathematics

The title is a quotation1 from Wittgenstein’s Remarks on the Foundations of Mathematics. Taken literally, it is simply false, as explained in the Introduction. But it can be given a twist which brings it in line with (at least) several views in Wittgenstein’s later writings. This is illustrated concretely, in Section 1, by means of examples concerning mathematical proofs and processes (rules), with the following main conclusion. The aspects (of proofs and rules) which are regarded as basic in current - some- what pretentious - logic, are not only different from those which are essential in current mathematical practice (which almost goes without saying), but actually harmful for a study of it. The reason is that those basic questions of ‘princi- ple’, concerning the validity of principles of proof and definition, appear more glamorous than the genuinely useful problems concerning current mathematical practice, and thereby divert attention from the latter. The ‘practice’ referred to includes not only applications inside or outside mathematics, but also facts of experience concerning mathematical reasoning: which (combinatorial) config- urations and (abstract) ideas we handle easily. At the end of Section 1 there are some brief comments on the future development of computer science where Wittgenstein’s quotation is likely to apply in its literal sense. Section 2 elaborates some observations, which arise incidentally in the discus- sion of proofs and rules, but are of more general interest. The main references are to Tractatus, in its own way a prime example of - what was called above - pretentious logic. The change in Wittgenstein’s style of thought, to that of Philosophical Investigations, is seen to fit in quite well with the particular weak- nesses of Tractatus which disillusioned him (in accordance with Wittgenstein’s own indications). The degree of change does not seem appropriate, except as an

0Originally published in Acta Philosophica Fennica, 28 (1976) 166–187, as ‘Der unheilvolle Einbruch der Logik in die Mathematik’. 1Der unheilvolle Einbruch der Logik in die Mathematik. 253 The Disastrous Invasion of Logic into Mathematics 254 overreaction, which tends to go with exaggerated expectations,2 here: aroused by philosophical problems. The section concludes with some alternatives to (the style of) Philosophical Investigations. Also - and in contrast to Wittgenstein - it is argued that (most) traditional philosophical questions are, practically speak- ing, as meaningful as most open questions, but somewhat superficial. Only their pedagogic interest is perennial inasmuch as they occur to us - phylogenetically and ontogenetically speaking - almost as soon as we begin to reflect.

Introduction No doubt, the ‘new’ logic was a principal object of interest in Wittgenstein’s discussions with mathematicians and in his own readings in higher mathematics. But in its proper, that is, statistical sense, the quotation in the title is simply false. There just is no question of any invasion since most mathematicians know precious little of logic anyway except (some) symbols for the logical particles. To speak of a disaster, sensibly and not merely dramatically, one has to make a few distinctions (even if we tacitly recognise the absurdity of talk of ‘disaster’ applied to an occupation which is truly harmless compared to most human activities). Above all, a few gifted mathematicians have found quite excellent applications of logic, using in an essential way concepts which were, patently, developed in specifically logical investigations. For example, in algebra and number theory and on their border (theory of p-adic numbers), both recursion theory and model theory have been used successfully.3 Other gifted mathematicians, like Wiener or von Neumann, who originally specialized in logic, but did not find it particularly suited to their talents, later seem to have used their familiarity with the ‘new’ logic to good effect, in work on cybernetics and above all on programming computers. Secondly, and this is surely the principal question raised by the quotation in the title - it is a delicate statistical matter4 in which way and to what degree logic has influenced the general development of mathematics and the teaching of mathematics (not whether the influence was good or bad, but whether it was significant). This matter cannot be decided by superficial impressions - and to me at least, it seems a bit silly to have an opinion on such matters without a quite exact analysis, let alone to ‘quarrel’ (provided we are in the happy position not to have got ourselves involved in practical decisions in this area).

2Cf. the motto, due to Nestroy, which Wittgenstein chose for his Philosophical Investigations (all progress looks bigger than it is) and which, without doubt, he intended to apply to his philosophical investigations. See also Chapter 16. 3Readers need only be reminded of Higman [1961] about finitely generated groups, Matya- sevic [1970] about diophantine relations, Ax and Kochen [1965] and Ershov [1965] about p-adic fields. 4Cf. two discussions in the literature of the influence of general ideas in the particular case of Herbrand’s work: by Godel on pp. 8-11 of Wang [1974], and (on pp. 107-109 of Kreisel, Mints and Simpson [1975]. The Disastrous Invasion of Logic into Mathematics 255

In short, if there is any sense in the quotation from the Remarks, it must concern specific parts of mathematics or stages of its development (of interest to some of us); perhaps not even of mathematics itself, but rather the analysis of mathematics. It seems to me that the quotation applies to a part of logic which I know quite well, namely Hilbert’s and its development. This is the principal topic of the present chapter. To avoid misunderstanding: at least in my own case, the quotation has not been of direct, not even of heuristic use. I have known it for nearly 35 years, and stressed its plausibility in the - otherwise rather negative - Chapter 12 on the Remarks. The brutal fact is that the quotation does not contain the re- motest hint how (the pretentious) logical analysis is to be replaced, that is, which concepts should be used in the analysis of proofs in place of the ‘basic’ concepts of proof theory and which questions should be asked in place of the ‘principal’ problems of proof theory; in short, no hint of what one wants to know about proofs. The indications in the Remarks were too banal to be of use (to me). For example, Wittgenstein mentions that it must be possible to visualize proofs completely (Ubersehbarkeit¨ ) - not only without proposing a precise measure for this property, but even without the remotest evidence for the assumption that this property lends itself to thorough study (the general idea of that property was hardly novel).5 If it turns out that my impression is right, the value of Wittgenstein’s quotation (for me) can perhaps be summarized as follows:

It is incisive and memorable, and so makes the reader familiar with a certain aim. If sometime later this aim is approximated, the reader is likely to take a closer look instead of moving on, breathlessly, to the next ‘interesting’ possibility.

In short, the quotation helps one achieve a philosophical attitude in the popular sense of the word; a certain stability when this is appropriate.

15.1 Proofs and Rules

Proof theory is, in my opinion, a particularly crass example of that pretentious logic which was mentioned in the opening of this chapter; perhaps more so than what is found in grand ‘literary’ philosophy. The claims of proof theory to have

5Below I shall occasionally mention (my) attempts at analyzing this property which turned out to be abortive, for example, the attempt to use socalled infinite proof figures (for analyzing - finite - proofs) which were expected to magnify differences that are too ‘subtle’ in the finite case. At present I find particularly convincing (for the class of proofs considered) a precise measure which Statman [1974] has introduced and studied, and which will be discussed below. It is fair to say that most (good) logicians brought up on traditional proof theory do not share my impression and are skeptical. The Disastrous Invasion of Logic into Mathematics 256 uncovered the true, in particular, formal nature of mathematical reasoning sur- pass in pretentiousness the claims of most traditional philosophers.6 To appreciate Wittgenstein’s remarks concerning proofs and rules, the reader should recall some standard7 formulation of the aims and claims of Hilbert’s proof theory (slogans: consistency and decision problems). For one thing, there is an objective relation anyway. But also Wittgenstein knew quite well, for ex- ample, Hilbert’s essay [1926] On the infinity,8 and Turing occasionally attended Wittgenstein’s lectures. Wittgenstein’s critique of proof theory and its principal problems (for exam- ple, in the Remarks) is wildly exaggerated, and therefore quite unconvincing. Even where his critique applies to Hilbert’s own formulations, it rarely applies to the more reasonable reformulations which any average logician will find for himself; to be specific, I myself had no trouble in finding the currently standard formulations9 after reading Hilbert’s essay a couple of times. Evidently, these circumstances destroy all confidence in Wittgenstein’s critique. Worse still, Wittgenstein’s own attempts to characterize what is essential in proofs aren’t much better (than Hilbert’s). He stresses that proofs create - or at least use! - new concepts. This is surely true, and much stressed by mathematicians, especially when those new concepts seem to have nothing to do with the theorem proved. This occurs not only in analytical number theory,√ but also in quite elementary (witty) proofs, for example, of the irrationality of 2 , that is, p2 6= 2q2 for natural numbers p and q. Here, the largest divisors, of the form 2n, of p2 and 2q2 are considered; n is clearly even in the first, and odd in the second case, and so p2 6= 2q2. The concept of power (of 2) used in the proof is ‘new’ inasmuch as, apparently, it has nothing to do with the proposition p2 6= 2q2 (which we understand as soon as we can multiple). Apart from their use in finding proofs, new concepts are often needed to check proofs, which we normally do by comparison with other facts, not merely by going over a given proof. But granted that (many) proofs use new concepts, the brutal fact remains that, somewhere or other, propositions concerning these new concepts have to be proved too. Wittgenstein hems and haws, but never quite faces this

6Incidentally, in this respect a good deal of current logic - despite its precise, superficially dry style - is not much better. As I see things, there has been progress in philosophical discussions as a result of precise formulations made possible by mathematical logic; but the progress seems primarily aesthetic: even the average author can nowadays produce an agreeable mathematical presentation (without literary gifts). There is certainly no evidence that mathematical logic helps to inculcate genuine discipline, with a sense of proportion, and sound judgement on the possibility, at any given time, of making significant contributions. 7[∞]. 8This essay is often regarded as an authentic presentation of Hilbert’s program (although the introduction to Hilbert and Bernays [1934] is incomparably better thought out). Wittgenstein had a reprint of Hilbert’s essay and wrote in the margin, opposite one of Hilbert’s particularly thoughtless passages: Heiliger Frege! (Holy Frege!). 9[∞]. The Disastrous Invasion of Logic into Mathematics 257 fact - as if those propositions were something perverse or dirty that must not even be mentioned. Of course, all this is not due to a mere oversight, but related to Wittgenstein’s reductionist aim of analyzing the meaning of mathematical and other abstract notions in terms of what we ‘do’ with them (and, bien entendu, this ‘doing’ was not meant to include proving propositions about those notions). It is perhaps worth noting that Wittgenstein’s philosophy left room for different Lebensformen or forms of life (so to speak theoretically, not necessarily in his personal relations), much less for different forms of thought, and hardly any for different Seinsformen, that is, different kinds of entities such as those studied in mathematics. A second property of proofs which Wittgenstein stressed (as essential), was this: proofs must be graspable and memorable (¨uberschaubar und einpr¨agsam), i.e. easy to take in and remember. In short, some kind of simplicity is meant. But all this is clearly secondary, as long as there are (genuine) doubts about the principles of proof that are used. If the possibility of mistakes of principle is considered at all, this comes before the question of the complexity of individual proofs (built up by use of those principles). Once again, Wittgenstein hems and haws, but does not take a clear stand on this matter of validity, specifically, the validity of principles of proof used in current mathematics. (Nobody denies that, at various periods of history, people experimented with dubious principles - more or less aware, according to the individual involved, of their weaknesses).10

There is a real issue here if we have two sets of principles, P1 and P2, such that the principles P1 seem more reliable than P2 but, at least statistically, the same theorem has simpler proofs by use of P2 than by use of P1 (and, tacitly, the simpler the structure of a proof, the smaller the probability of an incorrect application of the principles considered).11 The views 1 and 2 below, concerning the validity of principles used in cur- rent mathematics, seem to me basically sound, and in accord, in effect if not in intention, with the general aims of the Remarks (if it is remembered that Wittgenstein’s specific criticisms of set theory apply to those early expositions he

10Historical discussions of the infinitesimal calculus in the 17th and 18th centuries seem to me to underestimate the degree to which mathematicians at that time were aware of its heuristic character. In particular, I see no evidence that Newton’s only reason for writing Principa without using calculus was his vanity; he was certainly aware of some defects. 11 The issue demonstrably arises if P1 are the principles stressed in the literature on socalled constructive foundations, for example, finitist or predicative methods, and P2 the principles codified in, say, current axiomatic set theory. Actually I have long been skeptical about doubts concerning (the validity of) P2. But until the late fifties I had the impression that P1 might have some virtues (apart from having been discovered before P2). Naturally I did not expect to be able to specify those virtues before taking a closer look at P1, in particular, before giving a (more) precise characterization of P1. Equally naturally, I could not be sure of my impression, and therefore explicitly described the project of characterizing P1 as a calculated risk. It is quite clear that I underestimated the risk, and that work on P1 has been subject to the law of diminishing returns for a quite long time. The Disastrous Invasion of Logic into Mathematics 258 knew, which were really either defective or shallow).

1. There are no realistic doubts concerning the concepts used in current math- ematical practice and concerning their basic properties formulated in the usual axiomatic systems. This applies both to (the usual) theories of sets and to current intuitionistic mathematics including the theory of lawless sequences.12

2. There is, in my opinion, no hint of evidence for the assumption

3. that any analysis of the usual concepts or of their basic properties could im- prove in any general way on the mere recognition of their validity (asserted in 1). Naturally, no analysis can increase essentially the degree of certainty or reliability of usual practice which, by 1, is supposed to be 100%; but I do not even see any hint of evidence at the present time for any analysis which is likely to improve in any general way our understanding of our own knowledge of the concepts mentioned.13

Granted 1 and 2, the way is open to consider questions of complexity of proofs and rules as principal questions (concerning their reliability) - free of the conflict between this view of the matter and the ‘logical’ view involving only principles of proof and of definition. Many readers will of course object to 1 and 2 as ignoring, rather than solving, their problems; forgetting that, at one time, similar problems were genuine; we can now assert 1 and 2 just because those problems were solved. Perhaps the principal reason for the objection is this: one is tempted to feel that ‘doubts’, for example, concerning validity show a higher philosophical sensibility (or re- sponsibility!) than acceptance - as if doubts and questions could not be equally

12It cannot be repeated too often that there can be and have been realistic doubts (and so the question of doubts is not rejected ‘in principle’). In particular, certain precursors of the concepts mentioned above (sets, lawless sequences) were indeed dubious in a realistic sense, and their foundational analyses were by no means superflous. Incidentally, Hilbert’s reference (in [1926]) to ‘Cantor’s paradise’ was, in my opinion, premature as late as around 1930, since at that time there was no clear description of those sets which were to be considered. This was a matter of genuine doubt because the everyday use of ‘set’ obviously involves a mixture of concepts. 13It goes without saying that special analyses are needed (and possible), for example, to settle specific questions of formal independence. Such analyses may be suggested by very philosophical (and even pretentious) sounding generalities, such as Tarski’s general answer to the general question: What is truth? namely

‘Snow is white’ is true if and only if snow is white.

But usually the principal skill involved is not in finding the generalities, but in spotting prob- lems where, with imagination, those generalities can be used effectively. (Incidentally, Tarski’s general answer seems to be perfectly matched to the general question; given the thought behind the question, one certainly cannot expect more than Tarski’s answer). The Disastrous Invasion of Logic into Mathematics 259 thoughtless or uncritical as acceptance and assertions, respectively. Of course, this point is utterly banal: Wittgenstein’s literary skill has made it memorable (and useful).

Autobiographical remarks To avoid giving the (false) impression that I hold the views just described, espe- cially 1, dogmatically or so to speak off the top of my head, it seems proper to summarize my relevant logical experience, and to document it. First of all, when I was a young student, I had genuine doubts about the usual axiomatic set theory in the following quite concrete way. In my first piece of independent work ([1950]) I showed that (the Skolem form of) GB, that is, G¨odel-Bernay’s theory of classes, had no recursive model - but explicitly dubbed as hypothetical the corollary, of a formula of predicate logic which has a model, but no recursive model. I simply was not convinced of the consistency of GB (which contains the axiom of infinity). I actually went to the trouble of writing a sequel ([1953]) in which the same result was proved for GB without the axiom of infinity. Wang drew my attention to Zermelo’s paper on the cumulative hierarchy of types (as it happened, just before I completed the final version of the sequel), a paper which simply removed my doubts. In short, when I speak of ‘genuine doubts’, I don’t mean imaginings, but facts from my own experience. Incidentally, those doubts were left quite untouched by the circumstance that the usual paradoxes cannot be derived in the usual way in GB - which, then or now, is hardly any evidence for the consistency of GB.14 The second autobiographical remark concerns my long hesitation before study- ing the idea of simplicity or ‘graspability’ (Ubersehbarkeit¨ ) of proofs. I just wasn’t confident about finding a sensible measure in any direct way. First, I tried my hand at analyzing simplicity of principles of proof (cf. P1 in note ), by means of socalled autonomous progressions. Granted that these attempts were pretty faithful to the intended meaning, I soon came to this conviction: if the analyses are (even only) approximately right then those intended principles are just of little intrinsic interest (though - even now - I find that the idea of those analyses has some logical wit). So instead I went back to more traditional questions about proofs, in particular, infinite proofs in intelligently chosen languages with in- finitely long expressions, and, above all, intuitionistic logic. ‘Above all’, because proofs enter into the very meaning of the logical particles (or, more precisely, proofs of certain ‘logic-free’ propositions). What I overlooked was the witless way in which proofs entered! No recondite properties of proofs were involved, no relations between proofs or between proofs and other objects, nothing except

14If this were the principal evidence, it would justify equally Quine’s system NF ! Yet here we have a system which singles out its axioms by a certain property, namely stratification, but blithely uses (the usual) logical rules which do not preserve this property . . . ; not the kind of thing to inspire (my) confidence. The Disastrous Invasion of Logic into Mathematics 260 their ‘logical’ aspect which occurs to us without any experience in mathematics at all! in short, nothing but the hackneyed business:

The proof establishes its conclusion15

(in particular, a logic-free conclusion in the intuitionistic case). Certainly, if this property, that is the validity of proofs, were genuinely problematic, it could be at least one of the principal objects of study in a theory of proofs. But, at least, I had become convinced of 1 and 2 above: to repeat, convinced that questions of validity are by no means theoretically senseless (since such questions were once acute, and will no doubt become acute again), but that they are unrewarding at the present time. And other traditional questions did not seem much more promising (in particular, not much more than: What is truth?, as discussed in note 13). At this stage it was natural to move so to speak to an opposite extreme, in particular, opposite to Hilbert’s proof theory: I went about looking for methods of proof and properties of proof which are trivial for proof theory, but essential for mathematical practice. Naturally, in view of 1 and 2 this ‘essential character’ was to be analyzed by appropriate mathematical measures of complexity. I now consider two examples, involving different measures of complexity.

Explicit definitions To be precise, much of what is called in the literature: explicit definition, expresses a discovery; for example, understanding the notion of circle geometrically, we may then discover that they are loci of points which are equidistant from some fixed point (the center). However, here we think of explicit definitions as introducing new concepts, the definition being usually supplemented by a list of properties (of the new concept), which are proved by use of the explicit definition. As is well-known, this way of introducing a new concept is trivial for Hilbert’s proof theory, because such concepts are in an obvious way eliminable. On the other hand, for mathematical practice they are not only useful, but as it were typical - at least for modern mathematics, which is dominated by the axiomatic method. This proceeds as follows. A structure is defined explicitly in set theoretic or number theoretic terms, and then is shown to be, say, a unitary group: the axioms for unitary groups then constitute the supplementary ‘list of properties’ (of the structure or concept) mentioned above. The choice of such properties - or, as one says, of the proper cadre - is often the key to solving mathematical problems. As a matter of empirical fact, logicians hearing of this practice tend to believe that the role of explicit definitions must consist in this: those properties (or

15Prawitz [1974], apparently seriously, relies on this property as an essential element of a general proof theory. The Disastrous Invasion of Logic into Mathematics 261 axioms) of the structures considered can be proved only by use of ‘strong’ set theoretic or at least complicated number theoretic methods. The latter are so to speak logically necessary for the solution of the problem. Of course such cases are possible, and then the logical point view if appropriate. But, as a matter of empirical fact, the possibility just mentioned is rarely realized even when - in the considered judgment of experts - the use of the axiomatic method is ‘essential’.16 Granted this, one has to look elsewhere for the role of the method described, that is, not in ‘logical’ or ‘proof theoretic strength’. As far as I know, Wittgenstein himself never stressed the role of explicit definitions particularly. And I myself gave them little thought until about 20 years ago (in [1973]). As already mentioned above, the trouble was that it was hard to say concretely what those explicit definitions ‘did’, which features of the proof were being changed to increase its intelligibility. Certainly, proofs are shortened; but, it would seem, only by a roughly linear factor. In any case, we know from experience that length is rarely a decisive factor in intelligibility: we often do not take in a complicated sentence, and then again do take in a whole book. But Statman [1974] made, in my opinion, impressive progress by means of a suitable measure of complexity which is relevant in a large number of cases, in particular, for analyzing the role of explicit definitions. Roughly speaking, he measures the amount of intertwining or nesting of inferences. Evidently, the measure is sensitive to the style of presentation of proofs. He uses Gentzen’s natural deduction, in which assumptions are introduced and discarded (in contrast to the better known calculus of sequents in which assumptions are carried along). Post factum, it is easy to see why the style of natural deduction is well adapted to an analysis of explicit definitions:

the introduction of an explicit definition is, formally, tantamount to an additional assumption.

(Conversely, the comprehension axiom has very much the general look - syntactic form - of an explicit definition). To avoid malaise, readers should realize that (Statman’s) studies of complexity measures (and their proper areas of application) differ completely from those features of logic which make the latter so attractive to many logicians - and, in fact, to most people without scientific experience. Logic looks for ‘fundamental’ factors, preferably just one (or else points out piously that everything must be considered on its ‘merits’). When several factors, and especially when judgment in recognizing their relative importance are - obviously - needed, the beginner gets a feeling of hopelessness: Will the distinctions ever end? And even if there

16Incidentally, one does not use the method only heuristically, to find a solution or a proof (of theorems which do not contain the auxiliary definitions), but the ‘detour’ via those definitions remains the standard proof. The Disastrous Invasion of Logic into Mathematics 262 are relatively few factors which dominate in many cases, how shall we know that we have found the right factors? Nothing but a hard look at the actual progress of science and, in particular, of mathematics will really convince readers that things have not been hopeless. Of course, very occasionally, one does find a really fundamental factor, and gets one of those exceptional truly fundamental theories; but this is rare. Historical remark. Wittgenstein’s Tractatus wanted to be fundamental; clean; giving once and for all a schema for analyzing all meaningful propositions. When this one - incidentally, quite banal - schema turned out to be inadequate, Wittgen- stein’s reaction was not to consider any (theoretical) schema at all, but only special cases; cf. Section 2. (It has to be admitted that, statistically, the gifted reader gets a better orientation from well-chosen special examples than from any one very general and therefore, generally, banal schema). Our aim here is simply this: To do a bit better - than giving many special examples - by finding relatively few relevant schemata. To summarize, let us call (mathematical) theory of proofs the study of those properties of and relations between proofs which strike the ordinary mathemati- cian when he reflects on his activity by the light of nature; and let us take Hilbert’s proof theory as an example of the ‘logical view’ of that activity. Cer- tainly - in terms of the title of this chapter - there has been an invasion of (this part of) mathematics by logic. Was it disastrous? Not in any dramatic sense: some conjectures, by Hilbert, suggested by a false conception of proofs turned out to be false in general (but often sound for unexpectedly wide domains of mathematics). But the questions posed in proof theory are not ‘meaningless’ or particularly confused; on the contrary, the questions have turned out to be too ba- nal, incorporating too little of the properties of proofs (in fancy language: of the possibilities of the mathematical imagination) which we have learnt to appreciate (only) after experience of modern mathematics.

Decision Procedures and Decisions This second example concerns a more subtle ‘invasion’ by logic, namely, a some- what exaggerated idea of the role of socalled logical languages, for example, of first order predicate logic. Obviously, nobody could reasonably deny that the discovery of such languages was useful, specifically, for the development of model theory and its applications mentioned in the Introduction. Equally obviously, only logical cripples would be tempted by the absurd claim that only definitions formulated in first order predicate logic (that is, of finite structures or of classes of structures) are well determined. The exaggeration to be considered here, is less crass; it concerns the ideal form of a (mathematical) problem. The example of real closed ordered fields seems to me very instructive. As The Disastrous Invasion of Logic into Mathematics 263 shown by Descartes in special cases and by Sturm generally, there are relatively simple formulas involving a, b and the coefficients of the polynomial p(x) to de- termine how many roots (zeros) p(x) has for a ≤ x ≤ b. If the coefficients (and a and b) are rational or algebraic, those formulas permit effective decisions. Tarski, who (unlike the average mathematician of the twenties) knew first order predi- cate calculus, easily realized that essentially the same formulas allow a much more general conclusion: not only the number of roots, but all questions formulated in the first order language of fields are ‘decided’. To be precise, the same axioms (that is, properties of fields) used by Sturm to determine the number of roots, determine the truth value of an arbitrary (closed) formula in the language con- sidered. Tarski realized that, in general, this determination was no longer simple, and so looked for applications in algebra, for which the evaluation of truth values was not needed, but only the fact that they are determined (by the axioms in question); in particular, if a formula is true in one real closed field, it is true in all. Clearly, for this purpose, it was obviously sensible to look at all formulas of predicate logic (and possibly even at larger classes of formulas to which the ‘transfer’ principle above applies). The trouble began when people started to get interested in the efficiency of decision procedures, and assumed - consciously or not - that the language which is well adapted to model theory, is also appropriate here. In other words, they assumed that the ‘ideal form’ of the decision problem for real closed ordered fields should deal with all formulas of the first order language (of fields). They found socalled upper and lower bounds, namely 22cn and 2cn, respectively, where n is the length of the formula.17 These bounds are not known to be optimal. Incidentally, the upper bound is furnished by a particular decision procedure which uses axioms of logical complexity bounded by a certain function γ of the complexity c of the formula to be decided (the axioms in question come from the schema expressing that every polynomial of odd degree has a root). If it should turn out that we get more efficient decision procedures by using - logically (of course) unnecessary - axioms of logical complexity greater than γ(c), we would have another example of the inappropriateness (here) of the logical point of view. The possibility just mentioned of an efficient use of unnecessarily complicated axioms is by no means purely ‘theoretical’, but suggested by the following exam- ple from the literature concerning a problem posed at the time of Newton (and Gregory) but solved only in the second half of this century by Sch¨utteand van der Waerden [1953]:18

How many points can be placed on the unit sphere if the distance between any two points is to be ≥ 1? Specifically: 12 (Newton) or 13 (Gregory)?

17For the precise meaning of ‘bound’ here, see Rabin [1977]. 18See Kreisel [1975] for a discussion. The Disastrous Invasion of Logic into Mathematics 264

The solution uses trigonometric functions (and their properties), which are not generally available in arbitrary real closed ordered fields. Since, however, the old question can be expressed in the first order language of fields, we know that there is also a proof of the solution using only the axioms of real closed fields (and rules of ordinary predicate logic). Moreover - and this could or, at least, should interest proof theorists - the proof by Sch¨utteand van der Waerden can certainly be formalized in type theory, and transformed, by normalization, into a partic- ular first order derivation from the axioms for real closed fields. But now the axioms may well be more complicated than γ(c), where c is the complexity of the problem solved. It stands to reason that the trigonometric functions will be ap- proximated (by polynomials), and their relevant properties (inequalities) proved for the approximations. For this one needs polynomials of pretty high degree, that is, pretty long segments of the Taylor series for trigonometric functions.19 In short, it has to be admitted that the work on upper and lower bounds for decision procedures (for real closed ordered fields) looks more definitive than it is. Indeed, since the lower bounds are so large that a really practical procedure is excluded, the work on pushing down upper bounds would seem to be only of pure mathematical interest. But all this depends on the tacit assumption that a decision procedure ought to cover all formulae of first order language. If the assumption is dropped, the most obvious conclusion from the lower bounds is simply this: the full first order language is not appropriate! And one would now look for a subclass of that language where the ideas used for pushing down the general upper bound lead to a truly efficient procedure (that is, on that subclass, the bound is much smaller than 2cn): one expects to use those ideas to find an appropriate domain for a decision procedure. To summarize: logic, in the form of the first order language of predicate logic has certainly ‘invaded’ the part of mathematics or, more precisely, of mathe- matical logic which deals with decision procedures, and it is fair to say that this invasion led to a, statistically, unrewarding choice of problems. But, once again, it would be an exaggeration to speak of a ‘disaster’. For one thing, mathematicians have continued to solve problems (completely) which depend on some parameter, without talking grandly of ‘decision problems’; for example, for which dimen- sions d over the reals are there (not necessarily commutative) division algebras? Secondly, it must not be forgotten that the choice of manageable classes of prob- lems has often required a great deal of imagination, incomparably more than the

19Open problem. The example above seems an excellent candidate for a study of measures of complexity of proofs. Certainly, the ‘abstract’ type theoretic mathematical proof is simpler than its logical normal form (obtained by normalization). It would seem worth while to examine if the normal form is much longer, or, if not, whether for example its genus (Statman’s measure of nesting) is radically increased. Of course, if the difference in length is really negligible, the example does not provide any evidence for expecting shorter decision procedures from the use of logically unnecessarily complicated axioms; it only draws attention to this possibility. The Disastrous Invasion of Logic into Mathematics 265 classification according to ‘external’ logical form.20 General remarks on limitations of the logical view (represented above by first order predicate logic or recursion theory). Perhaps the most striking feature of the logical view (which is certainly a defect by ordinary scientific standards) is this: easy early results on general logical questions, which have appeal and even a certain glamour for beginners, surprisingly rarely develop into ‘deeper’ refinements.21 Thus there are genuine doubts about the objective fruitfulness of the logical view (‘objective’ since there is little doubt that at least some of the people who have attempted to develop this view have great talent). All this seems quite consistent with G¨odel’sunquestionable successes, since they all concern a very early ‘pioneer’ stage of the subjects he considered.22

Computer Science As seen above and stressed in the Introduction, contemporary mathematicians do not tend to overestimate the value of logic. On the contrary, it seems likely that, in time, much more of contemporary logic will become an integral part of standard mathematics than the average (contemporary) mathematician expects. In contrast, at least judging by my very limited experience of computer scient- ists,23 here the situation seems quite different. There seems to be a fairly widely accepted ideal in computer science, of a universal programming language, prefer- ably together with a socalled universal semantics. This ideal surely comes from the corresponding ideal of pretentious logic, for example, the universal semantics of Tractatus considered in Section 2. Of course, such an ideal (for example, of a universal semantics) is reasonable enough if one really has as good a view of what objects one is talking about as the (atomic) physicist has of the atomic structure of the matter he studies. But if one doesn’t, the chances are that one has over- looked something essential: and the grander the (universal) scheme, the more far reaching will be the consequences of an oversight. More specifically, (premature)

20Probably the most perfect case in point is furnished by diophantine equations p(x1, . . . , xn) = 0). The ‘logical’ classification uses the number n of variables and the degree of the polynomial p; the ‘mathematical’ classification uses the genus of the variety determined by p = 0. 21For example, some of the early, mathematically quite trivial results (in fact, little more than remarks) in recursion theory have a certain permanent interest, albeit on their original very elementary level; but attempts to develop them (for example, in degree theory) have been unrewarding: the ingenuity used turned out to be quite disproportionate to the results achieved. 22In my opinion this applies even to the theory of the ramified hierarchy of sets, inasmuch as that theory altered its character before significant further progress (by Jensen) was possible: G¨odel’s‘logical’ definition of that L-hierarchy had to be replaced by the more mathematical J-hierarchy. 23Warning. My experience is patently biassed because most of the computer scientists who (want to) talk to me at all, are or were logicians. And if I leaf through journals on the subject, I hardly ever understand the jargon - except the jargon of papers written by logicians. This reservation applies to the whole discussion below. The Disastrous Invasion of Logic into Mathematics 266 universal schemes in the past seem to have had the following defects. First of all, such schemes diverted attention from less grandiose, but much more effective local schemes. One example was mentioned above, concerning the proper choice of ‘decision’ problems. Another well-known example is provided by the language of set theory as a ‘universal’ language for mathematics which, for a long time, diverted attention away from the much more useful enterprise of finding a few concepts, the socalled structures-m`ere (not: the one concept of ‘set’) in terms of which many mathematical concepts are built up in a genuinely manageable way.24 The second defect, illustrated by the preceding sentence too, is this. Gener- ally speaking, people tend to be disproportionately uncritical towards universal schemes. To be specific, the same scientist who, on the one hand, is prepared to go in great detail into, say, a universal programming language and its variants, and to derive in it - with great satisfaction - a host of familiar baby programs (even if there is not a single novel application) is, on the other hand, often not prepared (for example, in connection with numerical computation) to examine a measure of complexity without knowing exactly, preferably in familiar terms, which features are being measured. In short, the glamour of glittering universal schemes is blinding. Even if my impressions described above, concerning the present state of com- puter science, are sound it would be premature to speak of any disastrous invasion (by logic). For one thing, the situation is not static; for example, years ago there was quite a lot of interest in the rater fruitless ‘abstract’ style of complexity theory, while several of the people involved have gone over into more realistic complexity theory (see Rabin [1977]). Secondly, to a certain extent, ordinary logic simply fills a certain void in an intellectually agreeable way: whatever doubts there may be about the role of the λ-calculus as a universal programming language or about its models as providing a universal semantics, there is no doubt of the mathe- matical and aesthetic value of the best work on the λ-calculus. Statistically, it is hardly likely to do harm inasmuch as there are no promising alternative with the same aims anyway - and people attracted by this work are probably not likely to be attracted by problems in computer science which are at present genuinely manageable. However, there does seem to be a genuine possibility that, because of the mathematical attraction of such studies as the work on the λ-calculus, people will simply become accustomed to its explicit aims, and the corresponding ideal form of problems. For any such ideal form, as illustrated by the discussion of decision problems earlier on, demonstrably optimal solutions may be (demonstra-

24The irony of the situation is that the language of set theory is by no means all that universal! For example, the set of natural numbers which are G¨odelnumbers of true formulae of set theory is of course not definable in that language (a point often forgotten in discussions of universal languages). The Disastrous Invasion of Logic into Mathematics 267 bly) impractical. From this one concludes that ‘there is nothing to be done’ - and overlooks that relatively few but nonideal methods can solve relatively many problems that actually arise in practice. For what it is worth: my own impression is that at least at the present time computer science needs systematic schemes less than resourcefulness and flexibil- ity of mind. Anecdote: in connection with the qualities just mentioned I remember Wittgen- stein talking about one of his pupils (whom he liked very well): If she found herself in front of a lamppost, all she’d ever think of doing would be to press her nose harder and harder against it; she’d never dream of stepping back and walking round it.

15.2 Philosophical Understanding: Questions of ‘Principle’

As I see them, the bulk of philosophical questions present principally a pedagogic problem of literally perennial interest; incidentally similar to that of other ques- tions which move us, especially in adolescence. Realistically speaking, we often understand questions quite well and can be confident that they, or at least some- thing like them, are important for us, before we have the experience or knowledge to give (or understand) a significant answer (that is, some perfectly reasonable answers simply cannot be made convincing by reference to our limited experi- ence25). It is, in my opinion, a sheer assumption that precise analyses of these questions are generally rewarding (even if in most cases there is no reason to doubt the possibility of arriving at perfectly definite analyses - at least, after a couple of elementary distinctions). Presumably, philosophy in its literal sense, as the search for wisdom, should help one come to terms with questions or question- ers that are not ripe for an answer (on given limited experience), and the strong feelings that, as a matter of empirical fact, are often aroused in such situations. I really have no view on the question what if anything to do about this (a question which may not be ripe for an answer on my limited experience). So I shall simply ignore this - obviously principal - aspect of philosophical questions, and continue to treat them with a straight face. In keeping with its title, the main body of the present chapter concentrated on pragmatic issues, touching matters of ‘principle’ and their (philosophical) un- derstanding only incidentally. Specifically, such matters arose only inasmuch as (what was called) pretentious logic (cl)aims to treat questions of ‘principle’, and was shown to be in conflict with practical needs. Also, there were asides on the

25For a very eloquent exposition of this state of affairs, see Solzhenitsyn’s Nobel Prize Lecture. Reviews of this lecture, as of most of his other writings, are inadequate because of the usual (journalistic) preoccupation with his views of current affairs. The Disastrous Invasion of Logic into Mathematics 268 banality of some such questions (which are so superficial that we understand them without any experience with specific subject matter) and, presumably therefore, frequent disappointment with literally correct answers.26 It may be worth while (or at least to the taste of some readers of this volume) to go into those ‘incidental’ points separately -perhaps in a more general context than that of proofs and rules discussed in this chapter. Tractatus seems to be a good context for this purpose (and not too general): both with respect to general aims and to style, it is a good standard of comparison - or paradigm, as Wittgen- stein would have said - for traditional philosophical writing. Even if its answers are not ‘literally correct’ (beyond a very narrow domain), it is, I think, suitable for illustrating the kind of disappointment mentioned above. Wittgenstein was disappointed and disillusioned with it; at least if one goes by his ripe judgment, expressed in the mottoes to Tractatus and Philosophical Investigations, and not by the lively style in which he discussed privately the errors of Tractatus. To put first things first. The general idea of Tractatus is certainly not hard to understand. Besides, both in his diaries and in private conversations Wittgen- stein explained that idea. The model for an analysis of (meaningful) propositions (expressing possibilities, or ‘possible facts’) is the chemical analysis of molecules: logical atomism is modelled on chemical atomism.27 The logical ‘atoms’ were to be - logically independent - elementary propositions. Certainly such analyses are very familiar from elementary combinatorics and probability theory where one lists independent possible situations and considers combinations (or permu- tations) of them. I do not think that anybody was ‘bewitched’ by this model or picture; one was simply hugely impressed by the successes of chemical atomism, and then disappointed by the lack of success of logical atomism. The real question seems to me to be this: Was there enough of a genuine idea in logical atomism? Was it comparable (enough) to the novel ideas involved in the introduction of chemical atomism, to have high expectations? Put differently, whatever the objective defects of logical atomism, is its failure so surprising as to destroy confidence in anything ‘like’ it - and to justify a complete turn-about as represented, for example, by the style of the Investigations? The answer seems to me clearly negative. Firstly, chemical atomism was not a matter of ‘principle’ (as it had been at the time of the Greeks): a specific, relatively short list of atoms was available. Secondly, as far as constraints on combinations are concerned, numerical valencies - so it seems to me - are already several orders of magnitude more sophisticated than the truth functions used in logical atomism. Thirdly, even before a theory of geometric aspects of chemical

26As illustrated by: What is truth? For balance one should always remember the one striking exception among traditional philosophical questions: What is matter (made of)? 27 For example, possible combinations CkHmOn are given by those triples (k, m, n) which are consistent with the valencies of carbon, hydrogen and oxygen. The Disastrous Invasion of Logic into Mathematics 269 atomism (that is, the geometry of the chemical bond discovered by Pauling by essential use of quantum theoretical properties of subatomic particles), the pos- sibility of this kind of refinement had been evident. No comparable direction of refinement was mentioned in connection with logical atomism. Finally, the business of logically independent elementary propositions was wholly unrealistic. One did not even find them in socalled ‘artificial’ languages, the simplest kind of formal systems, if ‘atomic’ (that is, logic-free) formulas were to be regarded as elementary: Is 0 = 1 supposed to be logically possible? It just doesn’t have the right smell. Using a little hindsight, one should have been immediately suspicious of Wittgenstein’s idea that an analysis into elementary logically independent propo- sitions ought to be plausible. After all, it is precisely the existence of logical relations between - apparently - atomic propositions which leads to the familiar paradoxes.28 In view of all this it is little short of ironic that Wittgenstein had to wait for enlightenment till Sraffa pointed out that some Italian gesture did not lend itself to an (obvious?) analysis by means of the schema of Tractatus! as if this failure were its principal weakness . . . Besides, realistically speaking, even the best of analyses or theories will only apply to a minute fraction of the phenomena we encounter: we have to search for - preferably - striking ones which lend themselves to theoretical analysis.29 Granted all this, it cannot be denied that there is something unsubstantial here - about Tractatus itself and about its oversights. It is easy to understand but difficult to agree with Wittgenstein’s own reaction, of looking for ‘profound’ mistakes or misconceptions that led (him) to the fiasco. He certainly felt that his view in Tractatus was very narrow, his paradigms extraordinarily special. He widened his view in the Investigations, but still staying in the range of the most familiar kind of experience; perhaps simply because the questions asked are intelligible and, in fact, occur to us, when we have only this kind of experience (and it would be elegant to answer them by reference to only such experience). Perhaps with skill and imagination it is possible to give answers to some tradi- tional questions in terms of such experience. But in view of the specific examples discussed in Section 1, one would expect difficulty in making those answers con- vincing. For 2000 years there have been reasonably intelligible questions about the roles of socalled abstract and concrete (or concretely realizable) notions in mathematics, with Kant’s distinction between (philosophical) analysis of abstract notions and (mathematical) constructions by imagining suitable concrete config-

28For example if T is the truth predicate for a (meaningful) language L, F ∈ L and F 0 is the formal negation of F , then T (F ) and T (F 0) are atomic, but satisfy: T (F 0) ↔ ¬T (F ). 29This lesson of the history of science is often ignored by philosophers, particularly by philoso- phers of socalled natural language, who actually say (out loud) that there ‘ought’ to be a theory of natural language just because we know so much about our own language; thereby overlooking that this knowledge need not be theoretically manageable. It was much easier to find a theory of celestial than of terrestrial mechanics. The Disastrous Invasion of Logic into Mathematics 270 urations. How much easier is it all nowadays when we actually have a body of set theoretic abstract mathematics before our eyes! Similarly, the aspects of ex- plicit definitions and of decision procedures, discussed in Section 1, could not, I think, have been made really convincing without detailed references to higher mathematics. And when we are interested in problems of meaning, are we really clever enough to solve (any of) them without looking at the phenomena of animal communication, where often we genuinely do not know the meanings of - clearly meaningful - sounds and signs? (In our own case we may not always be able to give chapter and verse; but realistically speaking we often know better what people have in their heads than in their bellies). More generally, and perhaps speaking only for a small minority (not: the silent majority), I find it hard to have confidence in our whole ‘critical’ philosophical tradition, with its paradoxes, its dramatic claims either to see profound errors in our ordinary views or profound misconceptions in 2000 year old questions. It all sounds like a paranoid’s paradise, and forgets the most striking fact of intellectual experience: how our thoughts seem to adapt themselves to the objects concerned, as we study them and get familiar with them (in a detached way) and how, with this familiarity, comes the judgment needed to distinguish between plausible and implausible theories, substantial and superficial contributions.30 Of course, it is a separate matter whether we are even remotely conscious of the steps involved, or (even) capable of making the essential ones conscious to ourselves.

30A little known, but quite beautiful example of a false theory which is almost unsurpassably plausible on limited knowledge is given by Crick, Griffith and Orgel [1957]. Granted that the genetic code for amino-acids uses 3 letter words from a 4 letter alphabet, the authors give simple conditions for a rational code which determine essentially a unique code which on the one hand permits exactly 20 amino-acids (in accordance with observation), and on the other hand turned out to be utterly false. Chapter 16

All Progress Looks Bigger Than It Is

The title is the motto1 of Philosophical Investigations, and says in effect that the ratio of actual progress (as judged by mature reflection) to apparent progress (measured by expectations after a few initial successes) is generally poor. Here the motto is applied to progress in traditional philosophy, and compared to progress in some of the younger ‘heirs’ of philosophy such as the natural and mathematical sciences. As a matter of historical curiosity, the motto was not Wittgenstein’s first choice. As he told me, he would have preferred Bishop Butler’s tag Everything is what it is, and nothing else, if G.E. Moore had not used it earlier. But Wittgen- stein stuck to the motto up to the end of his life. So, at least for him, the motto had priority over the many passages in his writings with which it conflicts, a point which was ignored by many of his disciples.

Summary Section 1 draws some general, but neglected conclusions from the obvious fact that traditional notions and questions occur to us when we know little; usually too little to discern even roughly the nature, the rewarding sides of the questions, or the kind of (intellectual) tools needed. In consequence traditional matters often lose their interest when we know more. These conclusions apply to all branches of traditional philosophy, and so are not affected by controversial views on the scope of the subject. Section 2 concerns the specific branch of philosophy which searches for defini- tions of common notions in familiar terms, after the model of Euclid’s geometry which appeals to all of us (and impressed Plato as relevant). This search is seen

0Originally published in Grazer Philosophische Studien, 6 (1978) 13–38, as ‘The Motto of ‘Philosophical Investigations’ and the Philosophy of Proofs and Rules’. 1Uberhaupt¨ hat der Fortschritt das an sich, daß er viel gr¨oßerausschaut, als er wirklich ist. 271 All Progress Looks Bigger Than It Is 272 to be particularly inappropriate when applied to the kind of notions which oc- cur in connection with literal family resemblances; inappropriate for studying the phenomena involved, and especially our process of recognizing them. Contrary to a widespread opinion among philosophers, the defects above are not corrected by greater precision of the definitions. Section 2 concludes with occasional practical uses of traditional definitions for replacing the actual process of recognition (here: of family resemblances) by artificial means. Section 3 goes into Wittgenstein’s own principally pedagogic aim for (his par- ticular ‘heir’ of) philosophy, summed up in the slogan about freeing us flies in the fly-bottle from ‘bewitchment by our language’. To distinguish this from ordinary instruction and the automatic clarification brought about by extending knowl- edge, the part of pedagogy involved is here called intimate. Section 3 describes some features of Wittgenstein’s later style of such pedagogy, and compares it with that of another ‘heir’ of philosophy, early mathematical logic. However, the ex- position in Section 3 introduces this twist. Results which are sometimes thought of as basic in the sense of deep, far from the surface, are here used for pedagogy on what occurs to us when we know very little, in short, they are basic only in the sense of exceptionally elementary. Such matters are obviously of perennial interest since we can rely on a perennial demand, each of us starting off knowing very little explicitly. The Appendix applies the motto in the title to progress in mathematical logic, and reexamines logicism in the light of experience in ‘advanced’ mathematics, distinguishing sharply between the roles of logical language and logical reasoning. Disclaimer. It is beyond the scope of a chapter, and certainly beyond my abilities, to give adequate support for the general conclusions summarized above. The main issues, on the value of notions and questions in traditional philosophy compared to the alternatives provided by its younger heirs, are severely statistical, needing data often hard to come by. In such matters our impressions, alias convictions, are well-known to be unreliable: far from being peculiarly subjective, questions of (heuristic) value need special precautions against undue ‘influence of the observer on the observation’. Interested readers will find relevant additional historical and autobiographical data in Chapters 14 and 15.

16.1 General Features of Traditional Philosophy and Some of Their Implications

As mentioned in the Summary, traditional notions occur to us when we know very little. Statistically they cannot be expected to remain rewarding when we know much more. To be specific: when we know very little, we tend to see superficial, abstract features of objects. And when we do see specific features we often cannot say very well (cannot define in familiar terms) what we see. All Progress Looks Bigger Than It Is 273

Traditional philosophy makes those abstractions its principal objects of study (and regards them as the height of sophistication to boot).

Extending experience When we know very little, the main intellectual tools available are a sense of coherence and, more generally, introspection. It is probably fair to say that the use of these tools (rather than subject matter) is the superficially most distinctive feature of traditional philosophy compared with its younger heirs; and it is almost certainly its most attractive feature for contemporary philosophers who dislike the need for extensive preliminary studies in the sciences. Without exaggerating dramatically the unreliability of these tools, we simply find we do better by looking ‘outward’, by extending experience. An obvious example of the use of extended experience comes from traditional questions on space and time. They profited greatly from (Einstein) paying attention to bodies moving near the speed of light; but only after technology had developed enough to study such bodies. Of course, there are exceptions where extensions are uneconomical, well illus- trated by some famous criticisms of errors.

1. Wittgenstein himself often succeeded by paying attention to neglected ex- perience, a much weaker kind of ‘extension’ than the novel experiments or technological advances used in the sciences. Thus to point out an error in St. Augustine’s particularly narrow ‘theory’ of language, it was sufficient to mention that language is also used for com- mands.2 However, similarly weak extensions are not likely to be sufficient for handling less blatant errors, let alone for a genuine, positive theory of language. In fact, one wonders if one can afford here to neglect information about animal communication, now that we have some tools to record such information in detail.

2. Within the sciences, especially at an early stage, imaginative use of very familiar experience is, occasionally, spectacularly useful. The standard example is Galileo’s refutation of his first idea concerning freely falling bodies, namely that the velocity is directly proportional to the ct distance s fallen. He noticed, in modern terms, that then s = s0e , and so a body would never start to fall. So the proposed law patently conflicts

2Though sufficient for pointing out the error, a mere passing mention would not have the following literary value of Wittgenstein’s own exposition. He makes vivid to us the feelings about our own knowledge which present themselves to us, often quite fleetingly (as do other literary works, except that they usually concern our feelings about other people or about our inner life, metaphorically or literally speaking). Like many other literary authors, in this respect he may have helped some of us readers more than himself. All Progress Looks Bigger Than It Is 274

with very familiar experience (of bodies which start to fall): in this sense the law is as absurd as St. Augustine’s view of language.3

Doubts about the nature of traditional questions When we know very little compared to the scope of a question we are often bad at guessing even remotely the methods needed for a satisfactory answer, though often we recognize such an answer immediately when we see one. (If this fact contradicts some norm or doctrine of ‘sense’, or of what questions and answers should be like, so much the worse for the doctrine). The question asked by Greek philosophers: What is matter? provides an ex- cellent illustration, inasmuch as there was genuine doubt whether it belongs to natural or analytic philosophy (the latter being closer to traditional philosophy than its scientific heirs). Of course, Plato and Aristotle would have been thrilled by the answer of contemporary physics to: What is matter?, and of course they would not have remained transfixed by their own preoccupations with ‘form’, ‘essence’, etc. And, of course, they would have found the physical properties of matter incomparably more rewarding than the analytical properties of ‘matter’, once they had the best modern presentations of both kinds of properties before them. However (for the sake of perspective) it is equally important to stress that some 300 years ago the best that physics was likely to do with the question was much less satisfactory; not only less so than contemporary physics, but also than contemporary analytical philosophy. Fortunately, the matter is not purely hypo- thetical. Newton happened to be very much interested in the question; specifi- cally, in establishing an atomic structure of matter. All he managed to produce was almost literally the same argument as Wittgenstein’s for the determinateness of sense, except that Newton referred to an ordinary stick instead of the sword Nothung. Newton’s argument, patently, was not used for developing modern atomic theory. Actually, this example understates the relative value of analytical philosophy since here its contribution is infinitesimal, though perhaps of a different order from Newton’s. But there are also analytical contributions of obviously permanent value, for example (if a pun is permitted), Cauchy’s elimination of infinitesimals

3Galileo’s proposed law, being an ‘empirical’ matter, could also have been refuted ‘empiri- cally’ by careful measurements of freely falling bodies (which would show that the law is not even approximately true, not even well after bodies have started to fall). Galileo discovered that, modulo a very little mathematics, one got a more efficient refutation if one looked a bit at the mathematical properties of the proposed law before making experiments. Naturally, this does not cast doubt on the possibility of using the jargon of philosophers of science precisely; the significance of their distinctions (here: between mathematical and empirical refutations) is in doubt. All Progress Looks Bigger Than It Is 275 and limits by contextual definitions in terms of convergence.4 Another famous success in the style of analytical philosophy is von Staudt’s answer to: What is a point at infinity? in terms of bundles of straight lines; generally through a point, with the extreme case of bundles of parallel lines. Digression. In view of all these earlier successes, it is objectively a bit odd that Frege’s or Russell’s analyses of the definite article by contextual definition are often considered highlights, if not the beginning, of analytical philosophy. But it may be instructive to some readers to reflect on the subjective sources of glamour of the business about the definite article.

Conclusions As for general conclusions to be drawn, nothing said so far suggests any basic error or misconception in the bulk of traditional philosophy. The difficulties are subtler, quantitative, concerning the frequency of successes and failures. The single best hope is to develop some judgment for recognizing which ques- tions are philosophical in the sense just described; in other words, for recognizing if a treatment by use of those specifically philosophical, attractive tools described above, can be rewarding. In particular, one must develop some judgment on whether extensions of experience are liable to modify a given question or its interest.5 Of course, the easier part is to learn what not to do. Probably the single most common mistake to avoid here is the assumption that questions which occur to us prior to (detailed) experience ought to have specifically philosophical, a priori solutions. To be more concrete, Section 2 discusses in a general way a quite broad class of notions which tend to invite unrewarding philosophical analysis.

16.2 Successes and Limitations of Definitions

We first recall the successes of definitions as a tool of traditional and analytical philosophy, and then turn to some negative aspects.

Definitions in geometry The most striking successes are to be found, both phylogenetically and onto- genetically speaking, in the discovery of definitions in geometry, for example, Euclid’s of circle, but also (some 2000 years later) of convex (a body B being

4Incidentally, Wittgenstein repeatedly told me how much he appreciated Cauchy’s philo- sophical finesse, and success in analyzing limits away. 5Reminder. As we have seen above, there are questions which do lend themselves to reward- ing philosophical analysis! So the matter of developing judgment is not illusory. All Progress Looks Bigger Than It Is 276 convex if all segments joining any two points of B lie wholly in B), with defini- tions of length, area, volume, etc., in between. As the word ‘discovery’ implies, these definitions represent relations between independently understood notions, and certainly not arbitrary conventions or rules of language - even though, after discovery, they may be used efficiently as such rules (and justified by consistency proofs, incidentally very much in the style of analytical philosophy). Like the contextual definitions mentioned in Section 1, those geometric definitions clarify our common notions, and occasionally solve puzzles. But they do much more besides: they have become an integral part of our intellectual equipment, being used constantly for advancing our knowledge of circles and other convex bodies. The undoubted importance of definitions has led to quite exaggerated claims, for example, that all precise reasoning must go back to definitions of the objects considered. This is obviously contradicted by the style of reasoning known as the ‘axiomatic method’, where one starts off with properties of the objects (expressed in the axioms). Some of these axioms are a priori, in the sense that no reasons that we can give for them (for example, on how they are learnt from experience) are as convincing as the axioms themselves; in terms of Section 1, they are best treated as a priori. Though fairly close in form to geometric definitions, such concepts as perfect rigid body or ideal fluid (in so called rational mechanics or hydromechanics) pro- vide a sharp contrast to Section 1. These notions express quite correctly how we think of the world, how reasonable (rational?) solids and fluids ought to behave. But it turned out that most fluids behaved differently, and so the concepts were of little use for understanding the fact of hydrodynamics. Then there was special pleading for some kind of ethereal mathematical interest for the (physically dis- credited) notion of ‘ideal fluid’. This was a shame, inasmuch as most properties of that notion which were treated in the early literature had no interest without the hydrodynamics vocabulary. Sure, the results were mathematical: but not all mathematics is mathematically interesting! In fact (and this is becoming gener- ally recognized) imagination and resourcefulness are needed to put mathematical notions in their proper place, often by disentangling them from the situations where they first happen to occur to us. The parallel in philosophy is obvious, whenever some ethereal philosophical interest is claimed for what is simply boring by the light of nature, for example, the logicist definition of the number one (to Poincar´e).

Family resemblances of concepts Returning now to definitions in the traditional style, it has to be admitted that the law of diminishing returns generally applies as (successful) theoretical sciences develop. But there are also areas where such definitions are unrewarding even at an early stage, in short, where nothing sensible can be done with philosophical tools. All Progress Looks Bigger Than It Is 277

As I see it (now), Wittgenstein’s slogan of family resemblances reminds one of a class of phenomena where the limitations of the traditional style are excep- tionally vivid, and hence instructive. I mean the phenomena of literal family resemblances, say, of the Hapsburgs or the Bourbons.

What can we realistically expect from any definition of such a family resemblance, say, in the style of analytical philosophy?

The first thing to expect is, probably, a genuine theory of literal family re- semblances or some kind of practical mastery. As appears almost certain now, molecular biology is the appropriate tool here. But even before the advent of molecular biology, in fact, even before Wittgenstein’s death, it was plausible that something very different from any traditional analysis would be needed, some- thing as far removed, so to speak, from ordinary experience as molecules.6 After all, family resemblances turn up in superficially very diverse categories: in blood types, bone structure (noses), features (hare lips), psychological quirks and ges- tures. We cannot expect to find a common element in ordinary experience. A second use to expect from a definition would be for the study of our actual process of recognizing a family resemblance. It would hardly be expected to go through the steps implied by anything like a traditional definition, especially if the latter is sufficiently complicated, and recognition very rapid. In this case, one expects the process to involve some highly specific innate reaction (which, in grand jargon, has evolved to help us survive in the world in which we live); more simply, in one of Wittgenstein’s favorite phrases, the process is best regarded as a fact of our natural history. The phrase has its defects in that it hides some critical issues.7 But at least it reminds us that traditional definitions are not 6Of course, the main point here, that any useful definition may have to refer to things far removed from currently available experience, is also well illustrated by the modern answer in atomic terms to the old question: What is matter?, already mentioned in Section 1. 7Since not everything we do or say is to be dubbed ‘a fact of our natural history’, it pays to spell out a bit the principal issue. Trivially, an act of recognition is not a brute fact of our natural history if it involves the conscious use of some criterion, or if we have been able to become conscious of some (subconscious) intermediate steps. The principal defect of socalled reasonable assumptions about our reason (which underlie most general schemes for analyzing knowledge) consists in overestimating the extent to which we were able to become conscious of subconscious steps, and in underestimating the precautions needed to guard against undue influence of the analysis on future conduct, leading the victim of the analysis to use different intermediate steps. For example, though judgments of convexity can be justified by reference to the definition given above, its actual - and reliable - perception may well involve a specific built-in reaction in the retina (as in the case of the frog’s eye), of which we cannot become conscious. It is a common place of ordinary learning and communication that practice relies, success- fully, on a built-in predisposition to ‘pick up’ the intended meaning, which would be consistent with a whole battery of innate reactions. Less known, but equally fatal for those reasonable assumptions about our reason, is our experience of physical theory. The visible part of the physical world is certainly most vivid to us, but it is simply not physically significant: it does All Progress Looks Bigger Than It Is 278 necessarily rewarding in a study of family resemblances; in particular, possibly less than the usual style of natural history.8 Following the lesson learnt (in the discussion above on ‘ideal fluids’) about bogus mathematical interest, it is time to look for an area, not necessarily of philosophical interest, where a (hypothetical) traditional definition of a family resemblance has a chance of being rewarding. An imaginative (clever) definition in this style, in terms of familiar things, may well be useful for simulating: not the actual process of recognition of family resemblances, but some of its useful results. In other words, the process is simulated by use of those ‘familiar things’ to which the definition refers. For example, if the definition has mechanical character it might suggest some mechanical device for recognizing a family resemblance from a photograph or a holograph. Put paradoxically, what is left of the ethereal philosophical aim (in the discussion above on ‘ideal fluids’) of establishing a norm by means of a traditional definition, is its technical use for artificial simulation. But there is a quite essential difference:

The traditional aim is taken to be achieved the moment we have some correct definition in terms acceptable to the particular scheme used. For the new aim, of simulation, the solution must be economical, too.

In short, simulation imposes stricter adequacy conditions.9 As a corollary the traditional aim occasionally retains some heuristic value: being less demanding, it is less frightening, and as a general rule, easier to satisfy. The observations above and the consequences drawn from them may well be ‘revolutionary’ for the philosophical tradition - in accordance with the usual view of the revolutionary character of Wittgenstein’s later writings. But there is nothing here that is in the least revolutionary for the silent majority, notoriously skeptical of the philosophical tradition. As one spokesman for the silent majority not lend itself to an autonomous theory. In short, not only in the emotional part of mental life, but also in perception and reasoning, the conscious part is not of much use for theoretical understanding. Of course, this part is real enough, and of (literary) interest inasmuch as it constitutes our view of our inner world, just as the visible part of the universe makes up our view of the external world. 8Warning. There is a wide spread, but uncritical assumption that some kind of definition is prior to any study of the objects is considered, to be used as a norm. Patently, there is no need for an explicit (verbal) definition if one can rely on recognizing the objects. It is a subtle matter to choose between definitions of the same object, even in the case of traditional definitions. It is separate matter whether there is a fundamental choice, appropriate both for a theory of the objects and for analyzing our process of recognizing them. 9A prime example of the kind of simulation in question is provided by the definition of logical validity in terms of derivability, say by Frege’s rules. Those rules were presented as providing a norm, a criterion of precision, allegedly superior to ordinary reasoning. They are not, and therefore (as elaborated in the Appendix) are rarely used in practice; in contrast to the constant use of the definitions of ‘circle’ or ‘convex’ in geometry. On the other hand, formal derivability is obviously relevant if the recognition of logical validity is to be simulated by means of a digital computer; only now there is the additional requirement that the rules be economical. All Progress Looks Bigger Than It Is 279

(Thomas Hewitt Key) put it a century ago:

What is matter? Never mind. What is mind? No matter.

In short, there was a lack of drama about those questions. (The silent majority is only skeptical and not particularly critical; unlike say positivists with their claims of dramatic misconceptions behind traditional questions).

16.3 Intimate pedagogy

Suppose we have come to the conclusion that some given notion, for example, one of those grand traditional notions, has to do with a family resemblance in the sense of Section 2. Of course we do not assume that such a conclusion, even if sound, can be conveyed convincingly, especially to individuals with very limited experience.10 We ask the pedagogic question: What can be done? The appropriate style of pedagogy may have to be intimate, sensitive to the intended audience (in contrast to mass instruction of formal skills). Wittgenstein’s own style will be sketched first, and then compared with a superficially very different style.

The style of Wittgenstein and the Jesuits Perhaps the most striking feature of Wittgenstein’s style (as, for example, in the passage referred to in note 10) is its obliqueness. Doctrines are discredited without being formulated explicitly, let alone precisely. Elaborate arguments in the literature concerning the doctrines are ignored. This style is familiar from the Jesuit tradition of discrediting refutations of the Bible; for example, involving cosmology (continuous creation) or biology (evo- lution). Jesuit scientists present theories (Big Bang) or additional observations (shadings of butterflies without apparent survival value) which are so to speak embarrassing to the proposed refutations. The new material belongs to exact science, but the pedagogic use made of it breaks all the sacred rules of exact methodology. At a minimum the latter would require that the particular features of the Biblical story which are to be defended, be listed; and that the refutation intended by its opponents be stated, and an error located. Preferably (according to exact methodology), one would list a general class of theories (including those implicit in the refutation considered) which are in conflict with the new material. Jesuit scientists do none of these things.

10In other words, we do not make here the pious (socalled rationalist) assumption that, properly put, every sound conclusion can be made convincing to the rational mind. An eloquent presentation of weaknesses of the assumption is to be found in the first part of Solzhenitsyn’s Nobel Prize Lecture. Wittgenstein touches the matter obliquely in his preface to Philosophical Investigations (about readers who have come close to his thoughts on their own). All Progress Looks Bigger Than It Is 280

To put first things first: it is by no means evident that exact methodology is, statistically, superior to the Jesuit tradition; at least, in connection with the Biblical stories here considered. In popular language: what is the point of all that logic-chopping, when it is out of all proportion to the precision of the ‘data’ (given in the Bible)? Is it obviously rewarding to formulate refutations with loving pre- cision, if one starts off with the conviction that they are badly mistaken anyway? Certainly, when attempts at verbal precision are premature they tend to produce unhappy wordings, and hence a futile chain of objections and counterobjections. In short, discretion is needed to recognize when the rules of exact methodology are rewarding. Nothing short of a statistical study can decide whether those rules have been generally applied with discretion.

Precise formulations as an (occasional) alternative Granted all this, there remains the fact that occasionally we do better by us- ing precise formulations (than the Jesuit style) for refuting an idea or doctrine. Galileo’s witty argument (in Section 1) gives the general flavour of such a use. Below there are a couple of results from mathematical logic with a similar flavour, which can be used for debunking some ‘grand’ claim or notion. Like Galileo’s ex- ample they are principally pedagogic, and do not help us find anything to replace the discredited ideas.11 Tarski’s discussion of truth definitions establishes both an implicit definition by means of the socalled adequacy conditions, and an indefinability by means of formulas of the language considered (for a wide class of languages, including all those generally studied in logical texts). What about pedagogic uses of Tarski’s discussion? Realistically speaking, it is certainly a perfect answer to Pilate’s question: What is Truth? Whatever the defects of the answer, it is certainly up to the level of Pilate’s logical sophistication. Of course, the answer is not deep, not very rewarding. But then it would be sheer assumption to suppose that the question is deep, and that a correct answer must be rewarding; cf. Columbus’ answer to the question how to make an egg stand up. Another excellent use of Tarski’s result is for defeating the claim that the language L of set theory is ‘universal’: the set of integers which are G¨odelnumbers of true sentences of L is not definable in L. The heuristic value of traditional questions, stressed in Section 2, is very well illustrated by Tarski’s observations, too. Looking at the ‘grand’ notion of Truth, he found an implicitly, but not explicitly definable predicate. This continues to be of use in definability theory, but only when combined with specific constructions

11Warning. These results have also been used quite indiscreetly, to accompany philosophical claims as grand as anything in the traditional literature of the pretentious business of mental- ism. Because of this need for discretion, the results have not been particularly good for mass instruction, only for intimate pedagogy. All Progress Looks Bigger Than It Is 281

(when more sophisticated indefinability problems are ‘reduced’ to the case of the truth definition). Another beauty for intimate pedagogy is G¨odel’sfirst incompleteness theo- rem,12 that is most naturally presented as the answer to the question: Is formal derivability in a system F equivalent to truth? say, for assertions about natural numbers. The (negative) answer is then established by use of Tarski’s observa- tion above. One simply verifies that, for all systems F containing a minimum of arithmetic, formal derivability is definable while, by Tarski, truth is not. G¨odel’stheorem unquestionably refutes the idea that, in mathematics, ab- stract notions are merely used as a fa¸conde parler. Hilbert expressed this idea explicitly and precisely in his consistency programme. A more direct formulation of the idea, which is equally easy to make precise, is that a proof by use of ab- stract notions of a theorem stated in elementary form, can be straightforwardly converted into an elementary proof. Incidentally, the idea was held quite widely by Hilbert’s contemporaries, too (and today, despite the refutation!). The idea is valid since most early uses of infinitistic notions (and modifications involving far from straightforward conversions of proofs) have had heuristic value (again, when combined with imaginative specific constructions or reformulations).

Conclusions As for general conclusions from the successes described above, the considerations of Section 1 suggest this. On the positive side, comparable results may be expected for neglected ‘disrep- utable’ notions (where, of course, comparable satisfaction will not be absolutely great). The positive side can be illustrated by successes in Brouwer’s intuition- istic foundations, generally considered disreputable (Bolshevik). For a long time this subject was not examined with detachment, but simply pursued by a few enthusiasts. On the negative side, similar successes should not be expected with notions which have been studied intensively. The negative side seems to me very well illustrated by the recent development of set theory. Despite attempts at tradi- tional analyses, as a matter of recent history progress was not made with the originally intended, ‘most general’ notion of set, but with sophisticated versions or specializations. The latter turned out to be more rewarding: our knowledge of the general notion gives more when applied to those specializations. At the same time, confirming the pedagogic value of generalities, introductions to books on set theory give an increasingly prominent place to simple precise results about

12Here it should be added that G¨odel’soriginal paper [1931] borders on the Jesuit tradition described earlier. He did not even give a definition of ‘formal system’, but considered Prin- cipia Mathematica and spoke of ‘related systems’; and only mentioned its bearing on Hilbert’s programme without giving any precise formulation. All Progress Looks Bigger Than It Is 282 such general notions as the full cumulative hierarchy of sets, and especially its segments.

Acknowledgement I first heard a clear formulation of the pedagogic value of philosophy more than 30 years ago, without however appreciating its full inwardness at the time. When John Richard Geach was five years old, he complained to me of his parents’ objection to his taste for bread and other carbohydrates (he was getting a big tummy). His parents, themselves not stunningly slim, had warned him that girls didn’t like boys with big tummies, and that he would regret his eating habits when he was older. His objection was that he wouldn’t care. I asked him if he was going to be hungry the next day at 4 p.m. He said: How should I know now? Of course I didn’t answer, and of course he saw the point fairly soon. After a while he asked: Is this philosophy? I decided it was, and he concluded: Then teach me more philosophies. I did. I don’t know if he changed his eating habits after this bit of intimate pedagogy. Anyway, this was not its purpose. As Wittgenstein liked to say, what matters is not what you do, but the way you talk about it.

Appendix. The Place of Logic in the Light of Experience

This Appendix contains twists which appeal to me principally because of their freshness (without necessarily being more rewarding in the long run than quite hackneyed points). Of course, there is an element of self-indulgence in empha- sizing those twists. But, at least statistically, there is also the objective reason that where there has been little progress, familiar considerations are not likely to be rewarding. This point goes well with Wittgenstein’s distaste for hackneyed banalities, the single most striking element of his thought.

General Introduction Foundational reductions differ from those used in ordinary scientific practice, for example, of geometry to algebra by use of Cartesian coordinates, in a far reaching way. The scientific reductions are introduced in the course of work on an established subject, on familiar objects, and used to solve problems about those objects. Consequently, whatever the eventual limitations of such reductions may be, one will not be left altogether emptyhanded. Besides, in a familiar area, it is usually possible to judge the interest of a problem by the light of nature. All Progress Looks Bigger Than It Is 283

Foundational reductions are more chancy: whatever else may be in doubt, a definition of the number 1 will not primarily be used to do sums. The interest is liable to be more ethereal, with all the consequences explained in Section 2. In particular, the permanent interest of such reductions may not at all lie in the area where they were first introduced. Neither the particular reduction (analysis) nor its originally intended purpose need be viable, only something a bit similar. Generally, the proper area of application or ‘home’ for those reductions is not found by going over the original intentions more and more carefully.

Traditional aims for logic A traditional aim formulated for logic is to contribute to the analysis of mathe- matical reasoning. If one ignores that what anything like logic can do for this aim is simply infinitesimal, it is surely ‘reasonable’ to expect logic to be fundamental. Logic, or, at least, its propositional part was one of the first areas of reasoning to be presented coherently. For many philosophers it was (and is) the only body of reasoning studied at a mature stage. Unquestionably, one gets the impression that it is typical of something quite general: so why not of all reasoning?13 As elaborated in Section 1, if we had no detailed experience we should simply have to rely on such more or less conflicting impressions as: our intuitions are all we have; or a little learning is a dangerous thing; or everything is subject to revision. Fortunately, the last 100 years have provided several new styles of reasoning in the natural and mathematical sciences. So our impressions on the place of logic can be tested. Obviously, it will be easier to establish errors, although quite often there is little doubt that this or that is here to stay.

Logical consequence One thing that is surely here to stay is the notion of logical or axiomatic con- sequence, as used, for example, in current formulations of the independence of the parallel axiom. The geometric constructions involved were known long before one could formulate clearly and convincingly how they can be used (to provide models) for the independence proof. Once this was seen, we never looked back. Evidently, the discovery that the rest of Euclid’s geometry can be set out in terms of purely logical consequence from a few axioms was a prerequisite for the interest of this formulation. 13Incidentally, much the same applies to old fashioned high school arithmetic: Hilbert’s pro- gramme expresses nothing else but the impression that this part is typical of all mathematics. In a different direction, Wittgenstein reacted to the truth functional analysis of propositional logic as follows: if such a simple observation brings order into the mind-boggling totality of all propositions, why not go the whole hog, and have a whole world scheme along those lines? In other words, why not the scheme of Tractatus? All Progress Looks Bigger Than It Is 284 First order logic The separation of the language of first order logic was much less immediately convincing. For example, it was not needed in the independence of the parallel axiom (which also holds for the socalled second order formulation of Euclid’s axioms of geometry, in particular, of the continuity axiom). Experience in my own lifetime is relevant here. Some 40 years ago logicians considered themselves to be misunderstood martyrs when the silent majority of mathematicians showed little interest in the compactness theorem and other generalities about first order logic. The lack of interest is consistent with the theme of this chapter, especially in Section 1. The logicians involved presented their general results as obviously significant because they concerned arbitrary axiomatic systems. But this left open the possibility that, for any particular (familiar) axiomatic system, those results have only superficial consequences.14 Since logicians ignored that possibility (to which mathematicians were sensitive from bitter experience), their enthusiasm carried little conviction. For details on the subsequent development, see the Appendix of Chapter 14.

Formal rules The third aspect of logic concerns not merely language, but inference; specifi- cally, by means of formal rules for generating all valid formulas. Such rules were discovered by Frege, and proved complete by G¨odelsome 50 years later. The history is quite different from the two cases above. In contrast to the notion of logical consequence (or, for that matter, to def- initions of ‘circle’, etc.), those rules have not become an integral part of mathe- matical training (despite occasional attempts by ‘progressive’ educators). In contrast to early results in first order logic, the discovery of a complete set of formal rules was not regarded as trivial, but rather as a sensation (not surprisingly, since we are certainly not conscious of following these or any other rules for recognizing logical validity). Formal rules were presented as an analysis of the nature of logical validity, if not as the ultimate criterion of rigor. And what did not fit in (for example, socalled second order logic: incidentally, the part of logic which originally put axiomatic theory on the map, as in Peano’s axioms for arithmetic or Dedekind’s for the continuum) was rejected; at least by socalled formalists. However, the norm is simply not applied to the bulk of practice, not even in the case of theorems formulated in the language of logic: mathematicians reversed the reduction of mathematics to logic, learning to use set theoretical and arithmetical principles for proving logical theorems. For reliability one needs such mathematical proofs, in the perfectly realistic sense that logical proofs (say,

14In fact, 40 years ago, practically all the applications of the general results consisted of trivial theorems which had clumsy proofs in the literature. All Progress Looks Bigger Than It Is 285 by Frege’s rules), are long and likely to contain errors. To appeal to the possibility (the existence) of some logical derivation d of a theorem F is certainly not an application of a norm when we do not know d, but are convinced by a totally different argument for F .15 The development of mathematical practice also shows that formal rules are futile for understanding the actual processes of logical reasoning, as discussed in Section 2. In mathematical proofs of logical theorems the formal logical inferences are seldom mentioned explicitly at all, and are obviously never the essential part of the argument. Thus those rules do not constitute even an approximation to our possibilities of recognizing logical validity, of finding and following convincing proofs of logical theorems. Reminder. For purposes of intimate pedagogy, the discovery of formal rules is of course of permanent value. After all, without that discovery one could not even state precisely how inefficient logical proofs are compared to mathematical proofs! As for its heuristic value, the discovery has led to the subject of proof theory, but with the following (inevitable) consequence of the exaggerated sensation described above. In accordance with the shift of emphasis about ‘ideal’ fluids (in Section 2), work in proof theory has to be constantly reformulated and separated from its original, discredited aims.

General conclusion Despite the familiar whining about increasing specialization, we all absorb an increasing absolute amount (of course, not: proportion) of information about progress outside our special fields. Except for a few doctrinaires, we develop in this way new perspectives on the relative importance of different styles of thought: not necessarily because we are bewitched by certain habits of thought, but because we are impressed by their successes.16 The change between the first 30 years of this century, the most recent golden era of foundations, and the last 30 years seems to me very striking. Whatever the lasting role of positivist doctrine may be, both relativity theory and quantum theory were first presented in terms derived from that doctrine (before Einstein and Heisenberg wearied of it). More generally, the role of the observer was stressed, as part of the very meaning of essentially all assertions. As explained [∞] this applies very much to intuitionist foundations. In contrast, the principal successes of the last 30 years involved a much more even mix of intellectual and material tools, including the use of sophisticated data;

15Readers will, of course, have noticed that a similar point applies to alleged norms in physics, for example, for length. Here different means are needed to get correct measures of great and small distances, and there is a whole theory of errors to make the proper choice of method of measurement. 16We may not always like successful styles as much as the attractive traditional tools, men- tioned already in Section 1. All Progress Looks Bigger Than It Is 286 for example, in molecular biology and studies of animal behavior mentioned al- ready (in Sections 1 and 2), but also in cosmology (radio astronomy, space travel, including the use of computers for processing previously unmanageable quanti- ties of data.) Except for the kind of doctrinaires mentioned earlier, few of us fail to recognize the interest of these more recent successes sub specie aeternitatis;17 and the philosophical interest of the discovery that the law of diminishing returns applies to the austerely theoretical, so called conceptual analysis which was so prominent in the first quarter of this century (also, as in the Appendix, in early mathematical logic).18

17Evidently, in the literal sense of the phrase, our view of the world, the way we see the world, is affected much less by generalities about the nature of space and time, or about microscopic indeterminacy than by answers to such questions as: Where does the sun get its heat? or: Why is glass transparent? (and in particular, by recognizing that some of those generalities are relevant to these questions). 18Incidentally, this view about diminishing returns is by no means new (and hence likely to be untrue), but has also been formulated, though in different terms, and analyzed in the highly literate account Weinberg [1976]. Specifically, the first aim is not to correct some alleged fundamental misconceptions of our view of the world, which would require something fundamentally different from familiar notions, in other words, ‘crazy’ theories. One expects more from extending experience, in the precise sense of extending the ranges of the physical parameters which we encounter ordinarily (without setting up special experiments). Besides, once we are confronted with the wider experience, a previously ‘crazy’ idea will often appear commonplace. True, there is no end in sight to possible extensions of those ranges. But at least statistically, after a great deal of familiarity with the phenomena involved, we aren’t too bad of judging when an extension is sufficient for progress of permanent interest. To this extent St. Thomas’ adaequatio rei intellectu seems to have a point. Chapter 17

Kripke’s book on Wittgenstein

Some of us who saw a lot of the later Wittgenstein are struck by the class between large chunks of his posthumous publications on the one hand, and many of his more mature maxims and introductory remarks on the other. The latter dominated our conversations, and conflict with passages (quoted in Kripke [1982]) in the Investigations which fit into a principal anglosaxon tradition, in line with Hume. For one thing Wittgenstein felt he was outside such traditions. More specifically, Wittgenstein is shown to violate a favourite maxim of his (also quoted by Kripke): don’t think, look! by not merely thinking, but arguing (the hind legs off a donkey), in the most traditional manner. Others may examine Wittgenstein’s failure to work his mature thoughts even into his last text (evidently, largely taken from earlier material), and the part played by editorial selection in keeping them from other texts. Here we ignore all the dialects, and merely recall1 briefly how a couple of Kripke’s principal points are interpreted in the light of Wittgenstein’s mature views, with one difference. Like Wittgenstein we stress the sterility of traditional arguments, as opposed to simple reminders where to look (in heaven and earth). But unlike Wittgenstein we avoid dramatizing the blind spots, perpetuated or even created by those ar- guments, since exaggerations invite equally sterile counterarguments, and so on ad nauseam. 0Originally published in Canadian Philosophical Reviews, 3 (1983) 287–289, as ‘Wittgenstein on rules and private language’. 1Recall from previous chapters. But on their own the earlier interpretations were (bounded to be) dull, in accordance with Wittgenstein’s own maxim in Tractatus 6.53, about sensible philosophical points having proper weight only in opposition to another’s thesis. Note that Wittgenstein violates this maxim too, by merely imagining ‘theses’ instead of quoting from the rich supply in the literature.

287 Kripke’s book on Wittgenstein 288 Every course of action can be made to accord with the rule [considered] If we look we see that not every course of action is made to accord with the rule. So, what can (logically, as it were) be made to accord is just of dubious relevance here. Selection from those logical possibilities is of the essence. Progress will require distinctions: most obviously, between biological (for ex- ample, human) and current electronic implementations of rules (called ‘data pro- cessing’ in the trade). In biological implementations many specific gadgets2 must be expected; as in other biological experience with its batteries of enzymes (handling the replication of DNA), of immune responses, or of socalled atoms of visual perception; not counting the many relics which lavish Nature has left over after selecting some successes. In contrast, in electronic implementations technical or economic reasons re- quire very few basic elements, success depending here on many, fast repetitions. We expect (and find) that the rewarding questions concerning rules are pretty different in the two cases above. In keeping with a general lesson from (bitter) scientific experience: what is true in general (here: for rules, that is for the general concept of rule), is liable to be trivial in each particular case. Trivial, not illegitimate! Thus such questions of ‘principle’ as whether a rule is followed correctly, do have close parallels in the two cases above. Far from being subtle, those questions can in fact be asked without ever looking at the rule! But, by Aristotle, brooding over them will soon reach the point of diminishing returns. Worse still, it keeps us from looking for relevant extensions of experience which raise new and rewarding questions almost automatically; in the present case: about imaginative ways of ensuring the reliability of a rule, without the ritual of measures or criteria (key words: enriching data, cross checks). Ways familiar from experimental science, which were close to Wittgenstein’s interests (with his family background in engineering) but, apparently, not to Kripke’s.3 Kripke’s book touches most of the points above, including the biological busi- ness. But, pursuing scholarship of contemporary academic philosophy, he never stops to face the principal question (for a philosopher in the popular sense):

What (if anything) of substance can be said at present about any kind of implementing rules?

A proof must be easy to take in and to remember Kripke links this requirement on proofs to public inspection and agreement, with surveyable in place of easy to take in. Though this link struck him, in his own

2Or, behaviouristically, reactions: this is not an issue here. 3Cf. footnote 25 on p. 39. Kripke’s book on Wittgenstein 289 words, with the ‘force of a revelation’, it goes back at least to Hilbert’s familiar sales talks for formalization. Anyhow, given the particular passages in Wittgen- stein on which Kripke picks, his interpretation is not unreasonable, even if some of his reasons are; partly because of his pitiful struggles with quite elementary German.4 But this should not be enough to stop Wittgenstein from turning in his grave at the hackneyed thoughts, his bˆetenoire, that an exceptionally gifted reader associated with his words.5 Now, what do we find if we actually look at a proof? The requirement above is equally relevant if the proof is to be convincing to ourselves, a matter which is suspect only when we ‘think’ (within the straight jacket of a particular ideology of rigour, to boot). Readers can transfer here the litany of the previous section: about dubious doubts drawing attention away from more rewarding problems; or the question of principle, whether a proof is correct, blinding us to the principal question:

What, if anything, of substance can be said about all correct proofs (even when restricted to some formal language)?6

Private language On the matter of private language, experience shows that many effective uses of mathematics are indeed purely formal, public and mechanical; but also this: to understand, that is, (privately!) digest a proof, we have to put it into ‘our own words,’ especially if it is complicated. Those words may be invented neologisms, metaphors or even diagrams. As with the digestion of food such ‘additives’ seem essential, at least, in trace amounts; and roughly, but not exactly, the same will suit all of us. So at least in the biological domain, some kind of capacity for private language may go with the capacity for understanding proofs. But to search here (and, above all, now) for greater precision would seem pretentious, and thus futile.7 In the same vein: the public varieties of language can be expected to be more rewarding to study, simply because of their more obvious survival value or, if

4Cf. footnote 63 on p. 74. 5However finicky in his own writing, Wittgenstein did not think of himself as a quibbler. He once said to me: If you want to know whether one shoe is black, the other white, ask me; if you want to know if there is a white spot on the black shoe, ask Moore. 6Reminder. Bourbaki’s L’architecture des mathematiques ([1948]) suggests one recipe for making some (not: all correct!) proofs convincing: an analysis (easy to take in) into a few lemmas, each about a few familiar structures (hence: easy to remember). In this way, the objects in the final theorem (say, about the reals) are seen as (an instance of) one basic structure (say, group) in one lemma, another (topological space) in another, or both (topological group) in a third. Thus the present scheme provides a natural place for Wittgenstein’s preoccupation with seeing as, too. 7A mistake in the sense of p. 101, footnote 81, about explaining protophenomena. Kripke’s book on Wittgenstein 290

Wittgenstein’s jargon is preferred, ‘role’ in our lives.8 Seen this way, the hoary questions about the capacity for doing higher mathematics, with their patently fabricated issues (for example, about certainty) would seem unpromising: lack of focus, as already mentioned; and lack of tools, since the capacity seems to be wholly confined to humans, with little scope for standard experimentation (on animals).

8Cf. The compulsive communicators, the title of the segment on humans in David Attenbor- ough’s TV series Life on Earth. Bibliography

Ackermann, W. [1924] Begr¨undungdes ‘tertium non datur’ mittels der Hilbertschen Theorie der Wider- spruchsfreheit, Math. Ann. 93 (1924) 1–36. Ax, J., and Kochen, S. [1965] Diophantine problems on local fields, Am. J. Math. 87 (1965) 605–630. Bachmann, F. [1975] Frege als konstruktiver Logizist, in Frege und die moderne Grundlagenforschung, Thiel ed., Meseinheim am Glan, 1975, pp. 160–168. Barwise, J. [1977] Handbook of mathematical logic, North Holland, 1977. Barwise, J., and Perry, J., eds. [1983] Situations and attitudes, M.I.T. Press, 1983. Bell, J.S. [1977] Boolean-valued models and independence proofs in set theory, Clarendon Press, 1977.

Benacerraf, P. and Putnam, H., eds. [1964] Philosophy of mathematics, Prentice Hall, 1964. Bernard, J.F. [1973] Talleyrand, New York, 1973. Bernays, P. [1935] Sur le platonisme dans les math´ematiques, Enseign. Math. 34 (1935) 52–69. [1940] Review of G¨odel[1939], J. Symb. Log. 5 (1940) 116–117. [1961] Die hohen Unendlichkeiten und die Axiomatik der Mengenlehre, in Infinitistic meth- ods, Pergamon Press, 1961, pp. 11-20. Berry, M.V. [199?] Some quantum-to-classical asymptotics, Bezem, M. [1985] Strongly majorizable functionals of finite type. A model for bar recursion containing discontinuous functionals, J. Symb. Log. 50 (1985) 652–660. Bishop, E. [1967] Foundations of constructive analysis, McGraw Hill, 1967. Boole, G. [1854] An investigation of the laws of thought, on which are founded the mathematical the- ories of logic and probability, Walton and Maberley, 1854. Bourbaki, N. [1948] L’architecture des mathematiques, in Les grands courants de la pensee mathematique, Le Lionnais ed., Blanchard, 1948, pp. 35–47, transl. in Am. Math. Monthly 57 (1950) 221–232.

291 Bibliography 292

Brouwer, L.E.J. [1905] Leven, Kunst, en Mystiek, Delft, 1905, partly transl. in [1975], pp. 1–10. [1907] Over de grondslagen der wiskunde, Dissertation, Amsterdam, 1907, transl. in [1975], pp. 15–101. [1908] De onbetrouwbaarheit der logische principes, Tijdschr. v. wijsb. 2, 1908, transl. in [1975], pp. 107–111. [1908a]see Volume II, p. 150. [1913] Intuitionism and formalism, Bull. Am. Math. Soc. 20 (1913) 81–96, also in [1975], pp. 123–138. [1924] Intuitionistische Zerlegung matematischer Grundbegriffe, Jber. Deutsch. Math. Verein. 33 (1924) 251–256, also in [1975], pp. 275–280. [1927] Uber¨ Definitionsbereiche von Funktionen, Math. Ann. 97 (1927) 60–75, transl. in van Heijenoort [1967], pp. 446–463. [1947] Address to Prof. G. Mannoury, Synthese 6 (1947) 190–194, also in [1975], pp. 472– 476. [1948] Essentieel negatieve eigenschappen, Ind. Math. 10 (1948) 322–323, transl. in [1975], pp. 478–479. [1948a]Consciousness, philosophy and mathematics, Proc. Int. Congr. Math. 10 (1948) 1235– 1249, also in [1975], pp. 480–494. [1975] L.E.J. Brouwer Collected Works, Volume I , Heyting ed., North Holland, 1975. [1981] Cambridge lectures on intuitionism, van Dalen ed., Cambridge University Press, 1981. Brouwer, L.E.J., and de Loor, B. [1924] Intuitionistischer Beweis des Fundamentalsatzes der Algebra, Proc. Ned. Acad. Weten- sch. 27 (1924) 186–188. Browder, F.E., ed. [1976] , Proc. Symp. Pure Appl. Math. 28, 1976. Burgess, J.P. [1981] The completeness of intuitionistic propositional calculus for its intended interpreta- tion, Notre Dame J. Form. Log. 22 (1981) 17–28. Cantor, G. [1879] Nachr. Ges. Wiss. G¨ottingen (1879) 127–135. [1885] Review of Frege [1884], Deuts. Literat. 6 (1885) 728–729, reprinted in Cantor [1932], pp. 440–441. [1889] Letter to Dedekind, reprinted in Cantor [1932], pp. 443–447. [1932] Gesammelte Abhandlungen, Zermelo ed., Springer, 1932. Chang, C.C. and Keisler, H.J. [1973] Model Theory, North Holland, 1973. Chevalley, C. [1936] Demonstration d’une hypothese de M. Artin, Abh. Math. Sem. Univ. Hamburg 11 (1936) 73–78. Church, A. [1953] Non-normal truth tables for the propositional calculus, Boll. Soc. Mat. Mex. 10 (1953) 41–52. Cohen, P.J. [1963] The independence of the continuum hypothesis, Proc. Nat. Acad. Sci. 50 (1963) 1143– 1148. [1964] The independence of the continuum hypothesis II, Proc. Nat. Acad. Sci. 51 (1964) 105–110. Bibliography 293

[1971] Comments on the foundations of set theory, Proc. Symp. Pure Appl. Math. 13 (1971) 9–15. Conway, J.H. [1976] On numbers and games, Academic Press, 1976. Craig, W. [1965] Boolean notions extended to higher dimensions, in The theory of models, Addison et al. eds., North Holland, 1965, pp. 55–69. Crawshay-Williams, R. [1970] Russell remembered, London, 1970. Crick, F. [1981] Life itself , Simon and Schuster, 1981. Crick, F., Griffith, , and Orgel, [1957] Codes without commas, Proc. Nat. Acad. Sci. 43 (1957) 416–422. Crick, F., and Orgel, [1973] Directed panspermia, Icarus 19 (1973) 341–346. Davis, M., Matyasevic, Y. and Robinson, J. [1976] Hilbert’s tenth problem. Diophantine equations: positive aspects of a negative solu- tion, Proc. Symp. Pure Appl. Math. 28 (1976) 323–378. Dedekind, R. [1872] Stetigkeit und irrationale Zahlen, Braunschweig, 18872. [1888] Was sind und was sollen die Zahlen, Braunschweig, 1888. Denef, J. [1984] The rationality of Poincar´eseries associated to the p-adic points of a variety, Invent. Math. 77 (1984) 1–23. Descartes, R. [1637] Discours de la m´ethode, Leyden, 1637. Dirac, P.M. [1978] Directions in physics, Wiley, 1978. Dreben, B., Andrews, P. and Anderaa, S. [1963] False lemmas in Herbrand, Bull. Am. Math. Soc. 69 (1963) 699–706. Dries, L. van der and Wilkie, A.J. [1984] Gromov’s theorem on groups of polynomial growth and elementary logic, J. Alg. 89 (1984) 349–374. Einstein, A. [1944] Remarks on Bertrand Russell’s theory of knowledge, in The philosophy of Bertrand Russell, Schilpp ed., Evanston, 1944, pp. 279–291. Eklof, P. [1976] Whitehead’s problem is undecidable, Am. Math. Month. 83 (1976) 775–788. Ellison, W.J. [1975] Les nombres premiers, Hermann, 1975. Ershov, Y.L. [1965] On the elementary theories of local fields, Alg. Log. 4 (1965) 5–30. Feferman, S. and Spector, C. [1962] Incompleteness along paths in progressions of theories, J. Symb. Log. 27 (1962) 383– 390. Finsler, P. [1926] Formale Beweise und die Entscheidbarkeit, Math. Zeit. 25 (1926) 676–682, transl. in van Heijenoort [1967], pp. 438–445. Bibliography 294

Frege, G. [1879] Begriffschrift, Halle, 1879, transl. in van Heijenoort [1967], pp. 1–82. [1884] Grundlagen der Arithmetik, Breslau, 1884. [1893] Grundgesetze der Arithmetik, Jena, 1893. Friedman, H. [1977] Ann. Math. 105 (1977) 1–28. Gabbay, D. [1976] A note on Kreisel’s notion of validity in Post systems, Studia Logica 35 (1976) 285– 295. [1981] Semantical investigations in Heyting’s intuitionistic logic, Reidel, 1981. Galvin, F. and Prikry, K. [1976] Infinitary Jonsson algebras and partition relations, Alg. Univ. 6 (1976) 485–494. Gandy, R.O. and Hyland, M. [1977] Computable and recursively countable functions of higher type, in Logic Colloquium ’76 , Gandy et al. eds., North Holland, 1977, pp. 407–438. Gentzen, G. [1935] Untersuchunngen ¨uber das logische Schliessen, Math. Zeit. 39 (1935) 176–210, 405– 431, transl. in [1969], pp. 68–131. [1935a]First version of Sections IV and V of [1936], in [1969], pp. 201–213. [1936] Die Widerspruchsfreiheit der reinen Zahlentheorie, Math. Ann. 112 (1936) 493–565, transl. in [1969], pp. 132–213. [1969] The collected papers of Gerhard Gentzen, Szabo ed., North Holland, 1969. Geroch, and Hartle, [19?] Volume II p. 248 Getty, P.J. [1963] My life and fortunes, New York, 1963. Girard, J.Y. [1987] Proof theory and logical complexity, Bibliopolis, 1987. Goad, C.A. [1978] Monadic infinitary propositional logic: a special operator, Rep. Math. Log. 10 (1978) 43–50. G¨odel,K. [1929] Dissertation, [1930] Die Vollst¨andigkeit der Axiome des logischen Funkionenkalk¨uls, Monash. Math. Phys. 37 (1930) 349–360, transl. in [1986], pp. . [1930a]Einigemetamathematische Resultate ¨uber Entscheidungsdefinitheit und Widerspruchs- freiheit, Anz. Akad. Wiss. Wien 67 (1930) 214–215, transl. in [1986], pp. . [1931] Uber¨ formal unentscheidbare S¨atzeder Principia Mathematica und verwandter Sys- teme I, Monash. Math. Phys. 38 (1931) 173–198, transl. in [1986], pp. 145–195. [1931a]Diskussion zur Grundlegung der Mathematik, Erkenn. 2 (1931) 147–151, transl. in [1986], pp. . [1932] Eine Eigenschaft der Realisierungen des Aussagenkalk¨uls, Ergebn. Math. Kolloq. 3 (1932) 20–21, transl. in [1986], pp. . [1932a]Zum intuitionistischen Aussagenkalk¨uls, Anz. Akad. Wiss. Wien 69 (1932) 65–66, transl. in [1986], pp. . [1933] Zur intuitionistischen Arithmetik und Zahlentheorie, Ergebn. Math. Kolloq. 4 (1933) 34–38, transl. in [1986], pp. 287–295. [1933a]Zum Entscheidungsproblem des logischen Funkionenkalk¨uls, Monash. Math. Phys. 40 (1933) 433–443, transl. in [1986], pp. . Bibliography 295

[1933b]EineInterpretation des intuitionistischen Aussagenkalk”uls, Ergebn. Math. Kolloq. 4 (1933) 39–40, transl. in [1986], pp. . [1934] Review of Skolem [1933a], (1934) , transl. in [1986], pp. . [1934a]On undecidable propositions of formal mathematical systems, in The undecidable, Davis ed., Raven Press, 1965, pp. 41–74. [1936] Uber¨ die L¨angevon Beweisen, Ergebn. Math. Kolloq. 7 (1936) 23–24, transl. in [1986], pp. . [1938] The consistency of the axiom of choice and of the generalized continuum-hypothesis, Proc. Nat. Acad. Sci. 24 (1938) 556–557. [1939] Consistency-proof for the generalized continuum-hypothesis, Proc. Nat. Acad. Sci. 25 (1939) 220–224. [1940] The consistency of the axiom of choice and of the generalized continuum hypothesis, Princeton University Press, 1940, 2nd edition 1951. [1944] Russell’s mathematical logic, in The philosophy of Bertrand Russell, Schilpp ed., Evanston, 1944, pp. 125–153. [1946] Remarks before the Princeton bicentennial conference on problems in mathematics, in The undecidable, Davis ed., Raven Press, 1965, pp. 84–88. [1947] What is Cantor’s continuum problem, Am. Math. Mon. 54 (1947) 515–525. [1949] An example of a new type of cosmological solutions of Einstein’s field equations of gravitation, Rev. Mod. Phys. 21 (1949) 447–450. [1949a]Aremark about the relationship between relativity theory and idealistic philosophy, in Albert Einstein, philosopher-scientist, Schilpp ed., Evanston, 1949, pp. 555–562. [1950] Rotating universes in general relativity theory, Proc. Int. Congr. Math. (1950) 175– 181. [1958] Uber¨ eine bisher noch nicht ben¨utzeErweiterung des finiten standpunktes, Dial. 12 (1958) 280–287, transl. in [1986], pp. . [1964] What is Cantor’s continuum problem, revised and enlarged edition, in Philosophy of mathematics, Benacerraf et al. eds., Prentice-Hall, 1964, pp. 258–287. [1972] second edition of [1958] [1972a]remarks on the undecidability results [1974] Preface to the second edition of Robinson [1966]. [1986] Collected works, volume I, Oxford University Press, 1986. [1989] Collected works, volume II, Oxford University Press, 1989. Goldfarb, W.D. [1979] Logic in the twenties: the nature of quantifier, J. Symb. Log. 44 (1979) 351–368. [1984] The unsolvability of G¨odelclass with identity, J. Symb. Log. 49 (1984) 1237–1252. Grattan-Guinness, I. [1979] In memoriam: Kurt G¨odel, Hist. Math. 6 (1979) 294–304. Hajnal, A. [1956] On a consistency theorem connected with the generalized continuum problem, Zeit. Math. Log. Grund. Math. 2 (1956) 131–136. Hasenj¨ager,G. [1953] Eine Bemerkung zu Henkins Beweis f¨urdie Vollst¨andigkeit des Pr¨adikantenkalk¨uls, J. Symb. Log. 18 (1953) 42–48. Hasse, H. [1964] Vorlesungen ¨uber Zahlentheorie, 2nd edition, Springer, 1964. Hawking, S.W. and Ellis, G.F.R. [1973] The large scale structure of space-time, Cambridge University Press, 1973. Heckmann, O. and Sch¨ucking, E. [1962] Relativistic cosmology, in Gravitation: an introduction to current research, Witten Bibliography 296 ed., Wiley and Sons, 1962, pp. 428–469. Heijenoort, van, J. [1967] From Frege to G¨odel, Harvard University Press, 1967. Herbrand, J. [1930] Recherches sur la th´eoriede la d´emonstration, Thesis, Paris, 1930, transl. in van Heijenoort [1967], pp. 525–581. Heyting, A. [1930] Die formalen Regeln der intuitionistischen Logik, Akad. Wiss. Phys. Math. (1930) 42–56. [1956] Intuitionism. An introduction, North Holland, 1956. Henkin, L. [1952] A problem concerning provability, J. Symb. Log. 17 (1952) 160. Hermann, I. [1949] Denkpsychologische Betrachtungen im Gebiete der mathematischen Mengenlehre, Schweiz. Z. Psych. 8 (1949) 189–231. Higman, G. [1961] Subgroups of finitely presented groups, Proc. Roy. Soc. 262 (1961) 455–474. Hilbert, D. [1899] Grundlagen der Geometrie, Leipzig, 1899. [1904] Uber¨ die Grundlagen der Logik und der Arithmetik, Verhandl. III Intern. Math. Kongr. (1904) 174–185, transl. in van Heijenoort [1967], pp. 129–138. [1926] Uber¨ das Unendliche, Math. Ann. 95 (1926) 161–190, transl. in van Heijenoort [1967], pp. 367–392. [1930] Probleme der Grundlegung der Mathematik, Math. Ann. 102 (1930) 1–9. [1931] Grundlegung der elementaren Zahlentheorie, Math. Ann. 104 (1931) 485–494. Hilbert, D. and Ackermann, W. [1928] Grundz¨ugeder theoretischen Logik, Springer, 1928. Hilbert, D., and Bernays, P. [1934] Grundlagen der Mathematik, Berlin, 1934. [1934] Grundlagen der Mathematik, volume II, Berlin, 1939. Hodge, W.V.D. and Pedoe, D. [1947] Methods of algebraic geometry, Volume I, Cambridge, 19437. Holmberg, M.A., ed. [1951] Les prix Nobel en 1950 , Stockholm, 1951. Howard, W.A. [1968] Functional interpretation of bar induction by bar recursion, Comp. Math. 20 (1968) 107–124. [1970] Assignment of ordinals to terms for primitive recursive functionals of finite type, in Intuitionism and proof theory, Kino et al. eds., North Holland, 1970, pp. 443–458. [1973] Hereditarily majorizable functionals of finite type, in Troelstra [1973], pp. 454–461. [1980] Ordinal analysis of terms of finite types, J. Symb. Log. 45 (1980) 493–504. Howard, W.A. and Kreisel, G. [1966] Transfinite induction and bar induction of types 0 and 1, and the role of continuity in intuitionistic analysis, J. Symb. Log. 31 (1966) 325–358. H¨ubner,, and Wuchterl, [198 ] Hume, D. [1777] An enquiry concerning the principles of morals, London, 1777. Bibliography 297

Janik, A., and Toulmin, S. [1973] Wittgenstein’s Vienna, Simon and Schuster, 1973. Jensen, R. [1972] The fine structure of L, Ann. Math. Log. 4 (1972) 229–308. Jeroslow, R.G. [1973] Redundancies in the Hilbert-Bernays derivability conditions for G¨odel’ssecond in- completeness theorem, J. Symb. Log. 38 (1973) 359–367. Jong, de, D.H.J. [1980] A class of intuitionistic connectives, in Kleene symposium, Barwise ed., North Hol- land, 1980, pp. 103–111. Kanamori, A. and Magidor, M. [1978] The evolution of large cardinal axioms in set theory, Springer Lect. Not. Math. 669 (1978) 99–275. Ketonen, J., and Solovay, R. [1981] Rapidly growing Ramsey functions, Ann. Math. 113 (1981) 267–314. Kleene, S.C. [1952] Recursive functions and intuitionistic mathematics, Proc. Int. Congr. Math. (1952) 679–685. [1955] Hierarchies of number-theoretical predicates, Bull. Am. Math. Soc. 61 (1955) 193– 213. [1976] The work of Kurt G¨odel, J. Symb. Log. 41 (1976) 761–778. [1978] An addendum to [1976], J. Symb. Log. 43 (1978) 613. Knorr, W. [1978] Archimedes and the spirals: the heuristic background, Hist. Math. 5 (1978) 43–75. [1983] ‘La croix des math´ematiciens’: the Euclideans theory of irrational lines, Bull. Am. Math. Soc. 9 (1983) 41–69. Komar, A. [1964] Undecidability of macroscopical distinguishable states in quantum field theory, Phys. Rev. 133 (1964) 542–544. Kreisel, G. [1950] Note on arithmetical models for consistent formulae of the predicate calculus, Fund. Math. 37 (1950) 265–285. [1951] On the interpretation of non-finitist proofs I, J. Symb. Log. 16 (1951) 241–267. [1952] On the interpretation of non-finitist proofs II, J. Symb. Log. 17 (1952) 43–58. [1952a]On the concept of completeness and interpretation of formal systems, Fund. Math. 39 (1952) 103–127. [1952b]Someelementary inequalities, Ind. Math. 14 (1952) 334–338. [1953] Note on arithmetical models for consistent formulae of the predicate calculus II, Proc. Int. Congr. Phil. 14 (1953) 39–49. [1956] Some uses of metamathematics, Brit. J. Phil. Sci. 7 (1956) 161–173. [1958] Relative consistency proofs (abstract), J. Symb. Log. 23 (1958) 109–110. [1959] Analysis of the Cantor-Bendixson theorem by means of the analytic hierarchy, Bull. Acad. Pol. Sci. 7 (1959) 621–626. [1960] see Volume II, p. 208. [1965] Mathematical logic, in Lectures on modern mathematics, Saaty ed., Wiley, 1965, vol. 3, pp. 95–195. [1967] Informal rigour and completeness proofs, in Problems in the philosophy of mathema- tics, Lakatos ed., North Holland, 1967, pp. 138–186. [1968] Functions, ordinals, species, in Logic, Methodology and Philosophy of Science III , van Rootselaar and Staal eds., North Holland, 1968, pp. 145–159. Bibliography 298

[1970] Hilbert’s Programme and the search for automatic proof procedures, Springer Lect. Not. Math. 125 (1970) 128–146. [1971] Some reasons for generalizing recursion theory, in Logic Colloquium ’69 , Gandy et al. eds., North Holland, 1971, pp. 139–198. [1971a]Review of Kreisel [1971], Zentr. f. Math. 199 (1971) 300–301. 1 [1972] Which number-theoretic problems can be solved in recursive progressions on Π1- paths through O?, J. Symb. Log. 37 (1972) 311-334. [1973] Perspectives in the philosophy of pure mathematics, Log. Meth. Phil. Sci. 4 (1973) 255–277. [1974] A notion of mechanistic theory, Synthese 29 (1974) 11–26. [1975] Some uses of proof theory for finding computer programs, in Colloque International de Logique, Clermont-Ferrand, Guillaume ed., 1975, pp. 123–134. [1977] On the kind of data needed for a theory of proofs, in Logic Colloquium ’76 , Gandy et al. eds., North Holland, 1977, pp. 111–128. [1980] Grundlagen der Mathematik, in Handbuch wissenschaftstheoretischer Begriffe, Van- denhoeck and Ruprecht, 1980, pp. 393–400. [1982] Review of Pour El and Richards [1979] and [1981], J. Symb. Log. 47 (1982) 900–902. [1985] Review of Strukturtypen der Logik by Stegm¨uller and Varga, Grazer Phil. Stud. 24 (1985) 185–195. [1985a]Reviewof Fundamentals of generalized recursion theory by Fitting, Bull. Am. Math. Soc. 13 (1985) 182–197. [1985b]Proof theory and the synthesis of programs: potential and limitations, Springer Lect. Not. Comp. Sci. 203 (1985) 136–150. [1986] Philosophie: eine Erg¨anzungder Wissenschaft ?, Int. Wittgen. Symp. 9 (1986) 51–56.

Kreisel, G., and Krivine, J.L. [1966] Elements of mathematical logic, North Holland, 1966, 2nd edition 1971. Kreisel, G., and MacIntyre, A. [1982] Constructive logic versus algebraization I, in The Brouwer centenary symposium, Troelstra et el. eds., North Holland, 1982, pp. 217–260. Kreisel, G., Mints, G.E., Simpson, S.G. [1975] The use of abstract language in elementary metamathematics: some pedagogical examples, Springer Lect. Notes Math. 453 (1975) 38–131. Kreisel, G. and Tait, W.W. [1961] Finite definability of number-theoretic functions and parametric completeness of equational calculi, Zeit. Math. Log. Grund. Math. 7 (1961) 28–38. Kreisel, G., and Takeuti, G. [1974] Formally self-referential propositions for cut free classical analysis and related sys- tems, Diss. Math. 118 (1974) 1–50. Kreisel, G. and Troelstra, A.S. [1971] Formal systems for some branches of intuitionistic analysis, Ann. Math. Log. 1 (1970) 229–387 and 3 (1971) 437–439. Kripke, S. [1982] Wittgenstein on rules and private language, Harvard University Press, 1982. Lakatos, I., ed. [1967] Problems in the philosophy of mathematics, North Holland, 1967. L´evy, A. [1957] Independence conditionelle de V = L et d’axiomes qui se rattachent au systeme de M. G¨odel, Comp. Rend. Acad. Sci. 245 (1957)1582–1583. Bibliography 299

L¨ob,M.H. [1955] Solution of a problem of Leon Henkin, J. Symb. Log. 20 (1955) 115–118. Lomonosov, V.I. [1973] Invariant subspaces of the family of operators that commute with a completely con- tinuous operator, Funk. Anal. Priloˇz. 7 (1973) 55-56. Lopez-Escobar, E.G.K. [1976] On a very restricted ω-rule, Fund. Math. 90 (1976) 156–172. [1982] Further applications of ultraconservative ω-rules, Arch. Math. Log. Grund. 22 (1982) 89–102. Lorenz, K. [1968] Dialogspiele als semantische grundlage von Logik kalk¨ulen, Arch. Math. Log. Grund. 11 (1968) 32–55, 73–100. Lorenzen, P. [1951] Uber¨ endlich Mengen, Math. Ann. 123 (1951) 331–338. MacIntyre, A. [1971] On ω1-categorical theories of fields, Fund. Math. 71 (1971) 1–25. Mahlo, P. [1912] Zur Theorie und Anwendung der ρ0-Zahlen, Ber. Verh. Sachs. Akad. Wiss. 64 (1912) 190–200. Malcev, A. [1936] Untersuchungen aus dem Gebiete der mathematischen Logik, Mat. Sbor. 1 (1936) 323–336. [1941] On a general method for obtaining local theorems in group theory, Ivanov Gos. Ped. Inst. 1 (1946) 3–9. Maistre, J. de [1821] Les soir´eesde Saint-P´etersbourg ou entretiens sur le gouvernement temporel de la providence: suivis d’un Trait´esur les sacrifices, Paris, 1921. Manewitz, L. and Stavi, J. 0 [1980] ∆2 operators and alternating sentences in arithmetic, J. Symb. Log. 45 (1980) 144– 154. Martin, D.A. [1975] Borel determinacy, Ann. Math. 102 (1975) 363–371. [1976] , Proc. Symp. Pure Appl. Math. 28 (1976) [1985] new proof of Borel determinacy, Proc. Symp. Pure Appl. Math. 42 (1985) Matyasevic, Y. [1970] Enumerable sets are diophantine, Dokl. Acad. Nauk 191 (1970) 279–282, transl. Sov. Math. Dokl. 11 (1970) 354–357. McCarthy, M.T. [1952] The groves of Academe, New York, 1952. Milnor, J. [1958] Some consequences of a theorem of Bott, Ann. Math. 68 (1958) 444–449. Montague, R. and Vaught, R.L. [1959] Natural models of set theories, Fund. Math. 47 (1959) 219–242. Moore, G. [1980] Beyond first-order logic, Hist. Phil. Log. 1 (1980) 95–137. Mostowski, A. [1952] Sentences undecidable in formalized arithmetic, North Holland, 1952. Motz, L. and Motz, R.O. [1990] Sensible mathematics, Nature 345 (1990) 300. Bibliography 300

Musil, R. [1930] Der Mann ohne Eigenschaften, Berlin, 1930. Nedo, M., and Ranchetti, M. [1983] Ludwig Wittgenstein: sein Leben in Bildern und Texten, Surkamp, 1983. Nerode, A. and Harrington, L. [1984] The work of Harvey Friedman, Not. Am. Math. Soc. 31 (1984) 563–566. Neumann, von, J. [1927] Zur Hilbertschen Beweistheorie, Math. Zeit. 27 (1927) 1–46. [1928] Die Axiomatisierung der Mengenlehre, Math. Zeit. 27 (1928) 339–422. Newman, M.H.A. [1969] Luitzen Egbertus Jan Brouwer, part II, Biogr. Mem. Fell. Royal Soc. 15 (1969) 46–53.

Nkrumah, K. [1970] Consciencism; philosophy and ideology for decolonization, Montly Review Press, 1970. Nobeling, G. [1968] Verallgemeinerung eines Satzes von Herrn E. Specker, Inv. Math. 6 (1968) 41–55. Parikh, R.J. [1979] Review of Kreisel [1977], Math. Rev. 58 (1979) 3203–3204. Paris, J.B., and Harrington, L. [1977] A mathematical incompleteness in PA, in Barwise [1977], pp. 1133–1142. Pour El, M.B. and Richards, I. [1979] A computable ordinary differential equation which possesses no computable solutions, Ann. Math. Log. 17 (1979) 61–90. [1981] The wave equation with computable initial data such that its unique solution is not computable, Adv. Math. 39 (1981) 215–239. [1983] Non-computability in analysis and physics: a complete determination of the class of non-computable linear operators, Adv. Math. 48 (1983) 44–74. Prawitz, D. [1974] On the idea of a general proof theory, Synth. 27 (1974) 63–77. Rabin, M. [1977] Decidable theories, in Barwise [1977], pp. 595–629. Rados, G. [1906] Zur ersten Verteilung des Bolyai-Preises, Math. Ann. 62 (1906) 156–176. Ramsey, F.P. [1928] On a problem of formal logic, Proc. Lond. Math. Soc. 30 (1928) 338–384. Robinson, A. [1955] Th´eorieM´etamath´ematiquedes id´eaux, Gauthier-Villar, 1955. [1966] Non-standard analysis, North Holland, 1966, 2nd edition 1974. Robinson, A., and Roquette, P. [1975] On the finiteness theorem of Siegel and Mahler concerning diophantine equations, J. Number Th. 7 (1975) 121–176. Robinson, J. [1968] Recursive functions of one variable, Proc. Am. Math. Soc. 19 (1968) 815–820. Rokeach, M. [1964] The three Christs of Ypsilanti; a psychological study, New York, 1964. Rosser, B.J. [1936] Extensions of some theorems of G¨odeland Church, J. Symb. Log. 1 (1936) 87–91. Bibliography 301

Russell, B. [1900] A critical exposition of the philosophy of Leibniz, Cambridge, 1900. [1903] The principles of mathematics, London, 1903. [1905] On denoting, Mind 14 (1905) 479–493. [1906] On some difficulties in the theory of transfinite numbers and order types, Proc. Lond. Math. Soc. 4 (1906) 29–53. [1912] The problems of philosophy, London, 1912. [1919] Introduction to mathematical philosophy, London, 1919. [1944] My mental development, in The philosophy of Bertrand Russell, Schilpp ed., Evanston, 1944, pp. 1–20. [1945] A history of western philosophy, New York, 1945. [1951] What desires are politically important?, in Les prix Nobel en 1950 , Holmberg ed., Stockholm, 1951, pp. 259–270. [1959] My philosophical development, New York, 1959. [1967] Autobiography 1872–1914 , vol. I, Atlantic Monthly Press, 1967. [1968] Autobiography 1914–1944 , vol. II, Atlantic Monthly Press, 1967. [1969] Autobiography 1944–1968 , vol. III, Atlantic Monthly Press, 1967. Schroeder-Heister, P. [1984] A natural extension of natural deduction, J. Symb. Log. 49 (1984) 1284–1300. Sch¨utte,K., and Van der Waerden, B.L. [1953] Das Problem der dreizehn Kugeln, Math. Ann. 125 (1953) 325–334. Scott, D.S. [1962] More on the axiom of extensionality, in Essays on the foundations of mathematics, Bar Hillel et al. eds., Magnes Press, 1962, pp. 115–176. Senovilla, J.M.M. [1990] A new class of inhomogeneous cosmological perfect fluid solutions without Big Bang singularities, Phys. Rev. Letters 64 (1990) 2219. Shoenfield, J. [1957] Open sentences in partial systems of arithmetic, J. Symb. Log. 22 (1957) 112. [1959] On the independence of the axiom of constructibility, Am. J. Math. 81 (1959) 537– 540. [1962] The problem of predicativity, in Essays on the foundations of mathematics, Bar Hillel et al. eds., North Holland, 1962, pp. 132–139. Siegel, . [1929] Simpson, S.G. [1984] Which set existence axioms are needed to prove the Cauchy-Peano theorem for or- dinary differential equations?, J. Symb. Log. 49 (1984) 783–802. Skolem, T. [1922] Einige Bemerkungen zur axiomatischen Begr¨undungder Mengenlehre, Proc. Congr. Scand. Math. 5 (1922) 217–232, transl. in van Heijenoort [1967], 290–301. [1933] Review of G¨odel[1933], Forts. Math. 59 (1933) 865–866. [1933a]Uber¨ die Unm¨oglichkeit einer vollst¨andingenCharakterisierung der Zahlenreihe mit- tels eines endlichen Axiomensystems, Norsk Mat. For. Skrif. 10 (1933) 73–82. Smorynski, C. [1977] ω-consistency and reflection, Proc. 1975 Log. Coll. Clermont-Ferrand, 1977, pp. 167– 181. [1977a]The incompleteness theorems, in Barwise [1977], pp. 821–865. Bibliography 302

Solzhenitsyn, A.I. [1968] The first circle, London, 1968. [1968a]The cancer ward, London, 1968. Specker, E.P. [1950] Additiven Gruppen von Folgen ganzer Zahlen, Port. Math. 9 (1950) 131–140. Spector, C. [1962] Provably recursive functionals of analysis: a consistency proof of analysis by an extension of principles formulated in current intuitionistic mathematics, Proc. Symp. Pure Appl. Math. 5 (1962) 1–27. Statman, R. [1974] Structural complexity of derivations, Ph.D. Thesis, Stanford University, 1974. [1977] Herbrand’s theorem and Gentzen’s notion of a direct proof, in Barwise [1977], pp. 897–912. Sturm, C. [1835] M´emoiresur la r´esolutiondes ´equations num´eriques, Mem. Acad. Roy. Sci. 6, 1835. Sundholm, G. [1983] Constructions, proofs and the meaning of the logical constants, J. Phyl. Log. 12 (1983) 151–172. Tait, W.W. [1967] Intensional interpretation of functionals of finite type, J. Symb. Log. 32 (1967) 198– 212. Tarski, A. Mostowski, A. and Robinson, R.M. [1953] Undecidable theories, North Holland, 1953. Taub, A.H. [1951] Empty space-times, Ann. Math. 53 (1951) 472–490. Taussky, O. [1987] in G¨odelremembered, Weingartner and Schmettered eds., Bibliopolis, 1987, pp. Troelstra, A.S. [1973] Metamathematical investigations of intuitionistic arithmetic and analysis, Springer Lecture Notes in Mathematics 344, 1973. [1977] Choice sequences, a chapter of intuitionistic mathematics, Clarendon Press, 1977. [1981] The interplay between logic and matehmatics: intuitionism, in Modern logic - A survey, Agazzi ed., Reidel, 1981, pp. 197–221. [1983] Analysing choice sequences, J. Phil. Log. 12 (1983) 197–260. Troelstra, A.S., and van Dalen, D. [1988] Constructivism in mathematics, 2 volumes, North Holland, 1988. Turing, A.M. [1936] On computable numbers with an application to the Entscheidungsproblem, Proc. Lond. Math. Soc. 42 (1936) 230–265. [1939] Systems of logic based on ordinals, Proc. Lond. Math. Soc. 45 (1939) 161–228. Van den Hoeven, G.F., and Moerdijk, I. [1984] Constructing choice sequences from lawless sequences of neighbourhood functions, in Logic Colloquium ’83 , M¨ulleret al. eds., Springer, 1984, pp. 207–234. Van der Dries, L. [1982] Some applications of a model-theoretic fact to (semialgebraic) geometry, Ind. Math. 44 (1982) 397–441. Vaught, R.L. [1974] Model theory before 1945, Proc. Symp. Pure Math. 25 (1974) 153–172. Bibliography 303

Wang, H. [1974] From mathematics to philosophy, Routledge and Kegan, 1974. Weil, A. [1974] Basic number theory, Springer, 1974. Weinberg, S. [1976] The forces of nature, Bull. Am. Acad. Sci. 29 (1976) 13–29. Weinstein, S. [1983] The intended interpretation of intuitionistic logic, J. Phil. Log. 12 (1983) 261–270. Weyl, H. [1918] Das Kontinuum, Leipzig, 1918. [1946] Review of The philosophy of Bertrand Russell, Am. Math. Monthly 53 (1946) 208– 214. Whithead, A.N., and Russell, B. [1910] Principia Mathematica, vol. I, Cambridge, 1910. Wigner, E. [1960] The unreasonable effectiveness of mathematics in the natural sciences, Comm. Pure Appl. Math. 13 (1960) 1–14. [1982] , Nobel Conf. 17 (1982) Wittgenstein, L. [1921] Logisch-philosophische Abhandklung, Ann. Naturphil. 14 (1921) 185–262. [195 ] Philosophical Investigations, 195 . [1956] Bemerkungen ¨uber die Grundlagen der Mathematik, Oxford, 1956. [1967] Zettel, Oxford, 1967. [1980] Vemischte Bemerkungen, Chicago, 1980. Wojtylak, P. [1982] Collapse of a class of infinite disjunctions in intuitionistic propositional logic, Rep. Math. Log. 16 (1982) 37–49. [1984] A recursive theory for the {¬, ∧, ∨, →, ◦} fragment of intuitionistic logic, Rep. Math. Log. 18 (1984) 3–35. Zermelo, E. [1896] Uber¨ einen Satz der Dynamik und die mechanische Warmetheorie, Ann. Phys. 57 (1896) 485–494. [1904] Beweis, dass jede Menge wohlgeordnet werden kann, Math. Ann. 59 (1904) 514–516, transl. in van Heijenoort [1967], pp. 139–141. [1908] Untersuchungen ¨uber die Grundlagen der Mengenlehre I, Math. Ann. 65 (1908) 261– 281, transl. in van Heijenoort [1967], pp. 199–215. [1912] Uber¨ eine Anwendung der Mengenlehre auf die Theorie des Schachspiels, Proc. Int. Congr. Math. 5 (1912) 501–504. [1930] Uber¨ Grenzzahlen und Mengenbereiche, Fund. Math. 16 (1930) 29–47. [1932] Uber¨ Stufen der Quantifikation und die Logik des Unendlichen, Jber. Dt. Mat. Verein. 41 (1932) 85–88. [1935] Grundlagen einer allgemeinen Theorie der mathematischen Satzsysteme, Fund. Math. 25 (1935) 136–146. I.S.B.N. 978-65-900390-2-6